Skip to content

50.043 Database Systems and Big Data Course Handout

This page will be updated regularly. Sync up often.

Course Description

Database systems manage data which is at the heart of modern computing applications. This course covers the fundamentals of traditional databases, such as Oracle and MySQL, and core ideas of recent big data systems.

Students will learn important problems in data management that these systems are designed to solve. They will experience the internal design and implementation of relational databases. They will also understand the internals of state‐of‐the‐art big data platforms, namely Apache Spark, and use them on Amazon cloud (Amazon Web Service). The students will be able to determine the advantages and limitations of different database systems.

Resource

The main resources are lecture slides, tutorial sessions, and online documentations. There are no official textbooks. But the following are useful for reference and deeper understanding of some topics.

  1. Abraham Siberschatz, Henry Korth, S Sudarshan. Database System Concepts. 6th edition. (DSC)
  2. Raghu Ramakrishnan, Johannes Gehrke. Database management systems. 3rd edition (DBM)
  3. Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer Widom. Database systems, the complete book. 2nd edition. (DS)

Instructors

  • Roy Lee (roy_lee@sutd.edu.sg) Office Hour:

  • Wenxuan Zhang (wxzhang@sutd.edu.sg) Office Hour:

  • Yanxia Qin (yanxia_qin@sutd.edu.sg) Office Hour: Friday 1:00-2:30 pm (please send email to arrange)

TAs

  • Zhengbo Zhang (zhengbo_zhang@mymail.sutd.edu.sg) Office Hour:

Communication

If you have course/assignment/project related questions, please post it on the dedicated MS teams channel.

Grading

Your final grade is computed as follows:

  1. Homework: 12% There will be 2 homework assignments, 6 points each.

  2. Project: 48% Group project, up to 3 per group. Unless notifying the instructors otherwise, all group members have the same grade for the project.

  3. Class participation: 3% Ask/answer questions during classes, spot mistakes, etc.

  4. Final exam: 37%

Things you need to prepare

  • If you are using Windows 10 or Windows 11, please install ubuntu subsystems
  • If you are using Linux, it should be perfect.
  • If you are using Mac, please install homebrew.
  • Make sure Java >8 is installed and ant is installed.
  • Ubuntu: sudo apt install ant ant-contrib
  • Mac: brew install ant ant-contrib
  • When you have the AWS educate invitaiton email. Please work on the AWS academy setup.

Project

Please refer to the project page.

Submission Policy and Plagiarism

  1. You will do the assignment/project on your own (own teams) and will not copy paste solutions from someone else.
  2. You will not post any solutions related to this course to a private/public repository that is accessible by the public/others.
  3. Students are allowed to have a private repository for their assignment which no one can access.
  4. For projects, students can only invite their partners as collaborators to a private repository.
  5. Failing to follow the Code of Honour will result in failing the course and/or being submitted to the University Disciplinary Committee. The consequences apply to both the person who shares their work and the person who copies the work.

Schedule (May 19 2025 - August 22 2025)

Week Lecture Cohort Reference Remarks
1 (5/19) Intro, ER Model ER Model DBM: Chapter 1-2,
DSC: Chapter 7
2 (5/26) Relational Model, Relational Algebra Relational Model, Relational Algebra DBM: Chapter 3-4,
DSC: Chapter 2 & 6
3 (6/2) SQL, NoSQL SQL DBM: Chapter 4-5,
DSC: Chapter 2-4
Project Team Submission (6/8 23:59)
4 (6/9) Functional Dependency, Normal Forms Functional Dependency, Normal Forms DBM: Chapter 19,
DSC: Chapter 8
5 (6/16) Storage, Index Strorage, Index DBM: Chapter 19,
DSC: Chapter 8
Assignment 1 Submission (6/22 23:59)
6 (6/23) Query Operations Query Operations DBM: Chapter 12-14, DSC: Chapter 12 Project Lab 1 Submission ( 6/29 23:59)
7 (6/30) Recess Week Self-study flintrock and spark cluster setup (edimension video tutorial)
8 (7/7) Query Optimization Query Optimization DBM: Chapter 15 ,
DSC: Chapter 13
Project Lab 2 Submission (7/13 23:59)
9 (7/14) Transaction Recovery and Concurrency Transactions DBM: Chapter 16-18,
DSC: Chapter 14-16
10 (7/21) HDFS HDFS, MapReduce
11 (7/28) MapReduce Project Lab 3 Submission (8/3 23:59)
12 (8/4) Spark Spark Assignment 2 Submission (8/10 23:59)
13 (8/11) Yarn, GuestLecture Spark 2 Project Lab 4 Submission (8/17 23:59)
14 (8/18) Exam week

Make Up and Alternative Assessment

Make ups for Final exam will be administered when there is an official Leave of Absence from OSA. There will be only one make up. There will be no make-up if students miss the make up test.