50.043 Database Systems and Big Data Course Handout¶
This page will be updated regularly. Sync up often.¶
Course Description¶
Database systems manage data which is at the heart of modern computing applications. This course covers the fundamentals of traditional databases, such as Oracle and MySQL, and core ideas of recent big data systems.
Students will learn important problems in data management that these systems are designed to solve. They will experience the internal design and implementation of relational databases. They will also understand the internals of state‐of‐the‐art big data platforms, namely Apache Spark, and use them on Amazon cloud (Amazon Web Service). The students will be able to determine the advantages and limitations of different database systems.
Resource¶
The main resources are lecture slides, tutorial sessions, and online documentations. There are no official textbooks. But the following are useful for reference and deeper understanding of some topics.
- Abraham Siberschatz, Henry Korth, S Sudarshan. Database System Concepts. 6th edition. (DSC)
- Raghu Ramakrishnan, Johannes Gehrke. Database management systems. 3rd edition (DBM)
- Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer Widom. Database systems, the complete book. 2nd edition. (DS)
Instructors¶
-
Roy Lee (roy_lee@sutd.edu.sg) Office Hour:
-
Wenxuan Zhang (wxzhang@sutd.edu.sg) Office Hour:
-
Yanxia Qin (yanxia_qin@sutd.edu.sg) Office Hour: Friday 1:00-2:30 pm (please send email to arrange)
TAs¶
- Zhengbo Zhang (zhengbo_zhang@mymail.sutd.edu.sg) Office Hour:
Communication¶
If you have course/assignment/project related questions, please post it on the dedicated MS teams channel.
Grading¶
Your final grade is computed as follows:
-
Homework: 12% There will be 2 homework assignments, 6 points each.
-
Project: 48% Group project, up to 3 per group. Unless notifying the instructors otherwise, all group members have the same grade for the project.
-
Class participation: 3% Ask/answer questions during classes, spot mistakes, etc.
-
Final exam: 37%
Things you need to prepare¶
- If you are using Windows 10 or Windows 11, please install ubuntu subsystems
- If you are using Linux, it should be perfect.
- If you are using Mac, please install homebrew.
- Make sure Java >8 is installed and ant is installed.
- Ubuntu:
sudo apt install ant ant-contrib
- Mac:
brew install ant ant-contrib
- When you have the AWS educate invitaiton email. Please work on the AWS academy setup.
Project¶
Please refer to the project page.
Submission Policy and Plagiarism¶
- You will do the assignment/project on your own (own teams) and will not copy paste solutions from someone else.
- You will not post any solutions related to this course to a private/public repository that is accessible by the public/others.
- Students are allowed to have a private repository for their assignment which no one can access.
- For projects, students can only invite their partners as collaborators to a private repository.
- Failing to follow the Code of Honour will result in failing the course and/or being submitted to the University Disciplinary Committee. The consequences apply to both the person who shares their work and the person who copies the work.
Schedule (May 19 2025 - August 22 2025)¶
Week | Lecture | Cohort | Reference | Remarks |
---|---|---|---|---|
1 (5/19) | Intro, ER Model | ER Model | DBM: Chapter 1-2, DSC: Chapter 7 |
|
2 (5/26) | Relational Model, Relational Algebra | Relational Model, Relational Algebra | DBM: Chapter 3-4, DSC: Chapter 2 & 6 |
|
3 (6/2) | SQL, NoSQL | SQL | DBM: Chapter 4-5, DSC: Chapter 2-4 |
Project Team Submission (6/8 23:59) |
4 (6/9) | Functional Dependency, Normal Forms | Functional Dependency, Normal Forms | DBM: Chapter 19, DSC: Chapter 8 |
|
5 (6/16) | Storage, Index | Strorage, Index | DBM: Chapter 19, DSC: Chapter 8 |
Assignment 1 Submission (6/22 23:59) |
6 (6/23) | Query Operations | Query Operations | DBM: Chapter 12-14, DSC: Chapter 12 | Project Lab 1 Submission ( 6/29 23:59) |
7 (6/30) | Recess Week | Self-study flintrock and spark cluster setup (edimension video tutorial) | ||
8 (7/7) | Query Optimization | Query Optimization | DBM: Chapter 15 , DSC: Chapter 13 |
Project Lab 2 Submission (7/13 23:59) |
9 (7/14) | Transaction Recovery and Concurrency | Transactions | DBM: Chapter 16-18, DSC: Chapter 14-16 |
|
10 (7/21) | HDFS | HDFS, MapReduce | ||
11 (7/28) | MapReduce | Project Lab 3 Submission (8/3 23:59) | ||
12 (8/4) | Spark | Spark | Assignment 2 Submission (8/10 23:59) | |
13 (8/11) | Yarn, GuestLecture | Spark 2 | Project Lab 4 Submission (8/17 23:59) | |
14 (8/18) | Exam week |
Make Up and Alternative Assessment¶
Make ups for Final exam will be administered when there is an official Leave of Absence from OSA. There will be only one make up. There will be no make-up if students miss the make up test.