Course Schedule
The following is a tentative schedule for the course (subject to change).
| Topic | Deadlines & Materials |
|---|---|
| Week 1: January 22 – 28 | |
Introduction to Big Data Big Data Case Studies and Datasets A Quick Tour of Go |
|
| Week 2: January 29 – February 4 | |
Unit 1: Scaling and Storage Slightly More Advanced Go Scaling Out Research Papers and HDFS Presentation |
|
| Week 3: February 5 – 11 | |
Network Design |
|
| Week 4: February 12 – 18 | |
Fault Tolerance and Consensus |
|
| Week 5: February 19 – 25 | |
Distributed Hash Tables |
|
| Week 6: February 26 – March 4 | |
Data Models |
|
| Week 7: March 5 – 11 | |
Unit 2: Distributed & Parallel Computation Introduction to Distributed Computation Hadoop Setup |
|
| Week 8: March 12 – 18 | |
| Spring Break! | |
| Week 9: March 19 – 25 | |
Hadoop MapReduce |
|
| Week 10: March 26 – April 1 | |
Designing a MapReduce Job |
|
| Week 11: April 2 – 8 | |
Spark Setup Cluster Orchestration |
|
| Week 12: April 9 – 15 | |
Unit 3: Streaming Algorithms and Applications Big Data Sampling Techniques Bloom Filters |
|
| Week 13: April 16 – 22 | |
Data Sketches |
|
| Week 14: April 23 – 29 | |
Spatiotemporal Data Working with Spark |
|
| Week 15: April 30 – May 6 | |
Spark Streaming SageDB, Machine Learning |
|
| Week 16: May 7 – 13 | |
Wrapping up the Semester |
|
| Week 17: May 14 – 20 | |
Final Quiz: Monday, May 18 ⋅ 10:00am – 10:30 am |
|