Course Schedule
The following is a tentative schedule for the course (subject to change).
Topic | Deadlines & Materials |
---|---|
Week 1: January 22 – 28 | |
Introduction to Big Data A Whirlwind Tour of Go |
|
Week 2: January 29 – February 4 | |
Unit 1: Scaling and Storage Scaling Out Research Papers and HDFS Presentation |
|
Week 3: February 5 – 11 | |
Finishing HDFS Talk Network Design, Work on Lab 2 |
|
Week 4: February 12 – 18 | |
Fault Tolerance and Consensus Dynamo Paper Presentation |
|
Week 5: February 19 – 25 | |
Distributed Hash Tables Megastore Paper Presentation |
|
Week 6: February 26 – March 4 | |
Data Models Thurs: Class Cancelled |
|
Week 7: March 5 – 11 | |
Unit 2: Distributed & Parallel Computation Introduction to Distributed Computation Hadoop Setup |
|
Week 8: March 12 – 18 | |
Spring Break! | |
Week 9: March 19 – 25 | |
Hadoop MapReduce MapReduce Paper Presentation |
|
Week 10: March 26 – April 1 | |
Designing our MR Job Spark, RDDs
|
|
Week 11: April 2 – 8 | |
Spark Setup Zookeeper Presentation Cluster Orchestration |
|
Week 12: April 9 – 15 | |
Unit 3: Streaming Algorithms and Applications Big Data Sampling Techniques IPFS Presentation Bloom Filters |
|
Week 13: April 16 – 22 | |
Data Sketches |
|
Week 14: April 23 – 29 | |
Spatiotemporal Data Working with Spark |
|
Week 15: April 30 – May 6 | |
Spark Streaming SageDB, Machine Learning |
|
Week 16: May 7 – 13 | |
Wrapping up the Semester |
|
Week 17: May 14 – 20 | |
Final Quiz: Tuesday, May 16 ⋅ 10:00am – 12:00 pm |