Course Schedule

The following is a tentative schedule for the course (subject to change).

Topic Deadlines & Materials
Week 1: Aug 17 - 21

Introduction to Big Data

Paper Evaluations and Course Roadmap

Week 2: Aug 24 - 28

Scalable Server Design

Blocking vs. Non-blocking I/O

Week 3: Aug 31 - Sep 4

Fault Tolerance and Recovery

HDFS

Week 4: Sep 7 - 11

Distributed Hash Tables

Week 5: Sep 14 - 18

Bloom Filters

MegaStore

Week 6: Sep 21 - 25

Data Models

Distributed Computation

Week 7: Sep 28 - Oct 2

Hadoop & YARN

  • Quiz 2: Sep 29
Week 8: Oct 5 - 9

Setting up a Hadoop Cluster

Week 9: Oct 12 - 16

Geospatial Data

Week 10: Oct 19 - 23

Sampling Streams

  • Quiz 3: Oct 20
Week 11: Oct 26 - 30

Sketching and Summarization

Week 12: Nov 2 - 6

Stream Cardinality Estimation

Stream Analytics

Week 13: Nov 9 - 13

Stream Processing Systems

  • Quiz 4: Nov 10
Week 14: Nov 16 - 20

Machine Learning

The DataFlow Model

Week 15: Nov 23 - 27

Fall Break!

Week 16: Nov 30 - Dec 4

Distributed Machine Learning

Machine Learning Frameworks