CS 677 Big Data

Course Schedule

The following is a tentative schedule for the course (subject to change).

Week Topic Materials
1 Aug 19 - 23

Introduction to Big Data

Paper Evaluations and Course Roadmap

2 Aug 26 - 30

Scalable Server Design

Fault Tolerance and Recovery

3 Sep 2 - 6

Distributed Hash Tables

HDFS

4 Sep 9 - 13

Data Models

Bloom Filters

5 Sep 16 - 20

Project 1 Design

PolarFS Discussion

6 Sep 23 - 27

Building a Bloom Filter

Distributed Computation

7 Sep 30 - Oct 4

Hadoop & YARN

RInK vs LInK

8 Oct 7 - 11

Project Q & A, Work

9 Oct 14 - 18

Tues: No class, fall break

Lab 4: Hadoop Setup

10 Oct 21 - 25

Geospatial Data

Sampling Streams

  • Quiz 3: Oct 22
11 Oct 28 - Nov 1

Sketching and Summarization

RDDs (Happy Halloween!)

12 Nov 4 - 8

Stream Cardinality Estimation

Stream Analytics at Twitter

13 Nov 11 - 15

Spark Setup

14 Nov 18 - 22

Machine Learning

15 Nov 25 - 29

Distributed Machine Learning

Thurs: No class, Thanksgiving break

16 Dec 2 - 6

Machine Learning Frameworks

  • Quiz 5: Dec 3