CS 677 is focused on building and leveraging distributed systems to analyze large datasets. The course will consist of large programming assignments and you will also be required to submit written reports on assigned readings from the literature.
Each assignment will include a detailed specification document with a description of the problem, breakdown of points, permitted libraries, etc. You are free to discuss the projects with your classmates, but sharing code or pseudocode is not acceptable. Please see the grading policy
Submitting Assignments: use the project links below to create a git repository for your work. To submit, check your code into your git repository before the deadline.
- Due dates are posted on the course schedule page. Assignments are due at 11:59pm on the due date.
- Late lab and research paper assignments are not accepted.
- Late projects are deducted 10% per day for a maximum of three days. Afterward, no credit will be given.
Presentation Order, Fall 2021:
- HDFS – Matthew Malensek
- Dynamo – Hugo Laboisse, Patrick Porter, Zhenzhen Wang
- BigTable – Alma Abbasi, Iris Li, Nikhil Matta
- MapReduce – Bill Li, Shulin Li, Kate Luo, Sam Wang
- Big Data Normalization – Nikhil Bhutani, Aryan Choudhary, Anthony Knox
- Resilient Distributed Datasets – Nuo Cheng, Ziyang Liu, Yiqi Wei
- SparkSQL – Daily Guo, He Wei
- SageDB – Yuan Qian, Yudan Su, Terry Tran
- AlphaFold – Junting Cai, Milton Carreno, Stephen Yu, Dan Zhong
- Lab 0 - Getting to Know You
- Lab 1 - Choosing a Research Paper (post on campuswire)
- Lab 2 - Go Chat
- Lab 3 - Chunked File Transfer
- Lab 4 - Project 1 Checkpoint
- Project 1 - Distributed File System