CS 677 is focused on building and leveraging distributed systems to analyze large datasets. The course will consist of large programming assignments and you will also be required to submit written reports on assigned readings from the literature.
Each assignment will include a detailed specification document with a description of the problem, breakdown of points, permitted libraries, etc. You are free to discuss the projects with your classmates, but sharing code or pseudocode is not acceptable. Please see the grading policy
Submitting Assignments: use the project links below to create a git repository for your work. To submit, check your code into your git repository before the deadline.
- Due dates are posted on the course schedule page. Assignments are due at 11:59pm on the due date.
- Late lab and research paper assignments are not accepted.
- Project deadlines are strong suggestions but you are granted flexibility to promote creativity and taking risks with your designs. Projects must be turned in by the end of the semester to receive credit.
Presentation Order, Spring 2023:
- Dynamo – Vinay Bojja, Gandhar Kulkarni, Ashutosh Malla
- Megastore – Cara Cao, Yaomin Zhang
- MapReduce – Xiuhui Wang, Heran Zhang
- Spark – Yifan Meng, Shelger Zhang
- RDDs – James Ambat, Deep Mistry
- Zookeeper – Colin Inns, Colm Lang
- IPFS – Josh Li, Aneesh Madhavan, Mark Wu
- Storm – Ashley Radford, Yordanos Solomon
- TensorFlow – Logan Jendrusch, Joel Konuparamban, Matyas Krizek
- Lab 0 - Getting to Know You
- Lab 1 - Log Analyzer
- Lab 2 - File Transfer Client and Server
- Lab 3 - Distributed Failure Detection
- Lab 4 - Choosing a Research Paper
- Lab 5 - Project 1 Design
- Lab 6 - Project 1 Checkpoint
- Lab 7 - Hadoop Setup
- Lab 8 - Logs, Passwords, and MapReduce Jobs
- Lab 9 - Spark Setup