Recent advancements in computing hardware and storage technologies have allowed us to collect and manage data at an unprecedented scale. This course explores the theory behind big data management and processing, and gives students a chance to gain real-world experience working with cutting-edge tools from the field. We’ll also have the opportunity to read and analyze research papers in big data computing.
Our goal for this course is to gain experience with the design and implementation of scalable distributed systems, leveraging distributed computation frameworks such as Hadoop and Spark, and analyzing large-scale datasets.
- November 18 – Project 3 now available.
- October 22 – Project 2 now available.
- August 27 – Project 1 now available.
- August 20 – Classes begin. Welcome!
Lecture: Tuesday & Thursday ⋅ 4:35 – 6:20pm ⋅ LS 307
Communication: Piazza ⋅ Zoom Live Stream
Instructor: Matthew Malensek
Office: HR 406
Hours: T 2:30 – 4:00pm ⋅ Th 10:00 – 11:30am ⋅ F 2:00 – 3:00pm