Fall 2017
Recent advancements in computing hardware and storage technologies have allowed us to collect and manage data at an unprecedented scale. This course explores the theory behind big data management and processing, and gives students a chance to gain real-world experience working with cutting-edge tools from the field. We’ll also have the opportunity to read and analyze research papers and new developments in big data computing.
Course Objectives
Our goal for this course is to gain experience with the design and implementation of scalable distributed systems, leveraging distributed computation frameworks such as Hadoop and Spark, and analyzing large-scale datasets.
Announcements
- Nov 20 – Project 3 description is now available.
- Oct 23 – Project 2 description is now available.
- Oct 11 – Project 1 Test Files
- Oct 6 – Active Bass Nodes
- Aug 30 – Project 1 description is now available.
- Aug 28 – Research paper review template now available.
- Aug 23 – Classes begin. Welcome!
Lecture Coordinates
MWF 11:45 am – 12:50 pm, HR 148
Instructor
Matthew Malensek
Office: HR 416
Hours: T 10-11am, WF 1-2pm
Email: mmalensek@usfca.edu
Phone: (415) 422-4756