The Distributed Analytics, Technologies, and Algorithms (DATA) lab at the University of San Francisco tackles big data problems in a variety of problem domains. We are focused on building scalable distributed systems and algorithms to analyze and gain insights from voluminous datasets.
- Galileo: a high-throughput distributed storage system for multidimensional data
- Minerva, a disk scheduling agent and cloud resource management framework
- Contributions to the Granules framework for distributed stream processing
- jbsdiff: a Java implementation of the bsdiff algorithm.
Occasionally research or teaching projects lead to the development of small tools or scripts. Some of the utilities are available at sigpipe.io, and there are miscellaneous utility scripts at GitHub.