Project 4: Cloud Storage Service (v 1.0)

Starter repository on GitHub: https://classroom.github.com/g/qKjctn-_

In this project, you will build your own cloud storage system similar to Dropbox or Google Drive, with resilient, replicated backend servers and a command line client application. Specific features of the system include:

You will need to use the go standard library and socket programming to complete this project. However, external libraries may be allowed if they do not implement core functionality (ask first!)

NOTE: you are allowed to work in teams of 2 for this project if you’d like.

Components

Your cloud storage system will have two components:

Storage/Retrieval Operations

Both the server and client in your system will support a variety of messages that influence behavior. You may design your own protocol to implement these operations:

If a client tries to put file that already exists, you can reject the operation. Instead, they should delete the existing file first. Or, if you’re feeling adventurous, you could add an overwrite operation that automatically does a delete followed by a put.

To ensure the system is trustworthy, each of these operations should be acknowledged as either successful or a failure. Users are generally willing to retry storage operations if/when they fail, so it’s better to be explicit about failures.

Handling Replication and Failures

When a client stores a file, the server should ensure that it has been replicated before acknowledging the storage operation as successful. To do this, the first server will contact the replica server and ask it to store the file as well. You must support at least 2 storage servers in this project (meaning every file is stored twice for redundancy), but you are welcome to support higher replication levels if you wish. The current best practice at companies like Google or Amazon is to store 3-5 replicas of every file.

If a replica server goes down, then your only option is to reject put operations. This may seem unintuitive – why not store the file and wait for the other server to come back up later? At that point, you could synchronize files between the two machines and continue operating normally. However, “synchronize the files” is something that seems relatively simple on the surface but quickly descends into a multitude of edge cases.

However, note that it is completely reasonable to allow get operations when the system is in a degraded state (one storage server has gone down).

Detecting and Repairing File Corruption

When storing a file, your server should also store its checksum (you can use any hash algorithm you’d like). For example, if you have my_file.txt you may also store my_file.txt.checksum in a separate directory. When retrieving a file, you’ll read the file, checksum it, and verify that the checksum matches the original checksum stored on disk. If the file is corrupted, contact a backup server to repair the file.

Tips and Resources

Testing and Grading

Since certain people coughMatthewcough are terribly slow and probably unable to implement a robust test suite before the semester ends, we’ll do testing and grading somewhat differently for this assignment.

You’ll set up your system and then perform the following tasks to demonstrate functionality. Point values are shown by each test.

For the remaining 2 points, you should include a README with detailed instructions on how to set up and use your system. Since this project is less prescriptive, make sure to discuss your design and the logic behind the decisions you made. Creativity is encouraged!

NOTE testing will be performed on our VMs. Make sure your project works on the test environment before turning the project in!

Test Dataset

Find the test dataset that will be used for grading here: p4-dataset.tar.gz. You can download it to your VM with wget. Here are the files’ names, sizes, and sha1 checksums:

$ wget 'https://www.cs.usfca.edu/~mmalensek/cs521/assignments/p4-dataset.tar.gz'
$ tar xf p4-dataset.tar.gz
$ cd p4-dataset
$ for i in *.bin; do stat --printf '%s ' $i; sha1sum $i; done | column -t
3           ca919eca39b3ed092622b8ae0875ddd0d637254e  test1.bin
12601476    34a308cf63ae2f20bd061733f3e0c1db6577332f  test2.bin
57690244    28b047c55ed0b68df52cf931d973e76aade87545  test3.bin
127944836   4a6d2c9b72e511436b2cf8c075c0a395f4be8de9  test4.bin
719332421   c8166f20e8bdc7d79fb6c7ae36dc170d98abee85  test5.bin
1164986500  2be1550d2d44c578efc1297e17a9652633353a7f  test6.bin

You should make sure that the files that are stored (with put) match these SHA-1 checksums when they are retrieved (with get). Even if the files look the same, you need to be sure by fingerprinting them with a hash function.

Changelog