Project 1: Test Files
Here are the files we will use to test your DFS:
- test_file_1.bin (512 bytes) - 391b1598572e7f2b9d1d1dfa1224fec3
- test_file_2.bin (1.5 KB) - d57a71c50c6c5d0762fc5346d04f4a79
- test_file_3.bin (1 MB) - d9cb06b3a61f8a7c3165783f62e19b1f
- test_file_4.bin (5 MB) - 6127211f9101b8b2bdecbdda64d54e05
- test_file_5.bin (142 MB) - 4c79e941e7811a2619fa2a1cf1b50584
We should be able to store the file, retrieve it, and verify that the checksum matches those listed above. You can use the md5sum
utility on Linux machines to test this, or md5
on Mac/BSD. Since we will be testing on Linux machines, familiarize yourself with this process.
Setup
To get ready for grading, you will need to launch:
- A controller
- 10 storage nodes
Then we’ll execute (some of) the test cases below.
Test Case 1
- Store file 1
- List files in DFS
- Retrieve file 1
- Verify checksum
Test Case 2
- Store files 1, 2, 3.
- List files in DFS
- Retrieve files 1 and 3
- Verify checksum
Test Case 3
- Store/retrieve file 5
Test Case 4
- Corrupt one of the file chunks on disk
- Retrieve file
- Verify repair occurred
- Verify file checksum
Test Case 5
- Kill primary and secondary replicas for a file (terminate the Storage Node processes)
- Verify controller has orchestrated re-replication
- Retrieve file
- Verify file checksum
Test Case 6: Code Walkthrough
- Client -> Controller communication
- Client -> Storage Node communication
- Storage Node -> Client transfer (in parallel)
- Controller replication logic and fault detection
- Storage node corrupt file detection