Project 4: Distributed CryptoMiner (v 1.0)

Starter repository on GitHub: https://classroom.github.com/a/5ipstVTD

As we all know, the safest investments in life are the ones you can truly rely on. Things like Gamestop or AMC stock, corn dog futures, leveraged ETFs, Dogecoin, the list goes on. However, as computer scientists, we have the responsibility to be skeptical about what we can actually trust. In fact, why trust anyone? Using cryptocurrencies as a means to buy goods and services is a step in the right direction, but only if you actually control the currency then you can truly trust it. So, let’s make our own cryptocurrency.

As luck would have it, such a task is quite amenable to parallelization (and you thought we were learning that stuff just for fun!). Cryptocurrencies like Bitcoin rely on a distributed transaction ledger composed of blocks. The job of miners in the Bitcoin network is to verify transactions within these blocks, which is computationally expensive.

In this assignment, you will get more familiar with:

The pthread library and parallelization using threads
The producer/consumer paradigm
Network programming with sockets
Taking performance measurements

Assignment Overview

You are given a complete, sequential program that demonstrates how crypto mining works. You’ll use this starter code to build our final project in three phases:

Parallelization of the original program so that it can use multiple threads to mine faster
A new program based on your parallel implementation that acts as a mining client that will request computationally-expensive problems from the mining server. The server will provide the problem specification and target difficulty, and the client will reply when it discovers a nonce that solves the problem.
The mining server. This will be a centralized component that offers up mining specifications sequentially; at any moment, all clients will be working on the same problem as specified by the mining server.

Each part of this assignment works a bit differently. In Phases 1 and 2, you are allowed to work with a teammate to build your parallel mining implementation. Phase 1 works exactly like our previous projects: you start with an objective and set of test cases and work to pass those test cases. In Phase 2, you will be creating your mining client completely from scratch, so design and testing are your responsibility. In Phase 3, the entire class participates. We will build one central server component, decide on its messaging protocol, and design clients that interoperate with the protocol.

More information about each phase follows.

Phase 1

Bitcoin mining (and many others) are based on Hashcash. The idea behind these systems is called proof-of-work, which is somewhat like a CAPTCHA designed for computers instead of humans. The computer works on a computationally expensive problem to prove its actions are legitimate; in the case of Hashcash, performing the computation proves that you are not a spammer, while in Bitcoin it serves to verify transactions as being valid. A key feature of proof-of-work systems is that once the solution to the problem is found, it is trivial to verify that the answer is correct.

In Bitcoin, hash inversions are the computationally expensive problem being solved by the computer. Given a block, the algorithm tries to find a nonce (number only used once) that when combined with the block data produces a hash code with a set amount of leading zeros. The more leading zeros requested, the harder the problem is to solve. It’s kind of like rolling dice; rolling any number from 1 through 3 is much easier than rolling a 6. This assignment searches for the nonce that satisfies a given difficulty level by splitting the work across multiple threads.

After successfully performing the hash inversion, the Bitcoin network rewards the miner (or pool of miners) for their work. With our cryptocurrency, a working program will reward you with a good grade :-).

Here’s a demo run for the completed, parallel version of Phase 1:

./miner 4 24 'Hello CS 521!!!'

Number of threads: 4
  Difficulty Mask: 00000000000000000000000011111111
       Block data: [Hello CS 521!!!]

----------- Starting up miner threads!  -----------

Solution found by thread 3:
Nonce: 10211906
 Hash: 0000001209850F7AB3EC055248EE4F1B032D39D0
10221196 hashes in 0.26s (39312292.30 hashes/sec)

In this example, 4 threads are used to find the solution to the block: the nonce that satisfies the given difficulty (24 leading zeros in this case). The process behind finding a solution works like this:

Program starts, block data is provided by the user: Hello CS 521!!!
Each thread receives a task to work on. A task is simply an array of nonces.
The thread begins appending each nonce in the task array to the block data and then hashing the resulting string. For example, the first combination will be Hello CS 521!!!1, followed by Hello CS 521!!!2, and so on.
- We start at 1 so that we can use a nonce of 0 to indicate failure.
Once a thread finds a hash that begins with the correct amount of zeros (specified by the difficulty mask), the process is complete.

In Phase 1, the main thread will produce tasks, and each worker thread performs the hash inversions. If the user specifies 4 for the thread count, this means you’ll have five total threads (main thread + 4 workers).

Testing Your Code

The great thing about cryptocurrencies is their proof-of-work paradigm: it takes a long time to produce a solution for a given block, but verifying that the solution is correct is relatively trivial. We can even do this on the UNIX command line. Let’s assume your block data was 'Hello CS 521!!!' and the nonce of the solution you found was 10211906 with a difficulty level of 24. We can test this with the sha1sum utility:

echo -n 'Hello CS 521!!!10211906' | sha1sum
0000001209850f7ab3ec055248ee4f1b032d39d0

Note that the resulting hash has 6 zeros, which is what we’d expect: 24 bit difficulty means 6 hex characters worth of zeros (24 / 4 = 6).

Phase 2

Once Phase 1 is complete, there is more work to be done to create a mining client. The client workflow looks like this:

Start up and connect to the mining server hostname and port number (specified as command line arguments).
Request a problem specification from the server, tagged with your username (any 24-character string). The server will reply with the current block being mined and its difficulty.
Use the implementation from Phase 1 to find a solution (a nonce that solves the block for its given difficulty). Once a solution is found, send a message to the mining server with the nonce you discovered.
- If your client is the first to discover a viable solution to the problem, it is rewarded one 521coin, the snazziest of all cryptocurrencies.
Running concurrently to (3), your client should also send periodic heartbeats to the mining server to let it know it is still alive.
- If the current block is solved by another miner, the mining server will respond to the heartbeat with a message indicating so. Otherwise, the mining server will send a simple acknowledgment back.
  - When a solution is found, you need some way to tell threads to stop working on the old problem and request a new problem.

This phase is mostly an exercise in network programming, and may also be a good time to coordinate your threads with a condition variable. One place that your code will be unique is in how it chooses what nonces to try; you could simply start at 1 and proceed sequentially, but it may be advantageous to come up with a different algorithm so that your client behaves differently than others (and therefore could find solutions faster… or slower).

You are responsible for error handling. Your client should not crash during normal operation and should not have any memory leaks (verify this with valgrind).

Phase 3

Phase 3 will be developed by members of the class. However, giving everyone access to a single repository may well result in complete pandemonium. Therefore, your contributions to Phase 3 will be expressed as Github pull requests and issues. After we develop the initial prototype for the mining server in class, you’ll need to test your mining client against it. This will undoubtedly uncover bugs in the mining server that need to be fixed. If you find a bug, file an issue for it on the Github issue tracker. If you fix a bug, submit a pull request with your changes to be merged into the shared codebase.

Documentation

Document your work, clean up your code, add comments and javadoc-style method descriptions where necessary. Since Phase 2 is designed by you, you need to explain how you designed it, how to build the program, and how to use it. If the documentation is not present, you’ll lose points.

Grading

[8pts] - Phase 1 test cases
[6pts] - Phase 2 implementation (mining client)
- Since there are no test cases for this portion of the project, you are graded based on the workflow described above; there are four main pieces of functionality and two sub-bullets, each worth 1 point each.
[2pts] - Phase 3 contributions
- You earn points on this phase based on your contributions:
  - [1pt] - Find a bug in the mining server and fix it (or add missing functionality)
  - [0.5pt] - Find a bug in the mining server and report it
  - [0.5pt] - Fix an existing bug in the issue tracker
Reminder: you need to document this. See above.

Restrictions: you may use any standard C library functionality. Your code must compile and run on your VM set up with Arch Linux as described in class. Failure to follow these guidelines will will result in a grade of 0.

Changelog

Initial project specification posted (5/2)