Project 3: Parallel Cryptocurrency Miner (v 1.0)
Starter repository on GitHub: https://classroom.github.com/a/d8fTEYeP
NOTE: If you’d like to build a Go implementation of this project, then you can work in a team of 2. Please contact the course staff if you are interested!
As we all know, the safest investments in life are the ones you can truly rely on. Things like Gamestop or AMC stock, corn dogs, Dogecoin, the list goes on. However, as computer scientists, we have the responsibility to be skeptical about what we can actually trust. In fact, why trust anyone? Using cryptocurrencies (or corn dogs) as a means to buy goods and services is a step in the right direction, but only if you actually control the currency then you can truly trust it. So, let’s make our own cryptocurrency.
As luck would have it, such a task is quite amenable to parallelization (and you thought we were learning that stuff just for fun!). Cryptocurrencies like Bitcoin rely on a distributed transaction ledger composed of blocks. The job of miners in the Bitcoin network is to verify transactions within these blocks, which is computationally expensive.
In this assignment, you will get more familiar with:
- The pthread library and parallelization using threads
- The producer/consumer paradigm
- Taking performance measurements
You are given a complete sequential program and must parallelize it using the pthreads library. Because of the inherent randomness of cryptocurrency mining, using more threads will improve the probability of finding a solution in a shorter amount of time.
Bitcoin mining (and many others) are based on Hashcash. The idea behind these systems is called proof-of-work, which is somewhat like a CAPTCHA designed for computers instead of humans. The computer works on a computationally expensive problem to prove its actions are legitimate; in the case of Hashcash, performing the computation proves that you are not a spammer, while in Bitcoin it serves to verify transactions as being valid. A key feature of proof-of-work systems is that once the solution to the problem is found, it is trivial to verify that the answer is correct.
In Bitcoin, hash inversions are the computationally expensive problem being solved by the computer. Given a block, the algorithm tries to find a nonce (number only used once) that when combined with the block data produces a hash code with a set amount of leading zeros. The more leading zeros requested, the harder the problem is to solve. It’s kind of like rolling dice; rolling any number from 1 through 3 is much easier than rolling a 6. This assignment searches for the nonce that satisfies a given difficulty level by splitting the work across multiple threads.
After successfully performing the hash inversion, the Bitcoin network rewards the miner (or pool of miners) for their work. With our cryptocurrency, a working program will reward you with a good grade :-).
Here’s a demo run for the completed, parallel version of the program:
./miner 4 24 'Hello CS 521!!!' Number of threads: 4 Difficulty Mask: 00000000000000000000000011111111 Block data: [Hello CS 521!!!] ----------- Starting up miner threads! ----------- Solution found by thread 3: Nonce: 10211906 Hash: 0000001209850F7AB3EC055248EE4F1B032D39D0 10221196 hashes in 0.26s (39312292.30 hashes/sec)
In this example, 4 threads are used to find the solution to the block: the nonce that satisfies the given difficulty (24 leading zeros in this case). The process behind finding a solution works like this:
- Program starts, block data is provided by the user:
Hello CS 521!!!
- Each thread receives a task to work on. A task is simply an array of nonces.
- The thread begins appending each nonce in the task array to the block data and then hashing the resulting string. For example, the first combination will be
Hello CS 521!!!1, followed by
Hello CS 521!!!2, and so on.
- We start at 1 so that we can use a nonce of 0 to indicate failure.
- Once a thread finds a hash that begins with the correct amount of zeros (specified by the difficulty mask), the process is complete.
In our implementation, the main thread will produce tasks, and each worker thread performs the hash inversions. If the user specifies
4 for the thread count, this means you’ll have five total threads (main thread + 4 workers).
The Bounded Producer-Consumer Problem
You might be wondering why a producer-consumer setup is the right approach for this assignment. After all, individual threads could split the entire range of possible nonces up and begin working on separate blocks in parallel. However, coordinating what nonces the threads work on from the producer side will give us a reliable way to test your code; everyone will start at 1 and proceed until they find the first nonce that ‘solves’ the data block. This also means that, for example, each student’s miner could be working on different ranges and we could compete to see who mines the most blocks :-).
Testing Your Code
The great thing about cryptocurrencies is their proof-of-work paradigm: it takes a long time to produce a solution for a given block, but verifying that the solution is correct is relatively trivial. We can even do this on the UNIX command line. Let’s assume your block data was
'Hello CS 521!!!' and the nonce of the solution you found was
10211906 with a difficulty level of 24. We can test this with the
echo -n 'Hello CS 521!!!10211906' | sha1sum 0000001209850f7ab3ec055248ee4f1b032d39d0
Note that the resulting hash has 6 zeros, which is what we’d expect: 24 bit difficulty means 6 hex characters worth of zeros (24 / 4 = 6).
- ~15 pts - Passing the test cases (coming soon!)
- 2 pts - Code review:
- Code quality and stylistic consistency
- Functions, structs, etc. must have documentation in Doxygen format (similar to Javadoc). Describe inputs, outputs, and the purpose of each function. NOTE: this is included in the test cases, but we will also look through your documentation.
- No dead, leftover, or unnecessary code.
- You must include a README.md file that describes your program, how it works, how to build it, and any other relevant details. You’ll be happy you did this later if/when your revisit the codebase. Here is an example README.md file.
Restrictions: you may use any standard C library functionality. You are also required to use a bounded producer-consumer implementation to allocate tasks to your worker threads; this makes testing your code predictable. Your code must compile and run on your VM set up with Arch Linux as described in class. Failure to follow these guidelines will will result in a grade of 0.
- Initial project specification posted (5/3)