Project 1 - Load Balancing in a Distributed Data Backup Application
Still subject to change...
Due - Monday, February 26, 2007
For this project, you will implement a testbed that will enable
you to evaluate various load balancing strategies for a distributed
data backup application. Your testbed will consist of
servers that accept requests to store data and
clients that upload data to the servers. Undergraduate
students will implement a client-aware load balancing algorithm
that enables clients to independently select from a set of
potential servers. Graduate students will implement a separate
load balancer that will attempt to balance load among the
candidate backup servers.
The Server
The server will be a multithreaded application that simply waits
for a client request and spawns a thread to process the request.
Processing the request will consist of receiving data from the
client and storing it to disk. You may use any protocol you like
for communication between the client and the server. You may also
reuse code that you have previously written. However, keep
it simple! The goals of this assignment are to give you
experience with socket programming and performance measurement.
You will have the opportunity to further develop this application
for Project 2 if you so choose.
The Client
The client will be designed to test the performance of the
system. It will repeatedly wait for X seconds, where X is a
configurable parameter, and upload a file to a backup server. You
should not implement a user interface or data restore capability.
Again, you will have the opportunity to further develop this
application for Project 2 if you wish.
The client implementation will vary slightly depending on the load
balancing algorithm used. Undergraduates will implement
Algorithm 1, a client-aware load balancing algorithm. Your
client will be configured with a list of the IP address of all
candidate servers. For each file uploaded by the client, it will
select the IP address of a server randomly, using a uniform
distribution.
Graduate students will begin with Algorithm 1 described
above. They will compare the results of this algorithm with
two additional algorithms implemented in the load balancer.
To use the load balancer, the client will be modified to upload
data in a two-step process. In step 1, the client will contact
the load balancer to request the IP address of a backup server.
In step 2, the client will connect to the IP address given and
upload the data.
The Load Balancer
The load balancer will receive a client request for an IP address,
apply one of two load balancing algorithms, and return the IP
address of the chosen backup server. When the server is launched,
it will be configured to use either algorithm 2 or 3 (described
below). In other words, for a single experiment, the load
balancer will only use one algorithm.
Algorithm 2 - Round Robin: Algorithm 2 will be a simple
round robin. The load balancer will maintain a list of the IP
addresses of the candidate servers. The first client request will
be directed to server 1, the second to server 2 and so on. In
this way, each server should receive a balanced number of
requests
Algorithm 3 - Size-aware: Algorithm 3 will choose servers
based on the size of the data to be uploaded. When a client
contacts the load balancer it will provide the size of the file it
intends to upload. The load balancer will keep track of the total
amount of data it believes has been uploaded to each server and
will select the server in an effort to balance the amount of
data stored on each server. Note that if a client upload
fails after it has received the server IP from the load balancer
the load balancer may have an incorrect view of the amount of data
stored on each server. You do not need to solve this problem for
Project 1, though you may select to work on it for Project 2.
Experimental Setup
You will run several experiments for each algorithm. Each
experiment will run for a fixed length of time, for example 5
minutes. You will fix the number of servers at 4 and use one or
more instances of your client to generate the workload. Though
decreasing the time between uploads for one client will have a
similar impact as increasing the total number of clients, you
should do some preliminary experimentation to determine how many
clients you will use for your experiments. Your goal is to
generate a reasonable amount of load for the servers and load
balancer. You will likely use 8-10 clients for your experiments.
You will vary the following parameters:
- The size of the files uploaded by the client - For each upload, your client will upload either a small (10-20K) or large (1-2MB) file. You will configure the client will the percentage of files that should be large. In other words, if you set this parameter at 20%, 2 out of 10 files uploaded by the client should be large files and the other 8 should be small files.
- Time between uploads for one client - Between each upload, each client will sleep for X seconds. X is your configurable parameter and will range from 0 to 30.
You will collect data about the following metrics:
- Number of requests served by each server - You will measure the total number of requests that a each server is able to service during the run of the experiment.
- Number of MBytes stored on each server - You will measure the total number of MBytes stored on each server at the end of the experiment.
- Client service time - You will measure the average amount of time required for a single upload at the client side.
- Requests/second served by the load balancer (Graduate students only) - You will measure the average number of requests the load balancer is able to service.
Results
You will submit a written report that contains the an overview of
your implementation, an overview of your experimental setup, and
the results of your experiments. In this report, make sure to
note any conditions or observations affecting your results. For
example, you might notice that the same experiment yields
different results on different days/times. Why? Well, it could
be the load on the machine varies. Do your best to come up with
an explanation for why you see these affects.
Your results will consist of several graphs, each accompanied by
at least one paragraph summarizing what the graph shows as well as
the findings evident from the graph. You should explain the
general trends you see (for example, "Not surprisingly, as the
time between requests decreases, the number of requests per second
served by the load balancer increases as well.") as well as point
out and provide explanations of any anomalies. This latter part
is the most interesting and you should be sure to explain any and
all strange behavior.
As a guide, undergraduates should have 6 graphs. All
will use algorithm 1. The first three will fix parameter 1 to
some reasonable value (e.g., 10 seconds) and vary parameter 2 from
0 to 100. Graphs 1-3 will report metrics 1-3. The second three
will fix parameter 2 to some reasonable value (e.g., 50) and vary
parameter 1 from 0 to 30 seconds. Graphs 4-6 will report metrics
1-3.
Graduate students should have 8 graphs. The first four
will fix parameter 1 and vary parameter 2 as described above for
the undergraduate graphs. Graphs 1-4 will report metrics 1-4
for all three algorithms. In other words, each graph will
have three lines or sets of bars, one for each algorithm. Graphs
5-8 will fix parameter 2 and vary parameter 1 and report metrics
1-4 for all three algorithms.
Implementation Requirements and Hints
- You may work in groups of 2 or 3.
- If you have not done any previous socket programming, you will use Java for this assignment. If you have done socket programming in the past and would like to use a different language, email the instructor and specify the language you would like to use. Requests to use other languages must be approved.
- Start early. Experimentation can take a long time, particularly when others are using the same machines.
- You will need to think a bit about how to implement your measurement framework into your testbed. Make all of your measurements as accurate as possible.
- If your experiments yield unstable results, for example the client service time bounces between 5ms and 3seconds, run your experiments again.
- Though we will not be testing fault-tolerance or other similar properties, your protocols should handle all possible error conditions. For example, a client should not explode if it tries to contact a server that is down.
- Make sure you correctly deal with pesky issues such as file naming. For example, if two clients upload index.html to the same server, the server should recognize that there are two copies of the file.
Due 5:30PM - Monday, February 26, 2007
- Complete and submit your working code. Place a copy of your source code in the submit directory /home/submit/cs680-s07/username.
- Turn in a hard copy of the your written report containing your results and analysis.
Note: No portion of your code may be copied from any other
source including another text book, a web page, or another
student (current or former). You must provide citations for any
sources you have used in designing and implementing your program.
Sami Rollins