Project 1: a P2P client

In this project, you'll integrate the tools you've developed in the labs for this course to build a simple peer-to-peer client for sharing recommendations about movies.

Due date: Thursday April 13 at the start of class. No late assignments will be accepted.

Points: 100 points total (breakdown below)

In this project, you'll build a simple p2p client. It will need to be able to do the following tasks:

In addition, there is some additional functionality that can be built in if you want to get a good grade on the project. (more details below).

Note also: You are welcome to use as much or as little of the code from the labs as you like. There are many places where it could be helpful, but if integration will make your life more difficult, feel free to start from scratch. Also, there are places where your code from CS 601 may be useful. You are also welcome to reuse as much of that as you wish.

Grading

Grades will be assigned as follows:

To turn in: Place a copy of your client in your submit directory. No hardcopy needed. Also, please be prepared to demo your client on April 13.

Building the client

Your client should maintain its list of songs in an XML database. (Update 3/28: you may use a non-XML internal representation, such as a SQL database, if you want. The requirement is that you use XML to communicate with each other.)

It will communicate with other clients/peers by using RESTful URL requests and replies.

Your client will also need to have some sort of GUI that users can use to interact with it. You may either:

  1. Generate HTML and interact with the client via a browser.
  2. Build a standalone GUI with Qt, Tkinter, .NET, Cocoa, or whatever other widget toolkit you like.

the GUI should provide the user with:

Note: if you are concerned about time, I would suggest generating HTML. There are lots of good reasons to build a standalone GUI, but time savings is not necessarily one of them, contrary to expectations.

There are basically two different approaches you can take to building your client - each have their pros and cons. For grading purposes, the only requirement is that your client have the necessary functionality and respond correctly to all messages in our protocol. beyond that, you may build it however you see fit.

  1. Monolithic two-threaded client. This approach will make the most sense for folks using the standalone-GUI approach to things. You will probably want to build two separate threads into your client: one to listen for and service HTTP requests from other peers, and one to control the GUI and interact with the user. You may find the HTTP server you built in 601 to have some useful pieces for this.
  2. Single-interface server process. If you choose to interact with the client using a browser, you may find it easiest to build a single process that services HTTP GETs and either generates HTML (to send back to the user) or XML (to send to other peers). Again, your 601 code will be helpful if you use Java. If you use Python, the CGIHTTPServer class may be helpful. You can use this to call existing CGI scripts if you like, or refactor your existing scripts into your server. If you use C# or C++, you should be able to modify your Java code without too much work.

Registering your client

When your client starts up, the first thing it should do is register itself with the address server. For this project, we'll keep things simple and use a single centralized address server. To register itself, your client should use the following RESTful request:

http://boromir.cs.usfca.edu/~brooks/682/peerdb.cgi?requestType=register&name=username&address=ipnumber:port

(it's just a simple cgi script, so please be nice to it ...)

You will receive back the following message:

<reply>
  <type>RegisterReply</type>
  <success>true</success>
  <dateTime>Tue 8 Mar, 2006 11:01:02 </dateTime>
 </reply>

Similarly, to unregister, do:

http://boromir.cs.usfca.edu/~brooks/682/peerdb.cgi?requestType=unregister&name=

You will receive back an unregisterReply in return.

To find out what other peers are currently active, send the following request to the address server:

http://boromir.cs.usfca.edu/~brooks/682/peerdb.cgi?requestType=findAllPeers
The address server will return XML representing the peer database:
<reply>
  <type>UserListReply</type>
  <success>true</success>
  <dateTime>Tue 8 Mar, 2005 11:01:02</dateTime>
<users>
  <user>
    <name> Joe Smith </name>
    <address> 192.3.55.6:8000 </address>
  </user>
  <user>
    <name> Malcolm X </name>
    <address> 133.66.197.2:8000 </address>
  </user>

<user>
<name>bob jones</name>
<address>132.6.55.1:8000</address>
</user> 
</users>
 </reply>

Communicating with other peers

Your client should be able to send and respond to the following messages. Note: this protocol is the one that everyone will use. No exceptions or changes.

Request info about movies

A peer should be able to send a RESTful request to another peer asking for movies with particular characteristics. The peer should be able to specify actor, title, director, or genre, as well as combinations of these. The receiving peer should return a document containing all movie that match the necessary criteria. For example:

http://valis.cs.usfca.edu:8000?requestType=GetInfo&actor=Jack%20Nicholson&address=123.1.1.1:8001

http://valis.cs.usfca.edu:8000?requestType=GetInfo&director=Scorsese&actor=Al%20Pacino&address=123.1.1.1:8001


http://valis.cs.usfca.edu:8000?requestType=GetInfo&genre=Horror&address=123.1.1.1:8001

The peer receiving the request must repond with an XML message in the following format:
<reply>
  <type>SearchReply</type>
  <success>true</success>
  <dateTime>Tue 8 Mar, 2005 11:01:02</dateTime>
  <data>
    <movies>
     <movie>
      <title>The Shining</title>
      < actors >
      <actor>Jack Nicholson</actor>
      < actor> Shelly Duvall < /actor>
      < /actors>
      <directors>
        < director> Stanley Kubrick </director>
      </directors>
      <genre> Horror </genre>
      < imageURL> http://www.amazon.com/... </imageURL>
      < rating> 5 </rating>
     </movie>
     ...
    </movies>
  </data>

</reply>

(If multiple movies match a search, they should each have a 'movie' element included.)

The subelements of a movie element are:

Note: Please do NOT change your XML database to fit exactly this format. (It may change at any time). Instead, keep your own format and use XSLT to turn your search results into this format as needed.

Suggestions

Your peer should also be able to make suggestions to another peer based on the movies that are in its DB.

The format for suggestions is as follows:

http://valis.cs.usfca.edu:8000?requestType=suggestion&title=The%20Shining
The receiving peer should then return a message of type suggestionReply that contains zero or more movie elements. For example:
<reply>
  <type>SuggestionReply</type>
  <success>true</success>
  <dateTime>Tue 8 Mar, 2005 11:01:02</dateTime>
  <data>
    <movies>
     <movie>
      <title>Dr. Strangelove</title>
      < actors >
      <actor>Peter Sellers</actor>
      < actor> Slim Pickens < /actor>
      < /actors>
      <directors>
        < director> Stanley Kubrick </director>
      </directors>
      <genre> Comedy </genre>
      < imageURL> http://www.amazon.com/... </imageURL>
      < rating> 5 </rating>
     </movie>
     ...
    </movies>
  </data>

</reply>

For this project, we'll keep the mechanism for generating suggestions simple. If the receiving peer has the movie in its database, it should return all movies that have the same director and one or more of the same actors. (Extending this to do something more interesting would be an excellent project 2).

Unknown messages

Your peer should also reqpond nicely to malformed or otherwise incorrect messages. The following XML should be returned to a requestor with an invalid URL:

<reply>
  <type>UnknownReply</type>
  <success>false</success>
  <dateTime>Tue 8 Mar, 2005 11:01:02</dateTime>
  <receivedRequest> args here </receivedRequest>

</reply>

receivedRequest should contain whatever arguments were sent to you.

Advanced stuff

Picture of other peers: Currently, the only time your peer finds out about other peers in when it first registers and calls findAllPeers. For this step, you will add to your peer the ability to dynamically keep track of other peers .When another peer does a search of your peer, send it an identifySelf message afterward: http://name.of.host?requestType=identifySelf&requesterAddress=ip:port

The remote peer should then respond to the address and port indicated in the request with:

<reply>
  <type>IdentifySelfReply</type>
  <success>true</success>
  <dateTime>Tue 8 Mar, 2005 11:01:02</dateTime>
  <user>
    <name> Joe Smith </name>
    <address> 192.3.55.6:8000 </address>
  </user>

<reply>

Your peer should be able to send and respond to requests.

Forwarding search: When your peer receives a search request that it is unable to satisfy, it should try to search at least three peers. If any of them return a successul request, your peer should construct a reply to the original requestor containing this information.

Culling peer list. If your client tries to contact a peer that is no longer alive, it should remove that peer from its list.

More expressive queries. Your peer should be able to send and respond to queries representing any logical formulation. For example, to find all comedy movies containing Steve Martin or Robin Williams, your request would look like: http://name.of.host?requestType=logicalInfoRequest&query=actor(Steve%20Martin)ORactor(Robin%20Williams)

We'll use the following representation:


Resources

Serving HTTP:

Back in the days of 601, you built an HTTP server as a project. You may find this useful. If you've forgotten how it worked, the project spec is here. If you're working in Java, you should be able to reuse a fair amount of your 601 code. The C# solution should look very much like the Java solution. (you only need to be able to handle HTTP GET).

If you're working in Python, here are some notes on HTTP serving with Python's CGIHTTPServer class.

Finding out your IP address and dealing with firewall issues:

If you're planning to work from home and you are using NAT and/or a firewall, you may need to do a little playing around to make sure that other processes can correctly find your peer, and that messages are not blocked by your firewall.

Also, please feel free to use the cs682 mailing list to discuss interoperation or clarification issues with each other - you shouldn't share code or solutions, but you're more than welcome to use the list to discuss things such as useful external resources, additional messages you plan to support, or techniques for dealing with NAT.