HTTP Server

Due date: Apr 2, 2009

Your goal is to produce a simple HTTP server that accepts multiple simultaneous requests. The server will respond to HTTP protocol GET requests and spit the requested file (HTML, image, etc...) back to the requesting browser. The server must also be able to execute Java code on the server side and send the results back to the browser. It is possible to write this server in very few lines, but they have to be the right lines. ;)

This project exercises your understanding of the following concepts and corresponding Java APIs:

  1. Java's ability to dynamically load classes
  2. Reading / writing text files / streams
  3. Socket programming and basic server / client strategies
  4. Multi-threaded programming

Project deliverables

You will make a server that listens at port 8080 and launches a thread to handle each client request. Your server must be thread safe in that multiple client connections should not interfere with each others data.

A client may:

  1. request a file from the disk as a filename relative to a root directory defined as an argument to your server's main program (you can use a directory under your home directory like ~parrt/public_html for testing). When you run your server it should respond to browser requests to URLs like the following:

    http://localhost:8080/test.html

    by returning the contents of (assuming the root of ~parrt/public_html)

    ~parrt/public_html/test.html

    Note that you must handle .gif files and other binary data types properly by setting the HTTP Content-Type header.
  2. request that the server execute some Java code that spits text back to the client. The URL will look like

    http://localhost:8080/java/fully-qualified-classname

    and your server must create an instance of fully-qualified-classname and then call its method service() (defined in superclass ServerSideJava; see below). You should assume that all classes will be visible in the CLASSPATH visible to the server when it started. The command-line document root argument has NOTHING to do with which classes the server can see!
  3. request a directory listing via URL

    http://localhost:8080/java/http.Directory/dir

    which executes the http.Directory ServerSideJava code and prints out the files and directory listing for dir under the document root.

Errors must be handled in a sane fashion. Specifically, you must handle:

  1. bad request (you only handle GET requests not POST etc...)
  2. missing file
  3. missing class implementing ServerSideJava interface
  4. missing or invalid directory for the directory listing code

You must send an HTML "page" back that contains the error similar to what you see from a real web server.

HTTP GET commands

As a simple first step, you might implement a single-threaded server that listens at 8080 for a connection and just prints out whatever comes in upon connection--It is important to understand the WWW protocols at least in concept. You will find something like the following:

GET / HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us)
AppleWebKit/412.6 (KHTML, like Gecko) Safari/412.2
Connection: keep-alive
Host: hrn53512:8080

note there is a blank line afterwards.

The first line is the most important as it is the command, GET. You only have to accept GET commands (i.e., ignore POST, etc...).

HTTP Response

When processing these requests, note that you must read the entire set of lines sent to you from the browser or your browser may hang and not read the result properly.

For each GET request for a file, your server must load the file from disk and send it to the browser. You should set at least the Content-Type header in the response so the browser knows what kind of data it is.

See URLConnection for information on how to determine the type. Specifically, you will call:

public static String guessContentTypeFromName(String fname);

Once you have the type, you need to tell the querying browser. Add this to your response:

Content-Type: text/html

In general, your response will be of the form (unless there is an error):

HTTP/1.1 200 OK
Content-Type: <MIME-type>
Content-Length: <length-in-bytes>

The trailing blank line indicates to the browser that now the data is coming. After this line, you should send text or binary data.

Browsers will typically be sending your server HTTP requests indicating they understand HTTP protocol version 1.1 at least. 1.1 seems to be when browsers/servers began allowing return headers. For example, upon a simple request for

http://www.cs.usfca.edu/~parrt/mud3.gif

here is what apache sends back:

HTTP/1.1 200 OK
Date: Thu, 18 Sep 2003 20:41:33 GMT
Server: Apache/2.0.40 (Red Hat Linux)
Last-Modified: Fri, 17 Jan 2003 21:59:12 GMT
ETag: "a981bb-36fd-48948c00"
Accept-Ranges: bytes
Content-Length: 14077
Connection: close
Content-Type: image/gif

<the mud3.gif data>

whereas HTTP 1.0 would simply send back the data.

Note that for binary files, you cannot use a Reader to load them from the disk. You can assume pure ascii text (8-bit text files) for this project, so it is safe to use an InputStream to load any kind of file.

Executing server side Java

A basic web server can only send static HTML pages back to a browser, but for sites like Amazon.com you need to generate dynamic content. Dynamic content implies you have to run some code on the server to generate the output rather than just loading it from a file.

In order to run a Java class on the server, the classes must all answer the same message. I have created a class called ServerSideJava from which your server-side-java classes should derive. It defines a method called service() that you must implement in a subclass. Your server will create an instance of the subclass and then call this service method, asking it to print something to the OutputStream.

package http;

import java.io.*;

/** Any subclass of this will be executable by your http server */
public abstract class ServerSideJava {
    /** Invoked by your http server to print something meaningful
     *  to 'out'.
     *  The URL is the entire URL seen by the server; that is,
     *  the entire "file name" after the HTTPD GET command your
     *  server sees.  For example,
     *  GET /index.html => URL=/index.html
     *  GET /java/http.TheDate => URL=/java/http.TheDate
     *  GET /java/http.Directory/a => URL=/java/http.Directory/a
     *
     *  The rootDir is the directory where HTML files are located and
     *  is passed in from command line argument.
     *
     *  Your client handle must send the URL and the OutputStream
     *  to this method.  
     */
    public abstract void service(String rootDir,
				 String URL,
                                 OutputStream out) throws IOException;
}

In other words, if I request

http://localhost:8080/java/http.TheDate

Then you must:

  1. Get a Class object corresponding to fully-qualified class http.TheDate.
  2. Create an instance of that class.
  3. If that class is not a kind of ServerSideJava object, send an error message such as
    Java class http.Server not ServerSideJava
    
    to the browser. Also handle the case where the class is not found.
  4. Invoke method service(document-root,URL,out) on the instance. The out variable is the output stream you opened on the socket going back to the client.

This amounts to:

String className = ...compute from URL...;
Class newClass = Class.forName(className);
Object o = newClass.newInstance();
((ServerSideJava)o).service(dirWhereHTMLFilesAre, incomingURL, outputStream);

Here is what a simple ServerSideJava object looks like:

package http;

import java.util.Date;
import java.io.*;

public class TheDate extends ServerSideJava {
    public void service(String rootDir,
                        String URL,
                        OutputStream out)
        throws IOException
    {
	// here URL, rootDir parameters are not needed
        PrintStream p = new PrintStream(out);
        p.println("<html><body>The date is "+new Date()+"</body></html>");
        p.flush();
    }
}

We will test your server with a few ServerSideJava objects of our own!

As part of your project, you must build a class called http.Directory that derives from ServerSideJava and sends a list of files in the directory specified in the URL to the browser. In other words, if the URL is

http://localhost:8080/java/http.Directory

(with or without a trailing '/' character) you would return a list of all files and directories in the document root (which is passed in from the server to the client and was specified on the command-line). If the URL is

http://localhost:8080/java/http.Directory/a/b

or

http://localhost:8080/java/http.Directory/a/b/

you must return a list of files in relative directory a/b under the document root. So the directory to look at is encoded in the URL. You must parse it out.

Notes

  1. Your http server must be able to spit back a complicated web page and have it displayed properly in a browser. A web page typically has many images, which the browser will also request from your server. Make sure this works. You can simply go to a USF webpage and "Save as..." from the browser to your local disk under your public_html directory. Then request this via your server.
  2. I have found that I must tell the socket to "shutdown" before closing streams and the socket otherwise I get an error dialog box about "connection reset by peer." I added this and the dialogs disappeared:
    socket.shutdownInput();
    socket.shutdownOutput();
    
  3. use package http for your classes and use class http.Server as your main class.
  4. if you encounter an error such as missing file, missing Java class, or whatever, send back an error message as text to the browser. Don't worry about sending back the appropriate HTTP error code. For other exceptions like IOException, you can let the client processing thread die, but you can't kill the server.
  5. You will notice that your browser is asking for file /favicon.ico, which is a way for a browser to get a little icon to stick in the URL window. Go to java.sun.com and you'll see a little "Duke" character next to the URL. Handle it like any other missing file: send an error message back to the browser (which it will ignore).
  6. Your program's main class http.Server must take the document root as an argument. Your code must be use this value--that is the only way we'll be able to specify directories during our testing.

Submission

You will submit a jar file called http.jar containing source and *.class files (include http.ServerSideJava and http.Directory) into the svn repository. You must place http.jar into http/trunk/lib in svn.

Please bring a printout of your project to class at 3:30.

Grading

I will run your program via

java -cp http.jar:parrt-classpath http.Server document-root

where document-root is a directory I will specify during testing--your program must read it from args[0]. If the arguments are not handled in exactly this manner or if your http.Server class fails to respond or compile etc..., I will deduct 10%!

Your grade is from 0..10. The server side java 4 points, GET processing 4 points, error handling 2 points. This project is important as it's 10% of your final grade.

Exceptions in your server that cause it to fail results in a 0 for that particular feature (GET, java execution).

You may discuss this project with anybody you want and may look at any code on the internet except for a classmate's code. You should physically code this project yourself but can use all the help you find other than cutting-n-pasting code from a classmate or the web.