Project 4: Web Server (v1.0)

Starter repository on GitHub: https://classroom.github.com/a/KdbymZzs

HTTP Messages

A message in HTTP consists of three main parts: the start line, header fields, and an optional body:

Each header field is separated by newlines. However, instead of the usual \n escape sequence for a newline, standards-compliant HTTP servers and clients use \r\n. Servers should also ignore any empty line(s) at the beginning of a message.

The header fields and body are separated by a single blank line (i.e., a line that only contains \r\n).

There are two types of HTTP messages: requests and responses.

HTTP Request Handling

Client applications send HTTP request messages to the server to retrieve content, send data, etc. In our case, we’re only concerned with GET requests, which retrieve information identified by a Request-URI. Here’s an example GET request sent from the Safari web browser, asking to retrieve /index.html.

GET /index.html HTTP/1.1
Host: localhost:8080
Upgrade-Insecure-Requests: 1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1 Safari/605.1.15                                                                          
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: keep-alive

Notice that there is no body content here – nothing after the blank line. This is because the header fields contain all the information necessary to request a file from the server.

GET requests may specify directories as well as files. If the URI contains a directory, then the web server should append index.html to the request; e.g., a request for / should resolve to /index.html.

HTTP Responses

After the web server receives a request, it will send a response. Responses are formatted similarly to requests; here’s an example response that was successful (status code 200) and includes the body (‘Hello world!'):

HTTP/1.1 200 OK
Date: Sun, 05 May 2020 23:53:37 GMT
Content-Length: 12

Hello world!

In general, the server opens the file that was requested, reads it, and writes its contents into the body of the message. If the file doesn’t exist, the server will return a 404 status code and send an error message as the body content. For most responses, the Date header field is required and must be formatted as shown above.

Implementation

This mini-project is an extension of Lab 8. Use the chat server code from the lab to build your web server. You’ll need to make some modifications to do this:

To test your server, forward the remote HTTP port to your local machine. Here, I’m forwarding port 8080 on my VM to port 8080 on my local machine. Then I can simply navigate to http://localhost:8080 to test my code:

ssh snuggly-bunny -L 8080:localhost:8080

You can also do this through gojira instead (substitute your own IP address):

ssh gojira -L 8080:192.168.122.103:8080

Execution Flow

The following steps occur when resolving a web request:

  1. The client (browser) connects to the server.
  2. Client sends an HTTP request with the request URI
  3. Server locates the file; if it exists, it determines the file size.
  4. Server sends the response headers back to the client with write()
  5. Server sends the file body to the client with sendfile()
  6. Client renders the web page, file etc.
  7. Server waits for the next request (Note: the connection is not necessarily closed)

Grading