Java Servlets

Big picture

Re-examine the http server project. Your program allows browers to remotely fetch files and also to launch some java code that ran code on the server machine and sent HTML back by accessing a special URL (/java/...). The java code satisfied the ServerSideJava interface.

A servlet is the exact same concept--a piece of Java code that runs on the server in response to a browser request. Servlets support the request/response client/server model and are able to service HTTP GET and POST requests. They are a replacement for CGI bin and php etc...

Many dynamic pages on the net are generated by Java code on a server and spit back to the browser, which pretends everything is a static file.

Another way to think of a GET request is as a method call over the internet (this is the basic idea behind REST) where method parameters are URL parameters. The return value is simply the (usually XML) data spit back by the server. For example, here is how to access Yahoo's search engine via their REST interface to find information on the movie Serenity:

http://api.search.yahoo.com/WebSearchService/V1/webSearch?appid=YahooDemo&query=serenity+movie

One could view this as an implementation of this method call:

WebSearchService.webSearch("YahooDemo", "serenity+movie");

See Creating a REST Request for Yahoo! Search Web Services for more information. Again, though, REST is just a way to think about access dynamic data from the web.

A POST request is almost the opposite of a GET request: it is meant to send data to the server from HTML forms (i.e., pages with <form> tags and submit buttons). POST pages normally process the data and then redirect the browser to another page that says anything from "thanks" to "here is your requested data" (like when doing airline reservations). You can think of GET services as display pages and POST services as processing pages.

You will have one servlet execute per page for your site; each servlet acquires the appropriate data and then computes an HTML page, which is sent back to the browser.

Web servers normally follow a model-view-controller (MVC) architecture where the web server is controller, the pages collectively represent the view, and your core system usually embodies the model (such as EmailManager, ...). You can use Jetty or your own HTTP server for this project; I'd recommend Jetty. ;)

What is JSP? Java Server Pages are "inside out" servlets where, instead of Java with embedded print HTML statements, you have HTML with embedded Java code snippets. We will discuss these in more detail later.

In principle, "skins" or multiple web page "looks" are possible, but they require strictly separated model and view. JSP encourages entangling model and view, thus, preventing skins. StringTemplate, something I built during years of server development, strictly enforces separation of model and view. You can trivially radically change the look of your site [demo jguru's guest/user/premium look].

Where is the main method in a server? For most deployed applications, there isn't one. The first execution of one of your pages must launch the application initiation procedures. In our case, you will be launching Jetty, the web server, from a main method so you could in principle do whatever initialization you wanted before the first page is served.

HTTP is stateless. How can your server know when the same person comes back to the site? That is, when I click on a link on a site that takes me to another page on the same site, how does the server know that a particular GET request is from the same person? Further, how does the server track a chunk of data for each person logged into the site? For example, you want to track their name, ID, and pages visited perhaps. Sessions are used for that and we will worry about them in another lecture. But they exist, and every servlet can ask for a site visitor's session object across pages. These sessions are normally implemented via cookies, little pieces of key/value pair data stored on a person's client machine. Each request to site foo.com sends all foo.com cookies on the client machine to the server along with each GET/POST request.

Some details

See package doc:

http://java.sun.com/products/servlet/2.3/javadoc/index.html

Useful class summary:

Generates HTML that is sent back to the browser.

http servlets have methods are associated with the http protocol. GET, POST, etc... The HTTP GET request invokes doGet() method.

The servlet life cycle:

  1. init
  2. service many times (doGet/doPost in our case)
  3. destroy

One instance of a servlet per server. Watch out for threading issues. Can tell it to be singlethreaded, but very slow. Reference example of using instance vars setting/getting from within service (see section below).

Later we'll talk about generating html properly and can discuss in detail.

How do they work? Unfortunately that really depends on the servlet container you are using. Some let you place your .java file inside a special directory and your server will be able to send URL requests to that code. The HTTP server compiles the code for you (and can replace an old one on a running server). With Jetty, it will find your servlet code from the CLASSPATH so make sure you start your app with the path set to see both jetty stuff and your stuff.

A servlet runs in the same address space as the server and typically represents a middle tier in a three tier system.

Simple Servlets

GET

Note that the servlet receives a request and response object. The request object contains information about the HTTP request, plus parameters, and other header info. The response object lets you set response headers, cookies, and lets you write to the output stream.

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class HelloServlet extends HttpServlet {
    public void doGet(HttpServletRequest request,
		      HttpServletResponse response)
	throws ServletException, IOException
    {
	PrintWriter out = response.getWriter();

	out.println("<html>");
	out.println("<body>");
	out.println("<h1>Servlet test</h1>");

	out.println("Hello, there!");

	out.println("</body>");
	out.println("</html>");
    }
}

POST

Used as target of form processing. Can handle much more data (I think) in terms of parameters. request.getParameter is used to get URL parameters.

<html>
<body>
<FORM METHOD=POST ACTION="/servlet/SimpleResponse">
<input type=text size=45 name=query value="type your query here">
<input type=submit value=SEARCH>
</FORM>
</body>
</html>

Servlet responding to form (note both doGet and doPost methods):

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;

public class SimpleResponse extends HttpServlet {
    public void doPost(HttpServletRequest request,
		       HttpServletResponse response)
	throws ServletException, IOException
    {
	response.setContentType("text/html");
	PrintWriter out = response.getWriter();
	out.println("<html><body>");
	String user = request.getParameter("user");
	String password = request.getParameter("password");
	out.println("You type "+user+"/"+password);
	out.println("</body></html>");
	out.close();
    }

    public void doGet(HttpServletRequest request,
		      HttpServletResponse response)
	throws ServletException, IOException
    {
	response.setContentType("text/html");
	PrintWriter out = response.getWriter();
	out.println("<html><body>");
	String name = request.getParameter("name");
	out.println("Hello "+name+"<br><br>");
	out.println("You requested URL: "+
		    HttpUtils.getRequestURL(request)+request.getQueryString());
	out.println("</body></html>");
	out.close();
    }
}

Use HttpUtils.getRequestURL(request) to reconstruct URL minus query (args).

Other useful methods in: getRemoteHost() http://java.sun.com/products/servlet/2.3/javadoc/javax/servlet/ServletRequest.html

Argument URL encoding

Converts meaningful or special chars to ascii code.

' '->%20 etc....

Use: java.netURLEncoder.encode(...)

Here is an example calling the doGet method using a space in arg.

http://localhost:8080/servlet/SimpleResponse?name=Terence%20Parr

Jetty and Servlets

Jetty is a simple embeddable or standalone web server for static pages or dynamic pages like JSP or Servlets. For our purposes, it's probably easier to have your application embed the web server directly.

Simple file serving

Here is a simple program that starts up a web server at port 8080 and has a document root specified by args[0]:

import org.mortbay.http.*;
import org.mortbay.jetty.Server;
import org.mortbay.jetty.servlet.*;
import org.mortbay.log.*;

public class MyServer {
    public static void main(String[] args) throws Exception {
	String DOC_ROOT = args[0];
        Server server = new Server();
        server.addListener(":8080");
        server.setRequestLog(getServerLogging());
        server.addWebApplication("/", DOC_ROOT);
        server.start();
    }
}

You will need the following additions to your CLASSPATH:

/home/public/cs601/jetty/javax.servlet.jar
/home/public/cs601/jetty/org.mortbay.jetty.jar
/home/public/cs601/jetty/log4j-1.2.12.jar
/home/public/cs601/jetty/jasper-compiler.jar
/home/public/cs601/jetty/jasper-runtime.jar
/home/public/cs601/jetty/xercesImpl.jar
/home/public/cs601/jetty/commons-logging.jar

which are the jars used by Jetty. Start your server like this:

java MyServer ~/foo

which will start serving files underneath ~/foo. URLs like host:8080/t.html will get file ~/foo/t.html from the disk.

At this point, Jetty will serve files, but you are not recording requests to your server. Add the following method:

private static RequestLog getServerLogging() throws Exception {
    NCSARequestLog a = new NCSARequestLog("./request.log");
    a.setRetainDays(90);
    a.setAppend(true);
    a.setExtended(false);
    a.setLogTimeZone("GMT");
    a.start();
    return a;
}   

And add this line before the server.start():

server.setRequestLog(getServerLogging());

You still get a warning from Jetty, but ignore it:

log4j:WARN No appenders could be found for logger (org.mortbay.util.Container).
log4j:WARN Please initialize the log4j system properly.

Serving Servlets

To get Jetty to handle servlets, update your server code to look like this:

Server server = new Server();
server.addListener(":8080");

// logging
server.setRequestLog(getServerLogging());

// Servlets
ServletHttpContext context = 
    (ServletHttpContext) server.getContext("/");
context.addServlet("Invoker","/servlet/*",
       "org.mortbay.jetty.servlet.Invoker");

// HTTP
server.addWebApplication("/", "./");

server.start();

Servlet HelloServlet would be visible at host:8080/servlet/HelloServlet.

Thread safety

The safety issue for servlets with instance variables can be illustrated with the following:

class PageServlet extends HttpServer {
    String id;
    public void doGet(HttpServletRequest request,
          HttpServletResponse response)
    throws ServletException, IOException
    {
        id = request.getParameter("ID");
        ...
        out.println("Your id is "+id);
    }
    ...
}

One thread can set the id then switch to another thread which resets id. When that second thread finishes the first thread would start up and proceed to print out the same value as the second thread. I.e., if url s?ID=1 and s?ID=2 result in simultaneous exec of this servlet, you risk seeing the same id value printed out.

JSP

JSP: autogenerated servlets from html + java page.

<html>
<body>
<h1>JSP test</h1>

Hello, <%=request.getParameter("name")%>.

</body>
</html>

Don't need a special URL mapping; resin knows how to execute *.jsp files.

File Uploads

Can do file uploads and so on. For example, answer the challenge here:

http://www.antlr.org/submit/challenge?type=grammar

and see source of resulting page; looks like:

<FORM METHOD=POST ACTION="/submit/process" ENCTYPE="multipart/form-data">
...
</FORM>