Serving HTTP in Python is a bit of a double-edged sword. One one hand, the available classes (BaseHTTPServer, CGIHTTPServer, CGIRequestHandler) are very powerful. On the other hand, they are not particularly well documented. This will hopefully help you out.

One hint: Don't forget that the source for all the Python libraries is available in /usr/lib/python2.3 - often, looking at the source to see how things are done inside a module can be more helpful than anything.

Using the BaseHTTPServer class.

Here is a simple program that processes HTTP requests, checks to see if the request has parameters attached, and then returns 'hello world' to the client.

In Python, you provide the BaseHTTPServer class with a handler that is responsible for servicing GET and POST requests. (we are only dealing with GET in this project.) It should inherit from both CGIHTTPRequestHandler and object. (see the comment for an explanation of this.)

do_GET is invoked by the BaseHTTPServer whenever a GET request is received. This is where you'll do your work. If you like, you can invoke a separate CGI script at this point.

The first thing to notice is that this DOES NOT work the way that the cgi module does. In particular, stdout is not redirected back to the client. Instead, self.wfile is returned to the client. So, to send back a file, use self.copyfile, which is inherited from BaseHTTPRequestHandler.

Also, in this code, I want to send back a string, rather than a file. I could write the data out to a temp file, but that's messy, so I use the StringIO class. It provides a String that has an associated file handle, so it can be manipulated just like a file. In this case, I write to the string, then use seek to reset the file pointer, then self.copyfile to move the 'file contents' onto the output stream.

self.path includes everything in the URL after the hostname and port. This is where you will extract arguments.

Note that, if path doesn't contain a '?', I call the superclass' do_GET method. The syntax for doing this is a little odd. super takes the name of a type and the object to be referenced, and returns a new object of the superclass' type. In order for this to work, you must (indirectly) inherit from object.

self.send_header() can be used to send back any necessary HTTP headers. Take a look at the BaseHTTPRequestHandler (in BaseHTTPServer) and the CGIHTTPRequestHandler (in CGIHTTPServer) for more usage.

saddr is the host, port tuple ('' is localhost) - when starting an HTTPServer, you should provide a saddr and a handler.


import SimpleHTTPServer, BaseHTTPServer, httplib
from CGIHTTPServer import CGIHTTPRequestHandler
from StringIO import StringIO


        
### in order for super() to work, myHandler must be a 'new-style object'
### old-style objects work like objects in C++ - they ae a classobj, rather
### than a type. They are not required to have a superclass.
### although myHandler has a superclass, apparently CGIHTTPRequestHandler
### does not derive from object.
### To get new-style class behavior, we also derive from object. A strange
### hack to preserve both methods of OO ...

class myHandler(CGIHTTPRequestHandler,object) :
    def do_GET(self) :
        if self.path.find('?') > 0 :        
            args = self.parsePath(self.path)
            print args
            print 'doing get'
            f=StringIO()
            f.write('hello wold')
            self.send_header("Content-type", "text/html")
            self.send_header("Content-Length", 11)
            self.end_headers()
            f.seek(0)
            self.copyfile(f, self.wfile)
	 else :
            super(myHandler,self).do_GET()

    def parsePath(self, pathString) :
        args = pathString.split('?')[1]
        pairs = [s.split('=') for s in args.split('&')]
        pairDict = {}
        for item in pairs :
            pairDict[item[0]] = item[1]
        return pairDict


saddr=("",8000)
m=BaseHTTPServer.HTTPServer(saddr, myHandler)
m.serve_forever()