Lab 3: XML Schema, modification, CGI



Due Tuesday 2/20 at start of class.
30 points


This lab will begin extending the code you developed last week, adding in the ability to modify the DOM tree. You'll also create a schema, get more experience traversing file structures and generating code, and deploy a simple server program.

Please place a copy of your XML schema and your code in the cs682 submit directory. Also, please place a README in the submit directory that gives the URL referring to the location of your running resource and/or all instructions to run your code.

  1. (5 points) Implement an XML schema for the photo database from last week's lab. Your schema should indicate that a picture has exactly one filename, which has an extension of either 3 or 4 letters (i.e. jpeg or jpg), exactly one directory, exactly one size, which is an integer, zero or more tags, exactly one date_modified, and exactly one EXIF element. (you do not need to include the EXIF subelements) You must be able to validate your database using a XSL validator such as this one.
  2. (5 points) Modify the DOM parsing program from the previous lab to do the following: Take as input the filename of a picture, along with one or more tags. Read in your XML database with a DOM parser, modify the DOM tree to include the appropriate <tag> elements, and then write the DOM tree back out to a file.
  3. (10 points) Construct a program that takes as input a directory. It should traverse all subdirectories, and, for each directory containing images, build one (or more - up to you) HTML document containing thumbnails for each image, a link to the full-size image, and a form containing a text box (for adding tags) and a submit button. Here is an example of the sort of thing I'm looking for; please feel free to make it much nicer-looking. You should also create an index page in the top-level directory that has a link to each of the pages in the subdirectories. (If you were careful in how you built your XML generator for lab 2, you can probably re-use a lot of it.)

    There are several ways to generate thumbnails. If you're using Python, the Python Imaging Library is very nice. I'm also a big fan of ImageMagick, which has standalone command-line tools, as well as libraries for most common languages. (Feel free to use other tools as well.) You can also extract the thumbnail stored in the EXIF data, if it's present.

    You will probably want to find a suitable library that can do image processing, as we'll be doing more of it in future labs.
  4. (10 points) Connect parts 2 and 3 using the server technology of your choice. When the user enters a tag in the text box and hits 'submit', a script on the server side should execute the code written in part 2 to update the tags for that picture.

    There are several possible ways to implement this, including the following:
    • CGI. This is a decent choice if you're using Python or Ruby. If you choose to use CGI, please use scorpio.cs.usfca.edu to test your scripts, rather than www.cs.usfca.edu. See here for some hints on getting CGI up and running. You are welcome to use Ajax/Javascript on the front end, although it's not required.
    • Java Servlet. I recommend using a servlet container such as jetty to do this - jetty is small and simple, and can be left running indefinitely.
    • .NET web forms. Talk to me if you're interested in this approach.
    • Other ... if you have access to your own server, you might choose to use some other server-side technology. Please talk to me if there's other approaches you're interested in.