Project 2: Building a book-shopping agent.

Part 1. Due 3/10/03 (1 week)

In this project, you'll be building a book-shopping agent. This agent will interface with Amazon's Web Services and retrieve data from their catalog. Amazon's Web Services returns book data using XML; therefore, a large part of this project will involve working with, parsing, and generating XML.

This project will involve integrating and learning about a fairly large amount of third-party software; therefore, I've broken it into pieces. The early pieces are like labs; the actual implementation is not that complicated, but there's several pieces to learn about, code to download and install, and formats to learn about and get familiar with. This first part of the project will cover talking to Amazon's database using HTTP and XML, sending their database an XSL style sheet to format your data, and setting up a parser to process the XML that is returned.

The second part of the project will involve using Xerces, an XML parser, to parse Amazon's output and generate Java objects. We'll then integrate this with SOAP, a more powerful interface than HTTP. Once we have this in place, the third part of the project will involve adding personalization to the client side, allowing our agent to filter results based on our preferences.

I'll be using Java for all of the example code, and referring to Xerces, an XML parser written in Java. You should use Java for this project; therefore, a "task 0" will be to get up to speed on Java programming.

One challenge with learning about Web Services is getting a handle on all the jargon and acronyms that are used - it can seem much more complicated than it actually is.

Resources

There are lots of resoures available on the Web regarding XML, Java, and SOAP. Here are a few:

Part 1: Setup.

You'll want to start by downloading Amazon's Web Services developer's kit. It's available here (or you can find it from Amazon's home page). Once you get this, you'll also need to apply for a developer's token. This will let you access Amazon's server.

Take some time and read through the included documentation. It's relatively straightforward, apart from all the jargon.

Part 2: XML over HTTP.

The simplest way of getting Amazon's data is to send the server an HTTP request. The Amazon documentation provides some examples of how to do this. Here's one:

http://xml.amazon.com/onca/xml2?t=webservices-20&dev-t=DXXXO5TPKF2Q6I&ManufacturerSearch=palm&mode=electronics&type=lite&page=1&f=xml;

The relevant parameters are:

The simplest way to try this out is to type some URLs into a browser. Any browser that can parse XML (Netscape, IE after 5.5, others?) will display the content of the results; others will leave in the formatting tags.

Try submitting the following queries to Amazon this way.

  1. Find the first 10 books by the author J.K. Rowling, ordered by price, lowest to highest.
  2. Find the first 10 albums by the Beatles, ordered by release date.
  3. Find the first 10 Sony digital cameras, sorted alphabetically.

Now that you've got the hang of how these URLs are put together, you've probably realized that it's a pain to do this by hand. So, for the remainder of this task, build a set of Java classes that can take input from the user as to the type of search and relevant parameters, construct a URL from these parameters, and then fetch the approprate content from Amazon.

This is the main part of this portion of the project - think about extensibility and modularity. You're going to want to use this code for a number of different purposes, so try to design each class and method to be as generic as possible. In particular, separate the backend, which constructs a URL and receives data, from the front-end, which will display the data and handle user input. Later on, you may want to use the backend separately. Your code should start by prompting the user for a type of search, then give them an opportunity to fill in the relevant variables, which might vary depending on the type of search (for example, manufacturer searches require a mode.) It's probably easiest to build a simple GUI for this. At this point, you can just dump the resulting XML onto a Canvas.

Part 2: Writing our own XSLT document

So far, we've just been using URLs to get some data back from Amazon, but we haven't thought at all about the content or the format of this data. Often, you'll want to transform XML from one representation to another (such as HTML). Remember, XML per se is just a language for specifying how content is arranged. the semantics (what objects can contain other objects, how objects are related, etc.).

For this task, you'll construct an XSL style sheet for your data that indicates that search results should be organized in an HTML table. Use the "webserv-example.xsl" stylesheet provided with the Amazon kit as a template. It provides some rudimentary tables; each item gets placed in a separate table, which works, but is ugly. Fix this so that it displays 3 elements in each row. Now, augment your program from part 1 to pass in a URL for your stylesheet in the '&f' parameter. Now Amazon will send back HTML containing the data you asked for. Your Java code should then render this data. At this point, there are several possible decisions about how this data should be rendered. One possibility is to simply pass the resulting HTML onto a browser. Another is to use your code as a servlet ; a piece of code invoked by a webserver that returns HTML. If you already are familiar with how they work and have access to a web server that supports servlets, feel free to implement this as a servlet. Otherwise, just do it as a separate client program.)

Next week, we'll work on parsing the XML that Amazon sends back, turning it into Java objects, and using SOAP to communicate with the server, rather than HTTP.