Part 2: Transforming Data into Different Representations.
Due 3/24/03
What to turn in: A URL pointing to the user's web page.
Also, email me your Agent code.
In this project, we'll be building a personalized book-shopping agent. Our agent will talk to Amazon's database to get info about books, as well as discover related books. We'll then filter this related information, based on a model of user preferences that we'll build up.
Part 1 focused on the retrieval of data; you put together the basic code to talk with Amazon's server via HTTP and pass in an XSLT file that transforms Amazon's XML output into HTML. In part 2, you'll build the tools to tranform your data from one representation into another and implemente the remaining infrastructure for your agent. In part 3, you'll apply this new data representation to a particular application: personalized recommendations.
From the user's point of view, the application will have two parts: a search area in which the user can enter search requests for a book and see detailed info about that book, and a similarity area, where the user can see books thought to be both similar to the requested book and likely to appeal to the user. Clicking on a book in the similarity area should bring up a detailed display about the book in the search area.
A rough example of what I'm looking for can be found: here Please feel free to make your pages prettier than mine!
The XSLT files I used to generate these pages are here and here.
Let's break this down in more detail.
The user will interact with the agent via a web page. This page should have two separate areas - you can implement these areas as frames, separate windows, or tables (among others); you can determine these details for yourself.
The primary area is the presentation area. In this area, a user should initally be presented with a form that allows them to fill in a book's author, title, or ISBN/ASIN. Detailed information for a particular book should also be presented here. Check out the example above for one way that this could be laid out.
The secondary area is the similarity area. In this pane/window, you'll present the user with other books like they one they searched for that they're likely to be interested in. Clicking on one of these books should bring up more detailed information about it in the presentation area.
Your agent will then send information back to the user's Web browser. There are two types of information you'll send back; presentation information will get displayed in the presentation area; this will be details about a book, retrieved directly from Amazon as HTML. You will also send back similarity information, intended for the similarity area. In this case, your agent will need to generate HTML form an internal representation.
The second sort of request you'll send is a similarity request. This will be done using SOAP; Amazon will return back XML-formatted data. Your agent will then parse this data and build up an internal representation of these similar books. (In part 3 we'll filter this data.) Your agent will then convert this information to HTML and send it back to the similarity area of the user's Web browser.
Here are the specific things you'll need to do for part 2:
Create a servlet that serves as a wrapper for your code from part 1. It should be able to receive and process an HTTP POST or GET containing a request for title, author, or ASIN/ISBN info, construct a URL representing this request that includes an XSLT file indicating how the results should be displayed, read back Amazon's reply, and pass this on to the Web browser.
In order to do this, you'll first need to wrap your agent code within a servlet. You'll want to extend the HttpServlet class, and make sure you implement the doPost and/or doGet methods. (I'd recommend handling both; technically, requests like this that get data dent back are supposed to use GET; in practice, people use both. Pick one and have the other method call that one.) this should be relatively straightforward.
If you need more information about how servlets work, you can look here.
You'll use your existing code to fill in the information for the presentation area; the user will fill in some searh criteria and you'll build up a URL and send a request. You'll also want to create your own XSLT stylesheet (feel free to tweak or ignore the ones above) so that the data is transformed into a representation that meets your needs.
For similarity searches, we're going to want a bit more control over things. We're going to want to use SOAP to handle the requests, and we're going to process the data ourselves.
Using SOAP as a request mechanism provides us both with more flexibility and with an API that more closely matches our needs. Rather than worrying about how book information is represented within a URL, we can use a SOAP object. This has the advantage of being more flexible; Amazon's SOAP server only returns XML, but that's OK. We're going to want to process the data ourselves before passing it on to the client.
You'll need a SOAP toolkit to handle the underlying SOAP requests; a reasonable choice is Apache Axis. This provides an API on top of SOAP, so you won't need to directly construct or manipulate SOAP messages.
Whenever a user submits a request for book info, your agent should also perform a Similarity Search for that information. (If Amazon returns more than one book for a search, you should get similarity info for each of them.) Luckily, Amazon has provided some Java classes that illustrate how to build a Similarity Search; SimilaritySoap.java is a good one to look at.
This similarity info will be returned as an array of Details objects. For part 2, you'll just need to traverse this array and generate HTML displaying each Book's Title, Author, ISBN, ASIN, and BrowseNode (that's what Amazon calls genre), plus a small gif of the cover. This should then get sent to the similarity area. In part 3, we'll customize this list to include only books from particular BrowseNodes.