CS 662: AI Programming
Assignment 6: Ontology Design

Assigned: October 11
Due: October 23.
30 points total.

To turn in: Place a copy of your ontology ( both the .pprj and .owl files) in your Subversion repository in a subdirectory named assignment 6. Also, please turn in a hard copy of the design document.

Introduction

In this assignment, we'll assume that, based on your expertise and skill in AI, you have been hired by Spamazon, an online retailer of books and media, to help develop an ontology for their new music store. They would like you to design and implement a medium-sized ontology using Protege and the OWL plugin. In designing your ontology, you will be expected to follow the design principles discussed in class and outlined in the paper "Ontology Development 101", which is linked below. In grading the project, I will be particularly interested in the design decisions you make, and your reasons for making them.

To begin

To start, complete the Pizza tutorial that we started in class. You don't need to turn anything in for this, but it will familiarize you both with using Protege and with ontology design.

Also, you should read Ontology Development 101. This will explain the steps involved in developing an ontology.

Grading

The project will be graded as follows:

Ontology implementation

In grading your ontology, I will evaluate how well you have captured the ideas in your domain, and also how well you have used the features of Protege to do so. For example, do your class/subclass relationships make sense? Do you use conditions properly? Are your classes consistent? How completely have you captured the relevant issues for your domain? I am more interested in seeing you accurately and completely represent a smaller domain seeing you do a superficial job at representing something larger.

Design document

This is primarily a design project, rather than a programming project. You will most likely write little or no code; most of your 'programming' will be within Protege. The point of it is to give you experience designing and implementing a medium-scale ontology using a full-featured ontology development tool.

As a result, I would like you to prepare a document that discusses your design choices. I'll talk about what the document should contain throughout the project, but, in short, it should contain:

There is not a minimum or maximum page limit on this document; however, I think it would be difficult to present this information thoroughly in less than five pages. It should be typed or prepared using a word processing program. You are welcome (and encouraged) to use pictures or diagrams to help explain your ideas. I would also encourage you to be precise in your language whenever possible.

Note: This document is almost half the grade for this project. As a result, I would strongly suggest that you do not wait until the last minute to write it. In fact, I would suggest writing it while you are doing the design, rather than after.

If you are concerned about your ability to express yourself clearly in English, I would even more strongly suggest starting early. I do not expect perfect grammar, but I do expect to be able to understand what you are saying. If you are concerned about your writing skills (whether or not you are a native English speaker) I would suggest that you take advantage of the services provided by the USF Learning and Writing Center . If you're interested, they will look at your work and help you express yourself more clearly. This sort of grammatical or stylistic assistance is permitted; however, you may not have someone else write this document for you or assist you with the technical content. (In other words, having your roommate proofread it is fine. Having your roommate write it for you is not.)

(Note: a secondary goal of this project is to give you experience explaining technical ideas in written English. This is an essential skill in almost any job - it doesn't matter how good a programmer you are if you can't explain what you've done.)

The assignment itself

The online retailer Spamazon has recently created an online music store, and they would like an ontology on the back end to help them recommend music to customers.

Pick an area of music that you are interested in and create an ontology describing that area. I would strongly recommend focusing more deeply on a specific area, rather than trying to broadly cover a wide area. You want your ontology to have enough breadth to actually cover the concepts of interest in the domain. For example, you're much more likely to be successful with a limited domain such as Mozart sonatas or Top 40 Hits of 2007 than a broad category like Opera or Rock or Pop. I'd suggest picking an area of music you have an interest in.

Your ontology should (at a minimum) represent: Depending on the area of music you choose, you will want other elements, such as instruments, composers (for classical music), chart position (for hit singles), BPM (for electronic music) guest artists (for hiphop), and so on. These should be driven by your competency questions.

Specifying your domain and usage


To begin, you will want to specify your ontology's domain precisely. For example, who will be using your ontology? What will they use it for? What sorts of relationships will they want to know about? Be creative and specific here - don't just say "Spamazon shoppers." You should prepare a set of at least 10 competency questions that your ontology should be able to answer. These questions should provide a picture of the breadth of your ontology's scope. (In other words, don't have "Who wrote (song X)?" 10 times.)

Your design document should have a section enumerating your competency questions and discussing how they provide sufficient usage examples.

Class and property design

Your design document should also contain a description of the important classes and properties in your ontology. Please keep in mind that a listing of every single class one by one is not necessarily the the best way to present this information. You may find that pictures are better than words at describing the classes and the relationships between them. Jambalaya might be a very effective tool to help you depict class/slot relationships.

In this section, I would also like you to describe any significant design choices that you made in constructing your ontology. For example, why did you decide to use a subclass rather than a property? Why did you decide to use a class rather than a string for a property's value? How did you deal with genre? I'm particularly interested in your thought process here.

For example, assume we were making an ontology about USF. This would be a poor explanation. "I made a class for Student and a class for Professor because there are Students and Professors in the domain." Notice that it doesn't say anything about what other modeling possibilities might exist, the advantages or disadvantages of this approach, or what the designer's thought process was.

This is a better explanation: "In modeling graduate students and undergraduate students, I chose to treat these both as subclasses of Student. I also considered using a property within student called typeOfStudent, but decided that this was not appropriate, because graduate students and undergraduate students each have other traits, such as their rank and the number of credits needed for full time status, that should be kept distinct."

You don't need to do this for every single class - I just want to know about any interesting design problems that came up in your modeling.

Adding instances

You should then populate your ontology with instances. You will want enough different instances to test the different classes and slots you've created. As mentioned above, 75 is probably a good estimate of the number needed. You are welcome to use external sources, such as Amazon or Allmusic.com to collect information; be sure you give them appropriate credit in your design document.

You may create your instances by hand using the Protege wizard, or write a program using the Java API to create instances. Here is some information on how to use the Java API.

Note: You are not required to write code to enter instances; if your Java skills are not very strong, you may find it a very frustrating experience. This information is provided for people who want to play with this aspect of Protege.

As you add instances, you will most likely find that there are some weaknesses or problems with your ontology. Include a section in your design document that describes any problems you found as you were creating instances and how you modified your ontology to address these problems. Be as specific as you can.

Consistency Checking

As you build your ontology, you will need to check it for consistency. You will also need to infer relationships that are entailed by the axioms you've entered. A significant advantage of an ontology is that this process can be automated using a reasoner.

You'll be using an external program called Pellet to do this checking. Protege is able to talk to Pellet via an HTTP interface. The address of this server can be configured under the Owl->Preferences menu.
Your final ontology must be consistent.

Querying

You should now be able to encode your competency questions using either the SPARQL Query panel, the Query tab, or the Jambalaya browser. Is your ontology able to answer all of the competency questions you originally posed? Are there other problems you can foresee?

Summary

Finally, your document should summarize the capabilities of your ontology. Now that you've built it and tested it, what are its potential uses? What audiences would be interested in it? Most importantly, what are your ontology's strengths and weaknesses? Are there concepts or queries that it cannot answer? How would you improve it in version 2.0?

Writing Hints