CS 662: AI Programming
Assignment 6: Ontology Design
Assigned: October 11
Due: October 23.
30 points total.
To turn in: Place a copy of your ontology ( both the
.pprj and .owl files) in your Subversion repository in a subdirectory
named assignment 6. Also, please turn in a hard copy of the design document.
Introduction
In this assignment, we'll assume that, based on your expertise and skill
in AI, you have been hired by Spamazon, an online retailer of books
and media, to help develop an ontology for their new music store. They
would like you to design and implement a medium-sized ontology using
Protege and the OWL plugin. In designing your ontology, you will be
expected to follow the design principles discussed in class and
outlined in the paper "Ontology Development 101", which is linked
below. In grading the project, I will be particularly interested in
the design decisions you make, and your reasons for making them.
To begin
To start, complete the
Pizza tutorial that we started in class. You don't need
to turn anything in for this, but it will familiarize you both
with using Protege and with ontology design.
Also, you should read
Ontology Development 101. This will explain the steps
involved in developing an ontology.
Grading
The project will be graded as follows:
- 60% - Ontology implementation
- 40% - Design document
Ontology implementation
In grading your ontology, I will evaluate how well you have captured
the ideas in your domain, and also how well you have used the features
of Protege to do so. For example, do your class/subclass relationships
make sense? Do you use conditions properly? Are your classes
consistent? How completely have you captured the relevant issues for
your domain? I am more interested in seeing you accurately and
completely represent a smaller domain seeing you do a superficial job
at representing something larger.
Design document
This is primarily a design project, rather than a programming
project. You will most likely write little or no code; most of your
'programming' will be within Protege. The point of it is to give you
experience designing and implementing a medium-scale ontology using a
full-featured ontology development tool.
As a result, I would like you to prepare a document that discusses your
design choices. I'll talk about what the document should contain
throughout the project, but, in short, it should contain:
- A description of the domain
- A discussion of the users of your ontology
- A set of competency questions
- A description of your class hierarchy.
- A discussion of the major properties in your ontology.
- A discussion of the design decisions you made
- An analysis of the strengths and weaknesses of your design
There is not a minimum or maximum page limit on this document;
however, I think it would be difficult to present this information
thoroughly in less than five pages. It should be typed or prepared
using a word processing program. You are welcome (and encouraged) to
use pictures or diagrams to help explain your ideas. I would also
encourage you to be precise in your language whenever possible.
Note: This document is almost half the grade for this project. As a
result, I would strongly suggest that you do not wait until the last
minute to write it. In fact, I would suggest writing it while
you are doing the design, rather than after.
If you are concerned about your ability to express yourself clearly in
English, I would even more strongly suggest starting early. I do not
expect perfect grammar, but I do expect to be able to understand what
you are saying. If you are concerned about your writing skills
(whether or not you are a native English speaker) I would suggest that
you take advantage of the services provided by the
USF Learning and Writing Center . If you're interested, they will
look at your work and help you express yourself more clearly. This
sort of grammatical or stylistic assistance is permitted; however, you
may not have someone else write this document for you or assist you
with the technical content. (In other words, having your roommate
proofread it is fine. Having your roommate write it for you is not.)
(Note: a secondary goal of this project is to give you experience
explaining technical ideas in written English. This is an essential
skill in almost any job - it doesn't matter how good a programmer you
are if you can't explain what you've done.)
The assignment itself
The online retailer Spamazon has recently created an online music
store, and they would like an ontology on the back end to help them
recommend music to customers.
Pick an area of music that you are interested in and create an
ontology describing that area. I would strongly recommend focusing
more deeply on a specific area, rather than trying to broadly cover a
wide area. You want your ontology to have enough breadth to actually
cover the concepts of interest in the domain. For example, you're much
more likely to be successful with a limited domain such as Mozart
sonatas or Top 40 Hits of 2007 than a broad category like Opera or
Rock or Pop. I'd suggest picking an area of music you have an interest in.
Your ontology should (at a minimum) represent:
- Songs, which have titles, running times, genres and prices
- Albums, which contain songs, and have running times and prices.
- Artists, both solo and group.
- Genres
Depending on the area of music you choose, you will want other
elements, such as instruments, composers (for classical music),
chart position (for hit singles), BPM (for electronic music) guest
artists (for hiphop), and so on. These should be driven by your
competency questions.
Specifying your domain and usage
To begin, you will want to specify your ontology's domain
precisely. For example, who will be using your ontology? What will
they use it for? What sorts of relationships will they want to know
about? Be creative and specific here - don't just say "Spamazon
shoppers." You should prepare a set of at least 10 competency
questions that your ontology should be able to answer. These questions
should provide a picture of the breadth of your ontology's
scope. (In other words, don't have "Who wrote (song X)?" 10 times.)
Your design document should have a section enumerating your competency
questions and discussing how they provide sufficient usage examples.
Class and property design
Your design document should also contain a description of the
important classes and properties in your ontology. Please keep in mind that
a listing of every single class one by one is not necessarily the the
best way to present this information. You may find that pictures are
better than words at describing the classes and the relationships
between them. Jambalaya might be a very effective tool to help you
depict class/slot relationships.
In this section, I would also like you to describe any significant
design choices that you made in constructing your ontology. For
example, why did you decide to use a subclass rather than a property? Why
did you decide to use a class rather than a string for a property's
value? How did you deal with genre? I'm particularly interested in
your thought process here.
For example, assume we were making an ontology about USF. This would
be a poor explanation. "I made a class for
Student and a class for Professor because there
are Students and Professors in the domain."
Notice that it doesn't say anything about what other modeling
possibilities might exist, the advantages or disadvantages of this
approach, or what the designer's thought process was.
This is a better explanation: "In modeling graduate students and
undergraduate students, I chose to treat these both as subclasses of
Student. I also considered using a property within student called
typeOfStudent, but decided that this was not appropriate, because
graduate students and undergraduate students each have other traits,
such as their rank and the number of credits needed for full time
status, that should be kept distinct."
You don't need to do this for every single class - I just want to know
about any interesting design problems that came up in your modeling.
Adding instances
You should then populate your ontology with instances. You will want
enough different instances to test the different classes and slots
you've created. As mentioned above, 75 is probably a good estimate of
the number needed. You are welcome to use external sources, such as
Amazon or Allmusic.com to collect information; be sure you
give them appropriate credit in your design document.
You may create your instances by hand using the Protege wizard, or write a
program using the Java API to create instances.
Here is some information on how to use the Java API.
Note: You are not required to write code to
enter instances; if your Java skills are not very strong, you may
find it a very frustrating experience. This information is provided
for people who want to play with this aspect of Protege.
As you add instances, you will most likely find that there are some
weaknesses or problems with your ontology. Include a section in your
design document that describes any problems you found as you were
creating instances and how you modified your ontology to address these
problems. Be as specific as you can.
Consistency Checking
As you build your ontology, you will need to check it for
consistency. You will also need to infer relationships that are
entailed by the axioms you've entered. A significant advantage of an
ontology is that this process can be automated using a
reasoner.
You'll be using an external program called Pellet to do this
checking. Protege is able to talk to Pellet via an HTTP
interface. The address of this server can be configured under the
Owl->Preferences menu.
Your final ontology must be consistent.
Querying
You should now be able to encode your competency questions using either the
SPARQL Query panel, the Query tab, or the Jambalaya browser. Is your
ontology able to answer all of the competency questions you originally
posed? Are there other problems you can foresee?
Summary
Finally, your document should summarize the capabilities of your
ontology. Now that you've built it and tested it, what are its
potential uses? What audiences would be interested in it? Most
importantly, what are your ontology's strengths and weaknesses? Are
there concepts or queries that it cannot answer? How would you improve
it in version 2.0?
Writing Hints