CS662 Assignment 9, utility, perceptrons, and programming potpourri.
Assigned: Tuesday, November 27.
Due: Monday, December 10 at 9:00 am. No late assignments accepted.
40 points total.
To turn in: For problems 1, 2, and 3, typed or handwritten answers to your
problems. For the coding problems, please submit all code to a folder
named assignment9 in your subversion repository, and also submit a
hard copy of your code.
Question 1. Utility. (5 points)
(from R & N, pp610): Tickets to a lottery cost $1. There are two
possible prizes: a $10 payoff, with probability 1/50, and a $1,000,000
payoff with probability 1/2,000,000. What is the expected value of a
lottery ticket? What is the optimal number of tickets to buy, assuming
your utility for money is linear?
Question 2. Value of information. (10 points total)
Suppose that our route-finding agent is trying to suggest a route for
us to get from USF to Oakland. We want to minimize the expected
travel time. We know that, when the Bay Bridge is busy, it takes 1
hour to drive there, and when the Bay Bridge is not busy, it takes 30
minutes to drive there. We know that taking BART always takes 40
minutes. We also know that the Bay Bridge is busy 40% of the time.
a. (2 points) Without any other information, should we drive or take BART? Show
all necessary work.
b. (2 points) Suppose that we can spend five minutes checking a traffic website
to see if the bridge is actually busy. We know that 90% of the time
when the bridge is actually busy, the site will say it's busy. (P(site
| busy) = 0.9) We also know that 20% of the time the site will say the
bridge is busy when it actually isn't. (P(site | !busy) = 0.2)
Use Bayes' rule to determine the probability that the bridge is
actually busy if the site says it is. (P(busy | site)).
c. (3 points) If the site says the Bridge is busy, what should we do?
What if the site says the bridge is not busy? Show all work.
d. (3 points) Use a value of information calculation to determine whether it is
worth it for us to spend five minutes checking the traffic website.
Question 3. Perceptrons. (5 points)
Do 20.15a on pp 761 of Russell & Norvig. You may assume that alpha =
0.1 and w0 = 0.
Question 4. Value Iteration and Policy Iteration. (5 points
each)
For this problem, you will implement the value iteration and policy
iteration algorithms. I've provided a
representation for states, a map, and the setup for two problems -
the one shown in R&N (and done in class), and a larger problem, the
map of which can
be found here. In this second problem, the
agent moves in the intended direction with P=0.7, and in each of the
other 3 directions with P=0.1. Your task is to implement the value
iteration and policy iteration algorithms and verify that they work
with both problems. (I'd suggest doing the R&N problem first.)
You may assume R=-0.04 for all non-goal states, and gamma = 0.8.
Here's an example of what the code looks like running in the Python
interpreter:
>>> import mdp
>>> m = mdp.makeRNProblem()
>>> m.valueIteration()
>>> [(s.coords, s.utility) for s in m.states.values()]
[(0, 0), (1, 0.30052947656142465), (2, 0.47206207850545195), (3,
0.68209220953458682), (4, 0.18120337982169335), (5,
0.34406397771608599), (6, 0.09080843870176547), (7,
0.095490116585228102), (8, 0.18785929363720655), (9,
0.00024908649990546677), (10, 1.0), (11, -1.0)]
>>> m.policyIteration()
>>> [(s.coords, s.utility, s.policy) for s in m.states.values()]
[(0, 0, None), (1, 0.28005761520403155, 'right'), (2,
0.4690814072745027, 'right'), (3, 0.68184632776188669, 'right'), (4,
0.15435343031111029, 'up'), (5, 0.34377291077136857, 'up'), (6,
0.061864822644220767, 'up'), (7, 0.088791721072110752, 'right'), (8,
0.18680600621029542, 'up'), (9, -0.00075615039456027738, 'left'), (10,
1.0, None), (11, -1.0, None)]
Potpourri
There are four programming problems described below. You must do one
of them, which is worth 10 points.
In addition, you may do up to two others, for up to 5 points
extra credit each, to be applied directly to the score of your lowest
midterm or final.
- Q-learning. Implement Q-learning for the problems described
above. I've provided a lot of the
skeleton code. You just need to implement the update, action
selection, and outer learning function. You can assume that alpha =
0.2 and gamma = 0.8.
- Boosting. The Enron
dataset is an email dataset that's harder to classify than the
spam dataset. (I've also placed a copy in
/home/public/cs662/enronMail). It consists of emails placed into
different folders by the users; the task is to classify an email
by folder based on the content. There are two challenges: some
users have lots of folders, and some folders have only a few messages.
Here is a class to help
process the Enron dataset, and a fairly straightforward Naive Bayes
classifier.. The classifier does quite well on the SpamAssassin
dataset (>99%), but quite badly on Enron when more than three
folders are used as categories.
Here is a quick example of how to use the code, trating and testing
on three folders belonging to user 'lokay-m'. It also uses code from
the emailClustering problem below:
>>> import enronEmail
>>> import naiveBayes
>>> c = enronEmail.buildDocFrequency('../enronMail/lokay-m')
(c is a tuple - c[0] is the number of documents, and c[1] is a
WordHash of document frequencies.)
>>> import emailClustering
>>> mails = emailClustering.loadData(["../enronMail/lokay-m/articles", "../enronMail/lokay-m/corporate","../enronMail/lokay-m/personal"], c[1], c[0])
>>> f1 = enronEmail.AlnumFilter()
>>> f2 = enronEmail.StopwordFilter()
>>> f3 = enronEmail.HeaderFilter()
>>> [m.filter([f1,f2,f3]) for m in mails]
>>> classifier.train(mails)
>>> for mail in mails :... print classifier.classify(mail), mail.location
...
../enronMail/lokay-m/articles ../enronMail/lokay-m/articles
../enronMail/lokay-m/articles ../enronMail/lokay-m/articles
../enronMail/lokay-m/personal ../enronMail/lokay-m/articles
(etc)
Again, this does pretty well for three forders, but poorly when we
go to 7 or 8 or to multiple users.
Use boosting to train a set of these classifiers. Each classifier
should provide a weighted vote as to the classification, based on
the strength of its MAP hypothesis. I'd start by testing this on a
small subset of the folders, and then gradually increasing as you
gain confidence. lokay-m is a good user to work with, as she has a
lot of email. You are welcome to add more filters, and use as few or
as many as you like.
- Clustering. The Enron dataset can also be used to do
clustering. Here is some
skeleton code for clustering based on document content. You will
need to implement the dist() method (use cosine similarity) and the
K-means clustering algorithm to group emails into clusters based on
their content. As with the boosting problem, you are welcome to add
more filters and use as few or as many as you like.
Note: Clustering is a computationally intensive process. I
strongly suggest testing this on small datasets until you are
absolutely sure it works correctly. You might also want to test it
on the SpamAssassin data.
- Backpropagation. The Machine Learning group at CMU has created
a database of face images in conjunction with Tom Mitchell's
excellent book Machine Learning. There's also a copy in
/home/public/cs662/faces. I've provided some code for you that does most of
the work of reading in images and creating a multilayer neural
network. You just need to implement the feedForward and backprop
methods.
Currently the network is set up to determine whether someone is
wearing sunglasses. You should also modify it to determine
pose; whether someone is looking up, left, right, or straight. You
will need to have two output units for this, and modify the
determineCorrectOutput function.
The backpropNN module uses NumPy,
which can be found in the SciPy package.
Note: Neural network training is also a computationally
intensive process. I strongly suggest training ith small data sets
for a small number of epochs, possibly using the smaller images
rather than the full-size ones, until you are positive your net works
correctly.
Here's an example of how to use the code:
>>> import backpropNN
>>> nn = backpropNN.backpropNN(128*120,1,4,0.3)
>>> backpropNN.train(nn, 'faces/simple.list', 50)
loaded data
epoch: 0
epoch: 1
epoch: 2
epoch: 3
epoch: 4
epoch: 5
epoch: 6
epoch: 7
epoch: 8
epoch: 9
epoch: 10
epoch: 11
epoch: 12
epoch: 13
epoch: 14
epoch: 15
epoch: 16
epoch: 17
epoch: 18
epoch: 19
epoch: 20
epoch: 21
epoch: 22
epoch: 23
epoch: 24
epoch: 25
epoch: 26
epoch: 27
epoch: 28
epoch: 29
epoch: 30
epoch: 31
epoch: 32
epoch: 33
epoch: 34
epoch: 35
epoch: 36
epoch: 37
epoch: 38
epoch: 39
epoch: 40
epoch: 41
epoch: 42
epoch: 43
epoch: 44
epoch: 45
epoch: 46
epoch: 47
epoch: 48
epoch: 49
Expected: [0] Actual [ 0.04212802]
Expected: [0] Actual [ 0.08170461]
Expected: [1] Actual [ 0.93460783]
Expected: [0] Actual [ 0.13312776]
Expected: [1] Actual [ 0.93585449]
Expected: [0] Actual [ 0.04586887]
Expected: [0] Actual [ 0.04755792]
Expected: [1] Actual [ 0.93601399]
Expected: [1] Actual [ 0.93635389]
Expected: [0] Actual [ 0.042275]
Expected: [0] Actual [ 0.05816649]
Expected: [1] Actual [ 0.9358427]
Expected: [0] Actual [ 0.06290364]
Expected: [1] Actual [ 0.93552888]
Expected: [1] Actual [ 0.93496914]
Expected: [1] Actual [ 0.93582527]
Expected: [1] Actual [ 0.93145154]
Expected: [1] Actual [ 0.92570123]
Expected: [0] Actual [ 0.04221538]
Expected: [1] Actual [ 0.92934903]
Expected: [1] Actual [ 0.93533853]