# CS662 Assignment 9, utility, perceptrons, and programming potpourri.

Assigned: Tuesday, November 27.
Due: Monday, December 10 at 9:00 am. No late assignments accepted.
40 points total.

To turn in: For problems 1, 2, and 3, typed or handwritten answers to your problems. For the coding problems, please submit all code to a folder named assignment9 in your subversion repository, and also submit a hard copy of your code.

Question 1. Utility. (5 points)
(from R & N, pp610): Tickets to a lottery cost \$1. There are two possible prizes: a \$10 payoff, with probability 1/50, and a \$1,000,000 payoff with probability 1/2,000,000. What is the expected value of a lottery ticket? What is the optimal number of tickets to buy, assuming your utility for money is linear?

Question 2. Value of information. (10 points total)

Suppose that our route-finding agent is trying to suggest a route for us to get from USF to Oakland. We want to minimize the expected travel time. We know that, when the Bay Bridge is busy, it takes 1 hour to drive there, and when the Bay Bridge is not busy, it takes 30 minutes to drive there. We know that taking BART always takes 40 minutes. We also know that the Bay Bridge is busy 40% of the time.

a. (2 points) Without any other information, should we drive or take BART? Show all necessary work.

b. (2 points) Suppose that we can spend five minutes checking a traffic website to see if the bridge is actually busy. We know that 90% of the time when the bridge is actually busy, the site will say it's busy. (P(site | busy) = 0.9) We also know that 20% of the time the site will say the bridge is busy when it actually isn't. (P(site | !busy) = 0.2)

Use Bayes' rule to determine the probability that the bridge is actually busy if the site says it is. (P(busy | site)).

c. (3 points) If the site says the Bridge is busy, what should we do? What if the site says the bridge is not busy? Show all work.

d. (3 points) Use a value of information calculation to determine whether it is worth it for us to spend five minutes checking the traffic website.

Question 3. Perceptrons. (5 points)

Do 20.15a on pp 761 of Russell & Norvig. You may assume that alpha = 0.1 and w0 = 0.

Question 4. Value Iteration and Policy Iteration. (5 points each) For this problem, you will implement the value iteration and policy iteration algorithms. I've provided a representation for states, a map, and the setup for two problems - the one shown in R&N (and done in class), and a larger problem, the map of which can be found here. In this second problem, the agent moves in the intended direction with P=0.7, and in each of the other 3 directions with P=0.1. Your task is to implement the value iteration and policy iteration algorithms and verify that they work with both problems. (I'd suggest doing the R&N problem first.)

You may assume R=-0.04 for all non-goal states, and gamma = 0.8.

Here's an example of what the code looks like running in the Python interpreter:
```>>> import mdp
>>> m = mdp.makeRNProblem()
>>> m.valueIteration()
>>> [(s.coords, s.utility) for s in m.states.values()]
[(0, 0), (1, 0.30052947656142465), (2, 0.47206207850545195), (3,
0.68209220953458682), (4, 0.18120337982169335), (5,
0.34406397771608599), (6, 0.09080843870176547), (7,
0.095490116585228102), (8, 0.18785929363720655), (9,
0.00024908649990546677), (10, 1.0), (11, -1.0)]
>>> m.policyIteration()
>>> [(s.coords, s.utility, s.policy) for s in m.states.values()]
[(0, 0, None), (1, 0.28005761520403155, 'right'), (2,
0.4690814072745027, 'right'), (3, 0.68184632776188669, 'right'), (4,
0.15435343031111029, 'up'), (5, 0.34377291077136857, 'up'), (6,
0.061864822644220767, 'up'), (7, 0.088791721072110752, 'right'), (8,
0.18680600621029542, 'up'), (9, -0.00075615039456027738, 'left'), (10,
1.0, None), (11, -1.0, None)]
```

### Potpourri

There are four programming problems described below. You must do one of them, which is worth 10 points.

In addition, you may do up to two others, for up to 5 points extra credit each, to be applied directly to the score of your lowest midterm or final.
• Q-learning. Implement Q-learning for the problems described above. I've provided a lot of the skeleton code. You just need to implement the update, action selection, and outer learning function. You can assume that alpha = 0.2 and gamma = 0.8.

• Boosting. The Enron dataset is an email dataset that's harder to classify than the spam dataset. (I've also placed a copy in /home/public/cs662/enronMail). It consists of emails placed into different folders by the users; the task is to classify an email by folder based on the content. There are two challenges: some users have lots of folders, and some folders have only a few messages.
Here is a class to help process the Enron dataset, and a fairly straightforward Naive Bayes classifier.. The classifier does quite well on the SpamAssassin dataset (>99%), but quite badly on Enron when more than three folders are used as categories.

Here is a quick example of how to use the code, trating and testing on three folders belonging to user 'lokay-m'. It also uses code from the emailClustering problem below:
```>>> import enronEmail
>>> import naiveBayes
>>> c = enronEmail.buildDocFrequency('../enronMail/lokay-m')
(c is a tuple - c[0] is the number of documents, and c[1] is a
WordHash of document frequencies.)
>>> import emailClustering
>>> mails = emailClustering.loadData(["../enronMail/lokay-m/articles", "../enronMail/lokay-m/corporate","../enronMail/lokay-m/personal"], c[1], c[0])
>>> f1 = enronEmail.AlnumFilter()
>>> f2 = enronEmail.StopwordFilter()
>>> [m.filter([f1,f2,f3]) for m in mails]
>>> classifier.train(mails)
>>> for mail in mails :...     print classifier.classify(mail), mail.location
...
../enronMail/lokay-m/articles ../enronMail/lokay-m/articles
../enronMail/lokay-m/articles ../enronMail/lokay-m/articles
../enronMail/lokay-m/personal ../enronMail/lokay-m/articles
(etc)
```
Again, this does pretty well for three forders, but poorly when we go to 7 or 8 or to multiple users.

Use boosting to train a set of these classifiers. Each classifier should provide a weighted vote as to the classification, based on the strength of its MAP hypothesis. I'd start by testing this on a small subset of the folders, and then gradually increasing as you gain confidence. lokay-m is a good user to work with, as she has a lot of email. You are welcome to add more filters, and use as few or as many as you like.

• Clustering. The Enron dataset can also be used to do clustering. Here is some skeleton code for clustering based on document content. You will need to implement the dist() method (use cosine similarity) and the K-means clustering algorithm to group emails into clusters based on their content. As with the boosting problem, you are welcome to add more filters and use as few or as many as you like.
Note: Clustering is a computationally intensive process. I strongly suggest testing this on small datasets until you are absolutely sure it works correctly. You might also want to test it on the SpamAssassin data.

• Backpropagation. The Machine Learning group at CMU has created a database of face images in conjunction with Tom Mitchell's excellent book Machine Learning. There's also a copy in /home/public/cs662/faces. I've provided some code for you that does most of the work of reading in images and creating a multilayer neural network. You just need to implement the feedForward and backprop methods.

Currently the network is set up to determine whether someone is wearing sunglasses. You should also modify it to determine pose; whether someone is looking up, left, right, or straight. You will need to have two output units for this, and modify the determineCorrectOutput function.

The backpropNN module uses NumPy, which can be found in the SciPy package.

Note: Neural network training is also a computationally intensive process. I strongly suggest training ith small data sets for a small number of epochs, possibly using the smaller images rather than the full-size ones, until you are positive your net works correctly.

Here's an example of how to use the code:
```>>> import backpropNN
>>> nn = backpropNN.backpropNN(128*120,1,4,0.3)
>>> backpropNN.train(nn, 'faces/simple.list', 50)
epoch:  0
epoch:  1
epoch:  2
epoch:  3
epoch:  4
epoch:  5
epoch:  6
epoch:  7
epoch:  8
epoch:  9
epoch:  10
epoch:  11
epoch:  12
epoch:  13
epoch:  14
epoch:  15
epoch:  16
epoch:  17
epoch:  18
epoch:  19
epoch:  20
epoch:  21
epoch:  22
epoch:  23
epoch:  24
epoch:  25
epoch:  26
epoch:  27
epoch:  28
epoch:  29
epoch:  30
epoch:  31
epoch:  32
epoch:  33
epoch:  34
epoch:  35
epoch:  36
epoch:  37
epoch:  38
epoch:  39
epoch:  40
epoch:  41
epoch:  42
epoch:  43
epoch:  44
epoch:  45
epoch:  46
epoch:  47
epoch:  48
epoch:  49
Expected: [0] Actual [ 0.04212802]
Expected: [0] Actual [ 0.08170461]
Expected: [1] Actual [ 0.93460783]
Expected: [0] Actual [ 0.13312776]
Expected: [1] Actual [ 0.93585449]
Expected: [0] Actual [ 0.04586887]
Expected: [0] Actual [ 0.04755792]
Expected: [1] Actual [ 0.93601399]
Expected: [1] Actual [ 0.93635389]
Expected: [0] Actual [ 0.042275]
Expected: [0] Actual [ 0.05816649]
Expected: [1] Actual [ 0.9358427]
Expected: [0] Actual [ 0.06290364]
Expected: [1] Actual [ 0.93552888]
Expected: [1] Actual [ 0.93496914]
Expected: [1] Actual [ 0.93582527]
Expected: [1] Actual [ 0.93145154]
Expected: [1] Actual [ 0.92570123]
Expected: [0] Actual [ 0.04221538]
Expected: [1] Actual [ 0.92934903]
Expected: [1] Actual [ 0.93533853]
```