CS 680 Internet Systems Research
Spring 2005

Description
Survey of Internet systems research including the anatomy of the web, search engine architecture and algorithms,  information retrieval, crawling, personalization, contextual computing, collaborative environments, peer-to-peer systems, personal information management systems, and the semantic web.
pre-requisite: CS 662 Artificial Intelligence (see instructor if you are missing this and would still like to take the course).

USEFUL LINKS: Paper Commentary Blog  Guidelines for Commenting  Special Lecture Series Course Wiki

FINAL REVIEW

Instructor
Dr. David Wolber
Office: Harney 539
Phone: (415) 422-6451
Email: wolber@usfca.edu
Homepage: www.cs.usfca.edu/~wolber
Office Hours: Mondays 3-4, Wednesdays 11-12, and by appointment.

Activities
Reading and Research-- Read seminal and current research papers to gain knowledge of the state-of-the-art and the process of research.

Produce a Documentary/Research Paper-- The goal of the projects is to produce publishable work in the form of a research paper or documentary video. The first project, encompassing the initial eight weeks of the semester, will produce a research paper along with other deliverables (code, wiki, etc.). The second project will produce a documentary video clip of 10-20 minutes (along with other deliverables).
 

Develop Innovative Software-- Though this course does not focus on software development, the research projects will require the development of software. Some projects will extend the existing Webtop system.

Develop Course Wiki/Blog -- The product of student commentaries will be a publicly available Internet Systems Research Blog and Wiki.

Course Structure
Prior to each class session: Each student will read the assigned paper and write commentaries following these guidelines. All commentaries should be cut-and-pasted into a single document, printed out, and brought to class.

Each class session: There will be a one hour discussion of the paper, including lecture and discussion. Students are required to participate in discussions and part of the grade is dependent on this participation. Generally, the other hour of the course will be devoted to the projects-- groups of 2 or 3 will meet, discuss, perform research on the web, give mini-reports to the class at large, develop software, etc. The professor will float between groups eavesdropping, suggesting, learning, etc. We will also have some guest speakers: Jim Pitkow from moreover.com, Eytan Adar from the HP Information Dynamics Lab, and Igor Ranitovic from the Internet Archive are already signed up.

The assigned papers allow for a breadth-first exploration of the field, while the projects allow for a more in-depth study of a particular area and an opportunity for creative work.

Students are also required to attend three research talks outside of class during the semester. These talks may be events in the Bay Area or from specific USF special lecture series talks. Your instructor will suggest a number of talks and you may suggest ones to him. Students will write commentaries as with the readings (see guidelines).

Projects
There will be two projects for the course, one due prior to spring break, and one at the end of the year.
Project I
Project II

Topics, Papers, Software, and Lectures
(Reading and Responses Due by class time on date given)

Topic Reading Lectures and Software
Introduction to Internet Systems Research  

Additional Resources
Motivation for Internet Research

Research Skills 

Research Resources

Search Engine Architecture and Algorithms

 

 

 

 

 

 

1/27/05
Authoritative sources in a hyperlinked environment
J Kleinberg - Cited by 1059  

2/1/05
The anatomy of a large-scale hypertextual Web search engine
S Brin, L Page - View as HTML - Cited by 1087

2/3/05
MIT Technology Review Article on Google/Microsoft, Jan 2005 (to be distributed)

2/8/05 (Note: originally was scheduled for 2/3/05)
Impact of Search Engines on Page Popularity
J Cho, S Roy (WWW 04 paper)

Additional resources

Discussion Questions

 

 

 

 

 

 

 

Personalization and Context

 

2/10 Speaker: Eytan Adar of HP Info Dynamics Lab

2/15
Personalized Search
J Pitkow, H Schuetze, T Cass, R Cooley, D Turnbull, Edmonds, A. Adar, E., Breuel, T. - Cited by

Initial Project Presentations

2/17
Context in Web Search (blogging optional)
S Lawrence, NJ Princeton - View as HTML - Cited by 49

Initial Project Presentations

2/22 Exploring the Web with Reconnaissance Agents
H Lieberman, C Fry, L Weitzman - Cited by 33

Initial Project Presentations
Project Wiki Preliminary Grading Starts 12 Midnight

2/24 Context and Personalization Study Due

Additional resources

Google Personal from Google Labs

A9  what's cool about it?

Watson

 

Powerpoint on Recon Agent paper

Personal Information Management (PIM) 3/1
Vannevar Bush, As We May Think, 1945 (optional)

Xanalogical structure, needed now more than ever: parallel documents, deep links to content, deep …
TH Nelson - Cited by 21

3/3 Wolber SLS Talk on WebTop
WebTop Worksheet

Additional resources

Google Desktop

Webtop

The Brain

 

Semantic Web

3/8
The Semantic Web
T Berners-Lee, J Hendler, O Lassila - View as HTML - Cited by 1347 

Berner's-Lee Talk

3/10 SLS Talk: James Pitkow

3/15 Midterm Review and Project Workshop

3/17 Midterm

3/22 and 3/24 BREAK

3/31
Semantic Search
R Guha, R McCool, E Miller - Cited by 22 - Web Search


Speaker: Lada Adamic

4/5
How to Make a Semantic Web Browser  --
D Quan, DR Karger - Cited by 2 (Haystack)

Additional resources

Protege

Haystack

 

Crawling and Search Area Creation

 

 

4/7 Talk: Doug Cutting, Developer of Nutch/Lucene

4/12
Focused crawling: A new approach for topic-specific resource discovery

S Chakrabarti, M van den Berg, B Dom - View as HTML - Cited by 250 

NOTE: You may read the Chakrabarti paper or any of the crawling ones listed in the "Additional Resources" link below.

4/14 Talk: Igor Ranitovic, Internet Archive

Additional Resources

Semantic Crawling: Scutter Definition  http://rdfweb.org/topic/Scutter

Collaborative Filtering and Social Networks

 

 

 

4/21 Please Read one or more of the following two papers and Wikify (at least scan by 4/19)

Friends and Neighbors on the Web
LA Adamic, E Adar - View as HTML - Cited by 19

The Hidden Web
HA Kautz, B Selman, MA Shah - Cited by 97 - Web Search

Reputation Network Analysis for Email Filtering
J Golbeck, J Hendler - View as HTML - Cited by 5 - Web Search

Additional resources

 
Peer-to-Peer Due 4/26
Looking up data in P2P systems
H Balakrishnan, MF Kaashoek, D Karger, R Morris, I … - Cited by 39

Due 4/26
Wired BitTorrent article

Due 4/28
Incentives Build Robustness in BitTorrent
B Cohen - View as HTML - Cited by 113 - Web Search

Due 4/28
Dissecting BitTorrent: Five Months in a Torrent’s Lifetime
M Izal, G Urvoy-Keller, EW Biersack, PA Felber, A … - Cited by 26 - Web Search

 

Additional Resources

 
Folksonomy Due 5/3: Read material about folksonomy...some can be found in Additonal Resources below...
WIKI by tuesday at class!

Additional Resources

 

FINAL REVIEW

Grading

Midterm 15 Material includes all papers, commentary, and in-class lecture/discussion
Final 20 Material includes all papers, commentary, and in-class lecture/discussion
Project I 25 Includes initial and final deliverables and presentations
Project II 25 Includes initial and final deliverables and presentations
Weekly Participation 15 Based on commentaries, in-class participation, in-class assignments,  special talk attendance

Important Dates

Feb. 15,17,22 Project 1 Initial Deliverables, Presentation
March 15 Midterm
March 17 Project 1 Final Deliverables Due (everything on Wiki)
March 21-25 Spring Break
April 19 Project 2 Initial Deliverables
May 10,12 Project 2 Due, Presentations
May 17 Final