|
Practice questions April 2004
Solutions to practice
questions
See also the 2004
2006 exams!
Recommended book
Richard
K Belew
Finding Out
About: A Cognitive Perspective on Search Engine Technology and the WWW Cambridge University Press, 2001
FOA website
OHP Slides
Lecture 1
introduction (slides) (handouts)(PPT)
Lecture 2
texts & Zipf (slides)
(handouts)(PPT)
Lecture 3 Zipf,
stemming, stop words (slides)
(handouts)(PPT)
Lecture 4
Matching (slides)
(handouts)(PPT)
Lecture 5 Index
(slides)
(handouts)(PPT)
Lecture 6 LSI (slides) (handouts)(PPT)
Lecture 7 Query
expansion (slides)
(handouts)(PPT)
Lecture 8 Topic
Spotting (slides)
(handouts)(PPT)
Lecture 9 Page
rank (slides)
(handouts)(PPT)
Lecture 10
statistical models (slides)
(handouts)(PPT)
Lecture 11 PCA
(slides)
(handouts)(PPT)
Lecture 12
Clustering (slides)
(handouts)(PPT)
Lecture 13
K-means (slides)
(handouts)(PPT)
Lecture 14
Sequence analysis Dynamic Programming (slides)
(handouts)(PPT)
Lecture 15 HMMs
(slides) (handouts)(PPT)
|
2007 Exam and
Solutions
May 2007 exam questions (PDF)
Solution 1 (PDF, XLS)
Solution 2 (PDF, XLS)
Solution 3 (PDF, XLS)
Solution 4 (PDF, XLS)
Software Tools
Text analysis
tool from lecture 2: zipf.c
Stop word removal
tool from lecture 3: stop.c
Index generator
from lecture 4: index.c
Retrieval tool
from lecture 5: retrieve.c
Clustering tool
from lecture 10: agglom.c
K-means tool from
lecture 11: k-means.c
Vector
representation of docs tool: doc2vec.c
Data file from
clustering examples: sa1.txt
Edit distance
tool from lecture 13: edit-dist.c
Zipf tool for
first lab zipf2.c
k-means example (excel spreadsheet)
Laboratories
Lab
sheet 1 (week 5)
Lab sheet 2 (week 9)
Files for lab 2:
Lab2Data
k-means-2010.c
agglom-2010.c
Resources
Online Library of Literature
Porter Stemmer in
C, java, perl, C#
Scott Weiss (JHU) (porter
stemmer)
Small Stop List: stopList(50)
Large Stop List: stopList(Brown)
WordNet
|