| Topics | Readings |
|---|---|
Introduction to IR. Information vs data retrieval. What do we want from IR ? Introduction to evaluation. |
vR, ch.1: "Introduction"; RB, ch.1: "Overview"; SJW, ch.1, "Overall introduction", ch.2: "Introduction"; Croft, W.B. (1995) "What do people want from information retrieval ?", D-Lib Magazine, November, Belkin N.J. & Croft, W.B. (1992) "Information filtering and information retrieval: Two sides of the same coin?" Communications of the ACM, v. 35 no. 12: 29-38.. |
IR concepts. Aboutness. Relevance. Rationalist vs. empriricist approaches (AI vs. Stats) |
SJW, ch.3: "Introduction". Belkin, N.J. (1978) Information concepts for information science. Journal of Documentation, v. 34, no.1: 55-85. Hutchins, W.J. (1978) "The concept of "aboutness" in subject indexing". Journal of Informatics, vol.1, no.1, April1977, pp.17-35 ( also Aslib Proceedings, vol. 30: 172-181) (also SJW, 93-97). Saracevic, T. (1975) "Relevance: a review of and a framework for the thinking on the topic", Journal of the American Society for Information Science, vol. 26: 321-343 (SJW, 143-165). |
| Document and query representation. Manual vs. automatic indexing. | J. D. Anderson & J. Perez-Carballo, “The nature of indexing: how humans and machines analyze messages and texts for retrieval. Part I: Research, and the nature of human indexing; Part II: Machine indexing, and the allocation of human versus machine effort”, Information Processing and Management, vol. 37 (2001), 231-254, 255-277. Furnas, G.W., Landauer, T. K., Gomez, L. M., Dumais, S. T. (1987) "The vocabulary problem in human-system communication", Communications of the ACM, 30(11), 964-971. |
| Automatic indexing. Lexical analysis. Weighting. Data structures. | vR, ch.2: “Automatic text analysis”; RB, ch.2: “Extracting lexical features”; SJW, ch.6, "Introduction" (esp. the section on Indexing). Salton, G. & Buckley, C. (1988) “Term weighting approaches in automatic text retrieval”, Information Processing and Management, vol. 24: 513-523 (SJW, pp. 323-328). Robertson, S. E. and Sparck Jones, K. (1997), “Simple, proven approaches to text retrieval”, University of Cambridge Computer Laboratory Technical Report no. 356, 1994 (updated 1996,1997). For stemming code or a demo, see Martin Porter’s site. Mikheev, Andrei “Document
Centered Approach to Text Normalization”, SIGIR 2000, Athens. |
Models of IR. Interaction models. Indexing models. Language models. Topic models. User models. Relevance feedback. |
SJW, intro to ch.5 ("Models") and ch.6 ("Techniques"). Cooper, W.S. "Getting beyond Boole", Information Processing and Management, vol. 24: 243-248 (also in SJW, 265-267); van Rijsbergen,C. J. "A new theoretical framework for information retrieval", SIGIR'86; Robertson, S.E. "The probability ranking principle in IR", Journal of Documentation, vol 33: 294-304, 1977 (also in SJW, 281-286). Salton, G., Wong, A. & Yang, C.S. (1975) “A vector space model for automatic indexing”, Communications of the ACM, vol 18: 613-620 (also in SJW, 273-280); N. J. Belkin, R. N. Oddy, and H. M. Brooks "ASK for information retrieval: Part I. Background and theory.", Journal of Documentation, 38(2):61--71, 1982; Saracevic, T. "Interactive models in information retrieval (IR): Progress, problems, proposal", in Proceedings of the 1996 ASIS Annual Meeting, Medford, NJ. Turtle, H. & Croft, W.B. (1990) “Inference networks for document retrieval”, SIGIR 1990, New York: ACM, 1-24; Ponte, J. and Croft, W.B. "A Language Modeling Approach to Information Retrieval", SIGIR'98; Lavrenko, V. "Language models", tutorial at SIGIR 2003. Robertson, S. and Sparck Jones, K. "Simple, proven approaches to text retrieval", Technical Report TR356, Cambridge University Computer Laboratory, 1997; Salton, G. and Buckley, C. "Term-weighting approaches in automatic text retrieval", Information Processing and Management, 24(5):513-523, 1988. Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., Harshman R. (1990) "Indexing by Latent Semantic Analysis", JASIST, 41(6), 391-407 (a less mathematical alternative to Deerwester's original LSA paper is: Landauer, T. K., Foltz, P. W., Laham, D. "An Introduction to Latent Semantic Analysis", Discourse Processes, 25,259-284, 1998). Bates, Marcia J. “The Design of Browsing and Berrypicking Techniques for the Online Search Interface.", Online Review 13 (October 1989): 407-424. Robert Ash (1990) "Information Theory", Dover Publications, ISBN: 0486665216. |
Information Retrieval as interaction. Evaluation of interactive systems. Intro to Statistics for IR. Case study: Rutgers at TREC, Interactive Track (here's the presentation from Interactive TREC 2002). |
Preece et al, "Interaction design" and/or complementary website, esp. chapters on evaluation. Any book or online tutorial on Statistics (concentrate on hypothesis
testing, t-tests, ANOVA, Chi-square, correlation, ...); Hull, D. (1993)
"Using statistical testing
in the evaluation of retrieval experiments", SIGIR'93'; Buckley,
C. and Voorhees E.M. (2000) "Evaluating
evaluation measure stability", SIGIR'00; Voorhees E.M. and Buckley,
C. (2002) "The effect
of topic set size on retrieval experiment error", SIGIR'02. TREC, Rutgers' recent work at TREC (see Muresan's publications, and TREC 2003 webpage), and previous work (and actually the whole 37 (3), May 2001 issue of Information Processing and Management, focussing on Interactive TREC, would be interesting for people who choose to do an evaluation for the final project). |
| User interfaces for IR systems. Part I: Interaction models. |
Chapter 10: “User Interfaces and Visualization” by Marti Hearst in “Modern Information Retrieval”. Journal of the American Society of Information Science, vol. 43, issue 2, 1992, special issue on Human-Computer Interface: “Introduction and Overview” by Lunin and Harman, “Interfaces for end-user information seeking” by Gary Marchionini, “User-friendly systems instead of user-friendly front-ends” by Donna Harman, “Intelligent information retrieval: An introduction” by Susan Gauch, “Models for hypertext” by Mark F. Frisse and Steve B. Cousins; Muresan, G. and Harper, D. J. “Document Clustering and Language Models for System-Mediated Information Access”, ECDL’01, Darmstadt, p. 438-449. Journal of the American Society of Information Science, vol 57, issue 6, 2006 - selection of papers on "Perspectives on Search User Interfaces: Best Practices and Future Visions". Bates, M. (1990) “Where should the person stop and the information search interface start?” Information Processing and Management, v 26(5): 575-591. O’Day, V. L. and Jeffries, R. “Orienteering in an information landscape: how information seekers get from here to there”, InterCHI’93, Amsterdam. Brajnik, G., Mizzaro, S., Tasso, C. and Venuti, F. “Strategic Help in User Interfaces for Information Retrieval”, JASIST, 53(5), 2002, p. 343-358. Campbell, I. “Supporting Information Needs by Ostensive Definition in an Adaptive Information Space”, MIRO’95, or "The Ostensive Model of Developing Information Needs", CoLIS, Copenhagen, 1996. Ryen W. White, Ian Ruthven, Joemon M. Jose, C. J. Van Rijsbergen, "Evaluating
implicit feedback models using searcher simulations", ACM Transactions
on Information Systems, 23 (3): 325-361, July 2005; |
|
User interfaces for IR systems. Part II : Tools and techniques. |
Shneiderman, Ben, ch.“Information Search and Visualization”
in “Designing the user Interface”, 3rd ed., 1997 (see webpage);
Belkin, N.J., Marchetti, P.-G., Cool, C. (1993) BRAQUE: Design of an interface
to support user interaction in information retrieval. Information Processing
and Management, 29 (3): 325-344; Chalmers, M. and Chitson, P. “Bead:
Exploration in information visualization”, SIGIR’92, p.
330-337; Nowell, L.T., France, R.K., Hix, D., Heath, L.S., Fox, E.A. (1996)
“Visualizing
search results: Some alternatives to query-document similarity”,
SIGIR’ 96, p. 67-75; Williamson, C., Shneiderman, B. (1992) “The
Dynamic HomeFinder: Evaluating dynamic queries in a real-estate information
exploration system”, SIGIR’92, p. 338-346; Lin, Xia “Map
displays for information retrieval”, JASIS, 48(1), 1997, p. 40-54;
George Robertson (2000) "The
Task Gallery: a 3D window manager", SIGCHI.
Korfhage, Robert R. “To see, or not to see - is that the query?”, SIGIR’91, p. 134-141; Gary Marchionini, “Interfaces for end-user information seeking”, JASIS, 43(2), 1992. Cutting, D. R., Pedersen, J. O., Karger, D. and Tukey, J. W. “Scatter/Gather: A cluster-based approach to browsing large document collections”, SIGIR’92, p. 318-329; Hearst, M. and Karadi, C. "Cat-a-Cone: An Interactive Interface for Specifying Searches and Viewing Retrieval Results using a Large Category Hierarchy", SIGIR'97 , Philadelphia, PA; Chen, M., Hearst, M., Hong, J., and Lin, J. "Cha-Cha: A System for Organizing Intranet Search Results" in the Proceedings of the 2nd USENIX Symposium on Internet Technologies and SYSTEMS (USITS), Boulder, CO, 1999. |
| Human Computer Interaction (HCI) | Preece, J., Rogers, Y. and Sharp, H. (2002) – “Interaction Design – Beyond Human-Computer Interaction” (and associated webpage). |
| Information Visualization. | Spence, R. (2000) – “Information Visualization”, ISBN: 0201596261; Chen, C. (1999) – “Information Visualisation and Virtual Environments”, ISBN: 1852331364; Card, S. K., MacKinlay, J. D. and Shneiderman (1999) – “Readings in Information Visualization : Using Vision to Think”, ISBN: 1558605339. Also, University of Maryland’s HCI Lab website, and InfoViz, a repository for IV. |
Evaluation of IR systems. Experimental vs operational IR systems. Evaluation of interactive IR systems. IR evaluation in context. |
In Baeza-Yates & Ribeiro-Neto “Modern Information Retrieval”,
ch.3: “Retrieval
Evaluation”; in RK, ch.4: "Assessing
the Retrieval". In JASIS, 47(1), January 1996, Special Issue: Evaluation of Information Retrieval :- Tague-Sutcliffe, J. M. – “Some perspectives on the evaluation of information retrieval systems”, Blair, D. C. – “STAIRS redux: Thoughts on the STAIRS evaluation, ten years after”, Hersh, W. et al. – “A task-oriented approach to information retrieval evaluation”; Ellis, D. – “The dilemma of measurement in information retrieval research”; Beaulieu, M. et al. – “Evaluating interactive systems in TREC”. In Information Processing and Management, 31 (3), May-June 1995, Special
issue: TREC :- Harman, D. - “Overview of the Second Text Retrieval
Conference (TREC-2)”; Sparck Jones, K. – “Reflections
on TREC”; Robertson, S. E. et al. – “Large Test Collection
Experiments on an Operational, Interactive System: Okapi at TREC”;
Belkin, N. et al. – “Combining the Evidence of Multiple Query
Representations for Information Retrieval”. In Information Processing and Management, 36 (1), January 2000, Special issue: TREC :- Harman, D. - “Overview of the Sixth Text REtrieval Conference (TREC-2)”; Sparck Jones, K. – “Further reflections on TREC”; Robertson, S. E. et al. – “Experimentation as a way of life: Okapi at TREC”. Borlund, P. and Ingwersen, P. (1997) “The development of a method for the evaluation of interactive information retrieval systems”, Journal of Documentation, 53(3). In Information Processing and Management, 37 (3), May 2001, Special issue: Interactive TREC :- Hersh, W. and Over, P. - “Interactivity at the Text Retrieval Conference (TREC)”; Over, P. - “The TREC interactive track: an annotated bibliography”; Hersh et al. – “Challenging conventional assumptions of automated information retrieval with real users: Boolean searching and batch retrieval evaluations”; Belkin, N. et al. “Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval”; Allan, J. et al. – “Evaluating combinations of ranked lists and visualizations of inter-document similarity”; Wu, M. et al. – “Using clustering and classification approaches in interactive retrieval”; Larson, R. R. - “ TREC interactive with Cheshire II”; Bodner, R. C. et al. – “The impact of text browsing on text retrieval performance”; Yang, K. - “Passage feedback with IRIS”. The Text Retrieval Conference (TREC) webpage. In TREC2001, Belkin et al. “Rutgers' TREC 2001 Interactive Track Experience”. Nicholas J. Belkin and Gheorghe Muresan Measuring Web Search Effectiveness: Rutgers at Interactive TREC, in Measuring Web Search Effectiveness: The User Perspective, workshop at WWW 2004, May 2004, New York (paper, presentation). Jakob Nielsen's Alertbox, March 1, 2004: "Risks of Quantitative Studies" Saracevic, T. “Evaluation of Evaluation in Information Retrieval”, SIGIR’95. Reid, J. “A Task-Oriented Non-Interactive Evaluation Methodology for Information Retrieval Systems”, Information Retrieval, 2(1), Feb 2000. Ying Sun, Paul B. Kantor "Cross-Evaluation: A new model for information system evaluation", JASIST, 57(5): 614-628, March 2006. |
Structure. Document and query structure. Links. Categorization vs. clustering. Filtering. XML & INEX. |
vR, ch.3: “Automatic
classification”; Borlund, P. “Experimental Components for the evaluation of interactive information retrieval systems”, Journal of Documentation, Vol. 56, no. 1, 2000, 71-90. Hearst, M. A. and Pedersen, J. O. “Reexamining the cluster hypothesis: scatter/gather on retrieval results”, SIGIR’96, Zurich, p. 76-84 Y. Kural, S. Robertson and S. Jones (2001) "Deciphering cluster representations", Information Processing and Management 37, 593-601. Anastasios Tombros, C.J. van Rijsbergen (2004) "Query-sensitive
similarity measures for information retrieval" (invited paper),
Knowledge and Information Systems, 6(5):617-642, September 2004; Weili Wu; Hui Xiong; Shekhar, S. (Eds.) (2004) "Clustering
and Information Retrieval", ISBN: 1-4020-7682-7. |
| IR on the Web. | Journal of the American Society for Information Science and Technology,
53(2), 2002 - Special issue on Web research; Hao Chen and Susan Dumais (2000) “Bringing Order from Chaos: automatically categorizing search results”, SIGCHI; Susan Dumais and Hao Chen (2000) "Hierarchical classification of Web content", SIGIR, Susan Dumais, Edward Cutrell, Hao Chen (2001) "Optimizing search by showing results in context ", SIGCHI; Ed H. Chi, Peter Pirolli, James Pitkow (2000), "The scent of a site: a system for analyzing and predicting information scent, usage, and usability of a Web site", SIGCHI. Lary Page, Sergey Brin et al “PageRank:
Bringing Order to the Web”, Stanford Uni. report (the model
behind Google). Wendy Lucas, Heikki Topi (2004) "Training for Web search: Will it get you in shape?", JASIST, 55(13):1183-1198. Search EngineWatch |
| Informetrics | Dietmar Wolfram, "Applied Informetrics for Information Retrieval Research", in New Directions in Information Management, no.36, Libraries Unlimited, 2003. |
Artificial Intelligence Machine Learning Data Mining |
Stuart J. Russell, Peter Norvig () "Artificial Intelligence: A Modern Approach", Prentice Hall, ISBN: 0137903952. Tom M. Mitchell (1997) "Machine Learning", McGraw-Hill, ISBN: 0070428077 (scan of ch.1). Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley & Sons, 2000 [DHS 2000] (scans of ch.1 and ch.5). Ian H. Witten, Eibe Frank (2000) "Data
Mining: Practical Machine Learning Tools and Techniques with Java Implementations",
Morgan-Kaufmann, ISBN: 1558605525;
Fabrizio Sebastiani (2002) "Machine Learning in Automated Text Categorization", ACM Computing Surveys, Vol. 34, No. 1, March 2002, pp. 1–47. Burges' SVM tutorial. Cristianini, N., Shawe-Taylor, J. (2000) "An Introduction to Support Vector Machines", Cambridge University Press. More resources in John Platt’s webpage. |
Natural Language Processing in IR. Information extraction. Document summarization. |
Robert Krovetz and W. Bruce Croft (1992) "Lexical
ambiguity and information retrieval", ACM Transactions on Information
Systems, 10(2):115-141; |
| Advanced topics: Multimedia IR (image, video, music, ...). |
Eakins, J. P. and Graham, M. E. "Content-based
Image Retrieval: A Report to the JISC Technology Applications Programme"; JASIST 55 (12), October 2004 - "Perspectives on ... Music Information
Retrieval". |
Personalization and user modeling. Collaborative systems. Recommender systems. |
Susan Dumais et al (2003) "Stuff I've seen: a system for personal information retrieval and re-use", SIGIR, 72-79. Bruce, H. (2005): "Personal,
anticipated information need", Information Research, 10(3) paper
232. |
AI and IR. Agents. |