Department of Mathematicstheory
INFORMATION RETRIEVAL
DSE 4401
Syllabus
- 01Introduction to Information Retrieval: Mathematical Basics, Vector spaces and Similarity, Probabilities and Statistics, Text Analysis
- 02Pre-processing: Document processing, Stemming, String Matching, Basic NLP tasks–POS Tagging Shallow Parsing
- 03Overview of Text Retrieval Systems: System Architecture, Boolean Models, Inverted Indexes, Document Ranking, IR Evaluation
- 04Retrieval Models and Implementation: Vector Space Models, TF-IDF Weighting, Retrieval Axioms, Implementation Issues, Probabilistic Models
- 05Statistical Language Models: Okapi/BM25, Language Models, KL-divergence, Smoothing
- 06Query Expansion and Feedback: Query Reformulation, Relevance feedback, Pseudo-Relevance Feedback, Language Model Based, Feedback
- 07Web Search Engines: Models of the Web, Web Crawling
- 08Static Ranking: Page Rank HITS, Query Log Analysis, Adversarial IR, Information Filtering: Adaptive Filtering, Collaborative Filtering, User Interfaces, Text Classification, NaïveBayes, K-nearestneighbors, Feature selection, Semi-supervised Learning
- 09Text Clustering: Vector-space Clustering; K-means, EM algorithm, Text shingling
- 10Graph-Based Methods: WordNet, Document and Word Graphs, Network Analysis, Random Walks, Harmonic Functions
References
- Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, “Introduction to Information Retrieval”, (2e), Cambridge University Press, 2015.
- B.Croft, D.Metzler, T.Strohman, Search Engines: Information Retrieval in Practice, (3e), MIT Press, 2016.
- Chengxiang Zhai, Statistical Language Models for Information Retrieval (Synthesis Lecture Series on Human Language Technologies), (2e), Morgan & Claypool Publishers, 2017.
Credits Structure
3Lecture
0Tutorial
0Practical
3Total