Meetings: MW 9:30 - 10:50
Managing Big Text Data
Davood Rafiei ,
Course eclass page
Text data has become pervasive and is found in many different shapes and forms.
As the data increases in size and variation, standard models of text and queries
are no longer effective or efficient. This departure has led to many interesting
models and algorithms for search and data exploration in the past couple of
This course studies some of those models and algorithms with an in-depth analysis of the underlying principles that allow these models to be scaled for large data collections and workloads.
Topics to be covered (tentative)
- Models of text and queries
- Top-k queries and indexes
- Similarity search
- Semantic models of text
- Example based queries
- Probabilistic query semantics
- Querying knowledge graphs
- Natural language data (and interfaces)
Students are expected to have an introductory course in data management and/or information retrieval (e.g. CMPUT 291 or equivalent) and proficiency in Linux and programming.
- (35%) - Assignments: includes problem sets, programming exercises and research paper reviews
- (45%) - Term project (individual or groups of 2, depending on the class
- (15%) - Class presentation of a research paper
- (5%) - Participation in class discussions
Recommended books and resources