CMPUT 692 Home Page

CMPUT 692-A2
Modern Database Searches and Techniques
Fall 2022

Meetings: MW 11:00 - 12:20
Instructor: Davood Rafiei , ATH 436
Course eclass page

With text and natural languages being pervasive, they have become an integral part of modern database management echosystems. As the data increase in size and variation, standard models of text and queries are no longer effective or efficient. This departure has led to many interesting models and algorithms for search and data exploration in the past couple of years.

This course studies some of those models and algorithms with an in-depth analysis of the underlying principles that allow these models to be applied or scaled to large data collections and workloads.

Topics to be covered (tentative)

Models of text and queries
Top-k queries and indexes
Similarity search
Semantic models of text
Example based queries
Natural language data and interfaces
Probabilistic query semantics
Querying knowledge graphs

Course prerequisite

Students are expected to have an introductory course in data management and/or information retrieval (e.g. CMPUT 291 or equivalent), some knowledge of probability and statistics and proficiency in Linux and programming.

Grading (tentative)

(35%) - Assignments: includes problem sets, programming exercises and research paper reviews
(45%) - Term project (individual or groups of 2, depending on the class size)
(15%) - Class presentation of a research paper
(5%) - Participation in class discussions

Recommended books and resources

Leskovec, Rajarman, Ullman: Minining of massive datasets, 3rd ed. Cambridge UP, 2014.
Manning, Raghavan, Schutze, Introduction to information retrieval, Cambridge UP, 2009.
Hogan et al. Knowledge graphs, Morgan & Claypool, 2021.
Relevant research papers (tba)

CMPUT 692-A2 Modern Database Searches and Techniques Fall 2022

Topics to be covered (tentative)

Course prerequisite

Grading (tentative)

Recommended books and resources

CMPUT 692-A2
Modern Database Searches and Techniques
Fall 2022