NSERC Undergraduate Summer 2020 Research Project

You need to be an undergraduate student and have a good GPA. Contact me by e-mail if interested.
(The deadline for the submission of necessary forms is in January, so make sure to contact me sufficiently early.)

Word Senses

The goal of this exciting project is disambiguation of word senses. When you read an e-book, you can press any word to see its dictionary definition. The problem is that a lot of words have multiple senses. For example, suppose you read about the bank of England, and the e-reader shows you that bank means "the land alongside a river". We are interested in developing a system that could figure out the appropriate sense of the word from its context in the text. The project will involve implementing programs that extract and process information from web resources, such as online dictionaries and websites like Wiktionary. The ultimate goal of this research is to design a program that could identify the correct word sense as well as or even better than people do.

Required: strong programming skills.
Desirable: CMPUT 210 or 299; interest in language processing.


Computational decipherment

Monoalphabetic substitution is a well-known method of enciphering a plaintext by converting it into a ciphertext of the same length using a key, which is equivalent to a permutation of the alphabet. The method is elegant and easy to use, requiring only the knowledge of a key whose length is no longer than the size of the alphabet, and it is resistant to brute-force decryption. This project will investigate the problem of automatically solving substitution ciphera using language models. The task is to recover the plaintext from the ciphertext without the key, given only a corpus representing the language of the plaintext. The project will involve implementing decipherment algorithms, and automatic extraction of related data from the web.

Required: strong programming skills.
Desirable: CMPUT 396 (Cryptography) and/or CMPUT 497 (NLP). Interest in music.