NSERC Summer Research Projects

The primary focus of our research revolves around two key areas: (1) Developing natural language interfaces for databases, and (2) Data integration, curation, and cleaning. For more detailed insights, please refer to our recent publications. Both research directions involve creating tools and techniques for automated information extraction, entity resolution, tabular data cleaning, natural language parsing, and translating natural language queries into code or SQL. In many of these tasks, we leverage the power of large language models to achieve state-of-the-art results.

We are currently offering two NSERC-funded undergraduate summer research positions. Ideal candidates should have a strong background in computer science (preferably in their 3rd or 4th year), possess excellent programming skills (particularly in Python, PyTorch, and SQL), and demonstrate a genuine interest in tackling challenging research problems. The responsibilities will include data collection and cleaning, designing and conducting experiments, performing in-depth analysis, and developing tools and prototypes. Collaboration with graduate students will be a key aspect of the role, so strong teamwork and communication skills are essential. Successful candidates will contribute to the creation of innovative tools, valuable resources, and potentially co-authored research papers.

This is a research scholarship opportunity, meaning the project is inherently open-ended and encourages creative problem-solving. We are looking for candidates who are passionate about writing robust, efficient code and enjoy the process of building and refining prototypes. If you are excited about pushing the boundaries of natural language processing, database systems, and data integration, we encourage you to apply and join our dynamic research team.

This is an excellent opportunity to gain hands-on research experience, contribute to impactful projects, and develop skills that are highly relevant in both academic and industry settings. If you have a curiosity-driven mindset and a desire to explore cutting-edge technologies, we look forward to receiving your application!

If interested, please

The NSERC undergrad summer scholarship is open to Canadian citizens and permanent residents of Canada.

Some Past Projects

BareTQL: An interactive system for searching and extraction of open data tables

Many organizations and government bodies have been making their data available to public. Despite the progress in many different aspects of table extraction and publishing, querying incomplete data in tables with little or no schema has been a challenge. This work develops BareTQL, an interactive system for querying open data tables in the presence of the aforementioned challenges. See the paper for more details.

Data Annotation Through Online Games

Facts and relationships that are extratcted from the Web are often erroneos or inaccurate, and verifying them can be a tedious and sometimes a boring task. What if we turn this task into an online game where as the users play the game, the verification happens behind the scene? This doesn't sound boring anymore. This is a project Eddie Santos and Stephen Romansky (two summer students) did over a summer. Here is a link to the game page. James Moore (another summer student) put together an Android app for the game.