| Parsing |
A natural language sentence consists of a sequence of words.
The purpose of parsing is to uncover the relationships between the words.
I developed a parser called Minipar. The key distinction between
Minipar and most other parsers is that it is a principle-based parser. |
| Acquisition
of Lexical Knowledge |
All natural languages consist of tens of thousands of
words. Knowledge about these words is called lexical knowledge. Many
NLP systems critically depend on lexical knowledge to be functional. The
acquisition of lexical knowledge presents a serious challenge due to the
large number of words and the many-to-many correspondence between words
and meanings. The goal of
my research in this area is to develop programs to automatically or
semi-automatically acquire lexical knowledge from text corpora. |
| Coreference |
The objective of coreference resolution is to determine
which words/phrases in a discourse (a piece of text or a segment of
conversation) refer to the same entity. For example, given a sentence
"John told Peter that he saw him at a conference three years
ago", a coreference resolver should be able to determine that
"he" probably
refers to John and "him" probably refers to Peter. |
| Question-
Answering |
Given a query, an information retrieval system returns a set
of documents that may be relevant to the query. The goal of a
question answering system is to identify a phrase or a sentence in the
document collection that is the answer to the query.
The Q&A track in TREC is an
competitive evaluation of question-answering systems. |
| Word Sense
Disambiguation |
Natural language words often have multiple meanings in
different contexts. For example, the word 'bank' in 'river bank' and 'bank
account' means differently. Word sense disambiguation (WSD) is to
determine the meaning of a word in its context. |