Automatic Identification of Cognates and False Friends in French and English
Cognates are words in different languages that
have similar spelling and meaning. They can help a second-language learner
on the tasks of vocabulary expansion and reading comprehension.
The learner also needs to pay attention to pairs of words that appear similar
but are in fact false friends: they have different meaning
in some contexts or in all contexts. In this paper we propose a method to
automatically classify a pair of words as cognates or false friends.
We focus on French and English,
but the methods are applicable to other language pairs.
We use several measures of orthographic similarity
as features for classification.
We study the impact of selecting different features,
averaging them, and combining them through machine learning techniques.