The following files contain stand-off stress
annotations for English Celex (orthographic forms).
The stress marks were mapped from the phonetic forms onto
orthographic forms using the ALINE algorithm.
If you use this data in your work,
please cite the following paper:
Grzegorz Kondrak.
A New Algorithm for the Alignment of Phonetic Sequences.
NAACL 2000.
eow_loe - English words with stress marks
lex_str.p
The following files contain stand-off morphological
annotations for English Celex (both orthographic and phonetic forms).
The morphological boundaries for the orthographic froms were
recursively extracted from the CELEX dictionary.
The morphological boundaries were then mapped from the orthographic forms onto
the phonetic forms using the ALINE algorithm.
If you use this data in your work,
please cite the following paper:
Grzegorz Kondrak.
A New Algorithm for the Alignment of Phonetic Sequences.
NAACL 2000.
eml_le - English lemmas with morpho breaks
emw_e - English words with morpho breaks
lex_mor.p
epl_e - transcribed English lemmas with morpho breaks
epw_e - transcribed English words with morpho breaks
pho_mor.p