Applying Many-to-Many Alignments and HMMs to Letter-to-Phoneme Conversion
Letter-to-phoneme conversion generally requires aligned training data
of letters and phonemes.
Typically, the alignments are limited to one-to-one alignments.
We present a novel technique of training with many-to-many alignments.
A letter chunking bigram prediction manages double letters and double
phonemes automatically as opposed to
preprocessing with fixed lists.
We also apply an HMM method
in conjunction with a local classification model to
predict a global phoneme sequence given a word.
The many-to-many alignments result in significant improvements
over the traditional one-to-one approach.
Our system achieves state-of-the-art performance on several languages
and data sets.