Automatic Generation of Pronunciation Dictionaries - for New, Unseen Languages by Voting Phoneme Recognizers in Nine Different Languages (Studienarbeit)
In this report we will describe a data driven approach for creating pronunciation dictionaries for a new unseen target language by voting among phoneme recognizers in nine different languages other than the target language. In this process recordings of the new language that are transcribed on word level are decoded by the phoneme recognizers. This results in a hypothesis of nine phonemes per time frame, one from every language. Then two algorithms are described that can map the decoded
... the decoded hypotheses to a pronunciation dictionary entry. These algorithms make use of a confusion matrix based distance measure between the phonemes of the phoneme recognizers and the phonemes of the target language which dictionary is to be created. The confusion matrix is calculated with the help of 500 phonetically transcribed training utterances in the target language. The phoneme recognizers used in this work were derived from the context independent speech recognizers of the GlobalPhone project. In order to improve the mapping of the hypotheses of the phoneme recognizers to the dictionary entry we incorporated confidence measures that were derived from word lattices into our algorithms. Using the proposed algorithms we produced new pronunciation dictionaries for the target languages Swedish and Haitian Creole. The newly created dictionaries were evaluated by comparing the performance of large vocabulary continuous speech recognition systems trained with these dictionaries to reference systems trained with rule based pronunciation dictionaries. The results of the evaluation show that the process in its current form does not produce pronunciation dictionaries that are accurate enough to train large vocabulary continuous speech recognizers with them. We therefor make suggestions for future work in order to fix the error sources of the process.