Structured Acoustic Models for Speech Recognition [chapter]

Slav Petrov
2011 Coarse-to-Fine Natural Language Processing  
We present a maximally streamlined approach to learning HMM-based acoustic models for automatic speech recognition. In our approach, an initial monophone HMM is iteratively refined using a split-merge EM procedure which makes no assumptions about subphone structure or context-dependent structure, and which uses only a single Gaussian per HMM state. Despite the much simplified training process, our acoustic model achieves state-of-the-art results on phone classification (where it outperforms
more » ... it outperforms almost all other methods) and competitive performance on phone recognition (where it outperforms standard CD triphone / subphone / GMM approaches). We also present an analysis of what is and is not learned by our system.
doi:10.1007/978-3-642-22743-1_4 fatcat:ijvlyays4zhjfk2sl72ya3ltlm