Data Selection and Adaptation for Naturalness in HMM-Based Speech Synthesis

Erica Cooper, Alison Chang, Yocheved Levitan, Julia Hirschberg
2016 Interspeech 2016  
Can we identify metrics for selecting the best utterances in a found-data corpus for voice training, or for excluding utterances that will detract from the quality of the voice? Can we select a subset of training utterances from a corpus of found data to produce a better voice than one trained on all of the data? Can we adapt a voice towards the best utterances in a corpus, to improve the quality of the voice?
doi:10.21437/interspeech.2016-502 dblp:conf/interspeech/CooperCLH16 fatcat:o3enwb34sne5jejf6rs7ajg3mi