Additive Modeling of English F0 Contour for Speech Synthesis

S. Sakai
Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.  
In this paper, we present an approach to fundamental frequency contour modeling of English for speech synthesis, based on a statistical learning technique called Additive Models that was successfully applied to the modeling of Japanese F0 contour previously. In an attempt to model English F0 contour, we defined a threelayer additive model consisting of an intonational phrase component, a word-level component representing lexical stress types, and a pitch-accent component related to accented
more » ... ted to accented syllables. These component functions are estimated simultaneously using a backfitting algorithm derived from a regularized least-squares error criterion specified on the model with regard to the training data. The proposed method was trained and tested using the widely used ToBIlabeled speech corpus and promising results were obtained.
doi:10.1109/icassp.2005.1415104 dblp:conf/icassp/Sakai05 fatcat:crgyurc7bnccbk5iuf5v4lm47u