A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Dynamic Prosody Generation for Speech Synthesis Using Linguistics-Driven Acoustic Embedding Selection
2020
Interspeech 2020
Recent advances in Text-to-Speech (TTS) have improved quality and naturalness to near-human capabilities. But something which is still lacking in order to achieve human-like communication is the dynamic variations and adaptability of human speech in more complex scenarios. This work attempts to solve the problem of achieving a more dynamic and natural intonation in TTS systems, particularly for stylistic speech such as the newscaster speaking style. We propose a novel way of exploiting
doi:10.21437/interspeech.2020-1411
dblp:conf/interspeech/TyagiNRDL20
fatcat:5hnx77bkgnay5ed5ffijqu47vi