Parameter generation algorithm considering Modulation Spectrum for HMM-based speech synthesis

Shinnosuke Takamichi, Tomoki Toda, Alan W. Black, Satoshi Nakamura
2015 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
This paper proposes a novel parameter generation algorithm for high-quality speech generation in Hidden Markov Model (HMM)based speech synthesis. One of the biggest issues causing significant quality degradation is the over-smoothing effect often observed in generated parameter trajectories. Global Variance (GV) is known as a feature well correlated with the over-smoothing effect and a metric on the GV of the generated parameters is effectively used as a penalty term in the conventional
more » ... r generation. However, the quality of the synthetic speech is far from that of the natural speech. Recently, we have found that a Modulation Spectrum (MS) of the generated parameters, which is also regarded as an extension of the GV, is more sensitively correlated with the over-smoothing effect than the GV. This paper incorporates a metric on the MS as a new penalty term in the proposed parameter generation algorithm. The experimental results demonstrate that the proposed parameter generation algorithm considering the MS yields significant improvements in synthetic speech quality compared to the conventional parameter generation algorithm considering the GV.
doi:10.1109/icassp.2015.7178764 dblp:conf/icassp/TakamichiTBN15 fatcat:up7kn7aqbvhbvjr5swrnawapxy