A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is
In a typical voice conversion system, vocoder is commonly used for speech-to-features analysis and features-to-speech synthesis. However, vocoder can be a source of speech quality degradation. This paper presents a novel approach to voice conversion using WaveNet for non-parallel training data. Instead of reconstructing speech with intermediate features, the proposed approach utilizes the WaveNet to map the Phonetic Posterior-Grams (PPGs) to the waveform samples directly. In this way, we avoiddoi:10.21437/interspeech.2019-1514 dblp:conf/interspeech/TianC019 fatcat:ebmoa63td5gbzlruatksmamnqq