A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Learning to Model Prosodic and Spectral Features for Non-parallel Emotive Speech Conversion
2021
Proceedings of the Canadian Conference on Artificial Intelligence
Emotion conversion in speech has attracted recent attention owing to its importance in human-machine interaction and the current high quality of speech synthesis. Most existing approaches rely on parallel data, which is not available in many real-time applications. We propose a non-parallel emotion conversion approach based on the cycle generative adversarial network (cycleGAN) framework. We introduce new variants of cycleGAN that use recurrent neural networks and multi-kernel convolutional
doi:10.21428/594757db.930ce165
fatcat:bwji7p3cfbd2zcahhh2tcfheea