A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
[article]
2022
arXiv
pre-print
Codes: https://github.com/MoonInTheRiver/DiffSinger. The old title of this work: "Diffsinger: Diffusion acoustic model for singing voice synthesis". ...
In this work, we propose DiffSinger, an acoustic model for SVS based on the diffusion probabilistic model. ...
Conclusion In this work, we proposed DiffSinger, an acoustic model for SVS based on diffusion probabilistic model. ...
arXiv:2105.02446v6
fatcat:vx3hxjcfnrbcnflbr6zpffte34
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
2022
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
In this work, we propose DiffSinger, an acoustic model for SVS based on the diffusion probabilistic model. ...
Singing voice synthesis (SVS) systems are built to synthesize high-quality and expressive singing voice, in which the acoustic model generates the acoustic features (e.g., mel-spectrogram) given a music ...
Conclusion In this work, we proposed DiffSinger, an acoustic model for SVS based on diffusion probabilistic model. ...
doi:10.1609/aaai.v36i10.21350
fatcat:t6yu52iwqbfd7mj47otyehb3by
A Survey on Recent Deep Learning-driven Singing Voice Synthesis Systems
[article]
2021
arXiv
pre-print
Singing voice synthesis (SVS) is a task that aims to generate audio signals according to musical scores and lyrics. ...
We intend to summarize their deployed model architectures and identify the strengths and limitations for each of the introduced systems. ...
DiffSinger: Denoising Diffusion Probabilistic Model + Neural Vocoder To further enhance the acoustic model's prediction accuracy and robustness of Mel-spectrograms, DiffSinger [8] utilizes a generative ...
arXiv:2110.02511v1
fatcat:4ou5xepnjbg2todhfu3vrn7p44
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation
[article]
2022
arXiv
pre-print
Deep generative models have achieved significant progress in speech synthesis to date, while high-fidelity singing voice synthesis is still an open problem for its long continuous pronunciation, rich high-frequency ...
In this work, we propose SingGAN, a generative adversarial network designed for high-fidelity singing voice synthesis. ...
[12] , three popular GAN-based models for fast and high-quality audio synthesis. 5) Diffwave [13] , the recently proposed diffusion probabilistic model for speech synthesis, and we use 6 iterations during ...
arXiv:2110.07468v3
fatcat:tvajeszbwrabvp6ewzujwvbntq
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
[article]
2022
arXiv
pre-print
A two-stage training scheme is proposed, with a basic TTS acoustic model trained at stage one providing valuable prior information for a DDPM trained at stage two. ...
Denoising diffusion probabilistic models (DDPMs) are expressive generative models that have been used to solve a variety of speech synthesis problems. ...
Diffsinger: Diffusion acoustic model for singing voice synthesis. arXiv preprint arXiv:2105.02446, 2021a. ...
arXiv:2201.11972v1
fatcat:v5wn34qda5ap5csccywxtyoozu