Filters








5 Hits in 0.54 sec

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism [article]

Jinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Zhou Zhao
2022 arXiv   pre-print
Codes: https://github.com/MoonInTheRiver/DiffSinger. The old title of this work: "Diffsinger: Diffusion acoustic model for singing voice synthesis".  ...  In this work, we propose DiffSinger, an acoustic model for SVS based on the diffusion probabilistic model.  ...  Conclusion In this work, we proposed DiffSinger, an acoustic model for SVS based on diffusion probabilistic model.  ... 
arXiv:2105.02446v6 fatcat:vx3hxjcfnrbcnflbr6zpffte34

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism

Jinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Zhou Zhao
2022 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
In this work, we propose DiffSinger, an acoustic model for SVS based on the diffusion probabilistic model.  ...  Singing voice synthesis (SVS) systems are built to synthesize high-quality and expressive singing voice, in which the acoustic model generates the acoustic features (e.g., mel-spectrogram) given a music  ...  Conclusion In this work, we proposed DiffSinger, an acoustic model for SVS based on diffusion probabilistic model.  ... 
doi:10.1609/aaai.v36i10.21350 fatcat:t6yu52iwqbfd7mj47otyehb3by

A Survey on Recent Deep Learning-driven Singing Voice Synthesis Systems [article]

Yin-Ping Cho, Fu-Rong Yang, Yung-Chuan Chang, Ching-Ting Cheng, Xiao-Han Wang, Yi-Wen Liu
2021 arXiv   pre-print
Singing voice synthesis (SVS) is a task that aims to generate audio signals according to musical scores and lyrics.  ...  We intend to summarize their deployed model architectures and identify the strengths and limitations for each of the introduced systems.  ...  DiffSinger: Denoising Diffusion Probabilistic Model + Neural Vocoder To further enhance the acoustic model's prediction accuracy and robustness of Mel-spectrograms, DiffSinger [8] utilizes a generative  ... 
arXiv:2110.02511v1 fatcat:4ou5xepnjbg2todhfu3vrn7p44

SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation [article]

Rongjie Huang, Chenye Cui, Feiyang Chen, Yi Ren, Jinglin Liu, Zhou Zhao, Baoxing Huai, Zhefeng Wang
2022 arXiv   pre-print
Deep generative models have achieved significant progress in speech synthesis to date, while high-fidelity singing voice synthesis is still an open problem for its long continuous pronunciation, rich high-frequency  ...  In this work, we propose SingGAN, a generative adversarial network designed for high-fidelity singing voice synthesis.  ...  [12] , three popular GAN-based models for fast and high-quality audio synthesis. 5) Diffwave [13] , the recently proposed diffusion probabilistic model for speech synthesis, and we use 6 iterations during  ... 
arXiv:2110.07468v3 fatcat:tvajeszbwrabvp6ewzujwvbntq

DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs [article]

Songxiang Liu, Dan Su, Dong Yu
2022 arXiv   pre-print
A two-stage training scheme is proposed, with a basic TTS acoustic model trained at stage one providing valuable prior information for a DDPM trained at stage two.  ...  Denoising diffusion probabilistic models (DDPMs) are expressive generative models that have been used to solve a variety of speech synthesis problems.  ...  Diffsinger: Diffusion acoustic model for singing voice synthesis. arXiv preprint arXiv:2105.02446, 2021a.  ... 
arXiv:2201.11972v1 fatcat:v5wn34qda5ap5csccywxtyoozu