728 Hits in 12.1 sec

Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs [article]

Rob Clark, Hanna Silen, Tom Kenter, Ralph Leith
2019 arXiv   pre-print
In this paper, we investigate three different ways of evaluating the naturalness of long-form text-to-speech synthesis.  ...  Text-to-speech systems are typically evaluated on single sentences.  ...  Introduction Traditionally, text-to-speech (TTS) systems are trained on corpora of isolated sentences.  ... 
arXiv:1909.03965v1 fatcat:5izucgp26fhrrmywsnqvihkf7a

Choice of Voices

Julia Cambre, Jessica Colnago, Jim Maddock, Janice Tsai, Jofish Kaye
2020 Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems  
The advancement of text-to-speech (TTS) voices and a rise of commercial TTS platforms allow people to easily experience TTS voices across a variety of technologies, applications, and form factors.  ...  We conclude with considerations for selecting text-to-speech voices for long-form content.  ...  We also thank our reviewers for their time and valuable feedback.  ... 
doi:10.1145/3313831.3376789 dblp:conf/chi/CambreCMTK20 fatcat:gk5spzgybjf67noyphx72cxyqa

Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs

Rob Clark, Hanna Silen, Tom Kenter, Ralph Leith
2019 10th ISCA Speech Synthesis Workshop   unpublished
In this paper, we investigate three different ways of evaluating the naturalness of long-form text-to-speech synthesis.  ...  Text-to-speech systems are typically evaluated on single sentences.  ...  Specific thanks to Xingyang Cai, Anna Greenwood, Mateusz Westa, Dina Kelesi and Leilani Kurtak-McDonald for help with evaluation tools and voice building.  ... 
doi:10.21437/ssw.2019-18 fatcat:dwoulbvl5jfkxod5q6fe6gpewq

Smatts: Standard Malay Text To Speech System

Othman O. Khalifa, Zakiah Hanim Ahmad, Teddy Surya Gunawan
2007 Zenodo  
This paper presents a rule-based text- to- speech (TTS) Synthesis System for Standard Malay, namely SMaTTS.  ...  As for the evaluation tests, a set of Diagnostic Rhyme Test (DRT) word list was compiled and several experiments have been performed to evaluate the quality of the synthesized speech by analyzing the Mean  ...  EVALUATION OF TEXT-TO-SPEECH SYSTEMS In order to evaluate Text-to-speech systems Klatt specifies some criterias.  ... 
doi:10.5281/zenodo.1079187 fatcat:ajju6zh3nbh5fkkl2onuvqhwlu

A Hakka Text-To-Speech System [chapter]

Hsiu-Min Yu, Hsin-Te Hwang, Dong-Yi Lin, Sin-Horng Chen
2006 Lecture Notes in Computer Science  
In this paper, the implementation of a Hakka text-to-speech (TTS) system is presented.  ...  The system is designed based on the same principle of developing a Mandarin and a Min-Nan TTS systems proposed previously.  ...  The system was designed based on the same principle of developing a Mandarin and a Min-Nan/Taiwanese TTS systems proposed previously. Experimental results confirmed that the system performed well.  ... 
doi:10.1007/11939993_28 fatcat:6xq2a3bzczb3zpvxolevjlqv2u

Text analysis and language identification for polyglot text-to-speech synthesis

Harald Romsdorfer, Beat Pfister
2007 Speech Communication  
The challenge for a text analysis component of a text-to-speech synthesis system is to derive from mixedlingual sentences the correct polyglot phone sequence and all information necessary to generate natural  ...  In multilingual countries, text-to-speech synthesis systems often have to deal with texts containing inclusions of multiple other languages in form of phrases, words, or even parts of words.  ...  Acknowledgements We cordially thank Alexis Wilpert and Yan Bi for providing the Chinese example sentences.  ... 
doi:10.1016/j.specom.2007.04.006 fatcat:u6u5j7qlebf5vdzoxwjwvotb2y

Adapting Prosody in a Text-to-Speech System [chapter]

Janez Stergar, Caglayan Erdem
2010 Products and Services; from R&D to Final Solutions  
One of the major problems in text-to speech synthesis system consists in the automatic generation of a natural and intelligible prosody.  ...  Adapting Prosody in a Text-to-Speech System 333 parameters given by sound duration and intonation contours enables a TTS backend to produce natural-sounding, high quality, synthetic speech (Edgington et  ...  from R&D to Final Solutions www.intechopen.comAdapting Prosody in a Text-to-Speech System (15) Adapting Prosody in a Text-to-Speech System  ... 
doi:10.5772/10398 fatcat:ijc4hpifmfetdn74hk52lepx2q

Triple M: A Practical Text-to-speech Synthesis System With Multi-guidance Attention And Multi-band Multi-time LPCNet [article]

Shilun Lin, Fenglong Xie, Li Meng, Xinhui Li, Li Lu
2021 arXiv   pre-print
In this work, a robust and efficient text-to-speech (TTS) synthesis system named Triple M is proposed for large-scale online application.  ...  Compared with single attention mechanism, multi-guidance attention not only brings better naturalness to long sentence synthesis, but also reduces the word error rate by 26.8%. 2) A new efficient multi-band  ...  In this way, new properties can be assigned to the text-to-speech system without modifying the online service.  ... 
arXiv:2102.00247v4 fatcat:isvzpo2scrhh3g3e5n3boso4bi

Speech-T: Transducer for Text to Speech and Beyond

Jiawei Chen, Xu Tan, Yichong Leng, Jin Xu, Guihua Wen, Tao Qin, Tie-Yan Liu
2021 Neural Information Processing Systems  
Considering that monotonic alignments are also critical to text to speech (TTS) synthesis and streaming TTS is also an important application scenario, in this work, we explore the possibility of applying  ...  supports streaming TTS with good voice quality; and 3) enjoys the benefit of joint modeling TTS and ASR in a single network.  ...  of text to speech (TTS) synthesis [26, 30, 22, 1, 6, 16, 13, 20, 21, 11] .  ... 
dblp:conf/nips/ChenTLXWQL21 fatcat:tdfdfj7b7feotbk32kagbshtai

Modeling prosody patterns for Chinese expressive text-to-speech synthesis

Zhiyong Wu, Lianhong Cai, Helen M. Meng
2010 2010 7th International Symposium on Chinese Spoken Language Processing  
This paper proposes an approach for modeling the prosody patterns of the acoustic features for Chinese expressive text-to-speech (TTS) synthesis.  ...  Keywords-expressive text-to-speech (TTS); prosody pattern; non-linear perturbaton model I.  ...  The results also indicate that the method can be successfully extended to new speakers. VI. CONCLUSIONS AND FUTURE WORK This work aims to enhance expressivity of text-to-speech (TTS) outputs.  ... 
doi:10.1109/iscslp.2010.5684494 dblp:conf/iscslp/WuCM10 fatcat:lpy6zmgc6zdbfgtwppjqqqoxhe

A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese

Fu-Chiang Chou, Chiu-Yu Tseng, Lin-Shan Lee
2002 IEEE Transactions on Speech and Audio Processing  
This paper presents a set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese.  ...  Hierarchical prosodic structure for an arbitrary desired text sentence is then generated based on the identification of different levels of break indices, and the prosodic feature sets and appropriate  ...  In this paper, a new set of text-to-speech synthesis technologies for Mandarin Chinese is presented.  ... 
doi:10.1109/tsa.2002.803437 fatcat:7x6s3g4fvbeprbx4ajtbzyieme

Exploring Efficient Neural Architectures for Linguistic–Acoustic Mapping in Text-To-Speech

Santiago Pascual, Joan Serrà, Antonio Bonafonte
2019 Applied Sciences  
Conversion from text to speech relies on the accurate mapping from linguistic to acoustic symbol sequences, for which current practice employs recurrent statistical models such as recurrent neural networks  ...  Our results show that the proposed decoder networks are competitive in terms of distortion when compared to a recurrent baseline, whilst being significantly faster in terms of CPU and GPU inference time  ...  Acknowledgments: We deeply thank the participants of the subjective evaluation. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/app9163391 fatcat:xwifhwme7rapnbry2tmyhful4m

An RNN-based prosodic information synthesizer for Mandarin text-to-speech

Sin-Horng Chen, Shaw-Hwa Hwang, Yih-Ru Wang
1998 IEEE Transactions on Speech and Audio Processing  
A new RNN-based prosodic information synthesizer for Mandarin Chinese text-to-speech (TTS) is proposed in this paper.  ...  Index Terms-Mandarin, pitch contour, prosodic information synthesizer, recurrent neural network, text-to-speech.  ...  ACKNOWLEDGMENT The authors thank the Telecommunication Laboratories, MOTC, for supplying the speech data base, and Academia Sinica for supplying the lexicon. degrees He was an instructor of the Department  ... 
doi:10.1109/89.668817 fatcat:4pco67wysfhzzjo57vjhhk2p4u

A unit selection text-to-speech synthesis system optimized for use with screen readers

Aimilios Chalamandaris, Sotiris Karabetsos, Pirros Tsiakoulis, Spyros Raptis
2010 IEEE transactions on consumer electronics  
Currently, unit-selection text-to-speech technology is the common approach for near-natural speech synthesis systems.  ...  This work describes the design and the implementation approaches for the efficient integration of this technology into screen reading environments.  ...  Chrysses and Mr. M. Alexandrakis for their contribution during user requirements design stage, as well as during the beta testing evaluation phase.  ... 
doi:10.1109/tce.2010.5606343 fatcat:4sys7zthuzdapdtp6mupemsd5m

Is text-to-speech synthesis ready for use in computer-assisted language learning?

Zöe Handley
2009 Speech Communication  
Text-to-Speech (TTS) synthesis, the generation of speech from text input, offers another means of providing spoken language input to learners in Computer-Assisted Language Learning (CALL) environments.  ...  In this paper, the aforementioned aspects of the quality of the output of four state-of-the-art French TTS synthesis systems are evaluated with respect to their use in the three different roles that TTS  ...  A further distinction is made between Text-to-Speech (TTS) synthesis systems and Concept-to-Speech (CTS) synthesis systems.  ... 
doi:10.1016/j.specom.2008.12.004 fatcat:lyif3pdq3zhmtj3bop24ag2efm
« Previous Showing results 1 — 15 out of 728 results