560 Hits in 4.1 sec

Hierarchical pitch target model for Mandarin speech

Zhiping Zhang, Xinhao Wang, Yansuo Yu, Xihong Wu
2010 2010 7th International Symposium on Chinese Spoken Language Processing  
Finally, the learned tonal patterns of syllables and the intonation patterns from different prosodic layers are illustrated, and the synthesis experiments show the effectiveness of the presented model.  ...  In this study, a hierarchical pitch target model is proposed to analyze the underlying factors of tones and intonation in Mandarin pitch, which can be applied in speech synthesis systems.  ...  As expected, simulations on Mandarin speech corpus have revealed the rationality of the learned tonal patterns and intonation patterns from fluent speech.  ... 
doi:10.1109/iscslp.2010.5684865 dblp:conf/iscslp/ZhangWYW10 fatcat:jklk72eeejgxvdhamnmn4ffkby

An RNN-based prosodic information synthesizer for Mandarin text-to-speech

Sin-Horng Chen, Shaw-Hwa Hwang, Yih-Ru Wang
1998 IEEE Transactions on Speech and Audio Processing  
A new RNN-based prosodic information synthesizer for Mandarin Chinese text-to-speech (TTS) is proposed in this paper.  ...  Index Terms-Mandarin, pitch contour, prosodic information synthesizer, recurrent neural network, text-to-speech.  ...  His general research interests are Mandarin speech recognition and the application of neural networks in speech processing.  ... 
doi:10.1109/89.668817 fatcat:4pco67wysfhzzjo57vjhhk2p4u

A Novel Prosodic-Information Synthesizer Based on Recurrent Fuzzy Neural Network for the Chinese TTS System

C.-T. Lin, R.-C. Wu, J.-Y. Chang, S.-F. Liang
2004 IEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics)  
In this paper, a new technique for the Chinese text-to-speech (TTS) system is proposed. Our major effort focuses on the prosodic information generation.  ...  Index Terms-Chinese text-to-speech system, fuzzy inference engine, prosodic information, recurrent neural network, sandhi rules, speech synthesizer.  ...  Cheng-Hsiung Tsai and Shean-Yih Lin for their help with developing the Chinese TTS system.  ... 
doi:10.1109/tsmcb.2003.811518 pmid:15369074 fatcat:fbn6zbzb7zf3lezdiu3bzar4bi

Hierarchical stress modeling and generation in mandarin for expressive Text-to-Speech

Ya Li, Jianhua Tao, Keikichi Hirose, Xiaoying Xu, Wei Lai
2015 Speech Communication  
Automatic stress prediction is helpful for both speech synthesis and natural speech understanding. This paper proposes a novel hierarchical Mandarin stress modeling method.  ...  The top level emphasizes stressed syllables, while the bottom level focuses on unstressed syllables for the first time due to its importance in both naturalness and expressiveness of synthetic speech.  ...  Therefore, the sentence-level stressed syllables should be fully investigated for speech technology, especially for speech synthesis which aims to produce human-like natural speech.  ... 
doi:10.1016/j.specom.2015.05.003 fatcat:nttxqdwm7jb2tmbuxrsvipe27i

A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese

Fu-Chiang Chou, Chiu-Yu Tseng, Lin-Shan Lee
2002 IEEE Transactions on Speech and Audio Processing  
This paper presents a set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese.  ...  Hierarchical prosodic structure for an arbitrary desired text sentence is then generated based on the identification of different levels of break indices, and the prosodic feature sets and appropriate  ...  In this paper, a new set of text-to-speech synthesis technologies for Mandarin Chinese is presented.  ... 
doi:10.1109/tsa.2002.803437 fatcat:7x6s3g4fvbeprbx4ajtbzyieme

Pre-Trained Text Representations for Improving Front-End Text Processing in Mandarin Text-to-Speech Synthesis

Bing Yang, Jiaqi Zhong, Shan Liu
2019 Interspeech 2019  
In this paper, we propose a novel method to improve the performance and robustness of the front-end text processing modules of Mandarin text-to-speech (TTS) synthesis.  ...  Specifically, we get an absolute improvement of 0.013 and 0.027 in F1 score for prosodic word prediction and prosodic phrase prediction respectively, and an absolute improvement of 2.44% in polyphone disambiguation  ...  It usually contains three main components, including text normalization, prosodic structure prediction and grapheme-Figure 1 : A typical pipeline of text processing of a Mandarin text-to-speech. to-phoneme  ... 
doi:10.21437/interspeech.2019-1418 dblp:conf/interspeech/YangZL19 fatcat:3kgp3gwfkrhzhl6lyu7hxbhase

BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in a Text-to-Speech Front-End

Yibin Zheng, Jianhua Tao, Zhengqi Wen, Ya Li
2018 Interspeech 2018  
The word embedding layer is employed to learn the task-specific embeddings for prosodic boundary prediction.  ...  In this paper, we propose a language-independent end-to-end architecture for prosodic boundary prediction based on BLSTM-CRF.  ...  Experiments and result analysis Dataset We evaluate the proposed methods on both Mandarin and English dataset. These two datasets are both recorded for speech synthesis task.  ... 
doi:10.21437/interspeech.2018-1472 dblp:conf/interspeech/ZhengTWL18 fatcat:r4v4bhjawnbabnxtunbpmj7bcm

Tone Learning in Low-Resource Bilingual TTS

Ruolan Liu, Xue Wen, Chunhui Lu, Xiao Chen
2020 Interspeech 2020  
We present a system for low-resource multi-speaker crosslingual text-to-speech synthesis.  ...  The Mandarin training data is limited to 15 minutes of speech by a female Mandarin speaker.  ...  This is often seen in human 2nd-language learners and is closely related to the reproduction of one's native prosodic patterns in 2nd language.  ... 
doi:10.21437/interspeech.2020-2180 dblp:conf/interspeech/LiuWLC20 fatcat:z6ymxpqpevevjjif3a6j4xfzy4

Prosodic Modeling for Isolated Mandarin Words and its Application

Hung-Kuang Shih, Chen-Yu Chiang, Yih-Ru Wang, Sin-Horng Chen
2008 2008 6th International Symposium on Chinese Spoken Language Processing  
Based on the prosodic model, a learning system for Mandarin word prosody pronunciation is designed and implemented for nonnative speakers.  ...  In this paper, a new approach to syllable-based modeling of F0 contour, duration and energy for isolated Mandarin words is proposed.  ...  The user can learn to speak Mandarin word via imitating the synthesized speech. The system will record the user's speech and on-line extract its prosodic features for display.  ... 
doi:10.1109/chinsl.2008.ecp.70 dblp:conf/iscslp/ShihCWC08 fatcat:xxpdgmirrjhbpnnujjgpjcepqq

Modeling of Speaking Rate Influences on Mandarin Speech Prosody and Its Application to Speaking Rate-controlled TTS

Sin-Horng Chen, Chiao-Hua Hsieh, Chen-Yu Chiang, Hsi-Chun Hsiao, Yih-Ru Wang, Yuan-Fu Liao, Hsiu-Min Yu
2014 IEEE/ACM Transactions on Audio Speech and Language Processing  
text, and prosodic tags representing the prosodic structure of speech.  ...  rates, to describe the influences of speaking rate on Mandarin speech prosody is proposed.  ...  ACKNOWLEDGMENT The authors would like to thank the ACLCLP for providing the Treebank Corpus.  ... 
doi:10.1109/taslp.2014.2321482 fatcat:pu43xyrqajddpeaqzlihu6nfyq

Unsupervised Prosodic Labeling of Speech Synthesis Databases Using Context-Dependent HMMs

Chen-Yu YANG, Zhen-Hua LING, Li-Rong DAI
2014 IEICE transactions on information and systems  
In this paper, an automatic and unsupervised method using context-dependent hidden Markov models (CD-HMMs) is proposed for the prosodic labeling of speech synthesis databases.  ...  The performance of the proposed method is evaluated on Mandarin speech synthesis databases and two prosodic descriptors are investigated, i.e., the prosodic phrase boundary and the emphasis expression.  ...  ., Hefei, China, for providing the speech database with manual annotations.  ... 
doi:10.1587/transinf.e97.d.1449 fatcat:jffrchvvkrb7lisbicayaxb2au

Effects of native language experience on perceptual learning of Cantonese lexical tones

Alexander L. Francis, Valter Ciocca, Lian Ma
2004 Journal of the Acoustical Society of America  
Acoustic information extracted from the speech waveform is mapped into inputs for HLsyn.  ...  This analysis by synthesis approach is a method to develop a more precise picture of the planning stage during speech production where the acoustic phonetics must be carefully planned and modified to acheive  ... 
doi:10.1121/1.4783669 fatcat:nmcgpgtav5ajvnu7p6jb5y4t24

Spectral and prosodic transformations of hearing-impaired Mandarin speech

Cheng-Lung Lee, Wen-Whei Chang, Yuan-Chuan Chiang
2006 Speech Communication  
This paper studies the combined use of spectral and prosodic conversions to enhance the hearing-impaired Mandarin speech.  ...  The analysis-synthesis system is based on a sinusoidal representation of the speech production mechanism.  ...  With reference to the sinusoidal framework, speech parameters included in the prosodic conversion are P 0 (m), P v (m), and the synthesis frame interval.  ... 
doi:10.1016/j.specom.2005.08.001 fatcat:kobibx7w4rerrhukce4hnj4zxi

Exploiting Prosody Hierarchy and Dynamic Features for Pitch Modeling and Generation in HMM-Based Speech Synthesis

Chi-Chun Hsia, Chung-Hsien Wu, Jung-Yun Wu
2010 IEEE Transactions on Audio, Speech, and Language Processing  
This paper proposes a method for modeling and generating pitch in hidden Markov model (HMM)-based Mandarin speech synthesis by exploiting prosody hierarchy and dynamic pitch features.  ...  Index Terms-Dynamic features, hidden Markov model (HMM)-based speech synthesis, pitch modeling and generation, prosody hierarchy.  ...  Kawahara for helping with the STRAIGHT analysis/synthesis program, as well as Dr. Tokuda for providing the HTS speech synthesis program.  ... 
doi:10.1109/tasl.2010.2040791 fatcat:jbhyzk3uavcbtp2wai63rqo6ee

Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS [article]

Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li
2020 arXiv   pre-print
In this paper, we extend the Tacotron-based speech synthesis framework to explicitly model the prosodic phrase breaks.  ...  To our best knowledge, this is the first implementation of multi-task learning for Tacotron based TTS with a prosodic phrasing model.  ...  Databases Speech Data: We use the TsingHua-Corpus of Speech Synthesis (TH-CoSS) [39] for Chinese.  ... 
arXiv:2008.05284v1 fatcat:cqeky4hzu5fx3aavg26ql7eni4
« Previous Showing results 1 — 15 out of 560 results