A Chinese text-to-speech system based on part-of-speech analysis, prosodic modeling and non-uniform units

Fu-Chiang Chou, Chiu-Yu Tseng, Keh-Jiann Chen, Lin-Shan Lee
1997 IEEE International Conference on Acoustics, Speech, and Signal Processing  
This paper presents a new Chinese text-to-speech system that produces very natural and intelligible synthetic Mandarin speech based on part-of-speech analysis, prosodic modeling and non-uniform units. The distinguishing features and key technology for the system can be summarized as follows: (1) A text analysis module for word identification and tagging was developed based on part-of-speech modeling and using heuristic rules to achieve very high accuracy. (2) The required prosodic parameters
more » ... the synthetic speech are derived fkom a two-stage procedure. The prosodic structures of the input texts are fmt derived firom a statistical model trained by a large speech database, and the prosodic parameters are then determined according to the structures. (3) A specially designed speech segments inventory constructed with non-uniform and pitch dependent units is used to improve the fluency and intelligibility of the system.
doi:10.1109/icassp.1997.596087 dblp:conf/icassp/ChouTCL97 fatcat:cupyo4raejenbj4sovjeg5um2i