Statistical approach to voice quality control in esophageal speech enhancement

Kenzo Yamamoto, Tomoki Toda, Hironori Doi, Hiroshi Saruwatari, Kiyohiro Shikano
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
This paper describes a voice quality control method in statistical esophageal speech enhancement. Esophageal speech is produced by one of the alternative speaking methods for laryngectomees. Its naturalness and intelligibility are much lower than those of natural voices and its voice quality sounds similar even if uttered by different laryngectomees. These issues are alleviated by a statistical voice conversion method from esophageal speech into normal speech (ESto-Speech) based on eigenvoices.
more » ... sed on eigenvoices. This method is capable of determining converted voice quality using a few target voice samples. In this paper, we propose ES-to-Speech using regression techniques to make it possible to manually control the converted voice quality by manipulating a few intuitively controllable parameters even if no target voice sample is available. The effectiveness of the proposed method is confirmed by experimental evaluations.
doi:10.1109/icassp.2012.6287949 dblp:conf/icassp/YamamotoTDSS12 fatcat:3lhajz6ov5fj5g3dleaju4phj4