Phoneme-to-Articulatory Mapping Using Bidirectional Gated RNN

Théo Biasutto-Lervat, Slim Ouni
2018 Interspeech 2018  
Deriving articulatory dynamics from the acoustic speech signal has been addressed in several speech production studies. In this paper, we investigate whether it is possible to predict articulatory dynamics from phonetic information without having the acoustic speech signal. The input data may be considered as not sufficiently rich acoustically, as probably there is no explicit coarticulation information but we expect that the phonetic sequence provides compact yet rich knowledge. Motivated by
more » ... e recent success of deep learning techniques used in the acoustic-to-articulatory inversion, we have experimented around the bidirectional gated recurrent neural network architectures. We trained these models with an EMA corpus, and have obtained good performances similar to the state-of-theart articulatory inversion from LSF features, but using only the phoneme labels and durations.
doi:10.21437/interspeech.2018-1202 dblp:conf/interspeech/Biasutto-Lervat18 fatcat:anxzcopwdrbzfkd5gxtbnmaa3y