Phonetic recognition by recurrent neural networks working on audio and visual information

P. Cosi, M. Dugatto, F. Ferrero, E.Magno Caldognetto, K. Vagges
1996 Speech Communication  
A phonetic classification scheme based on a feed forward recurrent back-propagation neural network working on audio and visual information is described. The speech signal is processed by an auditory model producing spectral-like parameters, while the visual signal is processed by a specialised hardware, called ELITE, computing lip and jaw kinematics parameters. Some results will be given for various speaker dependent and independent phonetic recognition experiments regarding the Italian plosive consonants. *
doi:10.1016/0167-6393(96)00034-9 fatcat:twjse5vl3jf77iefvqre3rsike