A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
End-to-end visual speech recognition with LSTMS
2017
2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Traditional visual speech recognition systems consist of two stages, feature extraction and classification. Recently, several deep learning approaches have been presented which automatically extract features from the mouth images and aim to replace the feature extraction stage. However, research on joint learning of features and classification is very limited. In this work, we present an end-to-end visual speech recognition system based on Long-Short Memory (LSTM) networks. To the best of our
doi:10.1109/icassp.2017.7952625
dblp:conf/icassp/PetridisLP17
fatcat:vg2mewdsibgflktoeibzu6fvrq