A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Speech Recognition Using Historian Multimodal Approach
2019
The Egyptian Journal of Language Engineering
This paper proposes an Audio-Visual Speech Recognition (AVSR) model using both audio and visual speech information to improve recognition accuracy in a clean and noisy environment. Mel frequency cepstral coefficient (MFCC) and Discrete Cosine Transform (DCT) are used to extract the effective features from audio and visual speech signal respectively. The Classification process is performed on the combined feature vector by using one of main Deep Neural Network (DNN) architecture, Bidirectional
doi:10.21608/ejle.2019.59164
fatcat:ylyu5apzuzakvefkxxycavcxei