Viseme Recognition using lip curvature and Neural Networks to detect Bangla Vowels

Nahid Akhter, Amitabha Chakrabarty
Automatic Speech Recognition plays an important role in human-computer interaction, which can be applied in various vital applications like crime-fighting and helping the hearing-impaired. This paper provides a new method for recognition of Bengali visemes based on a combination of image-based lip segmentation techniques, use of curvature of the both inner and outer lips as well as neural networks. The method is divided into three steps. First step is a lip segmentation step that uses a
more » ... ion of red exclusion method, HSV space and CIE spaces to produce illumination invariant images. Next, inner and outer lips are extracted separately using a new technique for curve-fitting. Second step is the feature extraction step, which makes use of quadratic curve-coefficients of the inner and outer lip contours. Finally, viseme recognition is done using a Neural Network. A dataset was created with 171 lip images of Bangla Visemes being spoken by different speakers and under different lighting conditions. The proposed method gave a viseme recognition result of 87.3%. Due to the use of non-iterative method as opposed to conventional methods, the algorithm was found to be faster in detecting lip contours.