Linear discriminant analysis for speechreading

G. Potamianos, H.P. Graf
1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175)  
This paper investigates the use of Fisher-Rao linear discriminant analysis LDA as a means of visual feature extraction for hidden Markov model based automatic speechreading. For every video frame, a three-dimensional region of interest containing the speaker's mouth over a sequence of adjacent frames is lexicographically arranged into a data vector. Such v ectors are then projected onto the space of the most discriminant eigensequences", estimated by means of LDA o n a training set of image
more » ... ng set of image sequence vectors, labeled from a set of a-priori chosen classes. The resulting projections, as well as their rst and second derivatives over time, are used as features for automatic speechreading. The proposed method is applied to single-speaker, multi-speaker, and speaker-independent visual-only recognition tasks, consistently outperforming principal component analysis and discrete wavelet transform based visual features. Speci c issues relevant t o L D A are also discussed, namely, class selection, automatic data class labeling, and dimensionality reduction prior to LDA.
doi:10.1109/mmsp.1998.738938 dblp:conf/mmsp/PotamianosG98 fatcat:6a4ot7ebwzgwbfsslj65snpb2i