Macro-cuboïd based probabilistic matching for lip-reading digits

Samuel Pachoud, Shaogang Gong, Andrea Cavallaro
2008 2008 IEEE Conference on Computer Vision and Pattern Recognition  
In this paper, we present a spatio-temporal feature representation and a probabilistic matching function to recognise lip movements from pronounced digits. Our model (1) automatically selects spatio-temporal features extracted from 10 digit model templates and (2) matches them with probe video sequences. Spatio-temporal features embed lip movements from pronouncing digits and contain more discriminative information than spatial features alone. A model template for each digit is represented by a
more » ... is represented by a set of spatiotemporal features at multiple scales. A probabilistic sequence matching function automatically segments a probe video sequence and matches the most likely sequence of digits recognised in the probe sequence. We demonstrate the proposed approach using the CUAVE [23] database and compare our representational scheme with three alternative methods, based on optical flow, intensity gradient and block matching, respectively. The evaluation shows that the proposed approach outperforms the others in recognition accuracy and is robust in coping with variations in probe sequences.
doi:10.1109/cvpr.2008.4587734 dblp:conf/cvpr/PachoudGC08 fatcat:5xhbqfk65bhtpnbswxvvs5vygu