Automated extraction of signs from continuous sign language sentences using Iterated Conditional Modes

Sunita Nayak, Sudeep Sarkar, Barbara Loeding
2009 2009 IEEE Conference on Computer Vision and Pattern Recognition  
Recognition of signs in sentences requires a training set constructed out of signs found in continuous sentences. Currently, this is done manually, which is a tedious process. In this work, we consider a framework where the modeler just provides multiple video sequences of sign language sentences, constructed to contain the vocabulary of interest. We learn the models of the recurring signs, automatically. Specifically, we automatically extract the parts of the signs that are present in most
more » ... present in most occurrences of the sign in context. These parts of the signs that is stable with respect to adjacent signs, are referred to as signemes. Each video is first transformed into a multidimensional time series representation, capturing the motion and shape aspects of the sign. We then extract signemes from multiple sentences, concurrently, using Iterated Conditional Modes (ICM). We show results by learning multiple instances of 10 different signs from a set of 136 sign language sentences. We classify the extracted signemes as correct, partially correct or incorrect depending on whether both the start and end locations are correct, only one of them is correct or both are incorrect, respectively. Out of the 136 extracted video signemes, 98 were correct, 20 were partially correct and 18 were incorrect. To demonstrate the generality of the unsupervised modeling idea, we also show the ability to automatically extract common spoken words in audio. We consider the English glosses (spoken) corresponding to the sign language sentences and extract the audio counterparts of the signs. Of the 136 such instances, we recovered 127 correct, 8 partially correct, and 1 incorrect representation of the words.
doi:10.1109/cvpr.2009.5206599 dblp:conf/cvpr/NayakSL09 fatcat:jzdypcv4ojef3frrasxyj6zijm