Real-Time Dynamic Sign Language Recognition Based on Hierarchical Matching Strategy

Liang Wenle, Huang Yuanyuan, Hu Zuojin
2017 International Journal of Signal Processing, Image Processing and Pattern Recognition  
Dynamic sign language can be described by its trajectory and the key hand-action. However, a large number of statistical data show that most of the commonly used sign language can be recognized by its trajectory curve. Therefore, a hierarchical matching recognition strategy for dynamic sign language is proposed in this paper. First, the gesture trajectory can be obtained by the somatosensory equipment like Kinect. According to its point density an algorithm of key frame detection is designed
more » ... tion is designed and is used to extract the key gestures. Then the dynamic time warping (DTW) algorithm is optimized and used to do the first-level matching, i.e. trajectory matching. If the recognition results can be get currently, then the recognition process can be finished, otherwise the process should go into the second-level, i.e. key frame matching and get the final recognition results. Experiments show that this algorithm not only has good real-time performance, the recognition accuracy is also higher. cameras is necessary which is very inconvenient. In recent years, since the appearance of the depth camera, the research of gesture recognition based on three-dimensional data has been greatly developed. The Microsoft Corp launched a somatosensory camera Kinect in 2010, which makes the use of depth information to identify sign language has become a trend. Jang and others proposed a system that uses Kinect to acquire depth information to identify the gesture. Based on the algorithm of continuously adaptive mean shift they used depth probability and update depth histogram to track hand position [5] . Chai and etc. used Kinect to obtain 3D features of hand gesture and realize the dynamic sign language recognition through matching the 3D gesture trace, and the average recognition rate can reach 83.51% [6] . Marin used Kinect to locate the hand area and furthermore get the fine information through Leap Motion. Then the support vector machine was used as the classifier. The recognition rate reached 91.28% [7] . At present, the use of Kinect to obtain the depth information to identify dynamic sign language has become the mainstream. key action, the feature points are relatively very intensive. According to such a statistical rule, an algorithm of key frame detection based on the point density in gesture trace is proposed. Suppose there is a gesture trace curve P . If we want to define the point density of point the depth and skin color information, hand area can be detected and extracted features. As shown in Figure 4 , each key hand-shape can be looked on as a N -dimensional feature vector
doi:10.14257/ijsip.2017.10.7.03 fatcat:6dobiugug5allk5q5me3xi32lu