Filters








426 Hits in 8.8 sec

Multi-modal gesture recognition challenge 2013

Sergio Escalera, Jordi Gonzàlez, Xavier Baró, Miguel Reyes, Oscar Lopes, Isabelle Guyon, Vassilis Athitsos, Hugo Escalante
2013 Proceedings of the 15th ACM on International conference on multimodal interaction - ICMI '13  
We made available a large video database of 13, 858 gestures from a lexicon of 20 Italian gesture categories recorded with a Kinect T M camera, providing the audio, skeletal model, user mask, RGB and depth  ...  In order to promote the research advance on this field, we organized a challenge on multi-modal gesture recognition.  ...  Data also include distracter gestures to make the recognition task challenging. The modalities provided included audio, RGB, depth maps, user masks, and skeletal model.  ... 
doi:10.1145/2522848.2532595 dblp:conf/icmi/EscaleraGBRLGAE13 fatcat:wqa52yfw5bbnrpfqxeijv2ehly

Multi-modal Gesture Recognition Using Skeletal Joints and Motion Trail Model [chapter]

Bin Liang, Lihong Zheng
2015 Lecture Notes in Computer Science  
This paper proposes a novel approach to multi-modal gesture recognition by using skeletal joints and motion trail model. The approach includes two modules, i.e. spotting and recognition.  ...  The proposed approach is evaluated on the 2014 ChaLearn Multi-modal Gesture Recognition Challenge dataset.  ...  A multi-modal gesture recognition system is developed in [17] for detecting as well as recognizing the gestures. The system adopts audio, RGB video, and skeleton joint models.  ... 
doi:10.1007/978-3-319-16178-5_44 fatcat:3byivkbqizhvhb72d5lm6vtzkm

3D skeletal movement-enhanced emotion recognition networks

Jiaqi Shi, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro
2021 APSIPA Transactions on Signal and Information Processing  
The combined model utilizes audio signals, text information, and skeletal data.  ...  In this paper, we extract three-dimensional skeleton information from videos and apply the method to IEMOCAP database to add a new modality.  ...  In this way, we obtained the position data of the joints in the 3D coordinate system from the original videos.  ... 
doi:10.1017/atsip.2021.11 fatcat:bnrelrqbxnhani7sjb44ljpbou

Multimodal Dynamic Networks for Gesture Recognition

Di Wu, Ling Shao
2014 Proceedings of the ACM International Conference on Multimedia - MM '14  
In this paper, we propose a novel bi-modal (audio and skeleton joints) dynamic network for gesture recognition.  ...  First, state-of-the-art dynamic Deep Belief Networks are deployed to extract high level audio and skeletal joints representations.  ...  Conversely, audio and visual data for gesture recognition have correlations at a "midlevel", as phonemes and joints motions; it can be difficult to relate joint spatio-temporal information to audio waveforms  ... 
doi:10.1145/2647868.2654969 dblp:conf/mm/WuS14 fatcat:6lgacabl25gjnlkcabwpuzoopi

Online RGB-D gesture recognition with extreme learning machines

Xi Chen, Markus Koskela
2013 Proceedings of the 15th ACM on International conference on multimodal interaction - ICMI '13  
In this paper, we propose a method for online gesture recognition using RGB-D data from a Kinect sensor.  ...  The outputs from the classifiers are aggregated to provide the final classification results for the gestures. We test our method on the ChaLearn multi-modal gesture challenge data.  ...  In this paper, we use multi-modal data obtained from a Kinect sensor for online gesture recognition.  ... 
doi:10.1145/2522848.2532591 dblp:conf/icmi/ChenK13 fatcat:oig6e6v7kfbrbigk2fsmqay3v4

ModDrop: adaptive multi-modal gesture recognition [article]

Natalia Neverova and Christian Wolf and Graham W. Taylor and Florian Nebout
2015 arXiv   pre-print
We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning.  ...  Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at three temporal scales.  ...  Acknowledgement This work has been partly financed through the French grant Interabot, a project of type "Investissement's d'Avenir / Briques Génériques du Logiciel Embarqué".  ... 
arXiv:1501.00102v2 fatcat:p5a7xrp6zfc4jf2tuihvoq2j74

ModDrop: Adaptive Multi-Modal Gesture Recognition

Natalia Neverova, Christian Wolf, Graham Taylor, Florian Nebout
2016 IEEE Transactions on Pattern Analysis and Machine Intelligence  
We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning.  ...  Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at three temporal scales.  ...  Acknowledgement This work has been partly financed through the French grant Interabot, a project of type "Investissement's d'Avenir / Briques Génériques du Logiciel Embarqué".  ... 
doi:10.1109/tpami.2015.2461544 fatcat:d6dp5cdmfbgd3awuwfimtqrqdm

A Multi-scale Approach to Gesture Detection and Recognition

Natalia Neverova, Christian Wolf, Giulio Paci, Giacomo Sommavilla, Graham W. Taylor, Florian Nebout
2013 2013 IEEE International Conference on Computer Vision Workshops  
We propose a generalized approach to human gesture recognition based on multiple data modalities such as depth video, articulated pose and speech.  ...  Our experiments on the 2013 Challenge on Multimodal Gesture Recognition dataset have demonstrated that using multiple modalities at several spatial and temporal scales leads to a significant increase in  ...  Conclusion We have described a generalized method for gesture and near-range action recognition from a combination of range video data, audio and articulated pose.  ... 
doi:10.1109/iccvw.2013.69 dblp:conf/iccvw/Neverova0PSTN13 fatcat:lerju4ym75bmjjinrwp74iaedi

Challenges in Multi-modal Gesture Recognition [chapter]

Sergio Escalera, Vassilis Athitsos, Isabelle Guyon
2017 Gesture Recognition  
We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision  ...  We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research.  ...  We thank our co-organizers of ChaLearn gesture and action recognition challenges: Miguel Reyes, Jordi Gonzalez, Xavier Baro, Jamie Shotton, Victor Ponce, Miguel Angel Bautista, and Hugo Jair Escalante.  ... 
doi:10.1007/978-3-319-57021-1_1 fatcat:vfeijghqtvffllogw2tium3pwa

Skeleton-Based Emotion Recognition Based on Two-Stream Self-Attention Enhanced Spatial-Temporal Graph Convolutional Network

Jiaqi Shi, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro
2020 Sensors  
Although gesture modality plays an important role in expressing emotion, it is seldom considered in the field of emotion recognition.  ...  as a static graph, and the self-attention part dynamically constructs more connections between the joints and provides supplementary information.  ...  Skeletal Data Extraction We extract dynamic body skeleton data from raw video of a large existing open-source emotional database to add a new modality representing gestures.  ... 
doi:10.3390/s21010205 pmid:33396917 pmcid:PMC7795329 fatcat:pqxk3jadnrhhhkpzngfn2cuiva

ChaLearn multi-modal gesture recognition 2013

Sergio Escalera, Cristian Sminchisescu, Richard Bowden, Stan Sclaroff, Jordi Gonzàlez, Xavier Baró, Miguel Reyes, Isabelle Guyon, Vassilis Athitsos, Hugo Escalante, Leonid Sigal, Antonis Argyros
2013 Proceedings of the 15th ACM on International conference on multimodal interaction - ICMI '13  
The MMGR Grand Challenge focused on the recognition of continuous natural gestures from multi-modal data (including RGB, Depth, user mask, Skeletal model, and audio).  ...  We organized a Grand Challenge and Workshop on Multi-Modal Gesture Recognition.  ...  We thank the Kaggle submission website for wonderful support, together with the committee members and participants of the ICMI 2013 Multi-modal Gesture Recognition workshop for their support, reviews and  ... 
doi:10.1145/2522848.2532597 dblp:conf/icmi/EscaleraGBRGAESASBS13 fatcat:tx5jk4bdjjaohk3n4brrj62e4y

Multi-scale Deep Learning for Gesture Detection and Localization [chapter]

Natalia Neverova, Christian Wolf, Graham W. Taylor, Florian Nebout
2015 Lecture Notes in Computer Science  
We present a method for gesture detection and localization based on multi-scale and multi-modal deep learning.  ...  Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at two temporal scales.  ...  Acknowledgments This work was partially funded by a French grant INTER-ABOT, call "Investissements d'Avenir / Briques Génériques du Logiciel Embarqué", and by a French region "Rhônes Alpes" through the  ... 
doi:10.1007/978-3-319-16178-5_33 fatcat:re3s2b2h25dg3nnvreql2rhl2a

Using Appearance-Based Hand Features for Dynamic RGB-D Gesture Recognition

Xi Chen, Markus Koskela
2014 2014 22nd International Conference on Pattern Recognition  
Gesture recognition using RGB-D sensors has currently an important role in many fields such as human-computer interfaces, robotics control, and sign language recognition.  ...  In this paper we propose an online gesture recognition method for multimodal RGB-D data.  ...  The winners' recognition system was based on audio and skeletal information. They used MFCC features and Gaussian HMMs for audio, and a Dynamic Time Warping based classifier for the skeletons.  ... 
doi:10.1109/icpr.2014.79 dblp:conf/icpr/ChenK14 fatcat:4nxyie4p3zbndby4pok2dd3ekq

Learning Deep and Wide: A Spectral Method for Learning Deep Networks

Ling Shao, Di Wu, Xuelong Li
2014 IEEE Transactions on Neural Networks and Learning Systems  
Only the skeletal modality and the audio modality are considered.  ...  The experimental results on bi-modal time series data, i.e., audio and skeletal joints data, show that the multimodal DBN+HMM framework can learn a good model of the joint space of multiple sensory inputs  ...  An illustration of the RGB, depth (with user segmentation) and skeletal modalities is shown in Fig 6. 2.  ... 
doi:10.1109/tnnls.2014.2308519 pmid:25420251 fatcat:4mnl6tv2xnf3jpzwhp76cvl4ti

Action Recognition Based On Conceptors Of Skeleton Joint Trajectories

2016 Revista de la Facultad de Ingeniería  
Skeletal data is a more exact data than RGB video while it eliminates the occlusions that caused by the limbs of the actor.  ...  With the tremendous popularity of the Kinect, recognizing human actions or gestures from skeletal data becomes more feasible.  ...  It is a multi-modal data set recorded by the Kinect camera. This dataset includes RGB video streams, depth images, user masks, skeletal data and audio data.  ... 
doi:10.21311/002.31.4.02 fatcat:oyf6ieus7rbtzjiqf2xad233ze
« Previous Showing results 1 — 15 out of 426 results