A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is
Renaud Péteri Georges Quénot Philippe Joly MTAP Guest Editors Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. ...doi:10.1007/s11042-020-10502-7 fatcat:vrxeidz7lnf2dm7d6z6pefkawq
Analyzing and recognizing human actions in videos has received considerable attention for many years in the computer vision community. Works on this topic are motivated by several potential applications (video monitoring, automatic video indexing, crowd analysis, human-machine interaction, etc). The wide variability of human actions makes it difficult to design generic methods. Many proposed approaches are based on discriminative supervised models [3, 4, 5, 6] . Other studies are focused ondoi:10.5244/c.29.118 dblp:conf/bmvc/BeaudryPM15 fatcat:ulkjyhspszaojmfzzatby7smba
more »... rative probabilistic models, and are based on LDA  or semisupervised LDA  . However, generative models fail to match already known actions occurring in videos and it is moreover difficult to semantically analyze discovered topics.
Lecture Notes in Computer Science
This paper presents four spatio-temporal wavelet decompositions for characterizing dynamic textures. The main goal of this work is to compare the influence of spatial and temporal variables in the wavelet decomposition scheme. Its novelty is to establish a comparison between the only existing method  and three other spatio-temporal decompositions. The four decomposition schemes are presented and successfully applied on a large dynamic texture database. Construction of feature descriptorsdoi:10.1007/978-3-642-02172-5_41 fatcat:jzxubw3subho3ki4xklygytsam
more »... tackled as well their relevance, and performances of the methods are discussed. Finally, future prospects are exposed.
Lecture Notes in Computer Science
Minh-Phuong Tran, Renaud Péteri, Maitine Bergounioux ...doi:10.1007/978-3-642-31298-4_17 fatcat:75htk4uyinainm3j4zduxlkgsi
Traitement du signal
Cet article porte sur la reconnaissance d'actions humaines dans des vidéos. La méthode présentée est basée sur l'estimation du flot optique dans chaque séquence afin d'en extraire des points critiques caractéristiques du mouvement. Des trajectoires d'intérêt multiéchelles sont ensuite générées à partir de ces points puis caractérisées fréquentiellement. Le descripteur final de la vidéo est obtenu en fusionnant ces caractéristiques de trajectoire avec des informations supplémentairesdoi:10.3166/ts.32.265-286 fatcat:g3j7deolx5gwzp5bxrsh4eie4m
more »... n de mouvements et de contours. Les résultats expérimentaux montrent que la méthode proposée permet d'atteindre, sur différentes bases de vidéos, des taux de classification parmi les plus élevés de la littérature. Contrairement aux récentes stratégies nécessitant des grilles denses de points d'intérêt, la méthode a l'avantage de ne considérer que les points critiques du mouvement, ce qui permet une baisse du coût de calcul ainsi qu'une caractérisation plus qualitative de chaque séquence. Les perspectives de ce travail sont finalement discutées, notamment celle portant sur la reconnaissance d'actions dites complexes. ABSTRACT. This paper focuses on human action recognition in video sequences. A method based on optical flow estimation is presented, where critical points of this flow field are extracted. Multi-scale trajectories are generated from those points and are frequentially characterized. Finally, a sequence is described by fusing this frequency information with motion orientation and shape information. Experiments on video datasets show that this method achieves recognition rates among the highest in the state of the art. Contrary to recent dense sampling strategies, the proposed method only requires critical points of motion flow field, thus permitting a lower computational cost and a better sequence description. Results, comparison and perspectives on complex actions recognition are then discussed. MOTS-CLÉS : reconnaissance d'actions, points critiques, caractérisation fréquentielle de trajectoires.
This paper presents a new approach for segmenting a video sequence containing dynamic textures. The proposed method is based on a 2D+T curvelet transform and an octree hierarchical representation. The curvelet transform enables to outline spatio-temporal structures of a given scale and orientation. The octree structure based on motion coherence enables a better spatio-temporal segmentation than a direct application of the 2D+T curvelet transform. Our segmentation method is successfully applieddoi:10.1109/icip.2009.5413352 dblp:conf/icip/DuboisPM09 fatcat:62nypquiajhfpaj4epih3svkcq
more »... n video sequences of dynamic textures. Future prospects are finally exposed. Index Terms-2D+T discrete curvelet transform, video segmentation, octree structure, dynamic textures.
The research context of this work is dynamic texture analysis and characterization. Many dynamic textures can be modeled as a large scale propagating wave and local oscillating phenomena. The Morphological Component Analysis algorithm (MCA) is used to retrieve these components using a well chosen dictionary. We define a new strategy for adaptive thresholding in the MCA framework, which greatly reduces the computation time when applied on videos. Tests on synthetic and real image sequencesdoi:10.1109/icpr.2010.553 dblp:conf/icpr/DuboisPM10 fatcat:rax3bfqopvhv7evlbwpc6oyjvy
more »... rate the efficiency of the proposed method and future prospects are finally exposed.
Over the past few years, Action Recognition task has drawn considerable interests, leading to intensive researches. This is mainly due to the variety of related applications, from autonomous car to human behavior analysis. Up to now, most of researches aim to identify various sport actions such as UCF-101 dataset  , but, due to the exponential number of online videos and the necessity to be more and more accurate, the need of finer analysis arises. In this working note, results for thedblp:conf/mediaeval/CalandrePM19 fatcat:fj7kdvwj45aclkithc2k7ntif4
more »... Eval 2019 Sports Video Annotation "Detection of Strokes in Table Tennis" task  are presented. As in sport videos displacement flow appears to be one of the most useful information for stroke identification, especially to differentiate quite similar strokes, this proposal relies on a combination of spatial information and Optical Flow's singularities identification. As a result, most relevant regions of video frames for the classification task are detected.
The paper addresses the problem of recognition of actions in video with low inter-class variability such as Table Tennis strokes. Two stream, "twin" convolutional neural networks are used with 3D convolutions both on RGB data and optical flow. Actions are recognized by classification of temporal windows. We introduce 3D attention modules and examine their impact on classification efficiency. In the context of the study of sportsmen performances, a corpus of the particular actions of tablearXiv:2012.05342v1 fatcat:y4mqrd3q4bgdlnp54hmmztkmum
more »... strokes is considered. The use of attention blocks in the network speeds up the training step and improves the classification scores up to 5% with our twin model. We visualize the impact on the obtained features and notice correlation between attention and player movements and position. Score comparison of state-of-the-art action classification method and proposed approach with attentional blocks is performed on the corpus. Proposed model with attention blocks outperforms previous model without them and our baseline.
This paper focuses on human action recognition in video sequences. A method based on the optical flow estimation is presented, where critical points of the flow field are extracted. Multi-scale trajectories are generated from those points and are characterized in the frequency domain. Finally, a sequence is described by fusing this frequency information with motion orientation and shape information. Experiments show that this method has recognition rates among the highest in the state of thedoi:10.1109/icip.2014.7025289 dblp:conf/icip/BeaudryPM14 fatcat:2lmenanrvnhprdsfeqzcwnhgi4
more »... on the KTH dataset. Contrary to recent dense sampling strategies, the proposed method only requires critical points of motion flow field, thus permitting a lower computation time and a better sequence description. Results and perspectives are then discussed. Index Terms-Action recognition in videos, critical points, frequency analysis of motion trajectories.
The research context of this article is the recognition and description of dynamic textures. In image processing, the wavelet transform has been successfully used for characterizing static textures. To our best knowledge, only two works are using spatio-temporal multiscale decomposition based on tensor product for dynamic texture recognition. One contribution of this article is to analyse and compare the ability of the 2D+T curvelet transform, a geometric multiscale decomposition, fordoi:10.1007/s11760-013-0532-4 fatcat:3yqe7xgz7bdjhezpf33ihdp4tm
more »... zing dynamic textures in image sequences. Two approaches using the 2D+T curvelet transform are presented and compared using three new large databases. A second contribution is the construction of these three publicly available benchmarks of increasing complexity. Existing benchmarks are either too small, not available or not always constructed using a reference database. as well as their relevance, and performances of the different methods are discussed. Finally, future prospects are exposed.
The research context of this work is dynamic texture analysis and characterization. Many dynamic textures can be modeled as large scale propagating wavefronts and local oscillating phenomena. After introducing a formal model for dynamic textures, the Morphological Component Analysis (MCA) approach with a well chosen dictionary is used to retrieve the components of dynamic textures. We define two new strategies for adaptive thresholding in the MCA framework, which greatly reduce the computationdoi:10.1109/tcsvt.2011.2159430 fatcat:jdpjpqtmyve5pne3jk6o2wofti
more »... ime when applied on videos. Tests on real image sequences illustrate the efficiency of the proposed method. An application to global motion estimation is proposed and future prospects are finally exposed.
La disponibilité d'images satellites à très haute résolution spatiale au dessus de zones urbaines est récente. Elle constitue potentiellement un très grand apport pour la cartographie des villes à des échelles de l'ordre du 1 :10 000. La très haute résolution spatiale de ces nouveaux capteurs permet une représentation réelle des rues sur une carte, mais engendre une augmentation significative des artefacts. Dans cet article, une méthode complète d'extraction de la voirie urbaine à partirdoi:10.3166/rig.14.485-504 fatcat:jvy3z7g7jffrjgjuy4xdom7j74
more »... s à très haute résolution spatiale est tout d'abord proposée. Un protocole innovant d'évaluation quantitative des résultats par comparaison à l'interprétation humaine est présenté. La méthode est ensuite appliquée et évaluée pour une scène urbaine du satellite Quickbird. Les évolutions futures de la méthode, permettant d'exploiter encore plus pleinement les potentialités des nouveaux capteurs, sont enfin discutées. ABSTRACT. The availability of very high spatial resolution satellite images over urban areas is recent. It potentially represents an important contribution to urban mapping at scales of 1 10 000. The very high spatial resolution of these sensors enables a real representation of streets on a map, but generates in return a significant increase of artefacts. In this article, a complete extraction method of the urban street network from very high spatial resolution images is first proposed. A novative protocol for quantitatively assess the results compared to human interpretation is presented. The method is then applied and assessed for an urban scene from the Quickbird satellite. Future evolutions enabling to better exploit the potentialities of these new sensors are finally discussed. MOTS-CLÉS : télédétection, très haute résolution spatiale, cartographie urbaine, réseaux de rues, contours actifs, analyse multi-échelle, évaluation quantitative.
Human action recognition in video is one of the key problems in visual data interpretation. Despite intensive research, the recognition of actions with low inter-class variability remains a challenge. This paper presents a new Siamese Spatio-Temporal Convolutional neural network (SSTC) for this purpose. When applied to table tennis, it is possible to detect and recognize 20 table tennis strokes. The model has been trained on a specific dataset, TTStroke-21, recorded in natural conditiondoi:10.1109/cbmi.2018.8516488 dblp:conf/cbmi/MartinBPM18 fatcat:3l7rqo2n7vgrnppdaqkqp4fi74
more »... ess) at the Faculty of Sports of the University of Bordeaux. Our model takes as inputs a RGB image sequence and its computed Optical Flow. After 3 spatio-temporal convolutions, data are fused in a fully connected layer of a proposed siamese network architecture. Our method reaches an accuracy of 91.4% against 43.1% for our baseline.
This work presents a Table Tennis stroke classification approach through a siamese spatio-temporal convolutional neural network -SSTCNN. The videos are recorded at 120 frames per second with players performing in natural conditions. The frames are extracted, resized and processed to compute the optical flow. From the optical flow, a region of interest -ROI -is inferred. The SSTCNN is then feed by RGB and optical flow ROIs stream to give a probabilistic classification over all the table tennisdblp:conf/mediaeval/MartinBMPM19 fatcat:ey4ra6s7ynaijc3jhv6ir4sjv4
more »... rokes. Optical Flow estimator As shown in , flow estimators can have a strong impact on the classification, so we tested classification using two different flow estimators: DeepFlow  and Dense Inversive Search -DIS .
« Previous Showing results 1 — 15 out of 44 results