Filters








37 Hits in 6.4 sec

3D attention mechanism for fine-grained classification of table tennis strokes using a Twin Spatio-Temporal Convolutional Neural Networks [article]

Pierre-Etienne Martin, Renaud Péteri, Julien Morlier
2020 arXiv   pre-print
Two stream, "twin" convolutional neural networks are used with 3D convolutions both on RGB data and optical flow. Actions are recognized by classification of temporal windows.  ...  The paper addresses the problem of recognition of actions in video with low inter-class variability such as Table Tennis strokes.  ...  Twin Spatio-Temporal Convolutional Neural Network -TSTCNN In order to perform action classification in videos, we use a two streams convolutional neural network (twin) with attention mechanism.  ... 
arXiv:2012.05342v1 fatcat:y4mqrd3q4bgdlnp54hmmztkmum

3D attention mechanism for fine-grained classification of table tennis strokes using a Twin Spatio-Temporal Convolutional Neural Networks

Pierre-Etienne Martin, Jenny Benois-Pineau, Renaud Peteri, Julien Morlier
2021 2020 25th International Conference on Pattern Recognition (ICPR)  
Two stream, "twin" convolutional neural networks are used with 3D convolutions both on RGB data and optical flow. Actions are recognized by classification of temporal windows.  ...  The paper addresses the problem of recognition of actions in video with low inter-class variability such as Table Tennis strokes.  ...  Twin Spatio-Temporal Convolutional Neural Network -TSTCNN In order to perform action classification in videos, we use a two streams convolutional neural network (twin) with attention mechanism.  ... 
doi:10.1109/icpr48806.2021.9412742 fatcat:jwykfzvyyfhj5hy6hwphkkw3mi

Sports Video: Fine-Grained Action Detection and Classification of Table Tennis Strokes from Videos for MediaEval 2021 [article]

Pierre-Etienne Martin
2021 arXiv   pre-print
The Sports Video task is part of the MediaEval 2021 benchmark. This task tackles fine-grained action detection and classification from videos. The focus is on recordings of table tennis games.  ...  Running since 2019, the task has offered a classification challenge from untrimmed video recorded in natural conditions with known temporal boundaries for each stroke.  ...  Fine grained sport action recognition with Twin spatio-temporal convolu- tional neural networks. Multim.  ... 
arXiv:2112.11384v1 fatcat:5qidqoysgvfshoirktbcmlsedu

3D Convolutional Networks for Action Recognition: Application to Sport Gesture Recognition [article]

Pierre-Etienne Martin, J Benois-Pineau, R Péteri, A Zemmari, J Morlier
2022 arXiv   pre-print
3D convolutional networks is a good means to perform tasks such as video segmentation into coherent spatio-temporal chunks and classification of them with regard to a target taxonomy.  ...  In the chapter we are interested in the classification of continuous video takes with repeatable actions, such as strokes of table tennis.  ...  our solution to the problem of fined-grained action recognition in video with a 3D CNN we call TSTCNNa twin spatio-temporal CNN.  ... 
arXiv:2204.08460v1 fatcat:7ks2orf4m5ge5ggmafbeghbrsu

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal CNN for MediaEval 2020

Pierre-Etienne Martin, Jenny Benois-Pineau, Boris Mansencal, Renaud Péteri, Julien Morlier
2020 MediaEval Benchmarking Initiative for Multimedia Evaluation  
This work presents a method for classifying table tennis strokes using spatio-temporal convolutional neural networks.  ...  A three stream spatio-temporal convolutional neural network using combination of those modalities and 3D attention mechanisms is presented in order to perform classification.  ...  Model architecture The model was similar to the Twin Spatio-Temporal Convolutional Neural Network -TSTCNN with attention mechanisms presented in [15] .  ... 
dblp:conf/mediaeval/MartinBMPM20 fatcat:kl3tz5ag6ragdkeyysjpwv2qga

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table Tennis Strokes Classification Task

Hai Nguyen-Truong, San Cao, N. A. Khoa Nguyen, Bang-Dang Pham, Hieu Dao, Minh-Quan Le, Hoang-Phuc Nguyen-Dinh, Hai-Dang Nguyen, Minh-Triet Tran
2020 MediaEval Benchmarking Initiative for Multimedia Evaluation  
In this task, we -the HCMUS Teamperform multiple experiments, which includes a combination of models such as SlowFast, Optical Flow, DensePose, R2+1, Channel-Separated Convolutional Networks, to classify  ...  In total, we submit eight runs corresponding to five different models with different sets of hyperparameters in each of our models.  ...  Twin Spatio-Temporal Convolutional Neural Networks (TSTCNN) In this task, we also use the Twin Spatio-Temporal Convolutional Neural Networks (TSTCNN) [10] and conduct experiments on it with our minor  ... 
dblp:conf/mediaeval/Nguyen-TruongCN20 fatcat:w4dyczo7gzfcbip2mqjmvycxxa

An overview of Human Action Recognition in sports based on Computer Vision

Kristina Host, Marina Ivašić-Kos
2022 Heliyon  
Human Action Recognition (HAR) is a challenging task used in sports such as volleyball, basketball, soccer, and tennis to detect players and recognize their actions and teams' activities during training  ...  As an action that can occur in the sports field refers to a set of physical movements performed by a player in order to complete a task using their body or interacting with objects or other persons, actions  ...  In [45] , a Twin Spatio-temporal Convolutional Neural Network, which takes as inputs an RGB image sequence and its computed Optical Flow, is proposed.  ... 
doi:10.1016/j.heliyon.2022.e09633 pmid:35706961 pmcid:PMC9189896 fatcat:5o4x4ywsanfcfo6wvkoc2g6lym

Sports Video Classification: Classification of Strokes in Table Tennis for MediaEval 2020

Pierre-Etienne Martin, Jenny Benois-Pineau, Boris Mansencal, Renaud Péteri, Laurent Mascarilla, Jordan Calandre, Julien Morlier
2020 MediaEval Benchmarking Initiative for Multimedia Evaluation  
Fine grained action classification has raised new challenges compared to classical action classification problem.  ...  Sport video analysis is a very popular research topic, due to the variety of application areas, ranging from multimedia intelligent devices with user-tailored digests, up to analysis of athletes' performances  ...  Neural Network.  ... 
dblp:conf/mediaeval/MartinBMPMCM20 fatcat:m4cvhjdhvzfr5gbstx4qe6kvai

2020 Index IEEE Transactions on Image Processing Vol. 29

2020 IEEE Transactions on Image Processing  
., +, TIP 2020 3612-3625 Learning Latent Global Network for Skeleton-Based Action Prediction. Learning Rich Part Hierarchies With Progressive Attention Networks for Fine-Grained Image Recognition.  ...  ., +, TIP 2020 15-28 A Context Knowledge Map Guided Coarse-to-Fine Action Recognition. Ji, Y., +, TIP 2020 2742-2752 A Spatio-Temporal Multi-Scale Binary Descriptor.  ... 
doi:10.1109/tip.2020.3046056 fatcat:24m6k2elprf2nfmucbjzhvzk3m

An Efficient Spatio-Temporal Pyramid Transformer for Action Detection [article]

Yuetian Weng, Zizheng Pan, Mingfei Han, Xiaojun Chang, Bohan Zhuang
2022 arXiv   pre-print
To this end, we present an efficient hierarchical Spatio-Temporal Pyramid Transformer (STPT) for action detection, building upon the fact that the early self-attention layers in Transformers still focus  ...  Specifically, we propose to use local window attention to encode rich local spatio-temporal representations in the early stages while applying global attention modules to capture long-term space-time dependencies  ...  To date, the majority of action detection methods [66, 32, 38, 54, 12] are driven by 3D convolutional neural networks (CNNs), e.g., C3D [56] , I3D [10] , to encode video segment features from video  ... 
arXiv:2207.10448v1 fatcat:wfcaqo5idncqloio6nk2a5ooh4

A Fine Grainedresearch Over Human Action Recognition

2019 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
Further, a fine-grained survey is also accomplished under every phase based on the individual strategies  ...  Unlike the earlier ones, this paper provides a detailed survey according to the basic working methodology of Human action recognition system.  ...  Further, a fine-grained analysis is accomplished over the deep learning approaches with respect to the convolution dimension.  ... 
doi:10.35940/ijitee.a4677.119119 fatcat:tacsukuctjehde4vzub5gzvfqu

Learning Neural Textual Representations for Citation Recommendation

Binh Thanh Kieu, Inigo Jauregi Unanue, Son Bao Pham, Hieu Xuan Phan, Massimo Piccardi
2021 2020 25th International Conference on Pattern Recognition (ICPR)  
3D Attention Mechanism for Fine-Grained Classification of Table Tennis Strokes Using a Twin Spatio-Temporal Convolutional Neural Networks DAY 3 -Jan 14, 2021 Martins Camboim de Sá, Jáder; Luis  ...  Modeling in a Spatio-Temporal Graph Convolutional Network for Action Recognition DAY 2 -Jan 13, 2021 Song, Siyang; Sanchez, Enrique; Shen, Linlin; Valstar, Michel 589 Self-Supervised Learning  ... 
doi:10.1109/icpr48806.2021.9412725 fatcat:3vge2tpd2zf7jcv5btcixnaikm

Three-Stream 3D/1D CNN for Fine-Grained Action Classification and Segmentation in Table Tennis [article]

Pierre-Etienne Martin
2021 pre-print
This paper proposes a fusion method of modalities extracted from video through a three-stream network with spatio-temporal and temporal convolutions for fine-grained action classification in sport.  ...  The network consists of three branches with attention blocks. Features are fused at the latest stage of the network using bilinear layers.  ...  The target application of our research is fine-grained action recognition in sports with the aim of improving athletes performance.  ... 
doi:10.1145/3475722.3482793 arXiv:2109.14306v1 fatcat:ttwu552ct5hdzalvnamyl2pt2e

Deep Learning for Visual Tracking: A Comprehensive Survey [article]

Seyed Mojtaba Marvasti-Zadeh, Li Cheng, Hossein Ghanei-Yakhdan, and Shohreh Kasaei
2019 arXiv   pre-print
visual tracking, network objective, network output, and the exploitation of correlation filter advantages.  ...  First, the fundamental characteristics, primary motivations, and contributions of DL-based methods are summarized from six key aspects of: network architecture, network exploitation, network training for  ...  Although convolutional neural networks (CNNs) have been dominant networks initially, the broad range of architectures such as Siamese neural networks (SNNs), recurrent neural networks (RNNs), auto-encoders  ... 
arXiv:1912.00535v1 fatcat:v5ikqi2cpbblhgtkiu6z6l5anq

Enhanced Video Classification System Using a Block-Based Motion Vector

Jayasree K, Sumam Mary Idicula
2020 Information  
When shot level features and keyframe features, along with motion vectors, were used, 86% correct classification was achieved, which was comparable with the existing methods.  ...  Convolutional neural networks (CNNs) are used [5] for large scale video classification.  ...  In [6] , recurrent convolutional neural networks (RCNNs) were used for video classification tasks, which are good at learning relations from input sequences.  ... 
doi:10.3390/info11110499 fatcat:vlbz6mqu3jej7pnk3nq62u575m
« Previous Showing results 1 — 15 out of 37 results