Filters








9,123 Hits in 3.1 sec

CycDA: Unsupervised Cycle Domain Adaptation from Image to Video [article]

Wei Lin, Anna Kukleva, Kunyang Sun, Horst Possegger, Hilde Kuehne, Horst Bischof
2022 arXiv   pre-print
To address these challenges, we propose Cycle Domain Adaptation (CycDA), a cycle-based approach for unsupervised image-to-video domain adaptation by leveraging the joint spatial information in images and  ...  Although action recognition has achieved impressive results over recent years, both collection and annotation of video training data are still time-consuming and cost intensive.  ...  On case C (blue) which consists of stage 1 (class-agnostic spatial domain alignment) and stage 2 (spatio-temporal learning), image-tovideo action recognition has varying performance on different action  ... 
arXiv:2203.16244v2 fatcat:qoebr44epzafbfyi3zuv6us4x4

Action recognition using context and appearance distribution features

Xinxiao Wu, Dong Xu, Lixin Duan, Jiebo Luo
2011 CVPR 2011  
We first propose a new spatio-temporal context distribution feature of interest points for human action recognition.  ...  Accordingly, an action video can be represented by two types of distribution features: 1) multiple GMM distributions of spatio-temporal context; 2) GMM distribution of local video appearance.  ...  In contrast, our method AFMKL is specifically proposed for action recognition and all the samples are assumed to be from the same domain.  ... 
doi:10.1109/cvpr.2011.5995624 dblp:conf/cvpr/WuXDL11 fatcat:p5o3sn7jnvatfosg7fscjwejhi

Local Descriptors for Spatio-temporal Recognition [chapter]

Ivan Laptev, Tony Lindeberg
2006 Lecture Notes in Computer Science  
In particular, we compare motion representations in terms of spatio-temporal jets, position dependent histograms, position independent histograms, and principal component analysis computed for either spatio-temporal  ...  , consistent with previously reported findings regarding spatial recognition.  ...  For the principal component analysis of spatio-temporal gradient fields, the affine contrast normalization is performed at the level of scale normalized image volumes.  ... 
doi:10.1007/11676959_8 fatcat:ysfir4jkdrdkfjb3otpkyjse2y

Skeleton-Based Action Recognition with Synchronous Local and Non-local Spatio-temporal Learning and Frequency Attention [article]

Guyue Hu, Bo Cui, Shan Yu
2019 arXiv   pre-print
Moreover, existing methods are limited to the spatio-temporal domain and ignore information in the frequency domain.  ...  Benefiting from its succinctness and robustness, skeleton-based action recognition has recently attracted much attention.  ...  Because there is no previous methods with the capability of mining patterns in the frequency domain for skeleton-based action recognition, we only compare our method to the ones in the spatio-temporal  ... 
arXiv:1811.04237v3 fatcat:7zceaqr7e5aj7m6veuxfom4y5e

Unsupervised Domain Adaptation for Video Transformers in Action Recognition [article]

Victor G. Turrisi da Costa, Giacomo Zara, Paolo Rota, Thiago Oliveira-Santos, Nicu Sebe, Vittorio Murino, Elisa Ricci
2022 arXiv   pre-print
On the other hand, the performance of a model in action recognition is heavily affected by domain shift. In this paper, we propose a simple and novel UDA approach for video action recognition.  ...  Our approach leverages recent advances on spatio-temporal transformers to build a robust source model that better generalises to the target domain.  ...  [52] proposed a method for spatio-temporal contrastive domain adaptation. Lastly, Turrisi da Costa et al.  ... 
arXiv:2207.12842v1 fatcat:fcb73cw6ovfcpdbxsqhhhibfvq

MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition

Yizhou Zhou, Xiaoyan Sun, Zheng-Jun Zha, Wenjun Zeng
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
Human actions in videos are three-dimensional (3D) signals. Recent attempts use 3D convolutional neural networks (CNNs) to explore spatio-temporal information for human action recognition.  ...  Compared with state-of-the-art approaches for action recognition on UCF101 and HMDB51, our MiCT-Net yields the best performance.  ...  We would like to thank all the reviewers for their insightful comments. We also appreciate the efforts of Dr. Steven Lin in the scientific manuscript editing of this paper.  ... 
doi:10.1109/cvpr.2018.00054 dblp:conf/cvpr/ZhouSZZ18 fatcat:bp7ropve3vdwdgql2gt7syklh4

Local velocity-adapted motion events for spatio-temporal recognition

Ivan Laptev, Barbara Caputo, Christian Schüldt, Tony Lindeberg
2007 Computer Vision and Image Understanding  
An experimental evaluation on a large video database with human actions demonstrates the advantage of the proposed scheme for event-based motion representation in combination with SVM classification.  ...  The particular advantage of event-based representations and velocity adaptation is further emphasized when recognizing human actions in unconstrained scenes with complex and non-stationary backgrounds.  ...  Acknowledgements We would like to thank Christian Schüldt for his help with experiments on SVM action recognition as well as for the acquisition of human action database in Figure 11 .  ... 
doi:10.1016/j.cviu.2006.11.023 fatcat:dca6zqupmjbi3ipde3mr2twere

Unsupervised Domain Adaptation for Spatio-Temporal Action Localization [article]

Nakul Agarwal, Yi-Ting Chen, Behzad Dariush, Ming-Hsuan Yang
2020 arXiv   pre-print
To address this, we focus on the hard and novel task of generalizing training models to test samples without access to any labels from the latter for spatio-temporal action localization by proposing an  ...  In order to minimize the domain shift, three domain adaptation modules at image level (temporal and spatial) and instance level (temporal) are designed and integrated.  ...  Introduction Existing video action understanding datasets are not designed for developing and evaluating domain adaptation algorithms in the context of spatio-temporal action localization.  ... 
arXiv:2010.09211v1 fatcat:6z4et4sc7rbfpa2knbnyrpim24

Dominant spatio-temporal modulations and energy tracking in videos: Application to interest point detection for action recognition

Christos Georgakis, Petros Maragos, Georgios Evangelopoulos, Dimitrios Dimitriadis
2012 2012 19th IEEE International Conference on Image Processing  
In this paper, we propose a generalization of such approaches in the 3D spatio-temporal domain and explore the potential of incorporating the Dominant Component Analysis scheme for interest point detection  ...  and human action recognition in videos.  ...  Notably, dense sampling at a regular spatio-temporal grid was shown to be superior for recognition on two action datasets, Hollywood2 and UCF.  ... 
doi:10.1109/icip.2012.6466966 dblp:conf/icip/GeorgakisMED12 fatcat:paimjdjh35dzdjpaf7xchn5yxm

Study of Human Action Recognition Based on Improved Spatio-temporal Features

Xiao-Fei Ji, Qian-Qian Wu, Zhao-Jie Ju, Yang-Yang Wang
2014 International Journal of Automation and Computing  
Most of the existed action recognition methods mainly utilize spatio-temporal descriptors of single interest point ignoring their potential integral information, such as spatial distribution information  ...  By combining local spatio-temporal feature and global positional distribution information (PDI) of interest points,a novel motion descriptor is proposed in this paper.  ...  Spatio-temporal interest points are those points where the local neighborhood has a significant variation in both the spatial and the temporal domain.  ... 
doi:10.1007/s11633-014-0831-4 fatcat:xjqpxu4j3nhj7o7qikbaq7jzme

Channel-Temporal Attention for First-Person Video Domain Adaptation [article]

Xianyuan Liu, Shuo Zhou, Tao Lei, Haiping Lu
2021 arXiv   pre-print
However, UDA for first-person action recognition is an under-explored problem, with lack of datasets and limited consideration of first-person video characteristics.  ...  Firstly, we propose two small-scale first-person video domain adaptation datasets: ADL_small and GTEA-KITCHEN.  ...  Such spatio-temporal information can benefit domain adaptation for action recognition.  ... 
arXiv:2108.07846v2 fatcat:rdbmvtjg3bbdjcwemjvbc5ajmi

ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning for Action Recognition [article]

Junting Pan, Ziyi Lin, Xiatian Zhu, Jing Shao, Hongsheng Li
2022 arXiv   pre-print
To solve this problem, we propose a new Spatio-Temporal Adapter (ST-Adapter) for parameter-efficient fine-tuning per video task.  ...  Extensive experiments on video action recognition tasks show that our ST-Adapter can match or even outperform the strong full fine-tuning strategy and state-of-the-art video models, whilst enjoying the  ...  Spatio-temporal attention For more dedicated structural modeling in the time dimension with ViTs, a mainstream approach in the video domain is to develop various spatio-temporal attention mechanisms by  ... 
arXiv:2206.13559v2 fatcat:6vl7zv2ezfhzdmfa4yw6x7i6ra

Towards To-a-T Spatio-Temporal Focus for Skeleton-Based Action Recognition [article]

Lipeng Ke, Kuan-Chuan Peng, Siwei Lyu
2022 arXiv   pre-print
To address these problems, we propose the To-a-T Spatio-Temporal Focus (STF), a skeleton-based action recognition framework that utilizes the spatio-temporal gradient to focus on relevant spatio-temporal  ...  Graph Convolutional Networks (GCNs) have been widely used to model the high-order dynamic dependencies for skeleton-based action recognition.  ...  in the spatio-temporal domain.  ... 
arXiv:2202.02314v1 fatcat:4p2fsqt3pbhbnlrzsgjc5wei2q

Self-Supervised Video Pose Representation Learning for Occlusion- Robust Action Recognition

Di Yang, Yaohui Wang, Antitza Dantcheva, Lorenzo Garattoni, Gianpiero Francesca, Francois Bremond
2021 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)  
This is achieved by minimizing the mutual information of the same pose sequence pruned into different spatio-temporal subgraphs.  ...  Action recognition based on human pose has witnessed increasing attention due to its robustness to changes in appearances, environments, and view-points.  ...  We are grateful to Inria-Sophia Antipolis "NEF" computation cluster for providing resources and support.  ... 
doi:10.1109/fg52635.2021.9667032 fatcat:i5lvlv2vmja4zexthcy27m6rjm

Probabilistic Feature Extraction from Multivariate Time Series Using Spatio-Temporal Constraints [chapter]

Michał Lewandowski, Dimitrios Makris, Jean-Christophe Nebel
2011 Lecture Notes in Computer Science  
In addition we provide quantitative results on a classification application, i.e. view-invariant action recognition, where imposing spatiotemporal constraints is essential.  ...  Performance analysis reveals that our spatio-temporal framework outperforms the state of the art.  ...  The proposed extension is easily adaptable to any variant of GPLVM, for instance BC-GPLVM or GPDM.  ... 
doi:10.1007/978-3-642-20847-8_15 fatcat:ru35l7fm4vd2vc4hialscntkl4
« Previous Showing results 1 — 15 out of 9,123 results