706 Hits in 4.9 sec

Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization [article]

Kyle Min, Jason J. Corso
2020 arXiv   pre-print
Despite recent advances, existing methods for weakly-supervised temporal activity localization struggle to recognize when an activity is not occurring.  ...  no activity occurs (i.e. background features) from activity-related features for each video.  ...  Acknowledgement We thank Stephan Lemmer, Victoria Florence, Nathan Louis, and Christina Jung for their valuable feedback and comments. This research was, in part, supported by NIST grant 60NANB17D191.  ... 
arXiv:2007.06643v1 fatcat:lzdfu3i4kreghoa2av4x2afbui

ACGNet: Action Complement Graph Network for Weakly-supervised Temporal Action Localization [article]

Zichen Yang, Jie Qin, Di Huang
2021 arXiv   pre-print
Weakly-supervised temporal action localization (WTAL) in untrimmed videos has emerged as a practical but challenging task since only video-level labels are available.  ...  By this means, the segment-level features are more discriminative and robust to spatial-temporal variations, contributing to higher localization accuracies.  ...  Adversarial Background-Aware Carreira, J.; and Zisserman, A. 2017. Quo Vadis, Action Recogni- Loss for Weakly-Supervised Temporal Activity Localization. In tion?  ... 
arXiv:2112.10977v1 fatcat:r5xj6pnstzelzbhysluzscf5vq

Equivalent Classification Mapping for Weakly Supervised Temporal Action Localization [article]

Tao Zhao, Junwei Han, Le Yang, Dingwen Zhang
2020 arXiv   pre-print
Weakly supervised temporal action localization is a newly emerging yet widely studied topic in recent years.  ...  The existing methods can be categorized into two localization-by-classification pipelines, i.e., the pre-classification pipeline and the post-classification pipeline.  ...  Consequently, weakly supervised temporal action localization can alleviate the burdensome and expensive human annotation.  ... 
arXiv:2008.07728v2 fatcat:ez52jjtpgjf4hoacardqjyd5ym

2021 Index IEEE Transactions on Image Processing Vol. 30

2021 IEEE Transactions on Image Processing  
The Author Index contains the primary entry for each item, listed under the first author's name.  ...  ., +, TIP 2021 5154-5167 Multi-Hierarchical Category Supervision for Weakly-Supervised Temporal Action Localization.  ...  ., +, TIP 2021 5920-5932 Modeling Sub-Actions for Weakly Supervised Temporal Action Localization.  ... 
doi:10.1109/tip.2022.3142569 fatcat:z26yhwuecbgrnb2czhwjlf73qu

Deep Learning-based Action Detection in Untrimmed Videos: A Survey [article]

Elahe Vahdani, Yingli Tian
2021 arXiv   pre-print
The task of temporal activity detection in untrimmed videos aims to localize the temporal boundary of actions and classify the action categories.  ...  This paper provides an extensive overview of deep learning-based algorithms to tackle temporal action detection in untrimmed videos with different supervision levels including fully-supervised, weakly-supervised  ...  They adopt a self-supervised iterative approach for training boundary-aware models from short videos by decomposing a trimmed video into ActionBytes and generate pseudo-labels to train a CNN to localize  ... 
arXiv:2110.00111v1 fatcat:ven4rijqmnbyxflrf6wyxfpex4

2021 Index IEEE Transactions on Multimedia Vol. 23

2021 IEEE transactions on multimedia  
The Author Index contains the primary entry for each item, listed under the first author's name.  ...  ., +, TMM 2021 1316-1329 SAL:Selection and Attention Losses for Weakly Supervised Semantic Segmentation.  ...  ., +, TMM 2021 4426-4440 SAL:Selection and Attention Losses for Weakly Supervised Semantic Segmentation.  ... 
doi:10.1109/tmm.2022.3141947 fatcat:lil2nf3vd5ehbfgtslulu7y3lq

A Survey on Temporal Action Localization

Huifen Xia, Yongzhao Zhan
2020 IEEE Access  
In addition, we summarize temporal action localization from two aspects: fully-supervised learning and weakly-supervised learning.  ...  Temporal action localization is one of the most crucial and challenging problems for video understanding in computer vision.  ...  In addition, some other research on W-TAL were inspired by weakly supervised object detection, such as interactive annotation and generative adversarial training.  ... 
doi:10.1109/access.2020.2986861 fatcat:rsndgkzhi5fm5l6nmfmgfvugby

Self-supervised Learning for Semi-supervised Temporal Language Grounding [article]

Fan Luo, Shaoxiang Chen, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang
2021 arXiv   pre-print
Previous works either tackle this task in a fully-supervised setting that requires a large amount of temporal annotations or in a weakly-supervised setting that usually cannot achieve satisfactory performance  ...  Given a text description, Temporal Language Grounding (TLG) aims to localize temporal boundaries of the segments that contain the specified semantics in an untrimmed video.  ...  Shuf- TALL: temporal activity localization via language query.  ... 
arXiv:2109.11475v2 fatcat:2qmfaum4off4dmxzbvgpgj2hty

Weakly-Supervised Action Localization by Generative Attention Modeling [article]

Baifeng Shi, Qi Dai, Yadong Mu, Jingdong Wang
2020 arXiv   pre-print
Weakly-supervised temporal action localization is a problem of learning an action localization model with only video-level action labeling available.  ...  The general framework largely relies on the classification activation, which employs an attention model to identify the action-related frames and then categorizes them into different classes.  ...  The temporal class activation maps (TCAM) [30, 60] are utilized to produce the top-down, class-aware attention maps.  ... 
arXiv:2003.12424v2 fatcat:swoumfyvwngefoujfucgzmyiqa

Abnormal event detection by a weakly supervised temporal attention network

Xiangtao Zheng, Yichao Zhang, Yunpeng Zheng, Fulin Luo, Xiaoqiang Lu
2021 CAAI Transactions on Intelligence Technology  
In this study, a Temporal Attention Network (TANet) is proposed to capture both the specific categories and temporal locations of abnormal events in a weakly supervised manner.  ...  An event recognition module is exploited to predict the event scores for each video segment while a temporal attention module is proposed to learn a temporal attention value.  ...  Islam and Radke [42] proposed a balanced binary cross-entropy loss and a metric loss to learn discriminative action representations for weakly supervised temporal action localization.  ... 
doi:10.1049/cit2.12068 fatcat:tv72n5lzyzezdi3emajgwiv7ja

TinyVIRAT: Low-resolution Video Action Recognition [article]

Ugur Demir, Yogesh S Rawat, Mubarak Shah
2020 arXiv   pre-print
The proposed method also consists of a weakly trained attention mechanism which helps in focusing on the activity regions in the video.  ...  We propose a novel method for recognizing tiny actions in videos which utilizes a progressive generative approach to improve the quality of low-resolution actions.  ...  The attention map is trained in a weakly supervised setting which is guided by the action label of the video without requiring localization bounding boxes.  ... 
arXiv:2007.07355v1 fatcat:qhfzx7h5jnfjbl7rdtfodzf25a

A Decade Survey of Content Based Image Retrieval using Deep Learning [article]

Shiv Ram Dubey
2020 arXiv   pre-print
The taxonomy used in this survey covers different supervision, different networks, different descriptor type and different retrieval type.  ...  The insights are also presented for the benefit of the researchers to observe the progress and to make the best choices.  ...  The adversarial neural network is also employed for cross-modal retrieval such as adversarial cross-modal retrieval (ACMR) [169] , self-supervised adversarial hashing (SSAH) [126] , attention-aware deep  ... 
arXiv:2012.00641v1 fatcat:2zcho2szpzcc3cs6uou3jpcley

2020 Index IEEE Transactions on Image Processing Vol. 29

2020 IEEE Transactions on Image Processing  
Wang, G., +, TIP 2020 1802-1814 Category-Aware Spatial Constraint for Weakly Supervised Detection. Dynamic Scene Deblurring by Depth Guided Model.  ...  Sinno, Z., +, TIP 2020 2536-2551 Greedy algorithms High-Quality Proposals for Weakly Supervised Object Detection.  ... 
doi:10.1109/tip.2020.3046056 fatcat:24m6k2elprf2nfmucbjzhvzk3m

Kinematic-Structure-Preserved Representation for Unsupervised 3D Human Pose Estimation [article]

Jogendra Nath Kundu, Siddharth Seth, Rahul M V, Mugalodi Rakesh, R. Venkatesh Babu, Anirban Chakraborty
2020 arXiv   pre-print
Furthermore, devoid of unstable adversarial setup, we re-utilize the decoder to formalize an energy-based loss, which enables us to learn from in-the-wild videos, beyond laboratory settings.  ...  Though weakly-supervised models have been proposed to address this shortcoming, performance of such models relies on availability of paired supervision on some related tasks, such as 2D pose or multi-view  ...  Here, L prior denotes the supervised loss directly on p 3D and p 2D for the synthetically rendered images on randomly selected backgrounds, as discussed before.  ... 
arXiv:2006.14107v1 fatcat:r3c3m6ugx5bxhfrepsake366t4

Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis [article]

Jogendra Nath Kundu, Siddharth Seth, Varun Jampani, Mugalodi Rakesh, R. Venkatesh Babu, Anirban Chakraborty
2020 arXiv   pre-print
As a result, the learned model not only inculcates task-bias but also dataset-bias because of its strong reliance on the annotated samples, which also holds true for weakly-supervised models.  ...  Furthermore, we demonstrate state-of-the-art weakly-supervised 3D pose estimation performance on both Human3.6M and MPI-INF-3DHP datasets.  ...  Compared to recent self-supervised approaches that either rely on videos with static background [45] or work with the assumption that temporally close frames have similar background [22] , our framework  ... 
arXiv:2004.04400v1 fatcat:gxjrtfgkpff2peilkm3lhvx2oq
« Previous Showing results 1 — 15 out of 706 results