A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization
[article]
2020
arXiv
pre-print
Despite recent advances, existing methods for weakly-supervised temporal activity localization struggle to recognize when an activity is not occurring. ...
no activity occurs (i.e. background features) from activity-related features for each video. ...
Acknowledgement We thank Stephan Lemmer, Victoria Florence, Nathan Louis, and Christina Jung for their valuable feedback and comments. This research was, in part, supported by NIST grant 60NANB17D191. ...
arXiv:2007.06643v1
fatcat:lzdfu3i4kreghoa2av4x2afbui
ACGNet: Action Complement Graph Network for Weakly-supervised Temporal Action Localization
[article]
2021
arXiv
pre-print
Weakly-supervised temporal action localization (WTAL) in untrimmed videos has emerged as a practical but challenging task since only video-level labels are available. ...
By this means, the segment-level features are more discriminative and robust to spatial-temporal variations, contributing to higher localization accuracies. ...
Adversarial Background-Aware
Carreira, J.; and Zisserman, A. 2017. Quo Vadis, Action Recogni- Loss for Weakly-Supervised Temporal Activity Localization. In
tion? ...
arXiv:2112.10977v1
fatcat:r5xj6pnstzelzbhysluzscf5vq
Equivalent Classification Mapping for Weakly Supervised Temporal Action Localization
[article]
2020
arXiv
pre-print
Weakly supervised temporal action localization is a newly emerging yet widely studied topic in recent years. ...
The existing methods can be categorized into two localization-by-classification pipelines, i.e., the pre-classification pipeline and the post-classification pipeline. ...
Consequently, weakly supervised temporal action localization can alleviate the burdensome and expensive human annotation. ...
arXiv:2008.07728v2
fatcat:ez52jjtpgjf4hoacardqjyd5ym
2021 Index IEEE Transactions on Image Processing Vol. 30
2021
IEEE Transactions on Image Processing
The Author Index contains the primary entry for each item, listed under the first author's name. ...
., +, TIP 2021 5154-5167 Multi-Hierarchical Category Supervision for Weakly-Supervised Temporal Action Localization. ...
., +, TIP 2021 5920-5932 Modeling Sub-Actions for Weakly Supervised Temporal Action Localization. ...
doi:10.1109/tip.2022.3142569
fatcat:z26yhwuecbgrnb2czhwjlf73qu
Deep Learning-based Action Detection in Untrimmed Videos: A Survey
[article]
2021
arXiv
pre-print
The task of temporal activity detection in untrimmed videos aims to localize the temporal boundary of actions and classify the action categories. ...
This paper provides an extensive overview of deep learning-based algorithms to tackle temporal action detection in untrimmed videos with different supervision levels including fully-supervised, weakly-supervised ...
They adopt a self-supervised iterative approach for training boundary-aware models from short videos by decomposing a trimmed video into ActionBytes and generate pseudo-labels to train a CNN to localize ...
arXiv:2110.00111v1
fatcat:ven4rijqmnbyxflrf6wyxfpex4
2021 Index IEEE Transactions on Multimedia Vol. 23
2021
IEEE transactions on multimedia
The Author Index contains the primary entry for each item, listed under the first author's name. ...
., +, TMM 2021 1316-1329 SAL:Selection and Attention Losses for Weakly Supervised Semantic Segmentation. ...
., +, TMM 2021 4426-4440 SAL:Selection and Attention Losses for Weakly Supervised Semantic Segmentation. ...
doi:10.1109/tmm.2022.3141947
fatcat:lil2nf3vd5ehbfgtslulu7y3lq
A Survey on Temporal Action Localization
2020
IEEE Access
In addition, we summarize temporal action localization from two aspects: fully-supervised learning and weakly-supervised learning. ...
Temporal action localization is one of the most crucial and challenging problems for video understanding in computer vision. ...
In addition, some other research on W-TAL were inspired by weakly supervised object detection, such as interactive annotation and generative adversarial training. ...
doi:10.1109/access.2020.2986861
fatcat:rsndgkzhi5fm5l6nmfmgfvugby
Self-supervised Learning for Semi-supervised Temporal Language Grounding
[article]
2021
arXiv
pre-print
Previous works either tackle this task in a fully-supervised setting that requires a large amount of temporal annotations or in a weakly-supervised setting that usually cannot achieve satisfactory performance ...
Given a text description, Temporal Language Grounding (TLG) aims to localize temporal boundaries of the segments that contain the specified semantics in an untrimmed video. ...
Shuf-
TALL: temporal activity localization via language query. ...
arXiv:2109.11475v2
fatcat:2qmfaum4off4dmxzbvgpgj2hty
Weakly-Supervised Action Localization by Generative Attention Modeling
[article]
2020
arXiv
pre-print
Weakly-supervised temporal action localization is a problem of learning an action localization model with only video-level action labeling available. ...
The general framework largely relies on the classification activation, which employs an attention model to identify the action-related frames and then categorizes them into different classes. ...
The temporal class activation maps (TCAM) [30, 60] are utilized to produce the top-down, class-aware attention maps. ...
arXiv:2003.12424v2
fatcat:swoumfyvwngefoujfucgzmyiqa
Abnormal event detection by a weakly supervised temporal attention network
2021
CAAI Transactions on Intelligence Technology
In this study, a Temporal Attention Network (TANet) is proposed to capture both the specific categories and temporal locations of abnormal events in a weakly supervised manner. ...
An event recognition module is exploited to predict the event scores for each video segment while a temporal attention module is proposed to learn a temporal attention value. ...
Islam and Radke [42] proposed a balanced binary cross-entropy loss and a metric loss to learn discriminative action representations for weakly supervised temporal action localization. ...
doi:10.1049/cit2.12068
fatcat:tv72n5lzyzezdi3emajgwiv7ja
TinyVIRAT: Low-resolution Video Action Recognition
[article]
2020
arXiv
pre-print
The proposed method also consists of a weakly trained attention mechanism which helps in focusing on the activity regions in the video. ...
We propose a novel method for recognizing tiny actions in videos which utilizes a progressive generative approach to improve the quality of low-resolution actions. ...
The attention map is trained in a weakly supervised setting which is guided by the action label of the video without requiring localization bounding boxes. ...
arXiv:2007.07355v1
fatcat:qhfzx7h5jnfjbl7rdtfodzf25a
A Decade Survey of Content Based Image Retrieval using Deep Learning
[article]
2020
arXiv
pre-print
The taxonomy used in this survey covers different supervision, different networks, different descriptor type and different retrieval type. ...
The insights are also presented for the benefit of the researchers to observe the progress and to make the best choices. ...
The adversarial neural network is also employed for cross-modal retrieval such as adversarial cross-modal retrieval (ACMR) [169] , self-supervised adversarial hashing (SSAH) [126] , attention-aware deep ...
arXiv:2012.00641v1
fatcat:2zcho2szpzcc3cs6uou3jpcley
2020 Index IEEE Transactions on Image Processing Vol. 29
2020
IEEE Transactions on Image Processing
Wang, G., +, TIP 2020 1802-1814 Category-Aware Spatial Constraint for Weakly Supervised Detection. Dynamic Scene Deblurring by Depth Guided Model. ...
Sinno, Z., +, TIP 2020 2536-2551
Greedy algorithms
High-Quality Proposals for Weakly Supervised Object Detection. ...
doi:10.1109/tip.2020.3046056
fatcat:24m6k2elprf2nfmucbjzhvzk3m
Kinematic-Structure-Preserved Representation for Unsupervised 3D Human Pose Estimation
[article]
2020
arXiv
pre-print
Furthermore, devoid of unstable adversarial setup, we re-utilize the decoder to formalize an energy-based loss, which enables us to learn from in-the-wild videos, beyond laboratory settings. ...
Though weakly-supervised models have been proposed to address this shortcoming, performance of such models relies on availability of paired supervision on some related tasks, such as 2D pose or multi-view ...
Here, L prior denotes the supervised loss directly on p 3D and p 2D for the synthetically rendered images on randomly selected backgrounds, as discussed before. ...
arXiv:2006.14107v1
fatcat:r3c3m6ugx5bxhfrepsake366t4
Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis
[article]
2020
arXiv
pre-print
As a result, the learned model not only inculcates task-bias but also dataset-bias because of its strong reliance on the annotated samples, which also holds true for weakly-supervised models. ...
Furthermore, we demonstrate state-of-the-art weakly-supervised 3D pose estimation performance on both Human3.6M and MPI-INF-3DHP datasets. ...
Compared to recent self-supervised approaches that either rely on videos with static background [45] or work with the assumption that temporally close frames have similar background [22] , our framework ...
arXiv:2004.04400v1
fatcat:gxjrtfgkpff2peilkm3lhvx2oq
« Previous
Showing results 1 — 15 out of 706 results