A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Weakly Supervised Temporal Action Localization Using Deep Metric Learning
[article]
2020
arXiv
pre-print
To this end, we propose a weakly supervised temporal action localization method that only requires video-level action instances as supervision during training. ...
We propose a classification module to generate action labels for each segment in the video, and a deep metric learning module to learn the similarity between different action instances. ...
Most of these techniques use temporal annotations during training, while we aim to use only video-level labels for action localization. Deep Metric Learning. ...
arXiv:2001.07793v1
fatcat:qzhzxvtzkva5ziqxyfzgvg36pi
A Survey on Temporal Action Localization
2020
IEEE Access
In addition, we summarize temporal action localization from two aspects: fully-supervised learning and weakly-supervised learning. ...
Temporal action localization has made some significant progress, especially with the development of deep learning recently. And more demand is for temporal action localization in untrimmed videos. ...
Then we review the recent developments of temporal action localization from fullysupervised learning to weakly-supervised learning methods. ...
doi:10.1109/access.2020.2986861
fatcat:rsndgkzhi5fm5l6nmfmgfvugby
Audio-Visual Event Localization in Unconstrained Videos
[chapter]
2018
Lecture Notes in Computer Science
We collect an Audio-Visual Event (AVE) dataset to systemically investigate three temporal localization tasks: supervised and weakly-supervised audio-visual event localization, and cross-modality localization ...
Our experiments support the following findings: joint modeling of auditory and visual modalities outperforms independent modeling, the learned attention can capture semantics of sounding objects, temporal ...
., Tencent and the support of NVIDIA Corporation with the donation of the GPUs used for this research. ...
doi:10.1007/978-3-030-01216-8_16
fatcat:t4dbgoypsnaixmrtrrtrwcv2sy
Abnormal event detection by a weakly supervised temporal attention network
2021
CAAI Transactions on Intelligence Technology
In this study, a Temporal Attention Network (TANet) is proposed to capture both the specific categories and temporal locations of abnormal events in a weakly supervised manner. ...
An event recognition module is exploited to predict the event scores for each video segment while a temporal attention module is proposed to learn a temporal attention value. ...
Islam and Radke [42] proposed a balanced binary cross-entropy loss and a metric loss to learn discriminative action representations for weakly supervised temporal action localization. ...
doi:10.1049/cit2.12068
fatcat:tv72n5lzyzezdi3emajgwiv7ja
Weak Supervision and Referring Attention for Temporal-Textual Association Learning
[article]
2020
arXiv
pre-print
Therefore we provide a Weak-Supervised alternative with our proposed Referring Attention mechanism to learn temporal-textual association (dubbed WSRA). ...
We validate our WSRA through extensive experiments for temporally grounding by languages, demonstrating that it outperforms the state-of-the-art weakly-supervised methods notably. ...
Weakly-Supervised Action Localization can be thought of a specific example of weakly supervised video learning with the video-level labels [52, 43, 32, 36] . ...
arXiv:2006.11747v2
fatcat:bpqa6chthfgjhatmsgqq5t2dym
Deep Learning-based Action Detection in Untrimmed Videos: A Survey
[article]
2021
arXiv
pre-print
This paper provides an extensive overview of deep learning-based algorithms to tackle temporal action detection in untrimmed videos with different supervision levels including fully-supervised, weakly-supervised ...
Moreover, the commonly used action detection benchmark datasets and evaluation metrics are described, and the performance of the state-of-the-art methods are compared. ...
Weakly-supervised Action Detection Weakly-supervised learning scheme requires coarse-grained or noisy labels during the training phase. ...
arXiv:2110.00111v1
fatcat:ven4rijqmnbyxflrf6wyxfpex4
Weakly Supervised Object Localization and Detection: A Survey
[article]
2021
arXiv
pre-print
and standard evaluation metrics that are widely used in this field. ...
As an emerging and challenging problem in the computer vision community, weakly supervised object localization and detection plays an important role for developing new generation computer vision systems ...
On the other hand, [107] , [132] , [136] , [166] develop methods to localize temporal actions in the given untrimmed videos, where the main goal is to predict the temporal boundary of each action ...
arXiv:2104.07918v1
fatcat:dwl6sjfzibdilnvjnrbifp4uke
Towards Train-Test Consistency for Semi-supervised Temporal Action Localization
[article]
2020
arXiv
pre-print
Recently, Weakly-supervised Temporal Action Localization (WTAL) has been densely studied but there is still a large gap between weakly-supervised models and fully-supervised models. ...
The inconsistent strategy makes it hard to explicitly supervise the action localization model with temporal boundary annotations at training time. ...
Weakly-supervised Temporal Action Localization Obtaining the temporal annotations for full supervision is still the bottleneck if we go to a larger scale. ...
arXiv:1910.11285v3
fatcat:2acircsxhjferlmolj5eu3i7mq
Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment
[article]
2018
arXiv
pre-print
In this work, we address the task of weakly-supervised human action segmentation in long, untrimmed videos. ...
, and a novel training strategy for weakly-supervised sequence modeling, named Iterative Soft Boundary Assignment (ISBA), to align action sequences and update the network in an iterative fashion. ...
This idea is inspired from taking random steps used in deep reinforcement learning. ...
arXiv:1803.10699v1
fatcat:pwrl3cfahvgstpbcfzfj7pynxu
AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization
[article]
2019
arXiv
pre-print
As a challenging problem for high-level video understanding, weakly supervised action recognition and localization in untrimmed videos has attracted intensive research attention. ...
An encoder-decoder based structure is carefully designed and jointly optimized to facilitate effective action classification and temporal localization. ...
In addition, RNN is also widely used for temporal action localization. Compared with fully supervised methods, weakly supervised action localization is less studied. ...
arXiv:1911.11961v1
fatcat:qxptpsr5djcd5nnpagxnqvnrpu
Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning
[article]
2020
arXiv
pre-print
Weakly-supervised action localization requires training a model to localize the action segments in the video given only video level action label. ...
It can be solved under the Multiple Instance Learning (MIL) framework, where a bag (video) contains multiple instances (action segments). ...
Related Work Weakly-Supervised Action Localization Weakly supervised action localization learns to localize activities inside videos when only action class labels are available. ...
arXiv:2004.00163v2
fatcat:gaendu464zaldljoiyhihrizvq
Audio-Visual Event Localization in Unconstrained Videos
[article]
2018
arXiv
pre-print
We collect an Audio-Visual Event(AVE) dataset to systemically investigate three temporal localization tasks: supervised and weakly-supervised audio-visual event localization, and cross-modality localization ...
Our experiments support the following findings: joint modeling of auditory and visual modalities outperforms independent modeling, the learned attention can capture semantics of sounding objects, temporal ...
For supervised and weakly-supervised event localization, we use overall accuracy as an evaluation metric. ...
arXiv:1803.08842v1
fatcat:diaiqzbuivh5bpox35ccpdb3ii
Action Localization through Continual Predictive Learning
[article]
2020
arXiv
pre-print
This self-supervised framework is not complicated as other approaches but is very effective in learning robust visual representations for both labeling and localization. ...
In this paper, we present a new approach based on continual learning that uses feature-level predictions for self-supervision. ...
Unsupervised action localization approaches have not been explored to the same extent as supervised and weakly-supervised approaches. ...
arXiv:2003.12185v1
fatcat:sb5t6r2tnberdd6crdl3pejoke
Weakly Supervised Temporal Adjacent Network for Language Grounding
[article]
2021
arXiv
pre-print
To this end, we introduce a novel weakly supervised temporal adjacent network (WSTAN) for temporal language grounding. ...
In this work, we are dedicated to weakly supervised TLG, where multiple description sentences are given to an untrimmed video without temporal boundary labels. ...
Therefore, the idea is applicable to weakly supervised spatial/temporal grounding problems, e.g. spatial visual grounding, temporal action localization, and semantic segmentation. ...
arXiv:2106.16136v1
fatcat:i7nwetztpraf5bjwxfrn2gpa2a
RefineLoc: Iterative Refinement for Weakly-Supervised Action Localization
[article]
2020
arXiv
pre-print
In this paper, we propose RefineLoc, a novel weakly-supervised temporal action localization method. ...
RefineLoc shows competitive results with the state-of-the-art in weakly-supervised temporal localization. ...
Weakly-supervised Temporal Action Localization. ...
arXiv:1904.00227v3
fatcat:cdiabbsi25ffdgouvqmgqssxqm
« Previous
Showing results 1 — 15 out of 2,414 results