A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
PathTrack: Fast Trajectory Annotation with Path Supervision
[article]
2017
arXiv
pre-print
Tracking approaches can benefit training on such large-scale datasets, as did object recognition. ...
In our novel path supervision the annotator loosely follows the object with the cursor while watching the video, providing a path annotation for each object in the sequence. ...
Overall, our framework is ideal for fast video annotation, which is desirable for generating large training sets, as we demonstrate in the next section. ...
arXiv:1703.02437v2
fatcat:zhafiggz4fcy5anngsabllrqvy
PathTrack: Fast Trajectory Annotation with Path Supervision
2017
2017 IEEE International Conference on Computer Vision (ICCV)
Tracking approaches can benefit training on such large-scale datasets, as did object recognition. ...
In our novel path supervision the annotator loosely follows the object with the cursor while watching the video, providing a path annotation for each object in the sequence. ...
Overall, our framework is ideal for fast video annotation, which is desirable for generating large training sets, as we demonstrate in the next section. ...
doi:10.1109/iccv.2017.40
dblp:conf/iccv/ManenGDG17
fatcat:mcdzgvgurrgsjmjxmvbmujxmbq
What Do I Annotate Next? An Empirical Study of Active Learning for Action Localization
[chapter]
2018
Lecture Notes in Computer Science
As a result, we collect Kinetics-Localization, a novel large-scale dataset for temporal action localization, which contains more than 15K YouTube videos. ...
Despite tremendous progress achieved in temporal action localization, state-of-the-art methods still struggle to train accurate models when annotated data is scarce. ...
However, despite those great achievements, a crucial limitation persists, namely the dependence of these models on large-scale annotated data for training. ...
doi:10.1007/978-3-030-01252-6_13
fatcat:f4sue6nhkzehnculgoypv3f7k4
Weakly Supervised Semantic Segmentation using Web-Crawled Videos
[article]
2018
arXiv
pre-print
videos to simulate strong supervision for semantic segmentation. ...
Although the entire procedure does not require any additional supervision, the segmentation annotations obtained from videos are sufficiently strong to learn a model for semantic segmentation. ...
be avoided to construct a large-scale video data. ...
arXiv:1701.00352v3
fatcat:v2l7aspoo5eplou2maacgmpysa
Weakly Supervised Semantic Segmentation Using Web-Crawled Videos
2017
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
videos to simulate strong supervision for semantic segmentation. ...
Although the entire procedure does not require any additional supervision, the segmentation annotations obtained from videos are sufficiently strong to learn a model for semantic segmentation. ...
be avoided to construct a large-scale video data. ...
doi:10.1109/cvpr.2017.239
dblp:conf/cvpr/HongYKLH17
fatcat:723ug43aknhw7gpm36spes4ose
Data-Driven Crowd Understanding: A Baseline for a Large-Scale Crowd Dataset
2016
IEEE transactions on multimedia
In this paper, we contribute a large-scale benchmark dataset collected from the Shanghai 2010 World Expo. ...
It includes 2630 annotated video sequences captured by 245 surveillance cameras, far larger than any public dataset. ...
Our large-scale annotated training set makes it possible for us to develop a data-driven approach as a baseline for our crowd understanding dataset.
A. ...
doi:10.1109/tmm.2016.2542585
fatcat:gtdbjyj7z5elleblqsr6slm7z4
A Survey on Machine Learning Techniques for Auto Labeling of Video, Audio, and Text Data
[article]
2021
arXiv
pre-print
In this survey paper, we provide a review of previous techniques that focuses on optimized data annotation and labeling for video, audio, and text data. ...
However, large amounts of annotated data are still demanded to build robust models and improve the prediction accuracy of the model. ...
Using this developed video annotation tool, users can generate large amount of ground truth data for videos. [38] proposes a tool for tagging and annotation. ...
arXiv:2109.03784v1
fatcat:uu55zfmtajcvdjekxeaue76izy
Deep learning vs. kernel methods: Performance for emotion prediction in videos
2015
2015 International Conference on Affective Computing and Intelligent Interaction (ACII)
of affective movie content analysis frameworks as long as very large datasets annotated along affective dimensions are not available. ...
) is introduced, for which (ii) the performance of the Convolutional Neural Networks (CNN) through supervised finetuning, the Support Vector Machines for Regression (SVR) and the combination of both (Transfer ...
We further would like to thank Xingxian Li for his help on the modification of the GTrace program. ...
doi:10.1109/acii.2015.7344554
dblp:conf/acii/BaveyeDCC15
fatcat:codzdwtynbeahefggftgfrppv4
Unsupervised Adversarial Visual Level Domain Adaptation for Learning Video Object Detectors from Images
[article]
2018
arXiv
pre-print
Having a robust video object detector is an essential component for video understanding and curating large-scale automated annotations in videos. ...
Though image object detectors have shown rapid progress in recent years with the release of multiple large-scale static image datasets, object detection on videos still remains an open problem due to scarcity ...
An immediate direction of effort can be to annotate video datasets. However, annotating large scale video datasets demand humongous manual labor, time and cost. ...
arXiv:1810.02074v1
fatcat:4qqfyhuuj5e5da3zfacbl4roau
Hierarchical Deep Co-segmentation of Primary Objects in Aerial Videos
[article]
2018
arXiv
pre-print
In this paper, we propose a large-scale dataset with 500 aerial videos and manually annotated primary objects. ...
From this dataset, we find most aerial videos contain large-scale scenes, small primary objects as well as consistently varying scales and viewpoints. ...
The main reasons are two-folds: 1) the heuristic rules and learning frameworks may not perfectly fit the characteristics of aerial videos, and 2) there is a lack of large-scale aerial video datasets for ...
arXiv:1806.10274v2
fatcat:tqymoteu5zfv3o6ghin2dumoii
Learning to Segment Human by Watching YouTube
[article]
2018
arXiv
pre-print
In the second step, the video-context derived human masks are used as direct labels to train CNN. ...
Inspired by this, based on popular deep convolutional neural networks (CNN), we explore a very-weakly supervised learning framework for human segmentation task, where only an imperfect human detector is ...
These limitations hinder the development of semantic segmentation which generally requires large-scale data for training. ...
arXiv:1710.01457v2
fatcat:g56faqjgvvh6zefrwxjtiaqprm
YouTube-VOS: Sequence-to-Sequence Video Object Segmentation
[chapter]
2018
Lecture Notes in Computer Science
End-to-end sequential learning to explore spatialtemporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i.e., even the largest video segmentation ...
To solve this problem, we build a new large-scale video object segmentation dataset called YouTube Video Object Segmentation dataset (YouTube-VOS). ...
Thus large scale training data such as our dataset is essential to learn spatial-temporal representation for video object segmentation. ...
doi:10.1007/978-3-030-01228-1_36
fatcat:jxbeuhclmjgvzosjkyi43r7goq
YouTube-VOS: Sequence-to-Sequence Video Object Segmentation
[article]
2018
arXiv
pre-print
End-to-end sequential learning to explore spatial-temporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i.e., even the largest video segmentation ...
To solve this problem, we build a new large-scale video object segmentation dataset called YouTube Video Object Segmentation dataset (YouTube-VOS). ...
Thus large scale training data such as our dataset is essential to learn spatial-temporal representation for video object segmentation. ...
arXiv:1809.00461v1
fatcat:ufu4eo2mlrakplypne5njogkpm
Efficient video annotation with visual interpolation and frame selection guidance
[article]
2020
arXiv
pre-print
We introduce a unified framework for generic video annotation with bounding boxes. Video annotation is a longstanding problem, as it is a tedious and time-consuming process. ...
Moreover, we also show 10% annotation time improvement over a state-of-the-art method for video annotation with bounding boxes [25]. ...
The place for a large scale general purpose video dataset is still vacant and efficient video annotation methods are required to create those. Video annotation. ...
arXiv:2012.12554v1
fatcat:o6z22kew4fhdjeb2ot2ogpy5zq
Online multi-label active annotation
2008
Proceeding of the 16th ACM international conference on Multimedia - MM '08
To address this problem, in this paper, we propose a scalable framework for annotation-based video search, as well as a novel approach to enable large-scale semantic concept annotation, that is, online ...
However, due to the complexity of both video data and semantic concepts, existing techniques on automatic video annotation are still not able to handle large-scale video set and large-scale concept set ...
INTRODUCTION We will propose a novel semantic annotation scheme to enable content-based video search, which is scalable to large-scale video samples as well as large-scale semantic concepts. ...
doi:10.1145/1459359.1459379
dblp:conf/mm/HuaQ08
fatcat:va4bdm5vvjehtnp7bfxd2bqhhe
« Previous
Showing results 1 — 15 out of 44,691 results