Filters








1,433 Hits in 3.3 sec

Surveillance Video Parsing with Single Frame Supervision [article]

Si Liu, Changhu Wang, Ruihe Qian, Han Yu, Renda Bao
2016 arXiv   pre-print
In this paper, we develop a Single frame Video Parsing (SVP) method which requires only one labeled frame per video in training stage.  ...  Surveillance video parsing, which segments the video frames into several labels, e.g., face, pants, left-leg, has wide applications.  ...  A Single frame supervised video Parsing (SVP) network is learned from the extremely sparsely labeled videos. During testing, a parsing window is slided along the video.  ... 
arXiv:1611.09587v1 fatcat:esoh7rx2q5csjc7pwsmlbupgji

Surveillance Video Parsing with Single Frame Supervision

Si Liu, Changhu Wang, Ruihe Qian, Han Yu, Renda Bao, Yao Sun
2017 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
In this paper, we develop a Single frame Video Parsing (SVP) method which requires only one labeled frame per video in training stage.  ...  Surveillance video parsing, which segments the video frames into several labels, e.g., face, pants, left-leg, has wide applications [41, 8] .  ...  As shown in Figure 1 , the labeled frame per training video (red bounding box) is fed into the proposed Single frame supervised Video Parsing (SVP) network.  ... 
doi:10.1109/cvpr.2017.114 dblp:conf/cvpr/LiuWQYBS17 fatcat:3yqf4z3zifgz5iptwf5l4k57ka

Semi-Supervised Image-to-Video Adaptation for Video Action Recognition

Rohan Munshi
2021 International Journal for Research in Applied Science and Engineering Technology  
Such human action recognition is based on evidence gathered from videos. It has a lot of applications including surveillance, video indexing, biometrics, telehealth, and human-computer interaction.  ...  Such systematic classification can facilitate researchers to spot the acceptable ways on the market to deal with every one of the challenges visaged and their limitations.  ...  Vector is projected to lower dimensions using PCA. 1) Single Viewpoint IV. ANALYSIS Action parsing in videos with complicated scenes is a noteworthy however difficult task in pc vision.  ... 
doi:10.22214/ijraset.2021.37355 fatcat:r7rxhd2p4vg4loq5ou7b5vqbea

Unsupervised Semantic Action Discovery from Video Collections [article]

Ozan Sener and Amir Roshan Zamir and Chenxia Wu and Silvio Savarese and Ashutosh Saxena
2016 arXiv   pre-print
In this paper, we consider instructional videos where there are tens of millions of them on the Internet. We propose a method for parsing a video into such semantic steps in an unsupervised way.  ...  It typically has an underlying structure, with a starting point, ending, and certain objective steps between them.  ...  In this paper, we present a unified model, considering both of the modalities, in order to parse human activities into activity steps with no form of supervision other than requiring videos to be the same  ... 
arXiv:1605.03324v1 fatcat:p3zpvrjrujhe5gnw5hc2grnjwq

ReMotENet: Efficient Relevant Motion Event Detection for Large-scale Home Surveillance Videos [article]

Ruichi Yu, Hongcheng Wang, Larry S. Davis
2018 arXiv   pre-print
Meanwhile, it exploits the properties of home surveillance videos, e.g., relevant motion is sparse both spatially and temporally, and enhances 3D ConvNets with a spatial-temporal attention model and reference-frame  ...  ., person and vehicles) in large scale home surveillance videos.  ...  The fruitful discussion with MD Mahmudul Hasan, Upal Mahbub and Jan Neumann is highly appreciated. Special thanks go to all Comcast TPS volunteers who donated their home videos.  ... 
arXiv:1801.02031v1 fatcat:ksfahktdlvhuvhzxks4atoebrq

Attribute-based people search in surveillance environments

Daniel A. Vaquero, Rogerio S. Feris, Duan Tran, Lisa Brown, Arun Hampapur, Matthew Turk
2009 2009 Workshop on Applications of Computer Vision (WACV)  
We are not aware of any similar surveillance system capable of automatically finding people in video based on their fine-grained body parts and attributes.  ...  To evaluate the performance of our approach, we present extensive experiments on a set of images collected from the Internet, on infrared imagery, and on two-and-ahalf months of video from a real surveillance  ...  The description in this section deals with attribute extraction from individual video frames or static images.  ... 
doi:10.1109/wacv.2009.5403131 dblp:conf/wacv/VaqueroFTBHT09 fatcat:g75llxaltngzzoi3ch5uxtzft4

Video Event Recognition for Surveillance Applications (VERSA) [article]

Stephen O'Hara
2010 arXiv   pre-print
VERSA provides a general-purpose framework for defining and recognizing events in live or recorded surveillance video streams.  ...  Doing so requires the definition of certain fundamental spatial and temporal relationships and a high-level syntax for specifying frame templates and query parameters.  ...  Frame Sketch and Canvas Area The Frame Sketch and Canvas Area is where the user defines a single Frame Template for a key frame of video in the Event Template.  ... 
arXiv:1007.3772v1 fatcat:apuv2tifhzbudmtwyzbxutka3y

Parsing Videos of Actions with Segmental Grammars

Hamed Pirsiavash, Deva Ramanan
2014 2014 IEEE Conference on Computer Vision and Pattern Recognition  
We illustrate the effectiveness of our approach over common baselines on a new half-million frame dataset of continuous YouTube videos.  ...  We describe simple grammars that capture hierarchical temporal structure while admitting inference with a finite-state-machine. This makes parsing linear time, constant storage, and naturally online.  ...  Fully-supervised learning: Assume we are given training data of videos with ground-truth parses {D n , P n } and a manually-specified set of production rules Γ.  ... 
doi:10.1109/cvpr.2014.85 dblp:conf/cvpr/PirsiavashR14 fatcat:z4un5zgej5gstl6hqe7uzlnau4

Generating Highlights of Cricket Video using Commentators and Spectators Voice

2019 International journal of recent technology and engineering  
The experimental results have shown good performance when compared with human generated summary.  ...  Video Summarization is the process of creating a small video describing the actual video within short duration(s).  ...  In [10] [11] [12] a novel approach for exploring surveillance video data. Initially object movement was recorded i.e. by extracting single shot image which is a combination of multiple frames.  ... 
doi:10.35940/ijrte.d4261.118419 fatcat:dkkalpu2xze25j5oiclxqilcgm

Joint Video and Text Parsing for Understanding Events and Answering Queries

2014 IEEE Multimedia  
Video parsing and text parsing produce two parse graphs from the input video and text respectively.  ...  Based on the probabilistic model, we propose a joint parsing system consisting of three modules: video parsing, text parsing and joint inference.  ...  Amer, Dan Xie and Sinisa Todorovic for their help in automatic video parsing.  ... 
doi:10.1109/mmul.2014.29 fatcat:bunjekcxezhffkinjx2zet2afm

Joint Video and Text Parsing for Understanding Events and Answering Queries [article]

Kewei Tu, Meng Meng, Mun Wai Lee, Tae Eun Choe, Song-Chun Zhu
2014 arXiv   pre-print
Video parsing and text parsing produce two parse graphs from the input video and text respectively.  ...  Based on the probabilistic model, we propose a joint parsing system consisting of three modules: video parsing, text parsing and joint inference.  ...  Amer, Dan Xie and Sinisa Todorovic for their help in automatic video parsing.  ... 
arXiv:1308.6628v2 fatcat:7y5wjg3irrgodbeal6hmoz32ea

Semantic event representation and recognition using syntactic attribute graph grammar

Liang Lin, Haifeng Gong, Li Li, Liang Wang
2009 Pattern Recognition Letters  
With this representation, one observed event can be parsed into an "event parse graph", and all possible variability of one event can be modeled into an "event And-Or graph", in a syntactic way.  ...  This grammar models the variability of semantic events by a set of meaningful "event components" with the spatio-temporal constraints.  ...  Introduction Video understanding is a hot research topic in recent years, with many applications, such as visual surveillance, video browsing and content-based video indexing.  ... 
doi:10.1016/j.patrec.2008.02.023 fatcat:g6ptjmczhffrbkuaodgicj6voy

2019 Index IEEE Transactions on Circuits and Systems for Video Technology Vol. 29

2019 IEEE transactions on circuits and systems for video technology (Print)  
., +, TCSVT Dec. 2019 3608-3621 Perceiving Motion From Dynamic Memory for Vehicle Detection in Surveil-Blind Video Quality Assessment With Weakly Supervised Learning and Resampling Strategy.  ...  Kang, D., +, TCSVT May 2019 1408-1422 Blind Video Quality Assessment With Weakly Supervised Learning and Res- ampling Strategy.  ... 
doi:10.1109/tcsvt.2019.2959179 fatcat:2bdmsygnonfjnmnvmb72c63tja

Online Dominant and Anomalous Behavior Detection in Videos

Mehrsan Javan Roshtkhari, Martin D. Levine
2013 2013 IEEE Conference on Computer Vision and Pattern Recognition  
We present a novel approach for video parsing and simultaneous online learning of dominant and anomalous behaviors in surveillance videos.  ...  In this paper, video events are learnt at each pixel without supervision using densely constructed spatio-temporal video volumes. Furthermore, the volumes are organized into large contextual graphs.  ...  At each temporal sample t, a single image is added to the already observed frames and a new video sequence, the query, Q, is formed.  ... 
doi:10.1109/cvpr.2013.337 dblp:conf/cvpr/RoshtkhariL13 fatcat:buity54tm5h5npbhiskrep6e5q

Hardware Architecture for Video Authentication Using Sensor Pattern Noise

Amit Pande, Shaxun Chen, Prasant Mohapatra, Joseph Zambreno
2014 IEEE transactions on circuits and systems for video technology (Print)  
A prototype implementation on a Xilinx Virtex-6 FPGA device was optimized with a resulting throughput of 167 MB/s, processing 30 640 × 480 video frames in 0.17 s.  ...  Camera identification and authentication have formed the basis of image/video forensics in legal proceedings.  ...  With 100 frames of a video, we obtain an average accuracy of 60% with db2 while this is 63% with the db8. To obtain a reasonable accuracy, we need a larger number of video frames.  ... 
doi:10.1109/tcsvt.2013.2276869 fatcat:ju3aa7le3nccheoabr2mzhfuc4
« Previous Showing results 1 — 15 out of 1,433 results