Filters








13,071 Hits in 4.8 sec

Gradient Forward-Propagation for Large-Scale Temporal Video Modelling [article]

Mateusz Malinowski and Dimitrios Vytiniotis and Grzegorz Swirszcz and Viorica Patraucean and Joao Carreira
2021 arXiv   pre-print
action recognition video datasets such as HMDB51, UCF101, and the large-scale Kinetics-600.  ...  In this paper, we build upon Sideways, which avoids blocking by propagating approximate gradients forward in time, and we propose mechanisms for temporal integration of information based on different variants  ...  We thank Carl Doersch, Tom Hennigan, Jacob Menick, Simon Osindero, and Andrew Zisserman for their advice throughout the duration of the project and the anonymous CVPR reviewers for their feedback on the  ... 
arXiv:2106.08318v2 fatcat:xyxli2a2xvgy3maw6uz4bh54kq

Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding [article]

Fu Li, Chuang Gan, Xiao Liu, Yunlong Bian, Xiang Long, Yandong Li, Zhichao Li, Jie Zhou, Shilei Wen
2017 arXiv   pre-print
Experiment results on the challenging Youtube-8M dataset demonstrate that our proposed temporal modeling approaches can significantly improve existing temporal modeling approaches in the large-scale video  ...  Our system contains three major components: two-stream sequence model, fast-forward sequence model and temporal residual neural networks.  ...  Conclusions In this work, we have proposed three temporal modeling approaches to address the challenging large-scale video recognition task.  ... 
arXiv:1707.04555v1 fatcat:nmapxga24fd7vhi2gefsc7tsy4

Interactive intrinsic video editing

Nicolas Bonneel, Kalyan Sunkavalli, James Tompkin, Deqing Sun, Sylvain Paris, Hanspeter Pfister
2014 ACM Transactions on Graphics  
We use a multi-scale parallelized solver to reconstruct the reflectance and illumination from these gradients while enforcing spatial and temporal reflectance constraints and user annotations.  ...  However, these algorithms cannot be easily extended to videos for two reasons: first, naïvely applying algorithms designed for single images to videos produce results that are temporally incoherent; second  ...  Acknowledgements We thank the SIGGRAPH reviewers for their feedback. This work was partially supported by NSF grants CGV-1111415, IIS-1110955, OIA-1125087, and LIMA -Région Rhône-Alpes.  ... 
doi:10.1145/2661229.2661253 fatcat:mbzg2zekgjdy3l4mvzxyr4fbkq

Deep Video Matting via Spatio-Temporal Alignment and Aggregation [article]

Yanan Sun, Guanzhi Wang, Qiao Gu, Chi-Keung Tang, Yu-Wing Tai
2021 arXiv   pre-print
in reasoning temporal domain and lack of large-scale video matting datasets.  ...  The other contribution consists of a large-scale video matting dataset with groundtruth alpha mattes for quantitative evaluation and real-world high-resolution videos with trimaps for qualitative evaluation  ...  The other challenge for video matting is the necessary input of a dense trimap for each frame, making it difficult to generate high quality large-scale video matting benchmarks.  ... 
arXiv:2104.11208v1 fatcat:o4m2ckvt3bblbdyz2uac55rjm4

Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models [article]

Feng Cheng, Mingze Xu, Yuanjun Xiong, Hao Chen, Xinyu Li, Wei Li, Wei Xia
2022 arXiv   pre-print
It is based on the finding that gradients from incomplete execution for backpropagation can still effectively train the models with minimal accuracy loss, which attributes to the high redundancy of video  ...  and temporal action detection.  ...  But in general graph-models, as all the top nodes will propagate gradients to the bottom nodes, the gradients' calculation for Figure 4 .  ... 
arXiv:2203.16755v1 fatcat:uq3zd76ijzar5etuigndch7scu

Particle Video: Long-Range Motion Estimation Using Point Trajectories

Peter Sand, Seth Teller
2008 International Journal of Computer Vision  
This paper describes a new approach to motion estimation in video. We represent video motion using a set of particles.  ...  The resulting motion representation is useful for a variety of applications and cannot be directly obtained using existing methods such as optical flow or feature tracking.  ...  If the pixel has nearly the same color in a large scale image as in all smaller scale images, it is a large scale pixel (Fig. 4) .  ... 
doi:10.1007/s11263-008-0136-6 fatcat:6spvemw24fc3fnyhmvwcglgqz4

Semantic Video Segmentation by Gated Recurrent Flow Propagation [article]

David Nilsson, Cristian Sminchisescu
2017 arXiv   pre-print
The temporal, gated recurrent flow propagation component of our model can be plugged into any static semantic segmentation architecture and turn it into a weakly supervised video processing one.  ...  Our model combines a convolutional architecture and a spatio-temporal transformer recurrent layer that are able to temporally propagate labeling information by means of optical flow, adaptively gated based  ...  on back-propagating gradient from the 2 STGRU units closest to the loss.  ... 
arXiv:1612.08871v2 fatcat:mcef73ca4zdufbkbzdqsa6siwq

Play and Rewind

Hanwang Zhang, Meng Wang, Richang Hong, Tat-Seng Chua
2016 Proceedings of the 2016 ACM on Multimedia Conference - MM '16  
optimization model, resulting in severe information loss.  ...  In this paper, we propose a novel unsupervised video hashing framework called Self-Supervised Temporal Hashing (SSTH) that is able to capture the temporal nature of videos in an end-to-end learning-to-hash  ...  Datasets We used two challenging large-scale video datasets for unsupervised training and retrieval 4 : FCVID. It is Fudan-Columbia Video Dataset [16] .  ... 
doi:10.1145/2964284.2964308 dblp:conf/mm/ZhangWHC16 fatcat:jmuptaw46jht3ipuau7qrvqsry

Deep Video Matting via Spatio-Temporal Alignment and Aggregation

Yanan Sun, Guanzhi Wang, Qiao Gu, Chi-Keung Tang, Yu-Wing Tai
2021 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
For a definitive version of this work, please refer to the published version.  ...  The other challenge for video matting is the necessary input of a dense trimap for each frame, making it difficult to generate high quality large-scale video matting benchmarks.  ...  Datasets Composited Dataset While there exist high-quality and large-scale datasets for image matting [33, 44] , only a few video matting datasets with ground truth alpha mattes are available which  ... 
doi:10.1109/cvpr46437.2021.00690 fatcat:kth5z3uekjdsngqdw72ge3pzse

Videoshop: A new framework for spatio-temporal video editing in gradient domain

Hongcheng Wang, Ning Xu, Ramesh Raskar, Narendra Ahuja
2007 Graphical Models  
This paper proposes a new framework for video editing in gradient domain.  ...  The spatio-temporal gradient fields of target videos are modified and/or mixed to generate a new gradient field which is usually not integrable.  ...  in Eq. (9) via loopy belief propagation [28] across a graphical model.  ... 
doi:10.1016/j.gmod.2006.06.002 fatcat:zpgdtt3hbba6tgkvrwi5c6ltvi

Artistic Style Transfer for Videos and Spherical Images

Manuel Ruder, Alexey Dosovitskiy, Thomas Brox
2018 International Journal of Computer Vision  
Doing this for a video sequence single-handedly is beyond imagination.  ...  We present two computational approaches that transfer the style from one image (for example, a painting) to a whole video sequence.  ...  Therefore, we developed a multi-pass algorithm processing the video in alternating temporal directions using both forward and backward flow.  ... 
doi:10.1007/s11263-018-1089-z fatcat:c3my424tovcpbh2fygm7o4mvnq

Occlusion-Aware Video Object Inpainting [article]

Lei Ke, Yu-Wing Tai, Chi-Keung Tang
2021 arXiv   pre-print
In particular, the shape completion module models long-range object coherence while the flow completion module recovers accurate flow with sharp motion boundary, for propagating temporally-consistent texture  ...  To facilitate this new research, we construct the first large-scale video object inpainting benchmark YouTube-VOI to provide realistic occlusion scenarios with both occluded and visible object masks available  ...  Training models for occlusion reasoning requires a large number and variety of occluded video objects with amodal mask annotations.  ... 
arXiv:2108.06765v1 fatcat:dltn4glbjvfq7haccbhhmsznpm

Ada-VSR: Adaptive Video Super-Resolution with Meta-Learning [article]

Akash Gupta, Padmaja Jonnalagedda, Bir Bhanu, Amit K. Roy-Chowdhury
2021 arXiv   pre-print
Most of the existing works in supervised spatio-temporal video super-resolution (STVSR) heavily rely on a large-scale external dataset consisting of paired low-resolution low-frame rate (LR-LFR)and high-resolution  ...  Specifically, meta-learning is employed to obtain adaptive parameters, using a large-scale external dataset, that can adapt quickly to the novel condition (degradation model) of the given test video during  ...  The objective of internal learning is to finetune the model for the video instance to improve the spatio-temporal super-resolution with only a few gradient steps.  ... 
arXiv:2108.02832v1 fatcat:ftu5ripp4rcqpht44v3rvwqah4

SCNN: A General Distribution based Statistical Convolutional Neural Network with Application to Video Object Detection [article]

Tianchen Wang, Jinjun Xiong, Xiaowei Xu, Yiyu Shi
2019 arXiv   pre-print
By introducing a parameterized canonical model to model correlated data and defining corresponding operations as required for CNN training and inference, we show that SCNN can process multiple frames of  ...  We use a CNN based video object detection as an example to illustrate the usefulness of the proposed SCNN as a general network model.  ...  Rather than to use a linear parameterized form as obtained by ICA, a direction for future improvement will be to use the nonlinear parameterized distribution that can model large-scale spatial correlation  ... 
arXiv:1903.07663v1 fatcat:5kqikqhplzd6doxieihlov3gfe

SCNN: A General Distribution Based Statistical Convolutional Neural Network with Application to Video Object Detection

Tianchen Wang, Jinjun Xiong, Xiaowei Xu, Yiyu Shi
2019 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
By introducing a parameterized canonical model to model correlated data and defining corresponding operations as required for CNN training and inference, we show that SCNN can process multiple frames of  ...  We use a CNN based video object detection as an example to illustrate the usefulness of the proposed SCNN as a general network model.  ...  Rather than to use a linear parameterized form as obtained by ICA, a direction for future improvement will be to use the nonlinear parameterized distribution that can model large-scale spatial correlation  ... 
doi:10.1609/aaai.v33i01.33015321 fatcat:bu2pgze2hvevbdymlg76j655ai
« Previous Showing results 1 — 15 out of 13,071 results