A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Gradient Forward-Propagation for Large-Scale Temporal Video Modelling
[article]
2021
arXiv
pre-print
action recognition video datasets such as HMDB51, UCF101, and the large-scale Kinetics-600. ...
In this paper, we build upon Sideways, which avoids blocking by propagating approximate gradients forward in time, and we propose mechanisms for temporal integration of information based on different variants ...
We thank Carl Doersch, Tom Hennigan, Jacob Menick, Simon Osindero, and Andrew Zisserman for their advice throughout the duration of the project and the anonymous CVPR reviewers for their feedback on the ...
arXiv:2106.08318v2
fatcat:xyxli2a2xvgy3maw6uz4bh54kq
Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding
[article]
2017
arXiv
pre-print
Experiment results on the challenging Youtube-8M dataset demonstrate that our proposed temporal modeling approaches can significantly improve existing temporal modeling approaches in the large-scale video ...
Our system contains three major components: two-stream sequence model, fast-forward sequence model and temporal residual neural networks. ...
Conclusions In this work, we have proposed three temporal modeling approaches to address the challenging large-scale video recognition task. ...
arXiv:1707.04555v1
fatcat:nmapxga24fd7vhi2gefsc7tsy4
Interactive intrinsic video editing
2014
ACM Transactions on Graphics
We use a multi-scale parallelized solver to reconstruct the reflectance and illumination from these gradients while enforcing spatial and temporal reflectance constraints and user annotations. ...
However, these algorithms cannot be easily extended to videos for two reasons: first, naïvely applying algorithms designed for single images to videos produce results that are temporally incoherent; second ...
Acknowledgements We thank the SIGGRAPH reviewers for their feedback. This work was partially supported by NSF grants CGV-1111415, IIS-1110955, OIA-1125087, and LIMA -Région Rhône-Alpes. ...
doi:10.1145/2661229.2661253
fatcat:mbzg2zekgjdy3l4mvzxyr4fbkq
Deep Video Matting via Spatio-Temporal Alignment and Aggregation
[article]
2021
arXiv
pre-print
in reasoning temporal domain and lack of large-scale video matting datasets. ...
The other contribution consists of a large-scale video matting dataset with groundtruth alpha mattes for quantitative evaluation and real-world high-resolution videos with trimaps for qualitative evaluation ...
The other challenge for video matting is the necessary input of a dense trimap for each frame, making it difficult to generate high quality large-scale video matting benchmarks. ...
arXiv:2104.11208v1
fatcat:o4m2ckvt3bblbdyz2uac55rjm4
Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models
[article]
2022
arXiv
pre-print
It is based on the finding that gradients from incomplete execution for backpropagation can still effectively train the models with minimal accuracy loss, which attributes to the high redundancy of video ...
and temporal action detection. ...
But in general graph-models, as all the top nodes will propagate gradients to the bottom nodes, the gradients' calculation for Figure 4 . ...
arXiv:2203.16755v1
fatcat:uq3zd76ijzar5etuigndch7scu
Particle Video: Long-Range Motion Estimation Using Point Trajectories
2008
International Journal of Computer Vision
This paper describes a new approach to motion estimation in video. We represent video motion using a set of particles. ...
The resulting motion representation is useful for a variety of applications and cannot be directly obtained using existing methods such as optical flow or feature tracking. ...
If the pixel has nearly the same color in a large scale image as in all smaller scale images, it is a large scale pixel (Fig. 4) . ...
doi:10.1007/s11263-008-0136-6
fatcat:6spvemw24fc3fnyhmvwcglgqz4
Semantic Video Segmentation by Gated Recurrent Flow Propagation
[article]
2017
arXiv
pre-print
The temporal, gated recurrent flow propagation component of our model can be plugged into any static semantic segmentation architecture and turn it into a weakly supervised video processing one. ...
Our model combines a convolutional architecture and a spatio-temporal transformer recurrent layer that are able to temporally propagate labeling information by means of optical flow, adaptively gated based ...
on back-propagating gradient from the 2 STGRU units closest to the loss. ...
arXiv:1612.08871v2
fatcat:mcef73ca4zdufbkbzdqsa6siwq
Play and Rewind
2016
Proceedings of the 2016 ACM on Multimedia Conference - MM '16
optimization model, resulting in severe information loss. ...
In this paper, we propose a novel unsupervised video hashing framework called Self-Supervised Temporal Hashing (SSTH) that is able to capture the temporal nature of videos in an end-to-end learning-to-hash ...
Datasets We used two challenging large-scale video datasets for unsupervised training and retrieval 4 :
FCVID. It is Fudan-Columbia Video Dataset [16] . ...
doi:10.1145/2964284.2964308
dblp:conf/mm/ZhangWHC16
fatcat:jmuptaw46jht3ipuau7qrvqsry
Deep Video Matting via Spatio-Temporal Alignment and Aggregation
2021
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
For a definitive version of this work, please refer to the published version. ...
The other challenge for video matting is the necessary input of a dense trimap for each frame, making it difficult to generate high quality large-scale video matting benchmarks. ...
Datasets
Composited Dataset While there exist high-quality and large-scale datasets for image matting [33, 44] , only a few video matting datasets with ground truth alpha mattes are available which ...
doi:10.1109/cvpr46437.2021.00690
fatcat:kth5z3uekjdsngqdw72ge3pzse
Videoshop: A new framework for spatio-temporal video editing in gradient domain
2007
Graphical Models
This paper proposes a new framework for video editing in gradient domain. ...
The spatio-temporal gradient fields of target videos are modified and/or mixed to generate a new gradient field which is usually not integrable. ...
in Eq. (9) via loopy belief propagation [28] across a graphical model. ...
doi:10.1016/j.gmod.2006.06.002
fatcat:zpgdtt3hbba6tgkvrwi5c6ltvi
Artistic Style Transfer for Videos and Spherical Images
2018
International Journal of Computer Vision
Doing this for a video sequence single-handedly is beyond imagination. ...
We present two computational approaches that transfer the style from one image (for example, a painting) to a whole video sequence. ...
Therefore, we developed a multi-pass algorithm processing the video in alternating temporal directions using both forward and backward flow. ...
doi:10.1007/s11263-018-1089-z
fatcat:c3my424tovcpbh2fygm7o4mvnq
Occlusion-Aware Video Object Inpainting
[article]
2021
arXiv
pre-print
In particular, the shape completion module models long-range object coherence while the flow completion module recovers accurate flow with sharp motion boundary, for propagating temporally-consistent texture ...
To facilitate this new research, we construct the first large-scale video object inpainting benchmark YouTube-VOI to provide realistic occlusion scenarios with both occluded and visible object masks available ...
Training models for occlusion reasoning requires a large number and variety of occluded video objects with amodal mask annotations. ...
arXiv:2108.06765v1
fatcat:dltn4glbjvfq7haccbhhmsznpm
Ada-VSR: Adaptive Video Super-Resolution with Meta-Learning
[article]
2021
arXiv
pre-print
Most of the existing works in supervised spatio-temporal video super-resolution (STVSR) heavily rely on a large-scale external dataset consisting of paired low-resolution low-frame rate (LR-LFR)and high-resolution ...
Specifically, meta-learning is employed to obtain adaptive parameters, using a large-scale external dataset, that can adapt quickly to the novel condition (degradation model) of the given test video during ...
The objective of internal learning is to finetune the model for the video instance to improve the spatio-temporal super-resolution with only a few gradient steps. ...
arXiv:2108.02832v1
fatcat:ftu5ripp4rcqpht44v3rvwqah4
SCNN: A General Distribution based Statistical Convolutional Neural Network with Application to Video Object Detection
[article]
2019
arXiv
pre-print
By introducing a parameterized canonical model to model correlated data and defining corresponding operations as required for CNN training and inference, we show that SCNN can process multiple frames of ...
We use a CNN based video object detection as an example to illustrate the usefulness of the proposed SCNN as a general network model. ...
Rather than to use a linear parameterized form as obtained by ICA, a direction for future improvement will be to use the nonlinear parameterized distribution that can model large-scale spatial correlation ...
arXiv:1903.07663v1
fatcat:5kqikqhplzd6doxieihlov3gfe
SCNN: A General Distribution Based Statistical Convolutional Neural Network with Application to Video Object Detection
2019
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
By introducing a parameterized canonical model to model correlated data and defining corresponding operations as required for CNN training and inference, we show that SCNN can process multiple frames of ...
We use a CNN based video object detection as an example to illustrate the usefulness of the proposed SCNN as a general network model. ...
Rather than to use a linear parameterized form as obtained by ICA, a direction for future improvement will be to use the nonlinear parameterized distribution that can model large-scale spatial correlation ...
doi:10.1609/aaai.v33i01.33015321
fatcat:bu2pgze2hvevbdymlg76j655ai
« Previous
Showing results 1 — 15 out of 13,071 results