Filters








926 Hits in 4.2 sec

High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks [article]

Ruben Villegas, Arkanath Pathak, Harini Kannan, Dumitru Erhan, Quoc V. Le, Honglak Lee
2019 arXiv   pre-print
In this work, we question if such handcrafted architectures are necessary and instead propose a different approach: finding minimal inductive bias for video prediction while maximizing network capacity  ...  Predicting future video frames is extremely challenging, as there are many factors of variation that make up the dynamics of how frames change through time.  ...  We first consider the Stochastic Video Generation (SVG) architecture presented in Denton and Fergus [2018] , a stochastic video prediction model that is entirely made up of standard neural network layers  ... 
arXiv:1911.01655v1 fatcat:znetgaaf2fcq5j6d4jxrjitbq4

Latent Neural Differential Equations for Video Generation [article]

Cade Gordon, Natalie Parde
2021 arXiv   pre-print
Generative Adversarial Networks have recently shown promise for video generation, building off of the success of image generation while also addressing a new challenge: time.  ...  We study the effects of Neural Differential Equations to model the temporal dynamics of video generation.  ...  High fidelity video prediction with large stochastic recurrent neural net- works.  ... 
arXiv:2011.03864v3 fatcat:qgjcowrcg5b6pn3ltdqyjyvlle

Accurate Grid Keypoint Learning for Efficient Video Prediction [article]

Xiaojie Gao, Yueming Jin, Qi Dou, Chi-Wing Fu, Pheng-Ann Heng
2021 arXiv   pre-print
Extensive experiments verify that our method outperforms the state-ofthe-art stochastic video prediction methods while saves more than 98% of computing resources.  ...  Second, we introduce a 2D binary map to represent the detected grid keypoints and then suggest propagating keypoint locations with stochasticity by selecting entries in the discrete grid space, thus preserving  ...  Comparison with Existing Methods We compared our model with several state-of-the-art image-based stochastic video prediction approaches using image-autoregressive recurrent networks, including two variants  ... 
arXiv:2107.13170v1 fatcat:2ooq3fmo7reuhlzvv2yv5znhky

2021 Index IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 43

2022 IEEE Transactions on Pattern Analysis and Machine Intelligence  
., +, TPAMI Nov. 2021 3850-3862 Evaluation of Saccadic Scanpath Prediction: Subjective Assessment Database and Recurrent Neural Network Based Metric. Xia, C., +, TPAMI Dec.  ...  Niu, Y., +, TPAMI Jan. 2021 347-359 Video Anomaly Detection with Sparse Coding Inspired Deep Neural Networks.  ...  Grammars A Generalized Earley Parser for Human Activity Parsing and Prediction. Qi, S., +, TPAMI Aug. 2021 Damen, D., +, TPAMI Nov. 2021 4125-4141  ... 
doi:10.1109/tpami.2021.3126216 fatcat:h6bdbf2tdngefjgj76cudpoyia

Predicting Video with VQVAE [article]

Jacob Walker, Ali Razavi, Aäron van den Oord
2021 arXiv   pre-print
With VQ-VAE we compress high-resolution videos into a hierarchical set of multi-scale discrete latent variables.  ...  In recent years, the task of video prediction-forecasting future video given past video frames-has attracted attention in the research community.  ...  “Folded Recurrent Neural Networks for Future Video Prediction.” In: ECCV. 2018. [35] Aäron van den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. “Pixel Recurrent Neural Networks.”  ... 
arXiv:2103.01950v1 fatcat:fmbirgg4bnh25akw2nm2yfnxjm

Convolutional neural networks automate detection for tracking of submicron-scale particles in 2D and 3D

Jay M. Newby, Alison M. Schaefer, Phoebe T. Lee, M. Gregory Forest, Samuel K. Lai
2018 Proceedings of the National Academy of Sciences of the United States of America  
The neural network tracker provides unprecedented automation and accuracy, with exceptionally low false positive and false negative rates on both 2D and 3D simulated videos and 2D experimental videos of  ...  to train the network on a diverse portfolio of video conditions.  ...  We designed our network to be recurrent in time so that past and future observations are used to predict particle locations.  ... 
doi:10.1073/pnas.1804420115 pmid:30135100 fatcat:scdhejrxc5clfayhujrjce6gda

Stochastic Video Generation with a Learned Prior [article]

Emily Denton, Rob Fergus
2018 arXiv   pre-print
Video frames are generated by drawing samples from this prior and combining them with a deterministic estimate of the future frame.  ...  Generating video frames that accurately predict future world states is challenging. Existing approaches either fail to capture the full distribution of outcomes, or yield blurry generations, or both.  ...  Several additional works train stochastic recurrent neural networks to model speech, handwriting, natural language (Chung et al., 2015; Fraccaro et al., 2016; Bowman et al., 2016) , perform counterfactual  ... 
arXiv:1802.07687v2 fatcat:z4d3hjpxwnhtno2zyzn4l5ortu

Learning Orthographic Structure With Sequential Generative Neural Networks

Alberto Testolin, Ivilin Stoianov, Alessandro Sperduti, Marco Zorzi
2015 Cognitive Science  
Here, we investigated a sequential version of the restricted Boltzmann machine (RBM), a stochastic recurrent neural network that extracts high-order structure from sensory data through unsupervised generative  ...  The model was compared to an extended version of simple recurrent networks, augmented with a stochastic process that allows autonomous generation of sequences, and to non-connectionist probabilistic models  ...  Simple recurrent networks are feed-forward neural networks composed by three layers.  ... 
doi:10.1111/cogs.12258 pmid:26073971 fatcat:k47jad5555ewzbwopstezdozi4

Image and Video Compression with Neural Networks: A Review

Siwei Ma, Xinfeng Zhang, Chuanmin Jia, Zhenghui Zhao, Shiqi Wang, Shanshe Wanga
2019 IEEE transactions on circuits and systems for video technology (Print)  
The evolution and development of neural network based compression methodologies are introduced for images and video respectively.  ...  In this paper, we provide a systematic, comprehensive and up-to-date review of neural network based image and video compression techniques.  ...  ) and Recurrent Neural Networks (RNN).  ... 
doi:10.1109/tcsvt.2019.2910119 fatcat:ibwmmewdlfcexjxfetsxzga52y

Stochastic Latent Residual Video Prediction [article]

Jean-Yves Franceschi
2020 arXiv   pre-print
Most works in the literature are based on stochastic image-autoregressive recurrent networks, which raises several performance and applicability issues.  ...  However, no such model for stochastic video prediction has been proposed in the literature yet, due to design and training difficulties.  ...  High fidelity video prediction with large stochastic recurrent neural networks. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché Buc, F., Fox, E., and Garnett, R.  ... 
arXiv:2002.09219v4 fatcat:wble6c57gzewpcir7bq27dd3nq

Unsupervised Learning of Object Structure and Dynamics from Videos [article]

Matthias Minderer, Chen Sun, Ruben Villegas, Forrester Cole, Kevin Murphy, Honglak Lee
2020 arXiv   pre-print
Extracting and predicting object structure and dynamics from videos without supervision is a major challenge in machine learning.  ...  Our method improves upon unstructured representations both for pixel-level video prediction and for downstream tasks requiring object-level understanding of motion dynamics.  ...  Stochastic dynamics model To model the dynamics in the video, we use a variational recurrent neural network (VRNN) [8] .  ... 
arXiv:1906.07889v3 fatcat:elss3ab5vnh2be77fud2nqdgmq

Recurrent Attention Models for Depth-Based Person Identification [article]

Albert Haque, Alexandre Alahi, Li Fei-Fei
2016 arXiv   pre-print
Formulated as a reinforcement learning task, our model is based on a combination of convolutional and recurrent neural networks with the goal of identifying small, discriminative regions indicative of  ...  By combining a sparsification technique with a reinforcement learning objective, our recurrent attention model attends to small spatio-temporal regions with high fidelity while avoiding areas with little  ...  In the next section, we describe our model and how we balance this trade-off by employing visual "glimpses" [50] which process small 4D regions with high fidelity and grow to larger regions with lower  ... 
arXiv:1611.07212v1 fatcat:btybowthaba4jnwju2gh3y6yzu

Recurrent Attention Models for Depth-Based Person Identification

Albert Haque, Alexandre Alahi, Li Fei-Fei
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
Formulated as a reinforcement learning task, our model is based on a combination of convolutional and recurrent neural networks with the goal of identifying small, discriminative regions indicative of  ...  By combining a sparsification technique with a reinforcement learning objective, our recurrent attention model attends to small spatio-temporal regions with high fidelity while avoiding areas with little  ...  In the next section, we describe our model and how we balance this trade-off by employing visual "glimpses" [50] which process small 4D regions with high fidelity and grow to larger regions with lower  ... 
doi:10.1109/cvpr.2016.138 dblp:conf/cvpr/HaqueAF16 fatcat:joqq2u7borba3ipum37yt7ob2q

Interpretable Deep Neural Networks for Dimensional and Categorical Emotion Recognition in-the-wild [article]

Xia Yicheng, Dimitrios Kollias
2019 arXiv   pre-print
This project focuses on extending the emotion recognition database, and training the CNN + RNN emotion recognition neural networks with emotion category representation and valence \& arousal representation  ...  The inner-relationship between two emotion representations and the interpretability of the neural networks are investigated.  ...  Section 2.2 goes through the idea of the Convolutional Neural Networks and Recurrent Neural Networks.  ... 
arXiv:1910.05784v2 fatcat:pupd36l3xngrbcv4bwkoujo2lm

A Review on Deep Learning Techniques for Video Prediction [article]

Sergiu Oprea, Pablo Martinez-Gonzalez, Alberto Garcia-Garcia, John Alejandro Castro-Vargas, Sergio Orts-Escolano, Jose Garcia-Rodriguez, Antonis Argyros
2020 arXiv   pre-print
We firstly define the video prediction fundamentals, as well as mandatory background concepts and the most used datasets.  ...  In light of the success of deep learning in computer vision, deep-learning-based video prediction emerged as a promising research direction.  ...  neural networks, recurrent networks, and generative models.  ... 
arXiv:2004.05214v2 fatcat:weerbkanmjb4dn6wkn5o4b5aia
« Previous Showing results 1 — 15 out of 926 results