927 Hits in 3.0 sec

Future Semantic Segmentation with Convolutional LSTM [article]

Seyed shahabeddin Nabavi, Mrigank Rochan, Yang, Wang
2018 arXiv   pre-print
We propose a novel model that uses convolutional LSTM (ConvLSTM) to encode the spatiotemporal information of observed frames for future prediction.  ...  We consider the problem of predicting semantic segmentation of future frames in a video.  ...  We thank NVIDIA for donating some of the GPUs used in this work.  ... 
arXiv:1807.07946v1 fatcat:emv7greii5amrd7ykczdhir4zy

Sequence-to-Sequence Video Prediction by Learning Hierarchical Representations

Kun Fan, Chungin Joung, Seungjun Baek
2020 Applied Sciences  
Convolutional Long Short-Term Memory (ConvLSTM) is used in combination with skip connections so as to separately capture the sequential structures of multiple levels of hierarchy of features.  ...  Video prediction which maps a sequence of past video frames into realistic future video frames is a challenging task because it is difficult to generate realistic frames and model the coherent relationship  ...  [37] extended the ConvLSTM to build neural networks using predictive coding to predict future frames.  ... 
doi:10.3390/app10228288 fatcat:6exnlpr5nvatlfz5bh7d2hmway

Semantic Segmentation of Video Sequences with Convolutional LSTMs [article]

Andreas Pfeuffer and Karina Schulz and Klaus Dietmayer
2019 arXiv   pre-print
Most of the semantic segmentation approaches have been developed for single image segmentation, and hence, video sequences are currently segmented by processing each frame of the video sequence separately  ...  Nowadays, state-of-the-art segmentation approaches rarely use the classical encoder-decoder structure, but use multi-branch architectures.  ...  [25, 26] ) do not often use the classical encoder-decoder structure any more, but use several branches, for example for different resolutions, and combine these branches at the end of the network.  ... 
arXiv:1905.01058v1 fatcat:53iuvplif5edjb3uttkdqx2hti

Video Prediction at Multiple Scales with Hierarchical Recurrent Networks [article]

Ani Karapetyan, Angel Villar-Corrales, Andreas Boltres, Sven Behnke
2022 arXiv   pre-print
competitive performance for video frame prediction.  ...  scenes or action recognition datasets, consistently outperforming popular approaches for video frame prediction.  ...  Then, the model combines the predicted structured representations and the seed frames in order to forecast the future video frames.  ... 
arXiv:2203.09303v1 fatcat:5gl5zhcexvggzcronrdhx54ore

Decomposing Motion and Content for Natural Video Sequence Prediction [article]

Ruben Villegas, Jimei Yang, Seunghoon Hong, Xunyu Lin, Honglak Lee
2018 arXiv   pre-print
We propose a deep neural network for the prediction of future frames in natural video sequences.  ...  By independently modeling motion and content, predicting the next frame reduces to converting the extracted content features into the next frame content by the identified motion features, which simplifies  ...  (Section 4.3). • Combination Layers and Decoder takes the outputs from both encoder pathways and residual connections, d t , s t , and r t , and combines them to produce a pixel-level prediction of the  ... 
arXiv:1706.08033v2 fatcat:pq3eirkmn5cehplol4zmtbhska

Don't Forget The Past: Recurrent Depth Estimation from Monocular Video [article]

Vaishakh Patil, Wouter Van Gansbeke, Dengxin Dai, Luc Van Gool
2020 arXiv   pre-print
Thus far, depth is mostly estimated independently for a single frame at a time, even if the method starts from video input.  ...  We integrate the corresponding networks with a ConvLSTM such that the spatiotemporal structures of depth across frames can be exploited to yield a more accurate depth estimation.  ...  X = f encoder (I), (1) D = f decoder (X), (2) where X is the summarized representation by the encoder, which is compact and will be used to pass information across video frames for depth estimation from  ... 
arXiv:2001.02613v2 fatcat:4siqtdbr75bt3effxe2pyoczqe

Segmenting the Future [article]

Hsu-kuang Chiu, Ehsan Adeli, Juan Carlos Niebles
2019 arXiv   pre-print
While prior work attempts to predict future video pixels, anticipate activities or forecast future scene semantic segments from segmentation of the preceding frames, methods that predict future semantic  ...  In this paper, we propose a temporal encoder-decoder network architecture that encodes RGB frames from the past and decodes the future semantic segmentation.  ...  The architecture of ConvLSTM is based on the bidirectional ConvLSTM temporal module and uses the asymmetric Resnet101-FCN encoder-decoder backbone.  ... 
arXiv:1904.10666v2 fatcat:rpfw54hrebh2tc6bee3btyeqz4

Cubic LSTMs for Video Prediction [article]

Hehe Fan, Linchao Zhu, Yi Yang
2019 arXiv   pre-print
Predicting future frames in videos has become a promising direction of research for both computer vision and robot learning communities.  ...  ., a spatial branch for capturing moving objects, a temporal branch for processing motions, and an output branch for combining the first two branches to generate predicted frames.  ...  loss functions, and then combine the predicted motion and stationary content to construct future frames.  ... 
arXiv:1904.09412v1 fatcat:go3yfe3iazhyrb5ij5tiqverau

Don't Forget The Past: Recurrent Depth Estimation from Monocular Video

Vaishakh Patil, Wouter Van Gansbeke, Dengxin Dai, Luc Van Gool
2020 IEEE Robotics and Automation Letters  
While videos are used in the training stage of self-supervised depth prediction methods for computing the view-synthesis loss across neighboring frames [5] , [6] , [7] , [15] , they ignore the intrinsic  ...  More specifically, the learning process at frame t starts with spatial convolutions with the encoder to get X t , which is followed by temporal convolutions with the ConvLSTM H t , C t = f ConvLSTM ((X  ... 
doi:10.1109/lra.2020.3017478 fatcat:2rh3aqcrkbgjxncnbbkenc4n5a

Looking Ahead: Anticipating Pedestrians Crossing with Future Frames Prediction [article]

Mohamed Chaabane, Ameni Trabelsi, Nathaniel Blanchard, Ross Beveridge
2020 arXiv   pre-print
Our end-to-end model consists of two stages: the first stage is an encoder/decoder network that learns to predict future video frames.  ...  Specifically, our model uses previous video frames, recorded from the perspective of the vehicle, to predict if a pedestrian will cross in front of the vehicle.  ...  The first stage consists in processing N input video frames using an encoder/decoder network to predict N future frames.  ... 
arXiv:1910.09077v2 fatcat:vgvfduyqc5gixixsnfzxzm3vei

Efficient and Information-Preserving Future Frame Prediction and Beyond

Wei Yu, Yichao Lu, Steve Easterbrook, Sanja Fidler
2020 International Conference on Learning Representations  
Applying resolution-preserving blocks is a common practice to maximize information preservation in video prediction, yet their high memory consumption greatly limits their application scenarios.  ...  Our competitive results indicate the potential of using CrevNet as a generative pre-training strategy to guide downstream tasks.  ...  at 10 Hz with resolution of 1242×375.We first resize each frame to 416×128 and finetune our best model on the video prediction task solely.The combinations of features extracted by our two-way autoencoder  ... 
dblp:conf/iclr/YuLEF20 fatcat:hxsjccr2bbas7fom4j4flfk76e

Convolutional Tensor-Train LSTM for Spatio-temporal Learning [article]

Jiahao Su, Wonmin Byeon, Jean Kossaifi, Furong Huang, Jan Kautz, Animashree Anandkumar
2020 arXiv   pre-print
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.  ...  and datasets, including the multi-steps video prediction on the Moving-MNIST-2and KTH action datasets as well as early activity recognition on the Something-Something V2 dataset.  ...  Multi-frame video prediction: Moving-MNIST-2 dataset.  ... 
arXiv:2002.09131v5 fatcat:2fsgpf6h2ngqtj2j25ptj626na

Future Frame Prediction Using Convolutional VRNN for Anomaly Detection [article]

Yiwei Lu, Mahesh Kumar Krishna Reddy, Seyed shahabeddin Nabavi and Yang Wang
2019 arXiv   pre-print
convolutional LSTM (ConvLSTM).  ...  To the best of our knowledge, this is the first work that considers temporal information in future frame prediction based anomaly detection framework from the model perspective.  ...  We thank NVIDIA for donating some of the GPUs used in this work. Ped1 Ped2 Avenue Figure 5 . Examples of anomaly detection on three datasets.  ... 
arXiv:1909.02168v2 fatcat:64djavv3fbbu3e5zvehvwibyqi

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [article]

Yunbo Wang, Haixu Wu, Jianjin Zhang, Zhifeng Gao, Jianmin Wang, Philip S. Yu, Mingsheng Long
2022 arXiv   pre-print
We further propose a new curriculum learning strategy to force PredRNN to learn long-term dynamics from context frames, which can be generalized to most sequence-to-sequence models.  ...  Our approach is shown to obtain highly competitive results on five datasets for both action-free and action-conditioned predictive learning scenarios.  ...  Since both the encoding and forecasting parts of ConvLSTM use the same set of model parameters, the one-step training in the encoding phase may prevent the forecaster from learning the jumpy frame dependencies  ... 
arXiv:2103.09504v4 fatcat:al5ij37d3nhj7nynglu7rod5k4

Precipitation Nowcasting with Star-Bridge Networks [article]

Yuan Cao, Qiuying Li, Hongming Shan, Zhizhong Huang, Lei Chen, Leiming Ma, Junping Zhang
2019 arXiv   pre-print
Existing deep learning-based algorithms use a single network to process various rainfall intensities together, compromising the predictive accuracy.  ...  Precipitation nowcasting, which aims to precisely predict the short-term rainfall intensity of a local region, is gaining increasing attention in the artificial intelligence community.  ...  More specifically, they proposed encoder-decoder based convolutional long short-term memory (ConvLSTM), which is broadly used in various video tasks.  ... 
arXiv:1907.08069v2 fatcat:ggklxfd2cfbb3muskesvpuvdwa
« Previous Showing results 1 — 15 out of 927 results