A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Towards Accurate Generative Models of Video: A New Metric & Challenges
[article]
2019
arXiv
pre-print
To this extent we propose Fréchet Video Distance (FVD), a new metric for generative models of video, and StarCraft 2 Videos (SCV), a benchmark of game play from custom starcraft 2 scenarios that challenge ...
We contribute a large-scale human study, which confirms that FVD correlates well with qualitative human judgment of generated videos, and provide initial benchmark results on SCV. ...
Conclusion We introduced the Fréchet Video Distance (FVD), a new evaluation metric for generative models of video, and an important step towards better evaluation of models for video generation. ...
arXiv:1812.01717v2
fatcat:aab3klrxwvantmc5ayoccbvaxa
Markov Decision Process for Video Generation
[article]
2019
arXiv
pre-print
To address this, we reformulate the problem of video generation as a Markov Decision Process (MDP). ...
We identify two pathological cases of temporal inconsistencies in video generation: video freezing and video looping. ...
The authors thank Sergey Tulyakov and Masaki Saito for helpful clarifications. ...
arXiv:1909.12400v1
fatcat:iqf5jzevj5bbjoulgx2oxsgxry
Action-conditioned Benchmarking of Robotic Video Prediction Models: a Comparative Study
[article]
2019
arXiv
pre-print
In this paper, we are proposing a new metric to compare different video prediction models based on this argument. ...
However, a comprehensive method for determining the fitness of different video prediction models at guiding the selection of actions is yet to be developed. ...
Acknowledgements This work is partially supported by the Portuguese Foundation for Science and Technology (FCT) project [UID/EEA/50009/2019]. ...
arXiv:1910.02564v1
fatcat:vd2djvrgffbxxl6twijvwby2mm
DVC-P: Deep Video Compression with Perceptual Optimizations
[article]
2021
arXiv
pre-print
Experimental results demonstrate that, compared with the baseline DVC, our proposed method can generate videos with higher perceptual quality achieving 12.27% reduction in a perceptual BD-rate equivalent ...
Specifically, a discriminator network and a mixed loss are employed to help our network trade off among distortion, perception and rate. ...
The GOP size is 10, and the first 100 frames are tested for each sequence.
1) Perceptual Video Quality Metric: We test perceptual quality of decoded videos by FVD. ...
arXiv:2109.10849v2
fatcat:yybw353jm5cxhet3kgjczuxddy
Latent Video Transformer
[article]
2020
arXiv
pre-print
The video generation task can be formulated as a prediction of future video frames given some past frames. Recent generative models for videos face the problem of high computational requirements. ...
Some models require up to 512 Tensor Processing Units for parallel training. In this work, we address this problem via modeling the dynamics in a latent space. ...
They also acknowledge Vage Egiazarian for thoughtful discussions of the model and the experiments. ...
arXiv:2006.10704v1
fatcat:lzq7cokewzdrzcqemo7hmv3za4
Scaling Autoregressive Video Models
[article]
2020
arXiv
pre-print
In contrast, we show that conceptually simple autoregressive video generation models based on a three-dimensional self-attention mechanism achieve competitive results across multiple metrics on popular ...
Due to the statistical complexity of video, the high degree of inherent stochasticity, and the sheer amount of data, generating natural video remains a challenging task. ...
We would also like to thank Chelsea Finn and Tom Kwiatkowski for thoughtful comments on an earlier draft. ...
arXiv:1906.02634v3
fatcat:eu3jxenc5jg3dfvfxin3hdpnm4
Transframer: Arbitrary Frame Prediction with Generative Models
[article]
2022
arXiv
pre-print
We present a general-purpose framework for image modelling and vision tasks based on probabilistic frame prediction. ...
Transframer is the state-of-the-art on a variety of video generation benchmarks, is competitive with the strongest models on few-shot view synthesis, and can generate coherent 30 second videos from a single ...
.: Towards accurate generative models of video: A new metric & challenges. ICLR Workshops (2019) 10 54. ...
arXiv:2203.09494v3
fatcat:4hfy5x53vbdv3bknwxr5kt67va
Novel View Video Prediction Using a Dual Representation
[article]
2021
arXiv
pre-print
Moreover, our method relies only onRGB frames to learn a dual representation which is used to generate the video from a novel viewpoint. ...
We address the problem of novel view video prediction; given a set of input video clips from a single/multiple views, our network is able to predict the video from a novel view. ...
[20] Thomas Unterthiner, Sjoerd van Steenkiste, Karol Ku-
rach, Raphael Marinier, Marcin Michalski, and Syl-
vain Gelly, "Towards accurate generative models of
video: A new metric & challenges," arXiv ...
arXiv:2106.03956v1
fatcat:lbxbfbjynja3hnznwwc66z5zmq
Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction
[article]
2021
arXiv
pre-print
(i.e., thousands frames), setting a new standard of video prediction with orders of magnitude longer prediction time than existing approaches. ...
We evaluate our method on three challenging datasets involving car driving and human dancing, and demonstrate that it can generate complicated scene structures and motions over a very long time horizon ...
For the video-level evaluation, we adopt FVD that measures a Fréchet distance between the ground-truth videos and the generated ones in a video representation space. ...
arXiv:2104.06697v1
fatcat:thkaq2a53fhyzedof52lzw7xtm
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
[article]
2022
arXiv
pre-print
We can generate arbitrarily long videos at arbitrary high frame rate, while prior work struggles to generate even 64 frames at a fixed rate. ...
This decreases the training cost and provides richer learning signal to the generator, making it possible to train directly on 1024^2 videos for the first time. ...
Frechet Video Distance (FVD) [68] serves as the main metric for video synthesis, but there is no complete official implementation for it (see §4 and Appx C). ...
arXiv:2112.14683v3
fatcat:qnsoi4xsgbglfgmumpioveuvjq
StyleVideoGAN: A Temporal Generative Model using a Pretrained StyleGAN
[article]
2021
arXiv
pre-print
After training, our model can not only generate new portrait videos for the training subject, but also for any random subject which can be embedded in the StyleGAN space. ...
We present a novel approach to the video synthesis problem that helps to greatly improve visual quality and drastically reduce the amount of training data and resources necessary for generating videos. ...
We also thank Pramod Rao for his invaluable support in conducting the experiments for our evaluation section. This work was supported by the ERC Consolidator Grant 4DReply (770784). ...
arXiv:2107.07224v2
fatcat:2lbxp7tenjbc7febnov6d57ql4
S-Flow GAN
[article]
2019
arXiv
pre-print
Our work offers a new method for domain translation from semantic label maps and Computer Graphic (CG) simulation edge map images to photo-realistic images. ...
We train a Generative Adversarial Network (GAN) in a conditional way to generate a photo-realistic version of a given CG scene. ...
FVD is a metric for video generation models evaluation and it uses a modified version of FID. we calculated the FVD score for our generated video (Ours-vid) w.r.t. the Oracle (real video) and did the same ...
arXiv:1905.08474v2
fatcat:3thpv32pefgh3fwa6k2ddzii5u
Stochastic Image-to-Video Synthesis using cINNs
[article]
2021
arXiv
pre-print
In contrast to common stochastic image-to-video synthesis, such a model does not merely generate arbitrary videos progressing the initial image. ...
for controlled video synthesis. ...
Dynamic Texture FVD (DTFVD) In Sec. 4.3 of our main paper, we introduced a dedicated FVD metric for the domain of dynamics textures, the Dynamic Texture Fréchet Video Distance (DTFVD). ...
arXiv:2105.04551v2
fatcat:sye4z4og6vfghardh3edxsc7pi
Playable Video Generation
[article]
2021
arXiv
pre-print
We propose a novel framework for PVG that is trained in a self-supervised manner on a large dataset of unlabelled videos. ...
In PVG, we aim at allowing a user to control the generated video by selecting a discrete action at every time step as when playing a video game. ...
From this observation, we propose a new task, Playable Video Generation (PVG) illustrated in Fig 1a . ...
arXiv:2101.12195v1
fatcat:rl2xllly2zb4ddhrbumt5elgrm
CCVS: Context-aware Controllable Video Synthesis
[article]
2021
arXiv
pre-print
This presentation introduces a self-supervised learning approach to the synthesis of new video clips from old ones, with several new key elements for improved spatial resolution and realism: It conditions ...
by affording simple mechanisms for handling multimodal ancillary information for controlling the synthesis process (eg, a few sample frames, an audio track, a trajectory in image space) and taking into ...
We thank the reviewers for useful comments. ...
arXiv:2107.08037v2
fatcat:xmpykxzkz5cxrppejzlc5u6i6i
« Previous
Showing results 1 — 15 out of 102 results