A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Learning to Forecast Videos of Human Activity with Multi-granularity Models and Adaptive Rendering
[article]
2017
arXiv
pre-print
We propose an approach for forecasting video of complex human activity involving multiple people. Direct pixel-level prediction is too simple to handle the appearance variability in complex activities. Hence, we develop novel intermediate representations. An architecture combining a hierarchical temporal model for predicting human poses and encoder-decoder convolutional neural networks for rendering target appearances is proposed. Our hierarchical model captures interactions among people by
arXiv:1712.01955v1
fatcat:22a5mvzl55g4bdbjj367jup22e