A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Z-Forcing: Training Stochastic Recurrent Networks
[article]
2017
arXiv
pre-print
Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). ...
In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network. ...
We show that mixing stochastic forward pass, conditional prior and backward recognition network helps building effective stochastic recurrent models. ...
arXiv:1711.05411v2
fatcat:bhvbpbh6bne3ddzdovsbxlfo3q
Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data
[article]
2018
arXiv
pre-print
It has been demonstrated that model capacity can be significantly enhanced by introducing stochastic latent variables in the hidden states of recurrent neural networks. ...
We argue that Stochastic WaveNet enjoys powerful distribution modeling capacity and the advantage of parallel training from dilated convolutions. ...
., 2016) and Z-forcing (Goyal et al., 2017) offer more powerful versions with augmented inference networks which better capture the correlation between the stochastic latent variables and the whole ...
arXiv:1806.06116v1
fatcat:4ijbqexxhbfi5fharoheqsmgau
Variational Bi-LSTMs
[article]
2017
arXiv
pre-print
Recurrent neural networks like long short-term memory (LSTM) are important architectures for sequential prediction tasks. LSTMs (and RNNs in general) model sequences along the forward time direction. ...
In the training of Bi-LSTMs, the forward and backward paths are learned independently. ...
We note that recently proposed methods like TwinNet (Serdyuk et al., 2017) and Z-forcing are similar in spirit to this idea. ...
arXiv:1711.05717v1
fatcat:z7tj5zgupzf4zcvs2ykpgzggei
Re-examination of the Role of Latent Variables in Sequence Modeling
[article]
2019
arXiv
pre-print
However, opposite results are also observed in other domains, where standard recurrent networks often outperform stochastic models. ...
With latent variables, stochastic recurrent models have achieved state-of-the-art performance in modeling sound-wave sequence. ...
For instance, the authors [12] report that an SRNN trained by Z-Forcing lags behind a baseline RNN in language modeling. ...
arXiv:1902.01388v2
fatcat:fnn4yxzfijfhvch6vdk5of2iai
Semi-Implicit Stochastic Recurrent Neural Networks
[article]
2020
arXiv
pre-print
Stochastic recurrent neural networks with latent random variables of complex dependency structures have shown to be more successful in modeling sequential data than deterministic deep models. ...
Semi-implicit stochastic recurrent neural network(SIS-RNN) is developed to enrich inferred model posteriors that may have no analytic density functions, as long as independent random samples can be generated ...
In this paper, we break the Gaussian assumption and propose a semi-implicit stochastic recurrent neural network (SIS-RNN) that is capable of inferring implicit posteriors for sequential data while maintaining ...
arXiv:1910.12819v2
fatcat:xatf7zqqdzg3vkz3r4ki5mdglm
STCN: Stochastic Temporal Convolutional Networks
[article]
2019
arXiv
pre-print
Convolutional architectures have recently been shown to be competitive on many sequence modelling tasks when compared to the de-facto standard of recurrent neural networks (RNNs), while providing computational ...
In this work, we propose stochastic temporal convolutional networks (STCNs), a novel architecture that combines the computational advantages of temporal convolutional networks (TCN) with the representational ...
., 2017) has demonstrated that temporal convolutional networks (TCNs) can also achieve at least competitive performance without relying on recurrence, and hence reducing the computational cost for training ...
arXiv:1902.06568v1
fatcat:lg6ljlejn5av5axbf7of5vhw7q
Deep Bayesian Natural Language Processing
2019
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Z-forcing:
Training stochastic recurrent networks. In Advances
in Neural Information Processing Systems 30, pages
6713-6723. ...
net stochastic recurrent neural network regularized recurrent neural network stochastic learning & normalizing flows -VAE with VampPrior skip recurrent neural network temporal difference VAE -Markov recurrent ...
doi:10.18653/v1/p19-4006
dblp:conf/acl/Chien19
fatcat:bj6qf6cpkffz3oxinswh5fy4ry
Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future
[article]
2019
arXiv
pre-print
The latter distribution is modeled by a recurrent neural network with stochastic latent variables z 1:T . We train the model using variational inference. ...
In particular, we make use of the so-called "Z-forcing" idea (Goyal et al., 2017): we consider training a conditional generative model p ζ (b | z) of backward states b given the inferred latent variables ...
arXiv:1903.01599v2
fatcat:qarnknfy5vcntl7v4b3suyw5ne
Modeling the Long Term Future in Model-Based Reinforcement Learning
2019
International Conference on Learning Representations
The latter distribution is modeled by a recurrent neural network with stochastic latent variables z 1:T . We train the model using variational inference. ...
In particular, we make use of the so-called "Z-forcing" idea (Goyal et al., max ζ E q θ (z|b,h) [log p ζ (b | z)] (6) The loss above will act as a training regularization that enforce latent variables ...
We took our model trained using imitation learning as in section 5.1. ...
dblp:conf/iclr/KeSTGBPB19
fatcat:er4yijxcrjfwrb2u6pidefr42q
Competency Assessment for Autonomous Agents using Deep Generative Models
[article]
2022
arXiv
pre-print
By combining the strengths of conditional variational autoencoders with recurrent neural networks, the deep generative world model can probabilistically forecast trajectories over long horizons to task ...
They are: • Deterministic RNN: A deterministic recurrent neural network that is trained with the same depth and comparable number of parameters to the recurrent VAE model. ...
One work has used z-forcing [12] techniques to force the latent space to contain information from the future states in order to predict long-term dynamics [24] . ...
arXiv:2203.12670v1
fatcat:33rwpysexrfxto24wzwkxutlde
A Hybrid Convolutional Variational Autoencoder for Text Generation
2017
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
previously introduced VAE model for text where both the encoder and decoder are RNNs, we propose a novel hybrid architecture that blends fully feed-forward convolutional and deconvolutional components with a recurrent ...
Specifically, using a sample from the q(z|x) to reconstruct the input instead of a deterministic z, forces the model to map an input to a region of the space rather than to a single point. ...
Note that LSTM networks make use of Layer Nor- malization (Ba et al., 2016) which has been shown to make training of such networks easier. ...
doi:10.18653/v1/d17-1066
dblp:conf/emnlp/SemeniutaSB17
fatcat:qzk3vol6m5egtpvykxwjp2mnuu
Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning
[article]
2019
arXiv
pre-print
[17] propose Z-forcing. ...
Following the approach in [17] using the same LSTM for all networks leads to inferior results (Tab. 3 row 2, Z-forcing).(4) Using a constant Gaussian distribution per word. ...
arXiv:1908.08529v1
fatcat:5bkvxkkiq5di3lichwwxlf4d7i
A Hybrid Convolutional Variational Autoencoder for Text Generation
[article]
2017
arXiv
pre-print
exhibits several attractive properties such as faster run time and convergence, ability to better handle long sequences and, more importantly, it helps to avoid some of the major difficulties posed by training ...
previously introduced VAE model for text where both the encoder and decoder are RNNs, we propose a novel hybrid architecture that blends fully feed-forward convolutional and deconvolutional components with a recurrent ...
Specifically, using a sample from the q(z|x) to reconstruct the input instead of a deterministic z, forces the model to map an input to a region of the space rather than to a single point. ...
arXiv:1702.02390v1
fatcat:klxamzr4qjaudfs5anpmsa3nxy
Relational State-Space Model for Stochastic Multi-Object Systems
[article]
2020
arXiv
pre-print
Real-world dynamical systems often consist of multiple stochastic subsystems that interact with each other. ...
This paper introduces the relational state-space model (R-SSM), a sequential hierarchical latent variable model that makes use of graph neural networks (GNNs) to simulate the joint state transitions of ...
The idea of using auxiliary costs to train deep SSMs has been explored in Z-forcing (Goyal et al., 2017; Ke et al., 2019) , which predicts the future summaries directly rather than contrastingly. ...
arXiv:2001.04050v1
fatcat:l6uqwkzwjraudm73zstu5mu4mu
Benchmarking Generative Latent Variable Models for Speech
[article]
2022
arXiv
pre-print
Stochastic latent variable models (LVMs) achieve state-of-the-art performance on natural image generation but are still inferior to deterministic models on speech. ...
Stochastic recurrent neural network (SRNN) The SRNN (Fraccaro et al., 2016) is similar to the VRNN but differs by separating the stochastic latent variables from the deterministic representations (figure ...
The FH-VAE, with disjoint latent variables and discriminative objective, Z-forcing, with an auxiliary task, and the VQ-VAE, with a quantized latent space and autoregressive prior fitted after training, ...
arXiv:2202.12707v2
fatcat:z7b6gzr4s5bdhcsukmuajyvkoy
« Previous
Showing results 1 — 15 out of 44 results