Filters








44 Hits in 3.1 sec

Z-Forcing: Training Stochastic Recurrent Networks [article]

Anirudh Goyal, Alessandro Sordoni, Marc-Alexandre Côté, Nan Rosemary Ke, Yoshua Bengio
2017 arXiv   pre-print
Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN).  ...  In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network.  ...  We show that mixing stochastic forward pass, conditional prior and backward recognition network helps building effective stochastic recurrent models.  ... 
arXiv:1711.05411v2 fatcat:bhvbpbh6bne3ddzdovsbxlfo3q

Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data [article]

Guokun Lai, Bohan Li, Guoqing Zheng, Yiming Yang
2018 arXiv   pre-print
It has been demonstrated that model capacity can be significantly enhanced by introducing stochastic latent variables in the hidden states of recurrent neural networks.  ...  We argue that Stochastic WaveNet enjoys powerful distribution modeling capacity and the advantage of parallel training from dilated convolutions.  ...  ., 2016) and Z-forcing (Goyal et al., 2017) offer more powerful versions with augmented inference networks which better capture the correlation between the stochastic latent variables and the whole  ... 
arXiv:1806.06116v1 fatcat:4ijbqexxhbfi5fharoheqsmgau

Variational Bi-LSTMs [article]

Samira Shabanian, Devansh Arpit, Adam Trischler, Yoshua Bengio
2017 arXiv   pre-print
Recurrent neural networks like long short-term memory (LSTM) are important architectures for sequential prediction tasks. LSTMs (and RNNs in general) model sequences along the forward time direction.  ...  In the training of Bi-LSTMs, the forward and backward paths are learned independently.  ...  We note that recently proposed methods like TwinNet (Serdyuk et al., 2017) and Z-forcing are similar in spirit to this idea.  ... 
arXiv:1711.05717v1 fatcat:z7tj5zgupzf4zcvs2ykpgzggei

Re-examination of the Role of Latent Variables in Sequence Modeling [article]

Zihang Dai, Guokun Lai, Yiming Yang, Shinjae Yoo
2019 arXiv   pre-print
However, opposite results are also observed in other domains, where standard recurrent networks often outperform stochastic models.  ...  With latent variables, stochastic recurrent models have achieved state-of-the-art performance in modeling sound-wave sequence.  ...  For instance, the authors [12] report that an SRNN trained by Z-Forcing lags behind a baseline RNN in language modeling.  ... 
arXiv:1902.01388v2 fatcat:fnn4yxzfijfhvch6vdk5of2iai

Semi-Implicit Stochastic Recurrent Neural Networks [article]

Ehsan Hajiramezanali, Arman Hasanzadeh, Nick Duffield, Krishna Narayanan, Mingyuan Zhou, Xiaoning Qian
2020 arXiv   pre-print
Stochastic recurrent neural networks with latent random variables of complex dependency structures have shown to be more successful in modeling sequential data than deterministic deep models.  ...  Semi-implicit stochastic recurrent neural network(SIS-RNN) is developed to enrich inferred model posteriors that may have no analytic density functions, as long as independent random samples can be generated  ...  In this paper, we break the Gaussian assumption and propose a semi-implicit stochastic recurrent neural network (SIS-RNN) that is capable of inferring implicit posteriors for sequential data while maintaining  ... 
arXiv:1910.12819v2 fatcat:xatf7zqqdzg3vkz3r4ki5mdglm

STCN: Stochastic Temporal Convolutional Networks [article]

Emre Aksan, Otmar Hilliges
2019 arXiv   pre-print
Convolutional architectures have recently been shown to be competitive on many sequence modelling tasks when compared to the de-facto standard of recurrent neural networks (RNNs), while providing computational  ...  In this work, we propose stochastic temporal convolutional networks (STCNs), a novel architecture that combines the computational advantages of temporal convolutional networks (TCN) with the representational  ...  ., 2017) has demonstrated that temporal convolutional networks (TCNs) can also achieve at least competitive performance without relying on recurrence, and hence reducing the computational cost for training  ... 
arXiv:1902.06568v1 fatcat:lg6ljlejn5av5axbf7of5vhw7q

Deep Bayesian Natural Language Processing

Jen-Tzung Chien
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts  
Z-forcing: Training stochastic recurrent networks. In Advances in Neural Information Processing Systems 30, pages 6713-6723.  ...  net stochastic recurrent neural network regularized recurrent neural network stochastic learning & normalizing flows -VAE with VampPrior skip recurrent neural network temporal difference VAE -Markov recurrent  ... 
doi:10.18653/v1/p19-4006 dblp:conf/acl/Chien19 fatcat:bj6qf6cpkffz3oxinswh5fy4ry

Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future [article]

Nan Rosemary Ke, Amanpreet Singh, Ahmed Touati, Anirudh Goyal, Yoshua Bengio, Devi Parikh, Dhruv Batra
2019 arXiv   pre-print
The latter distribution is modeled by a recurrent neural network with stochastic latent variables z 1:T . We train the model using variational inference.  ...  In particular, we make use of the so-called "Z-forcing" idea (Goyal et al., 2017): we consider training a conditional generative model p ζ (b | z) of backward states b given the inferred latent variables  ... 
arXiv:1903.01599v2 fatcat:qarnknfy5vcntl7v4b3suyw5ne

Modeling the Long Term Future in Model-Based Reinforcement Learning

Nan Rosemary Ke, Amanpreet Singh, Ahmed Touati, Anirudh Goyal, Yoshua Bengio, Devi Parikh, Dhruv Batra
2019 International Conference on Learning Representations  
The latter distribution is modeled by a recurrent neural network with stochastic latent variables z 1:T . We train the model using variational inference.  ...  In particular, we make use of the so-called "Z-forcing" idea (Goyal et al., max ζ E q θ (z|b,h) [log p ζ (b | z)] (6) The loss above will act as a training regularization that enforce latent variables  ...  We took our model trained using imitation learning as in section 5.1.  ... 
dblp:conf/iclr/KeSTGBPB19 fatcat:er4yijxcrjfwrb2u6pidefr42q

Competency Assessment for Autonomous Agents using Deep Generative Models [article]

Aastha Acharya, Rebecca Russell, Nisar R. Ahmed
2022 arXiv   pre-print
By combining the strengths of conditional variational autoencoders with recurrent neural networks, the deep generative world model can probabilistically forecast trajectories over long horizons to task  ...  They are: • Deterministic RNN: A deterministic recurrent neural network that is trained with the same depth and comparable number of parameters to the recurrent VAE model.  ...  One work has used z-forcing [12] techniques to force the latent space to contain information from the future states in order to predict long-term dynamics [24] .  ... 
arXiv:2203.12670v1 fatcat:33rwpysexrfxto24wzwkxutlde

A Hybrid Convolutional Variational Autoencoder for Text Generation

Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth
2017 Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing  
previously introduced VAE model for text where both the encoder and decoder are RNNs, we propose a novel hybrid architecture that blends fully feed-forward convolutional and deconvolutional components with a recurrent  ...  Specifically, using a sample from the q(z|x) to reconstruct the input instead of a deterministic z, forces the model to map an input to a region of the space rather than to a single point.  ...  Note that LSTM networks make use of Layer Nor- malization (Ba et al., 2016) which has been shown to make training of such networks easier.  ... 
doi:10.18653/v1/d17-1066 dblp:conf/emnlp/SemeniutaSB17 fatcat:qzk3vol6m5egtpvykxwjp2mnuu

Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning [article]

Jyoti Aneja, Harsh Agrawal, Dhruv Batra, Alexander Schwing
2019 arXiv   pre-print
[17] propose Z-forcing.  ...  Following the approach in [17] using the same LSTM for all networks leads to inferior results (Tab. 3 row 2, Z-forcing).(4) Using a constant Gaussian distribution per word.  ... 
arXiv:1908.08529v1 fatcat:5bkvxkkiq5di3lichwwxlf4d7i

A Hybrid Convolutional Variational Autoencoder for Text Generation [article]

Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth
2017 arXiv   pre-print
exhibits several attractive properties such as faster run time and convergence, ability to better handle long sequences and, more importantly, it helps to avoid some of the major difficulties posed by training  ...  previously introduced VAE model for text where both the encoder and decoder are RNNs, we propose a novel hybrid architecture that blends fully feed-forward convolutional and deconvolutional components with a recurrent  ...  Specifically, using a sample from the q(z|x) to reconstruct the input instead of a deterministic z, forces the model to map an input to a region of the space rather than to a single point.  ... 
arXiv:1702.02390v1 fatcat:klxamzr4qjaudfs5anpmsa3nxy

Relational State-Space Model for Stochastic Multi-Object Systems [article]

Fan Yang, Ling Chen, Fan Zhou, Yusong Gao, Wei Cao
2020 arXiv   pre-print
Real-world dynamical systems often consist of multiple stochastic subsystems that interact with each other.  ...  This paper introduces the relational state-space model (R-SSM), a sequential hierarchical latent variable model that makes use of graph neural networks (GNNs) to simulate the joint state transitions of  ...  The idea of using auxiliary costs to train deep SSMs has been explored in Z-forcing (Goyal et al., 2017; Ke et al., 2019) , which predicts the future summaries directly rather than contrastingly.  ... 
arXiv:2001.04050v1 fatcat:l6uqwkzwjraudm73zstu5mu4mu

Benchmarking Generative Latent Variable Models for Speech [article]

Jakob D. Havtorn, Lasse Borgholt, Søren Hauberg, Jes Frellsen, Lars Maaløe
2022 arXiv   pre-print
Stochastic latent variable models (LVMs) achieve state-of-the-art performance on natural image generation but are still inferior to deterministic models on speech.  ...  Stochastic recurrent neural network (SRNN) The SRNN (Fraccaro et al., 2016) is similar to the VRNN but differs by separating the stochastic latent variables from the deterministic representations (figure  ...  The FH-VAE, with disjoint latent variables and discriminative objective, Z-forcing, with an auxiliary task, and the VQ-VAE, with a quantized latent space and autoregressive prior fitted after training,  ... 
arXiv:2202.12707v2 fatcat:z7b6gzr4s5bdhcsukmuajyvkoy
« Previous Showing results 1 — 15 out of 44 results