Filters








1,479 Hits in 4.1 sec

Communication-Efficient Stochastic Gradient MCMC for Neural Networks

Chunyuan Li, Changyou Chen, Yunchen Pu, Ricardo Henao, Lawrence Carin
2019 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Bayesian methods such as Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) offer an elegant framework to reason about model uncertainty in neural networks.  ...  We propose accelerating SG-MCMC under the masterworker framework: workers asynchronously and in parallel share responsibility for gradient computations, while the master collects the final samples.  ...  While developed the theory for the staleness of stochastic gradients in SG-MCMC recently, we focus on studying more efficient algorithms to reduce the communication cost.  ... 
doi:10.1609/aaai.v33i01.33014173 fatcat:kalywcqzmfelzalau3ekz6zaz4

Bayesian graph convolutional neural networks via tempered MCMC

Rohitash Chandra, Ayush Bhagat, Manavendra Maharana, Pavel N. Krivitsky
2021 IEEE Access  
Past implementation of Langevin-gradients for Bayesian neural networks [36] , [74] , [75] , stochastic gradient descent (SGD) was used with user-chosen learning rate (i.e., constant ν 1 ).  ...  Recent work in area where Langevin MCMC methods have been used for neural networks include the use of parallel tempering MCMC for simple neural networks for pattern classification and time series prediction  ... 
doi:10.1109/access.2021.3111898 fatcat:kwwwa7vdcrgv3hm5ainkkmpiba

Bayesian graph convolutional neural networks via tempered MCMC [article]

Rohitash Chandra, Ayush Bhagat, Manavendra Maharana, Pavel N. Krivitsky
2021 arXiv   pre-print
Deep learning models, such as convolutional neural networks, have long been applied to image and multi-media tasks, particularly those with structured data.  ...  Graph convolutional neural networks have recently gained attention in the field of deep learning that takes advantage of graph-based data representation with automatic feature extraction via convolutions  ...  Past implementation of Langevin-gradients for Bayesian neural networks [36, 75, 74] , stochastic gradient descent (SGD) was used 4 https://pytorch.org/ 5 https://pytorch-geometric.readthedocs.io/en/latest  ... 
arXiv:2104.08438v1 fatcat:ot3wt2mobzggxn3zusrpkxaj6u

Asynchronous Stochastic Gradient MCMC with Elastic Coupling [article]

Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, Frank Hutter
2016 arXiv   pre-print
We consider parallel asynchronous Markov Chain Monte Carlo (MCMC) sampling for problems where we can leverage (stochastic) gradients to define continuous dynamics which explore the target distribution.  ...  We outline a solution strategy for this setting based on stochastic gradient Hamiltonian Monte Carlo sampling (SGHMC) which we alter to include an elastic coupling term that ties together multiple MCMC  ...  Figure 2 : 2 Comparison between different SGMCMC samplers for sampling from the posterior over neural network weights for a fully connected network on MNIST (left) and a residual network on CIFAR-10 (right  ... 
arXiv:1612.00767v2 fatcat:s5qk5sq4grb7nj44exvmo72iki

Distributed Bayesian Learning with Stochastic Natural-gradient Expectation Propagation and the Posterior Server [article]

Leonard Hasenclever, Stefan Webb, Thibaut Lienart, Sebastian Vollmer, Balaji Lakshminarayanan, Charles Blundell, Yee Whye Teh
2017 arXiv   pre-print
We demonstrate SNEP and the posterior server on distributed Bayesian learning of logistic regression and neural networks.  ...  Firstly, we propose stochastic natural gradient expectation propagation (SNEP), a novel alternative to expectation propagation (EP), a popular variational inference algorithm.  ...  Both black-box variational methods and stochastic gradient MCMC methods can be applied to neural networks yielding uncertainty estimates.  ... 
arXiv:1512.09327v4 fatcat:mt4d7wujqbcztpb5rp7zngfzs4

Surrogate-assisted parallel tempering for Bayesian neural learning [article]

Rohitash Chandra, Konark Jain, Arpit Kapoor, Ashray Aman
2020 arXiv   pre-print
complexity of large neural network models.  ...  However, certain challenges remain given large neural network models and big data.  ...  Dietmar Muller and Danial Azam for discussions and support during the course of this research project. We sincerely thank the editors and anonymous reviewers for their valuable comments.  ... 
arXiv:1811.08687v3 fatcat:yzsduvrojjaajihutzyrcnz5fy

WHAI: Weibull Hybrid Autoencoding Inference for Deep Topic Modeling [article]

Hao Zhang, Bo Chen, Dandan Guo, Mingyuan Zhou
2020 arXiv   pre-print
WHAI) for deep latent Dirichlet allocation, which infers posterior samples via a hybrid of stochastic-gradient MCMC and autoencoding variational Bayes.  ...  deep neural network, and a stochastic-downward deep generative model based on a hierarchy of Weibull distributions.  ...  stochastic-gradient MCMC and autoencoding variational inference for WHAI Set mini-batch size m and the number of layer L Initialize encoder parameter Ω and model parameter {Φ (l) } 1,L . for iter = 1,  ... 
arXiv:1803.01328v2 fatcat:6bvohpvnwjaibocikfjn32bu7e

Impact of Parameter Sparsity on Stochastic Gradient MCMC Methods for Bayesian Deep Learning [article]

Meet P. Vadera, Adam D. Cobb, Brian Jalaian, Benjamin M. Marlin
2022 arXiv   pre-print
We use stochastic gradient MCMC methods as the core Bayesian inference method and consider a variety of approaches for selecting sparse network structures.  ...  Recent research has seen the investigation of a number of approximate Bayesian inference methods for deep neural networks, building on both the variational Bayesian and Markov chain Monte Carlo (MCMC)  ...  Sparsity and Neural Networks Identifying sparse neural network structures has an extended history in the machine learning community [26, 18] and a wide variety of methods for learning sparse neural network  ... 
arXiv:2202.03770v1 fatcat:6vv6oku6irdwrdasnt7wpwpage

Langevin-gradient parallel tempering for Bayesian neural learning [article]

Rohitash Chandra, Konark Jain, Ratneel V. Deo, Sally Cripps
2018 arXiv   pre-print
Second, we make within-chain sampling schemes more efficient by using Langevin gradient information in forming Metropolis-Hastings proposal distributions.  ...  Bayesian neural learning feature a rigorous approach to estimation and uncertainty quantification via the posterior distribution of weights that represent knowledge of the neural network.  ...  Acknowledgements We would like to thanks Artemis high performance computing support at University of Sydney and Arpit Kapoor for providing technical support.  ... 
arXiv:1811.04343v1 fatcat:lxgnqwhjurcb7o45m4mlneh7ku

Multi-variance replica exchange stochastic gradient MCMC for inverse and forward Bayesian physics-informed neural network [article]

Guang Lin, Yating Wang, Zecheng Zhang
2021 arXiv   pre-print
Physics-informed neural network (PINN) has been successfully applied in solving a variety of nonlinear non-convex forward and inverse problems.  ...  In this work, we propose a multi-variance replica exchange stochastic gradient Langevin diffusion method to tackle the challenge of the multiple local optima in the optimization and the challenge of the  ...  To combine simulated tempering with the traditional MCMC community, a replica stochastic gradient MCMC (reSG-MCMC) was recently brought up [11] .  ... 
arXiv:2107.06330v1 fatcat:ucaajjbxnfegvok37ev6nytkcq

Overcoming barriers to scalability in variational quantum Monte Carlo [article]

Tianchen Zhao, Saibal De, Brian Chen, James Stokes, Shravan Veerapaneni
2021 arXiv   pre-print
In particular, we demonstrate the GPU-scalability of VQMC for solving up to ten-thousand dimensional combinatorial optimization problems.  ...  VQMC overcomes the curse of dimensionality by performing alternating steps of Monte Carlo sampling from a parametrized quantum state followed by gradient-based optimization.  ...  Communication between the computing units is necessary only when we need to update the parameters of the neural network, e.g. during a stochastic gradient descent update.  ... 
arXiv:2106.13308v2 fatcat:upydum5npzcxnjjq6qhblzgf54

Revisiting Bayesian autoencoders with MCMC

Rohitash Chandra, Mahir Jain, Manavendra Maharana, Pavel N. Krivitsky
2022 IEEE Access  
This paper presents Bayesian autoencoders powered by MCMC sampling implemented using parallel computing and Langevin-gradient proposal distribution.  ...  Bayesian inference via Markov Chain Monte Carlo (MCMC) sampling has faced several limitations for large models; however, recent advances in parallel computing and advanced proposal schemes have opened  ...  We note that Langevin-gradient proposal distribution has been effective for novel Bayesian neural networks in relatively small and large neural network architectures of up to several thousand model parameters  ... 
doi:10.1109/access.2022.3163270 fatcat:eibediyxyzc45hn6hyssbllxcy

Revisiting Bayesian Autoencoders with MCMC [article]

Rohitash Chandra, Mahir Jain, Manavendra Maharana, Pavel N. Krivitsky
2021 arXiv   pre-print
In this paper, we present Bayesian autoencoders powered MCMC sampling implemented using parallel computing and Langevin gradient proposal scheme.  ...  Bayesian inference via MCMC methods have faced limitations but recent advances with parallel computing and advanced proposal schemes that incorporate gradients have opened routes less travelled.  ...  We note that Langevin-gradient proposal distribution has been effective for novel Bayesian neural networks in relatively small and large neural network architectures of up to several thousand model parameters  ... 
arXiv:2104.05915v1 fatcat:6gr6lxe2eja2ljsqrvyg3kpeii

Deep Autoencoding Topic Model with Scalable Hybrid Bayesian Inference [article]

Hao Zhang, Bo Chen, Yulai Cong, Dandan Guo, Hongwei Liu, Mingyuan Zhou
2020 arXiv   pre-print
In order to provide scalable posterior inference for the parameters of the generative network, we develop topic-layer-adaptive stochastic gradient Riemannian MCMC that jointly learns simplex-constrained  ...  variational encoder that deterministically propagates information upward via a deep neural network, followed by a Weibull distribution based stochastic downward generative model.  ...  7 Algorithm 1 Hybrid stochastic-gradient MCMC and autoen- in Section II-B.  ... 
arXiv:2006.08804v1 fatcat:px4gousafnehtf3w55tzeohweu

An Adaptive Empirical Bayesian Method for Sparse Deep Learning

Wei Deng, Xiao Zhang, Faming Liang, Guang Lin
2019 Advances in Neural Information Processing Systems  
The proposed method works by alternatively sampling from an adaptive hierarchical posterior distribution using stochastic gradient Markov Chain Monte Carlo (MCMC) and smoothly optimizing the hyperparameters  ...  Empirical applications of the proposed method lead to the state-of-the-art performance on MNIST and Fashion MNIST with shallow convolutional neural networks (CNN) and the state-of-the-art compression performance  ...  Yunfan Li and the reviewers for their insightful comments.  ... 
pmid:33244209 pmcid:PMC7687285 fatcat:vxhhhiq32zd7tecr6iio4s5tme
« Previous Showing results 1 — 15 out of 1,479 results