Impact of Parameter Sparsity on Stochastic Gradient MCMC Methods for Bayesian Deep Learning [article]

Meet P. Vadera, Adam D. Cobb, Brian Jalaian, Benjamin M. Marlin
2022 arXiv   pre-print
Bayesian methods hold significant promise for improving the uncertainty quantification ability and robustness of deep neural network models. Recent research has seen the investigation of a number of approximate Bayesian inference methods for deep neural networks, building on both the variational Bayesian and Markov chain Monte Carlo (MCMC) frameworks. A fundamental issue with MCMC methods is that the improvements they enable are obtained at the expense of increased computation time and model
more » ... rage costs. In this paper, we investigate the potential of sparse network structures to flexibly trade-off model storage costs and inference run time against predictive performance and uncertainty quantification ability. We use stochastic gradient MCMC methods as the core Bayesian inference method and consider a variety of approaches for selecting sparse network structures. Surprisingly, our results show that certain classes of randomly selected substructures can perform as well as substructures derived from state-of-the-art iterative pruning methods while drastically reducing model training times.
arXiv:2202.03770v1 fatcat:6vv6oku6irdwrdasnt7wpwpage