### Online Model Selection Based on the Variational Bayes

Masa-aki Sato
2001 Neural Computation
The Bayesian framework provides a principled way of model selection. This framework estimates a probability distribution over an ensemble of models, and the prediction is done by averaging over the ensemble of models. Accordingly, the uncertainty of the models is taken into account, and complex models with more degrees of freedom are penalized . However, integration over model parameters is often intractable, and some approximation scheme is needed. Recently, a powerful approximation scheme,
more » ... led the variational bayes (VB) method, has been proposed. This approach de nes the free energy for a trial probability distribution, which approximates a joint posterior probability distribution over model parameters and hidden variables. The exact maximization of the free energy gives the true posterior distribution. The VB method uses factorized trial distributions. The integration over model parameters can be done analytically, and an iterative expectation-maximization-like algorithm, whose convergenc e is guaranteed, is derived. In this article, we derive an online version of the VB algorithm and prove its convergenc e by showing that it is a stochastic approximation for nding the maximum of the free energy. By combining sequential model selection procedures, the online VB method provides a fully online learning method with a model selection mechanism. In preliminary experiments using synthetic data, the online VB method was able to adapt the model structure to dynamic environments. stochastic variable. For a set of observed data, XfTg D fx(t) | t D 1, . . . , Tg, and a prior probability distribution P( µm | Mm ), the Bayesian method calculates the posterior probability over the parameter, P( µm |X fTg, Mm ) D P(XfTg | µm , Mm )P( µm | Mm ) P(XfTg | Mm ) . Here, the data likelihood is de ned by