Filters








1,845 Hits in 4.6 sec

On the role of data in PAC-Bayes bounds [article]

Gintare Karolina Dziugaite, Kyle Hsu, Waseem Gharbieh, Gabriel Arpino, Daniel M. Roy
2020 arXiv   pre-print
For so-called linear PAC-Bayes risk bounds based on the empirical risk of a fixed posterior kernel, it is possible to minimize the expected value of the bound by choosing the prior to be the expected posterior  ...  The dominant term in PAC-Bayes bounds is often the Kullback--Leibler divergence between the posterior and prior.  ...  The use of a data-dependent prior and in particular this one based on a run of SGD on an initial segment of data is new. The paper studies minimizing high-probability PAC-Bayes bounds in expectation.  ... 
arXiv:2006.10929v2 fatcat:2nkrcd66efao3fwcmgsamuqug4

PAC-Bayes with Backprop [article]

Omar Rivasplata, Vikram M Tankasali, Csaba Szepesvari
2019 arXiv   pre-print
We explore the family of methods "PAC-Bayes with Backprop" (PBB) to train probabilistic neural networks by minimizing PAC-Bayes bounds.  ...  We present two training objectives, one derived from a previously known PAC-Bayes bound, and a second one derived from a novel PAC-Bayes bound.  ...  We focus on the family of methods 'PAC-Bayes with Backprop' (PBB) which derives training objectives based on PAC-Bayes upper bounds on the risk.  ... 
arXiv:1908.07380v5 fatcat:vtwkysp75na6bj5j26fb4o7iwu

PAC-Bayes Analysis Beyond the Usual Bounds [article]

Omar Rivasplata, Ilja Kuzborskij, Csaba Szepesvari, John Shawe-Taylor
2020 arXiv   pre-print
In this setting the unknown quantity of interest is the expected risk of the data-dependent randomized predictor, for which upper bounds can be derived via a PAC-Bayes analysis, leading to PAC-Bayes bounds  ...  We clarify the role of the requirements of fixed 'data-free' priors, bounded losses, and i.i.d. data.  ...  We emphasize that the "usual assumptions" on which PAC-Bayes bounds are based, namely, (a) data-free prior, (b) bounded loss, and (c) i.i.d. data, played a role only in the technique used for controlling  ... 
arXiv:2006.13057v3 fatcat:e3abu75fhjeyrgow344obqjv7m

Learning under Model Misspecification: Applications to Variational and Ensemble methods

Andrés R. Masegosa
2020 Neural Information Processing Systems  
In this work, we present a novel analysis of the generalization performance of Bayesian model averaging under model misspecification and i.i.d. data using a new family of second-order PAC-Bayes bounds.  ...  Using novel secondorder PAC-Bayes bounds, we derive a new family of Bayesian-like algorithms, which can be implemented as variational and ensemble methods.  ...  So, the findings of this work can be of help to develop more accurate and safer predictive models in machine learning, which could ease the adoption of this technology.  ... 
dblp:conf/nips/Masegosa20 fatcat:3otaaklalvgpdnd5whimm47dhe

A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs work

Ralf Herbrich, Thore Graepel
2000 Neural Information Processing Systems  
The result is obtained in a PAC-Bayesian framework and is based on geometrical arguments in the space of linear classifiers.  ...  We present a bound on the generalisation error of linear classifiers in terms of a refined margin quantity on the training set.  ...  Acknowledgements We would like to thank David McAllester, John Shawe-Taylor, Bob Williamson, Olivier Chapelle, John Langford, Alex Smola and Bernhard SchOlkopf for interesting discussions and useful suggestions on  ... 
dblp:conf/nips/HerbrichG00 fatcat:rgc7ru73i5bfhi2zdmk6laaif4

Tighter risk certificates for neural networks [article]

María Pérez-Ortiz and Omar Rivasplata and John Shawe-Taylor and Csaba Szepesvári
2021 arXiv   pre-print
We also re-implement a previously used training objective based on a classical PAC-Bayes bound, to compare the properties of the predictors learned using the different training objectives.  ...  We compute risk certificates for the learnt predictors, based on part of the data used to learn the predictors.  ...  We propose -and experiment with-two new PBB training objectives: one derived from the PAC-Bayes-quadratic bound of Rivasplata et al. (2019) , and one derived from the PAC-Bayes-lambda bound of Thiemann  ... 
arXiv:2007.12911v3 fatcat:efoankqx6vbwvdeh34k76mijdm

PAC-Bayes-Empirical-Bernstein Inequality

Ilya O. Tolstikhin, Yevgeny Seldin
2013 Neural Information Processing Systems  
We present a PAC-Bayes-Empirical-Bernstein inequality. The inequality is based on a combination of the PAC-Bayesian bounding technique with an Empirical Bernstein bound.  ...  The PAC-Bayes-Empirical-Bernstein inequality is an interesting example of an application of the PAC-Bayesian bounding technique to self-bounding functions.  ...  Acknowledgments The authors are thankful to Anton Osokin for useful discussions and to the anonymous reviewers for their comments.  ... 
dblp:conf/nips/TolstikhinS13 fatcat:ca6dk3nbmvfsfhlsglpp56f2ky

A Refined MCMC Sampling from RKHS for PAC-Bayes Bound Calculation

Li Tang, Zheng Zhao, Xiu-Jun Gong
2014 Journal of Computers  
The experimental results on two artificial data sets show that the simulation is reasonable and effective in practice.  ...  PAC-Bayes risk bound integrating theories of Bayesian paradigm and structure risk minimization for stochastic classifiers has been considered as a framework for deriving some of the tightest generalization  ...  PAC-Bayes Bound and Its Application on Linear Classifier We recall the PAC-Bayes bound for the binary classification problems presented in [3] [4] [5] .  ... 
doi:10.4304/jcp.9.4.930-937 fatcat:lgf6ltbgk5ha7bmlbmeqe255ky

Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors [article]

Gintare Karolina Dziugaite, Daniel M. Roy
2019 arXiv   pre-print
Entropy-SGD works by optimizing the bound's prior, violating the hypothesis of the PAC-Bayes theorem that the prior is chosen independently of the data.  ...  In order to obtain a valid generalization bound, we rely on a result showing that data-dependent priors obtained by stochastic gradient Langevin dynamics (SGLD) yield valid PAC-Bayes bounds provided the  ...  Acknowledgments This research was carried out in part while the authors were visiting the Simons Institute for the Theory of Computing at UC Berkeley.  ... 
arXiv:1712.09376v3 fatcat:l3fssx5csbhedcrtl2ojaaznle

How Tight Can PAC-Bayes be in the Small Data Regime? [article]

Andrew Y. K. Foong, Wessel P. Bruinsma, David R. Burt, Richard E. Turner
2022 arXiv   pre-print
We focus on the case of i.i.d. data with a bounded loss and consider the generic PAC-Bayes theorem of Germain et al.  ...  In this paper, we investigate the question: Given a small number of datapoints, for example N = 30, how tight can PAC-Bayes and test set bounds be made?  ...  The choice of prior is crucial in PAC-Bayes, and the role of data-dependent priors (DDPs) has been gaining increased attention.  ... 
arXiv:2106.03542v4 fatcat:62bruslsmvebxb4sgovnh2ous4

Data-dependent PAC-Bayes priors via differential privacy [article]

Gintare Karolina Dziugaite, Daniel M. Roy
2019 arXiv   pre-print
The Probably Approximately Correct (PAC) Bayes framework (McAllester, 1999) can incorporate knowledge about the learning algorithm and (data) distribution through the use of distribution-dependent priors  ...  As an application of this result, we show that a Gaussian prior mean chosen via stochastic gradient Langevin dynamics (SGLD; Welling and Teh, 2011) leads to a valid PAC-Bayes bound given control of the  ...  This research was carried out in part while the authors were visiting the Simons Institute for the Theory of Computing at UC Berkeley. GKD was additionally supported by an EPSRC studentship.  ... 
arXiv:1802.09583v2 fatcat:rgppt5chlrcflo7jpi2iizgp5y

Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary β-Mixing Processes [article]

Liva Ralaivola, Guillaume Stempfel
2010 arXiv   pre-print
In this work, we propose the first - to the best of our knowledge - Pac-Bayes generalization bounds for classifiers trained on data exhibiting interdependencies.  ...  The approach undertaken to establish our results is based on the decomposition of a so-called dependency graph that encodes the dependencies within the data, in sets of independent data, thanks to graph  ...  Acknowledgment This work is partially supported by the IST Program of the EC, under the FP7 Pascal 2 Network of Excellence, ICT-216886-NOE.  ... 
arXiv:0909.1933v2 fatcat:jszzwhwlujflzem3awmlvxrbky

PAC-Bayes Analysis of Sentence Representation [article]

Kento Nozawa, Issei Sato
2019 arXiv   pre-print
Moreover, we propose novel sentence vector learning algorithms on the basis of our PAC-Bayes analysis.  ...  We analyze learning sentence vectors from a transfer learning perspective by using a PAC-Bayes bound that enables us to understand existing heuristics.  ...  We also thank developers of scikit-learn, gensim, and PyTorch. KN was supported by JSPS KAKENHI Grant Number 18J20470. IS was supported by JSPS KAKENHI Grant Number 17H04693.  ... 
arXiv:1902.04247v2 fatcat:n7rzqtsvjze6vnkdkjczy6ojqe

Generalization bounds for deep learning [article]

Guillermo Valle-Pérez, Ard A. Louis
2020 arXiv   pre-print
We focus on generalization error upper bounds, and introduce a categorisation of bounds depending on assumptions on the algorithm and data.  ...  This bound is, by one definition, optimal up to a multiplicative constant in the asymptotic limit of large training sets, as long as the learning curve follows a power law, which is typically found in  ...  For the the PAC-Bayes bound, in (a) the exponent is estimated from a linear fit to the log of the PAC-Bayes bound vs log m.  ... 
arXiv:2012.04115v2 fatcat:i7xjvhjlhfdhrjgtsi27pngn4e

User-friendly introduction to PAC-Bayes bounds [article]

Pierre Alquier
2021 arXiv   pre-print
Since the original PAC-Bayes bounds of D.  ...  In statistical learning theory, there is a set of tools designed to understand the generalization ability of such procedures: PAC-Bayesian or PAC-Bayes bounds.  ...  of the Approximate Bayesian Inference team at RIKEN AIP.  ... 
arXiv:2110.11216v4 fatcat:ck4oiea6c5gd7ejpst37zwgcuu
« Previous Showing results 1 — 15 out of 1,845 results