1,260 Hits in 4.8 sec

Progress in Self-Certified Neural Networks [article]

Maria Perez-Ortiz, Omar Rivasplata, Emilio Parrado-Hernandez, Benjamin Guedj, John Shawe-Taylor
2021 arXiv   pre-print
We empirically compare (on 4 classification datasets) classical test set bounds for deterministic predictors and a PAC-Bayes bound for randomised self-certified predictors.  ...  In this context, learning and certification strategies based on PAC-Bayes bounds are especially attractive due to their ability to leverage all data to learn a posterior and simultaneously certify its  ...  Data-dependent PAC-Bayes priors We experiment with Gaussian PAC-Bayes priors Q 0 with a diagonal covariance matrix centered at (i) random weights (uninformed data-free priors) and (ii) learnt weights (  ... 
arXiv:2111.07737v3 fatcat:2ixoizo5pnemjc3jhzo4rthh3y

On the role of data in PAC-Bayes bounds [article]

Gintare Karolina Dziugaite, Kyle Hsu, Waseem Gharbieh, Gabriel Arpino, Daniel M. Roy
2020 arXiv   pre-print
For so-called linear PAC-Bayes risk bounds based on the empirical risk of a fixed posterior kernel, it is possible to minimize the expected value of the bound by choosing the prior to be the expected posterior  ...  The dominant term in PAC-Bayes bounds is often the Kullback--Leibler divergence between the posterior and prior.  ...  Finally, we evaluate minimizing a PAC-Bayes bound with our data-dependent priors as a learning algorithm.  ... 
arXiv:2006.10929v2 fatcat:2nkrcd66efao3fwcmgsamuqug4

Tighter risk certificates for neural networks [article]

María Pérez-Ortiz and Omar Rivasplata and John Shawe-Taylor and Csaba Szepesvári
2021 arXiv   pre-print
We present two training objectives, used here for the first time in connection with training neural networks. These two training objectives are derived from tight PAC-Bayes bounds.  ...  We also re-implement a previously used training objective based on a classical PAC-Bayes bound, to compare the properties of the predictors learned using the different training objectives.  ...  We propose -and experiment with-two new PBB training objectives: one derived from the PAC-Bayes-quadratic bound of Rivasplata et al. (2019) , and one derived from the PAC-Bayes-lambda bound of Thiemann  ... 
arXiv:2007.12911v3 fatcat:efoankqx6vbwvdeh34k76mijdm

PAC-Bayes risk bounds for sample-compressed Gibbs classifiers

François Laviolette, Mario Marchand
2005 Proceedings of the 22nd international conference on Machine learning - ICML '05  
We extend the PAC-Bayes theorem to the sample-compression setting where each classifier is represented by two independent sources of information: a compression set which consists of a small subset of the  ...  The new PAC-Bayes theorem states that a Gibbs classifier defined on a posterior over samplecompressed classifiers can have a smaller risk bound than any such (deterministic) samplecompressed classifier  ...  This PAC-Bayes bound (see Theorem 1) depends both on the empirical risk (i.e., training errors) of the Gibbs classifier and on "how far" is the data-dependent posterior Q from the data-independent prior  ... 
doi:10.1145/1102351.1102412 dblp:conf/icml/LavioletteM05 fatcat:p46ngski3veelb34lo3666oipa

PAC-Bayes Analysis Beyond the Usual Bounds [article]

Omar Rivasplata, Ilja Kuzborskij, Csaba Szepesvari, John Shawe-Taylor
2020 arXiv   pre-print
In this setting the unknown quantity of interest is the expected risk of the data-dependent randomized predictor, for which upper bounds can be derived via a PAC-Bayes analysis, leading to PAC-Bayes bounds  ...  We present three bounds that illustrate the use of data-dependent priors, including one for the unbounded square loss.  ...  PAC-Bayes bounds with d-stable data-dependent priors Next we discuss an approach to convert any PAC-Bayes bound with a usual 'data-free' prior into a bound with a stable data-dependent prior, which is  ... 
arXiv:2006.13057v3 fatcat:e3abu75fhjeyrgow344obqjv7m

Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory [article]

Ron Amit, Ron Meir
2019 arXiv   pre-print
We present a framework for meta-learning that is based on generalization error bounds, allowing us to extend various PAC-Bayes bounds to meta-learning.  ...  Thus, prior knowledge is incorporated through setting an experience-dependent prior for novel tasks.  ...  ACKNOWLEDGMENTS We thank Asaf Cassel, Guy Tennenholtz, Baruch Epstein, Daniel Soudry, Elad Hoffer and Tom Zahavy for helpful discussions of this work, and the anonymous reviewers for their helpful comment  ... 
arXiv:1711.01244v8 fatcat:uvbegjw6erezjebbinjrrejzpe

Data Dependent Priors in PAC-Bayes Bounds [chapter]

John Shawe-Taylor, Emilio Parrado-Hernández, Amiran Ambroladze
2010 Proceedings of COMPSTAT'2010  
with subset of patterns Prior in the direction w r Posterior like PAC-Bayes Bound New bound proportional to KL(P Q) John Shawe-Taylor University College London Data Dependent Priors in PAC-Bayes Bounds  ...  0.016 spam Bound - - 0.254 0.198 0.186 0.178 CE 0.066 0.063 0.067 0.077 0.070 0.072 John Shawe-Taylor University College London Data Dependent Priors in PAC-Bayes Bounds  ...  the bound Optimisation problem to determine the p-SVM The p-SVM is only solved with the remaining points Prior-SVM New bound proportional to µw − ηw r 2 Classifier that optimises the bound Optimisation  ... 
doi:10.1007/978-3-7908-2604-3_21 dblp:conf/compstat/Shawe-TaylorPA10 fatcat:wqc2hnlzfnbqtnwz2fld5npfqq

PAC-Bayes with Backprop [article]

Omar Rivasplata, Vikram M Tankasali, Csaba Szepesvari
2019 arXiv   pre-print
We present two training objectives, one derived from a previously known PAC-Bayes bound, and a second one derived from a novel PAC-Bayes bound.  ...  We explore the family of methods "PAC-Bayes with Backprop" (PBB) to train probabilistic neural networks by minimizing PAC-Bayes bounds.  ...  To clarify, PAC-Bayes bounds require the prior to be a 'data-free' (i.e. non data-dependent) distribution.  ... 
arXiv:1908.07380v5 fatcat:vtwkysp75na6bj5j26fb4o7iwu

Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors [article]

Gintare Karolina Dziugaite, Daniel M. Roy
2019 arXiv   pre-print
In order to obtain a valid generalization bound, we rely on a result showing that data-dependent priors obtained by stochastic gradient Langevin dynamics (SGLD) yield valid PAC-Bayes bounds provided the  ...  Entropy-SGD works by optimizing the bound's prior, violating the hypothesis of the PAC-Bayes theorem that the prior is chosen independently of the data.  ...  Acknowledgments This research was carried out in part while the authors were visiting the Simons Institute for the Theory of Computing at UC Berkeley.  ... 
arXiv:1712.09376v3 fatcat:l3fssx5csbhedcrtl2ojaaznle

Generalization bounds for deep learning [article]

Guillermo Valle-Pérez, Ard A. Louis
2020 arXiv   pre-print
Finally, we comment on why this function-based bound performs significantly better than current parameter-based PAC-Bayes bounds.  ...  Extensive empirical analysis demonstrates that our marginal-likelihood PAC-Bayes bound fulfills desiderata 1-3 and 5.  ...  For PAC-Bayes we will consider stochastic learning algorithms.  ... 
arXiv:2012.04115v2 fatcat:i7xjvhjlhfdhrjgtsi27pngn4e

PAC Bayesian Performance Guarantees for Deep (Stochastic) Networks in Medical Imaging [article]

Anthony Sicilia, Xingchen Zhao, Anastasia Sosnovskikh, Seong Jae Hwang
2021 arXiv   pre-print
In this work, we explore recent advances using the PAC-Bayesian framework to provide bounds on generalization error for large (stochastic) networks.  ...  We observe the resultant bounds are competitive compared to a simpler baseline, while also being more explainable and alleviating the need for holdout sets.  ...  For PAC-Bayes bounds, the number of models sampled is either 1000 (in Fig. 1a ) or 100 (in Fig. 1b,c,d) . Data Splits for Self-Bounded Learning.  ... 
arXiv:2104.05600v2 fatcat:nmpd7gdclnhf5hvmjrxbljnrga

A Limitation of the PAC-Bayes Framework [article]

Roi Livni, Shay Moran
2021 arXiv   pre-print
In this manuscript we present a limitation for the PAC-Bayes framework. We demonstrate an easy learning task that is not amenable to a PAC-Bayes analysis.  ...  PAC-Bayes is a useful framework for deriving generalization bounds which was introduced by McAllester ('98).  ...  Acknowledgements The authors would like to acknowledge Steve Hanneke for suggesting and encouraging them to write this manuscript.  ... 
arXiv:2006.13508v3 fatcat:xcl2qs3wzvezfd5yolr3enl5ee

A Primer on PAC-Bayesian Learning [article]

Benjamin Guedj
2019 arXiv   pre-print
Generalised Bayesian learning algorithms are increasingly popular in machine learning, due to their PAC generalisation properties and flexibility.  ...  The present paper aims at providing a self-contained survey on the resulting PAC-Bayes framework and some of its main theoretical and algorithmic developments.  ...  The author warmly thanks Omar Rivasplata for his careful reading and suggestions.  ... 
arXiv:1901.05353v3 fatcat:vy73fwwanvfofbp3azhsrdq5v4

PAC-Bayesian Policy Evaluation for Reinforcement Learning [article]

Mahdi MIlani Fard, Joelle Pineau, Csaba Szepesvari
2012 arXiv   pre-print
This paper introduces the first PAC-Bayesian bound for the batch reinforcement learning problem with function approximation.  ...  PAC-Bayesian methods overcome this problem by providing bounds that hold regardless of the correctness of the prior distribution.  ...  A general PAC-Bayes bound We begin by first stating a general PAC-Bayes bound. In the next section, we use this result to derive our main bound for the approximation error in an RL setting.  ... 
arXiv:1202.3717v1 fatcat:eeqovahf3jfn3pq2gl4yizw47y

Data-dependent PAC-Bayes priors via differential privacy [article]

Gintare Karolina Dziugaite, Daniel M. Roy
2019 arXiv   pre-print
We show how an {\epsilon}-differentially private data-dependent prior yields a valid PAC-Bayes bound, and then show how non-private mechanisms for choosing priors can also yield generalization bounds.  ...  The Probably Approximately Correct (PAC) Bayes framework (McAllester, 1999) can incorporate knowledge about the learning algorithm and (data) distribution through the use of distribution-dependent priors  ...  Acknowledgments The authors would like to thank Olivier Catoni, Pascal Germain, Mufan Li, David McAllester, and Alexander Rakhlin, John Shawe-Taylor, for helpful discussions.  ... 
arXiv:1802.09583v2 fatcat:rgppt5chlrcflo7jpi2iizgp5y
« Previous Showing results 1 — 15 out of 1,260 results