A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
On the role of data in PAC-Bayes bounds
[article]
2020
arXiv
pre-print
For so-called linear PAC-Bayes risk bounds based on the empirical risk of a fixed posterior kernel, it is possible to minimize the expected value of the bound by choosing the prior to be the expected posterior ...
The dominant term in PAC-Bayes bounds is often the Kullback--Leibler divergence between the posterior and prior. ...
The use of a data-dependent prior and in particular this one based on a run of SGD on an initial segment of data is new. The paper studies minimizing high-probability PAC-Bayes bounds in expectation. ...
arXiv:2006.10929v2
fatcat:2nkrcd66efao3fwcmgsamuqug4
PAC-Bayes with Backprop
[article]
2019
arXiv
pre-print
We explore the family of methods "PAC-Bayes with Backprop" (PBB) to train probabilistic neural networks by minimizing PAC-Bayes bounds. ...
We present two training objectives, one derived from a previously known PAC-Bayes bound, and a second one derived from a novel PAC-Bayes bound. ...
We focus on the family of methods 'PAC-Bayes with Backprop' (PBB) which derives training objectives based on PAC-Bayes upper bounds on the risk. ...
arXiv:1908.07380v5
fatcat:vtwkysp75na6bj5j26fb4o7iwu
PAC-Bayes Analysis Beyond the Usual Bounds
[article]
2020
arXiv
pre-print
In this setting the unknown quantity of interest is the expected risk of the data-dependent randomized predictor, for which upper bounds can be derived via a PAC-Bayes analysis, leading to PAC-Bayes bounds ...
We clarify the role of the requirements of fixed 'data-free' priors, bounded losses, and i.i.d. data. ...
We emphasize that the "usual assumptions" on which PAC-Bayes bounds are based, namely, (a) data-free prior, (b) bounded loss, and (c) i.i.d. data, played a role only in the technique used for controlling ...
arXiv:2006.13057v3
fatcat:e3abu75fhjeyrgow344obqjv7m
Learning under Model Misspecification: Applications to Variational and Ensemble methods
2020
Neural Information Processing Systems
In this work, we present a novel analysis of the generalization performance of Bayesian model averaging under model misspecification and i.i.d. data using a new family of second-order PAC-Bayes bounds. ...
Using novel secondorder PAC-Bayes bounds, we derive a new family of Bayesian-like algorithms, which can be implemented as variational and ensemble methods. ...
So, the findings of this work can be of help to develop more accurate and safer predictive models in machine learning, which could ease the adoption of this technology. ...
dblp:conf/nips/Masegosa20
fatcat:3otaaklalvgpdnd5whimm47dhe
A PAC-Bayesian Margin Bound for Linear Classifiers: Why SVMs work
2000
Neural Information Processing Systems
The result is obtained in a PAC-Bayesian framework and is based on geometrical arguments in the space of linear classifiers. ...
We present a bound on the generalisation error of linear classifiers in terms of a refined margin quantity on the training set. ...
Acknowledgements We would like to thank David McAllester, John Shawe-Taylor, Bob Williamson, Olivier Chapelle, John Langford, Alex Smola and Bernhard SchOlkopf for interesting discussions and useful suggestions on ...
dblp:conf/nips/HerbrichG00
fatcat:rgc7ru73i5bfhi2zdmk6laaif4
Tighter risk certificates for neural networks
[article]
2021
arXiv
pre-print
We also re-implement a previously used training objective based on a classical PAC-Bayes bound, to compare the properties of the predictors learned using the different training objectives. ...
We compute risk certificates for the learnt predictors, based on part of the data used to learn the predictors. ...
We propose -and experiment with-two new PBB training objectives: one derived from the PAC-Bayes-quadratic bound of Rivasplata et al. (2019) , and one derived from the PAC-Bayes-lambda bound of Thiemann ...
arXiv:2007.12911v3
fatcat:efoankqx6vbwvdeh34k76mijdm
PAC-Bayes-Empirical-Bernstein Inequality
2013
Neural Information Processing Systems
We present a PAC-Bayes-Empirical-Bernstein inequality. The inequality is based on a combination of the PAC-Bayesian bounding technique with an Empirical Bernstein bound. ...
The PAC-Bayes-Empirical-Bernstein inequality is an interesting example of an application of the PAC-Bayesian bounding technique to self-bounding functions. ...
Acknowledgments The authors are thankful to Anton Osokin for useful discussions and to the anonymous reviewers for their comments. ...
dblp:conf/nips/TolstikhinS13
fatcat:ca6dk3nbmvfsfhlsglpp56f2ky
A Refined MCMC Sampling from RKHS for PAC-Bayes Bound Calculation
2014
Journal of Computers
The experimental results on two artificial data sets show that the simulation is reasonable and effective in practice. ...
PAC-Bayes risk bound integrating theories of Bayesian paradigm and structure risk minimization for stochastic classifiers has been considered as a framework for deriving some of the tightest generalization ...
PAC-Bayes Bound and Its Application on Linear Classifier We recall the PAC-Bayes bound for the binary classification problems presented in [3] [4] [5] . ...
doi:10.4304/jcp.9.4.930-937
fatcat:lgf6ltbgk5ha7bmlbmeqe255ky
Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors
[article]
2019
arXiv
pre-print
Entropy-SGD works by optimizing the bound's prior, violating the hypothesis of the PAC-Bayes theorem that the prior is chosen independently of the data. ...
In order to obtain a valid generalization bound, we rely on a result showing that data-dependent priors obtained by stochastic gradient Langevin dynamics (SGLD) yield valid PAC-Bayes bounds provided the ...
Acknowledgments This research was carried out in part while the authors were visiting the Simons Institute for the Theory of Computing at UC Berkeley. ...
arXiv:1712.09376v3
fatcat:l3fssx5csbhedcrtl2ojaaznle
How Tight Can PAC-Bayes be in the Small Data Regime?
[article]
2022
arXiv
pre-print
We focus on the case of i.i.d. data with a bounded loss and consider the generic PAC-Bayes theorem of Germain et al. ...
In this paper, we investigate the question: Given a small number of datapoints, for example N = 30, how tight can PAC-Bayes and test set bounds be made? ...
The choice of prior is crucial in PAC-Bayes, and the role of data-dependent priors (DDPs) has been gaining increased attention. ...
arXiv:2106.03542v4
fatcat:62bruslsmvebxb4sgovnh2ous4
Data-dependent PAC-Bayes priors via differential privacy
[article]
2019
arXiv
pre-print
The Probably Approximately Correct (PAC) Bayes framework (McAllester, 1999) can incorporate knowledge about the learning algorithm and (data) distribution through the use of distribution-dependent priors ...
As an application of this result, we show that a Gaussian prior mean chosen via stochastic gradient Langevin dynamics (SGLD; Welling and Teh, 2011) leads to a valid PAC-Bayes bound given control of the ...
This research was carried out in part while the authors were visiting the Simons Institute for the Theory of Computing at UC Berkeley. GKD was additionally supported by an EPSRC studentship. ...
arXiv:1802.09583v2
fatcat:rgppt5chlrcflo7jpi2iizgp5y
Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary β-Mixing Processes
[article]
2010
arXiv
pre-print
In this work, we propose the first - to the best of our knowledge - Pac-Bayes generalization bounds for classifiers trained on data exhibiting interdependencies. ...
The approach undertaken to establish our results is based on the decomposition of a so-called dependency graph that encodes the dependencies within the data, in sets of independent data, thanks to graph ...
Acknowledgment This work is partially supported by the IST Program of the EC, under the FP7 Pascal 2 Network of Excellence, ICT-216886-NOE. ...
arXiv:0909.1933v2
fatcat:jszzwhwlujflzem3awmlvxrbky
PAC-Bayes Analysis of Sentence Representation
[article]
2019
arXiv
pre-print
Moreover, we propose novel sentence vector learning algorithms on the basis of our PAC-Bayes analysis. ...
We analyze learning sentence vectors from a transfer learning perspective by using a PAC-Bayes bound that enables us to understand existing heuristics. ...
We also thank developers of scikit-learn, gensim, and PyTorch. KN was supported by JSPS KAKENHI Grant Number 18J20470. IS was supported by JSPS KAKENHI Grant Number 17H04693. ...
arXiv:1902.04247v2
fatcat:n7rzqtsvjze6vnkdkjczy6ojqe
Generalization bounds for deep learning
[article]
2020
arXiv
pre-print
We focus on generalization error upper bounds, and introduce a categorisation of bounds depending on assumptions on the algorithm and data. ...
This bound is, by one definition, optimal up to a multiplicative constant in the asymptotic limit of large training sets, as long as the learning curve follows a power law, which is typically found in ...
For the the PAC-Bayes bound, in (a) the exponent is estimated from a linear fit to the log of the PAC-Bayes bound vs log m. ...
arXiv:2012.04115v2
fatcat:i7xjvhjlhfdhrjgtsi27pngn4e
User-friendly introduction to PAC-Bayes bounds
[article]
2021
arXiv
pre-print
Since the original PAC-Bayes bounds of D. ...
In statistical learning theory, there is a set of tools designed to understand the generalization ability of such procedures: PAC-Bayesian or PAC-Bayes bounds. ...
of the Approximate Bayesian Inference team at RIKEN AIP. ...
arXiv:2110.11216v4
fatcat:ck4oiea6c5gd7ejpst37zwgcuu
« Previous
Showing results 1 — 15 out of 1,845 results