A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Pareto Invariant Risk Minimization
[article]
2022
arXiv
pre-print
To remedy the above issues, we reformulate IRM as a multi-objective optimization problem, and propose a new optimization scheme for IRM, called PAreto Invariant Risk Minimization (PAIR). ...
Despite the success of invariant risk minimization (IRM) in tackling the Out-of-Distribution generalization problem, IRM can compromise the optimality when applied in practice. ...
Introduction There are surging evidences showing that machine learning models using empirical risk minimization (ERM) (Vapnik, 1991) are prone to exploit shortcuts, or spurious features, and thus can ...
arXiv:2206.07766v1
fatcat:g3pjcyxn5ndn3lcywajhvs5joi
Sparse Invariant Risk Minimization
2022
International Conference on Machine Learning
In this paper, we propose a simple yet effective paradigm named Sparse Invariant Risk Minimization (SparseIRM) to address this contradiction. ...
Invariant Risk Minimization (IRM) is an emerging invariant feature extracting technique to help generalization with distributional shift. ...
(Ahuja et al., 2020a; Jin et al., 2020) provides new perspectives by introducing game theory and regret minimization into invariant risk minimization. ...
dblp:conf/icml/ZhouLZZ22
fatcat:w5nbjcgqefe6tdeligy6buf6cm
Invariant Risk Minimization
[article]
2020
arXiv
pre-print
We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. ...
To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that data representation, matches for all training distributions. ...
This is the ubiquitous Empirical Risk Minimization (ERM) principle [50] . ...
arXiv:1907.02893v3
fatcat:recyacztqzgqjf3ln3gprlwade
Empirical Risk Minimization
[chapter]
2016
Encyclopedia of Machine Learning and Data Mining
In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. ...
(1) is known as the Empirical Risk Minimization (ERM) principle (Vapnik, 1998) . ...
To learn using VRM, we sample the vicinal distribution to construct a dataset D ν : = {(x i ,ỹ i )} m i=1 , and minimize the empirical vicinal risk: R ν (f ) = 1 m m i=1 (f (x i ),ỹ i ). ...
doi:10.1007/978-1-4899-7502-7_79-1
fatcat:2sgqbsowcvcpxadlvudemmf7xu
Average Stability is Invariant to Data Preconditioning. Implications to Exp-concave Empirical Risk Minimization
[article]
2017
arXiv
pre-print
Several important implications of our findings include: a) We demonstrate that the excess risk of empirical risk minimization (ERM) is controlled by the preconditioned stability rate. ...
that includes a regularization for analyzing the sample complexity of generalized linear models. ...
Acknowledgments We thank Iliya Tolstikhin for pointing out the alternative proof of Corollary 1 using local Rademacher complexities. ...
arXiv:1601.04011v4
fatcat:adut577725gilppaayglbrsbp4
Nonlinear Invariant Risk Minimization: A Causal Approach
[article]
2021
arXiv
pre-print
Prior work addressing this, either explicitly or implicitly, attempted to find a data representation that has an invariant relationship with the target. ...
This is done by leveraging a diverse set of training environments to reduce the effect of spurious features and build an invariant predictor. ...
Ahuja et al. (2020a) study the problem from the perspective of game theory, with an approach termed invariant risk minimization games (IRMG). ...
arXiv:2102.12353v5
fatcat:jqixnfnfnrbv7mh4jpsfrx44pq
mixup: Beyond Empirical Risk Minimization
[article]
2018
arXiv
pre-print
In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. ...
(1) is known as the Empirical Risk Minimization (ERM) principle (Vapnik, 1998) . ...
To learn using VRM, we sample the vicinal distribution to construct a dataset D ν : = {(x i ,ỹ i )} m i=1 , and minimize the empirical vicinal risk: R ν (f ) = 1 m m i=1 (f (x i ),ỹ i ). ...
arXiv:1710.09412v2
fatcat:zpwavulzc5gbbea4g2xlgbutnu
Training genetic programming classifiers by vicinal-risk minimization
2014
Genetic Programming and Evolvable Machines
For some given size of training set, there is a trade-off between the empirical risk and the complexity of the discriminating function [4] . ...
We demonstrate that VRM has a number of attractive properties and demonstrate that it has a better correlation with generalization error compared to empirical risk minimization so is more likely to lead ...
error (either empirical or vicinal risk). ...
doi:10.1007/s10710-014-9222-4
fatcat:poubitheevfhxnbidae3h6q3z4
Robust 1-Bit Compressed Sensing via Hinge Loss Minimization
[article]
2018
arXiv
pre-print
While such a risk minimization strategy is very natural to learn binary output models, such as in classification, its capacity to estimate a specific signal vector is largely unexplored. ...
., the square or logistic loss, which are at least locally strongly convex. ...
m, it is very convenient to work with a scaling invariant complexity parameter. ...
arXiv:1804.04846v2
fatcat:jgy74pfvebcopcmw7dd6g3qzce
Direct Gibbs posterior inference on risk minimizers: construction, concentration, and calibration
[article]
2022
arXiv
pre-print
Loss functions provide an alternative link, where the quantity of interest is defined, or at least could be defined, as a minimizer of the corresponding risk, or expected loss. ...
In this case, one can obtain what is commonly referred to as a Gibbs posterior distribution by using the empirical risk function directly. ...
The authors also thank the editors of this Handbook, Alastair Young in particular, for the invitation to make a contribution. ...
arXiv:2203.09381v2
fatcat:3p5v5weszjfkpcodivyacfydwi
Optimal Sketching Bounds for Exp-concave Stochastic Minimization
[article]
2019
arXiv
pre-print
Our main computational result is a fast implementation of a sketch-to-precondition approach in the context of exp-concave empirical risk minimization. ...
In particular, our statistical analysis highlights a novel and natural relationship between algorithmic stability of empirical risk minimization and ridge leverage scores, which play significant role in ...
Furthermore, does achieving such bounds require incorporating sketching methodologies directly into the learning algorithm itself or can such bounds be proved for generic regularized empirical risk minimization ...
arXiv:1805.08268v7
fatcat:p4upor3ezbbvzgnccs2nlv6ciu
Minimizing Convex Functions with Integral Minimizers
[article]
2020
arXiv
pre-print
Given a separation oracle 𝖲𝖮 for a convex function f that has an integral minimizer inside a box with radius R, we show how to find an exact minimizer of f using at most (a) O(n (n + log(R))) calls to ...
𝖲𝖮 and 𝗉𝗈𝗅𝗒(n, log(R)) arithmetic operations, or (b) O(n log(nR)) calls to 𝖲𝖮 and (n) ·𝗉𝗈𝗅𝗒(log(R)) arithmetic operations. ...
A special thanks to Daniel Dadush for pointing out the implication of the Grötschel-Lovász-Schrijver approach to our problem and many other insightful comments, and to Thomas Rothvoss for many helpful ...
arXiv:2007.01445v4
fatcat:3wcqykchlnfg5ikcgt6ujwfwu4
Domain Adaptation via Bregman divergence minimization
2021
Scientia Iranica. International Journal of Science and Technology
via domain invariant representation. ...
However, when the learning data (source domain) have a different distribution compared with the testing data (target domain), the FLDA-based models may not work well, and the performance degrades, dramatically ...
Marginal distribution adaptation The existing dimensionality reduction methods obtain a linear combination of features that characterize or seperate two or more classes of objects or events. ...
doi:10.24200/sci.2021.51486.2210
fatcat:xlwuk7kikffirkr5t6yesn7ylq
Reparameterized Variational Divergence Minimization for Stable Imitation
[article]
2020
arXiv
pre-print
We contribute a reparameterization trick for adversarial imitation learning to alleviate the optimization challenges of the promising f-divergence minimization framework. ...
We unfortunately find that f-divergence minimization through reinforcement learning is susceptible to numerical instabilities. ...
id=rkHywl-A-. Ghasemipour, S. K. S., Zemel, R., and Gu, S. A divergence minimization perspective on imitation learning methods. arXiv preprint arXiv:1911.02256, 2019. ...
arXiv:2006.10810v1
fatcat:zobyenu2rbfqfbjlk22yhus74u
Statistical Inference for Bayesian Risk Minimization via Exponentially Tilted Empirical Likelihood
[article]
2021
arXiv
pre-print
We show that the Bayesian posterior obtained by combining this surrogate empirical likelihood and the prior is asymptotically close to a normal distribution centering at the empirical risk minimizer with ...
Our surrogate empirical likelihood is carefully constructed by using the first order optimality condition of the empirical risk minimization as the moment condition. ...
risks (i.e., R n (θ S , 0) wherê θ S is the constrained empirical risk minimizer on model S) while in the meantime do not have large complexities. ...
arXiv:2109.07792v1
fatcat:tfl47krurvg5znsncy6ksoxg6u
« Previous
Showing results 1 — 15 out of 33,546 results