68 Hits in 3.2 sec

Beyond backpropagation: implicit gradients for bilevel optimization [article]

Nicolas Zucchet, João Sacramento
2022 arXiv   pre-print
This paper reviews gradient-based techniques to solve bilevel optimization problems.  ...  Bilevel optimization is a general way to frame the learning of systems that are implicitly defined through a quantity that they minimize.  ...  We thank Benjamin Scellier, Johannes von Oswald and Simon Schug for their detailed comments on this manuscript.  ... 
arXiv:2205.03076v1 fatcat:rdv5cjiconcp3fn6ra5dhmkguq

Learning Sparsity-Promoting Regularizers using Bilevel Optimization [article]

Avrajit Ghosh, Michael T. McCann, Madeline Mitchell, Saiprasad Ravishankar
2022 arXiv   pre-print
Training involves solving a challenging bilievel optimization problem; we derive an expression for the gradient of the training loss using the closed-form solution of the denoising problem and provide  ...  an accompanying gradient descent algorithm to minimize it.  ...  We thank Jeffrey Fessler and Caroline Crockett, University of Michigan, for helpful discussions and their comments on this work.  ... 
arXiv:2207.08939v1 fatcat:tjuop5o2yjdyhiidjxkzjhveua

Bilevel Optimization: Convergence Analysis and Enhanced Design [article]

Kaiyi Ji, Junjie Yang, Yingbin Liang
2021 arXiv   pre-print
For deterministic bilevel optimization, we provide a comprehensive convergence rate analysis for two popular algorithms respectively based on approximate implicit differentiation (AID) and iterative differentiation  ...  Bilevel optimization has arisen as a powerful tool for many machine learning problems such as meta-learning, hyperparameter optimization, and reinforcement learning.  ...  ., 2017) as bilevel optimization, and proposed iMAML via implicit gradient.  ... 
arXiv:2010.07962v3 fatcat:yweiwflrl5am5f5yxqtq6oa4ty

Betty: An Automatic Differentiation Library for Multilevel Optimization [article]

Sang Keun Choe, Willie Neiswanger, Pengtao Xie, Eric Xing
2022 arXiv   pre-print
We take an initial step towards closing this gap by introducing Betty, a high-level software library for gradient-based multilevel optimization.  ...  Multilevel optimization has been widely adopted as a mathematical foundation for a myriad of machine learning problems, such as hyperparameter optimization, meta-learning, and reinforcement learning, to  ...  However, since it is specifically developed for bilevel meta-learning problems, extending it beyond two-level hierarchy is not straightforward.  ... 
arXiv:2207.02849v1 fatcat:rkdkasfml5fi5irzmrn25ydxxa

Bilevel methods for image reconstruction [article]

Caroline Crockett, Jeffrey A. Fessler
2021 arXiv   pre-print
This review discusses methods for learning parameters for image reconstruction problems using bilevel formulations.  ...  One can view the bilevel problem as formalizing hyperparameter optimization, as bridging machine learning and cost function based optimization methods, or as a method to learn variables best suited to  ...  The authors would like to thank Lindon Roberts for helpful email discussion of [95] .  ... 
arXiv:2109.09610v1 fatcat:lfr5e2posbe43otwvgqjn5xhiq

Deep Inverse Optimization [article]

Yingcong Tan, Andrew Delong, Daria Terekhov
2018 arXiv   pre-print
We demonstrate that by backpropagating through the interior point algorithm we can learn the coefficients determining the cost vector and the constraints, independently or jointly, for both non-parametric  ...  Our method, called deep inverse optimization, is to unroll an iterative optimization process and then use backpropagation to learn parameters that generate the observations.  ...  The gradients for updating the coefficients of the optimization problem are derived through implicit differentiation.  ... 
arXiv:1812.00804v1 fatcat:5svkudvgyfeefcuzcmvhvtsqpu

Data Summarization via Bilevel Optimization [article]

Zalán Borsos, Mojmír Mutný, Marco Tagliasacchi, Andreas Krause
2021 arXiv   pre-print
Coresets are weighted subsets of the data that provide approximation guarantees for the optimization objective.  ...  In this work, we propose a generic coreset construction framework that formulates the coreset selection as a cardinality-constrained bilevel optimization problem.  ...  We use first-order methods based on implicit gradients for solving our proposed bilevel optimization problems due to their flexibility and scalability.  ... 
arXiv:2109.12534v1 fatcat:f5yewtrb3nehfcs2s5i6jxbfgy

FEDNEST: Federated Bilevel, Minimax, and Compositional Optimization [article]

Davoud Ataee Tarzanagh, Mingchen Li, Christos Thrampoulidis, Samet Oymak
2022 arXiv   pre-print
We establish provable convergence rates for FEDNEST in the presence of heterogeneous data and introduce variations for bilevel, minimax, and compositional optimization.  ...  However, many contemporary ML problems -- including adversarial robustness, hyperparameter tuning, and actor-critic -- fall under nested bilevel programming that subsumes minimax and compositional optimization  ...  ., 2019) gave a similar analysis for a truncated backpropagation approach. Non-asymptotic complexity analysis for bilevel optimization has also been explored.  ... 
arXiv:2205.02215v2 fatcat:vs6vcmftrnasbhpm2jyun7tuka

The least-control principle for learning at equilibrium [article]

Alexander Meulemans, Nicolas Zucchet, Seijin Kobayashi, Johannes von Oswald, João Sacramento
2022 arXiv   pre-print
Here, we present a new principle for learning such systems with a temporally- and spatially-local rule.  ...  In practice, our principle leads to strong performance matching that of leading gradient-based learning methods when applied to an array of problems involving recurrent neural networks and meta-learning  ...  Richards, Rafal Bogacz, and members of the Senn and Bogacz labs for discussions.  ... 
arXiv:2207.01332v1 fatcat:7ebqaqd6ubcfjpogppkwuufgri

Truncated Back-propagation for Bilevel Optimization [article]

Amirreza Shaban, Ching-An Cheng, Nathan Hatch, Byron Boots
2019 arXiv   pre-print
Bilevel optimization has been recently revisited for designing and analyzing algorithms in hyperparameter tuning and meta learning tasks.  ...  We find that optimization with the approximate gradient computed using few-step back-propagation often performs comparably to optimization with the exact gradient, while requiring far less memory and half  ...  Relationship with implicit differentiation The gradient estimate h T −K is related to implicit differentiation, which is a classical first-order approach to solving bilevel optimization problems [12,  ... 
arXiv:1810.10667v2 fatcat:2elvu3xlbnc2fmi3z56452gvra

Nonsmooth Implicit Differentiation for Machine Learning and Optimization [article]

Jérôme Bolte
2022 arXiv   pre-print
To show the sharpness of our assumptions, we present numerical experiments showcasing the extremely pathological gradient dynamics one can encounter when applying implicit algorithmic differentiation without  ...  We provide several applications such as training deep equilibrium networks, training neural nets with conic optimization layers, or hyperparameter-tuning for nonsmooth Lasso-type models.  ...  (c) The backpropagation can be seen as an oracle (in the optimization sense) for a conservative Jacobian.  ... 
arXiv:2106.04350v2 fatcat:mg6yewqf7ffztkwpzb3it2v3yy

Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation

Evgenii Nikishin, Romina Abachi, Rishabh Agarwal, Pierre-Luc Bacon
To alleviate this problem, we propose an end-to-end approach for model learning which directly optimizes the expected returns using implicit differentiation.  ...  We treat a value function that satisfies the Bellman optimality operator induced by the model as an implicit function of model parameters and show how to differentiate the function.  ...  for insightful discussions; Pierluca D'Oro, David Brandfonbrener, Valentin Thomas, and Timur Garipov for providing useful suggestions on the early draft of the paper; Compute Canada for computational  ... 
doi:10.1609/aaai.v36i7.20758 fatcat:wzlxxkpo25hkpisv2xotwkgfiy

Flexible Differentiable Optimization via Model Transformations [article]

Akshay Sharma and Mathieu Besançon and Joaquim Dias Garcia and Benoît Legat
2022 arXiv   pre-print
DiffOpt offers both forward and reverse differentiation modes, enabling multiple use cases from hyperparameter optimization to backpropagation and sensitivity analysis, bridging constrained optimization  ...  We introduce DiffOpt.jl, a Julia library to differentiate through the solution of convex optimization problems with respect to arbitrary parameters present in the objective and/or constraints.  ...  The authors would like to thank all contributors and users of the package for their feedback and improvements, and in particular Invenia Technical Computing for support, feedback on the API and documentation  ... 
arXiv:2206.06135v1 fatcat:2ouu7xgo4fdrbgx7s2bpefnbre

EvoGrad: Efficient Gradient-Based Meta-Learning and Hyperparameter Optimization [article]

Ondrej Bohdal, Yongxin Yang, Timothy Hospedales
2021 arXiv   pre-print
Gradient-based meta-learning and hyperparameter optimization have seen significant progress recently, enabling practical end-to-end training of neural networks together with many hyperparameters.  ...  EvoGrad estimates hypergradient with respect to hyperparameters without calculating second-order gradients, or storing a longer computational graph, leading to significant improvements in efficiency.  ...  Related work Gradient-based meta-learning solves a bilevel optimization problem where validation loss is optimized with respect to the meta-knowledge by backpropagating through the update of the model  ... 
arXiv:2106.10575v2 fatcat:tee6jdlbbbfwjph36lbe7rsele

Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis [article]

Kaiyi Ji
2021 arXiv   pre-print
For the first class, two popular types of gradient-based algorithms have been proposed for hypergradient estimation via approximate implicit differentiation (AID) and iterative differentiation (ITD).  ...  There are generally two classes of bilevel optimization formulations for machine learning: 1) problem-based bilevel optimization, whose inner-level problem is formulated as finding a minimizer of a given  ...  For example, [104] reformulated the model-agnostic metalearning (MAML) [30] as problem-based bilevel optimization, and proposed iMAML via implicit gradient.  ... 
arXiv:2108.00330v1 fatcat:eczdcyzovrcoplozdo6zfdrth4
« Previous Showing results 1 — 15 out of 68 results