Filters








1,250 Hits in 7.3 sec

Learning A Minimax Optimizer: A Pilot Study

Jiayi Shen, Xiaohan Chen, Howard Heaton, Tianlong Chen, Jialin Liu, Wotao Yin, Zhangyang Wang
2021 International Conference on Learning Representations  
Solving continuous minimax optimization is of extensive practical interest, yet notoriously unstable and difficult.  ...  The decoupled design is found to facilitate learning, particularly when the min and max variables are highly asymmetric.  ...  ∼ U [0.9, 1] • Matrix Game: min x max y x T Ay, A ∈ R 5×5 , A i,j ∼ Bernoulli(0.5) • U [−1, 1] On all three problems, we compare Twin-L2O with several state-of-the-art algorithms: Gradient Descent Ascent  ... 
dblp:conf/iclr/ShenCHC0YW21 fatcat:fikryxn2unherjegjdvo4ffqb4

Conservative Objective Models for Effective Offline Model-Based Optimization [article]

Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine
2021 arXiv   pre-print
In this paper, we aim to solve data-driven model-based optimization (MBO) problems, where the goal is to find a design input that maximizes an unknown objective function provided access to only a static  ...  Structurally, COMs resemble adversarial training methods used to overcome adversarial examples.  ...  actual function via supervised regression (with no conservative term like COMs) and then optimizes this learned proxy via gradient ascent.  ... 
arXiv:2107.06882v1 fatcat:xjysqud46jbczboei62usv6rdy

Natural evolution strategies and quantum approximate optimization [article]

Tianchen Zhao, Giuseppe Carleo, James Stokes, Shravan Veerapaneni
2020 arXiv   pre-print
Recent work of Gomes et al. [2019] on combinatorial optimization using neural quantum states is pedagogically reviewed in this context, emphasizing the connection with natural evolution strategies.  ...  In particular it is found that natural evolution strategies can achieve state-of-art approximation ratios for Max-Cut, at the expense of increased computation time.  ...  of the following variational upper bound, via Riemannian gradient descent in the geometry induced by the Fubini-Study metric, min xPX f pxq ď min θPR d´E x"|ψ θ | 2 " f pxq ‰¯. (14) The choice to restrict  ... 
arXiv:2005.04447v1 fatcat:ssvtl2aqlbgjda44267t5luxia

On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach [article]

Yuanhao Wang, Guodong Zhang, Jimmy Ba
2019 arXiv   pre-print
It is tempting to apply gradient descent to solve minimax optimization given its popularity and success in supervised learning.  ...  We show theoretically that the algorithm addresses the notorious rotational behaviour of gradient dynamics, and is compatible with preconditioning and positive momentum.  ...  In particular, we note that the concept of Nash equilibrium or local Nash does not reflect the order between the min-player and the max-player and may not exist even  ... 
arXiv:1910.07512v2 fatcat:5cmual4pnffxrl6dzzspkmbegm

Understanding adversarial training: Increasing local stability of supervised models through robust optimization

Uri Shaham, Yutaro Yamada, Sahand Negahban
2018 Neurocomputing  
We propose a general framework for increasing local stability of Artificial Neural Nets (ANNs) using Robust Optimization (RO).  ...  We show that adversarial training of ANNs is in fact robustification of the network optimization, and that our proposed framework generalizes previous approaches for increasing local stability of ANNs.  ...  The steepest ascent with respect to the l 2 ball coincides with the direction of the gradient ∇J θ,yi (x i ).  ... 
doi:10.1016/j.neucom.2018.04.027 fatcat:myqq7cv77fbyrhuqbihbtl734a

NOVAS: Non-convex Optimization via Adaptive Stochastic Search for End-to-End Learning and Control [article]

Ioannis Exarchos and Marcus A. Pereira and Ziyi Wang and Evangelos A. Theodorou
2021 arXiv   pre-print
We study the proposed optimization module's properties and benchmark it against two existing alternatives on a synthetic energy-based structured prediction task, and further showcase its use in stochastic  ...  optimal control applications.  ...  Unrolling gradient descent approximates the arg min operator with a fixed number of gradient descent iterations during the forward pass and interprets these as an unrolled compute graph that can be differentiated  ... 
arXiv:2006.11992v3 fatcat:r4tgjunt7nhznbpncjocvvxmk4

Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective [article]

Kaidi Xu, Hongge Chen, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Mingyi Hong, Xue Lin
2019 arXiv   pre-print
Moreover, leveraging our gradient-based attack, we propose the first optimization-based adversarial training for GNNs.  ...  In this paper, we first present a novel gradient-based attack method that facilitates the difficulty of tackling discrete graph data.  ...  This yields two new topology attacks: projected gradient descent (PGD) topology attack and min-max topology attack.  ... 
arXiv:1906.04214v3 fatcat:rkawhcjq3ngnfa3hmh53uyixvi

Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach [article]

Simiao Zuo, Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Jianfeng Gao, Weizhu Chen, Tuo Zhao
2022 arXiv   pre-print
Existing works usually formulate the method as a zero-sum game, which is solved by alternating gradient descent/ascent algorithms.  ...  The leader's strategic information is captured by the Stackelberg gradient, which is obtained using an unrolling algorithm.  ...  The min and max problems are solved using alternating gradient descent/ascent.  ... 
arXiv:2104.04886v3 fatcat:sld3r2tbrbctfcns4jhltrtf5e

Understanding Overparameterization in Generative Adversarial Networks [article]

Yogesh Balaji, Mohammadmahdi Sajedi, Neha Mukund Kalibhat, Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi
2021 arXiv   pre-print
In contrast, the unsupervised setting and GANs in particular involve non-convex concave mini-max optimization problems that are often trained using Gradient Descent/Ascent (GDA).  ...  min-max problem.  ...  The common approach to solve the above optimization problem is to run a Gradient Descent Ascent (GDA) algorithm.  ... 
arXiv:2104.05605v1 fatcat:gmmytxvfy5g2lckcjessarzki4

Minnorm training: an algorithm for training over-parameterized deep neural networks [article]

Yamini Bansal, Madhu Advani, David D Cox, Andrew M Saxe
2018 arXiv   pre-print
In particular, we show faster convergence to the max-margin hyperplane in a shallow network (compared to vanilla gradient descent); faster convergence to the minimum-norm solution in a linear chain (compared  ...  To solve this constrained optimization problem, our method employs Lagrange multipliers that act as integrators of error over training and identify 'support vector'-like examples.  ...  The Minnorm algorithm with minibatch gradient descent performs best.  ... 
arXiv:1806.00730v2 fatcat:ftis25smhrbnzhhlpxyrshlduu

Submodular Mini-Batch Training in Generative Moment Matching Networks [article]

Jun Qi
2017 arXiv   pre-print
Thus, the problem is reformulated as a mini-max optimization problem with a cardinality constraint as shown in (6) , where C refers to the maximum number of the elements in the subset A. min w max A⊆V  ...  Introduction Deep generative models (DGMs) [14] characterize the distribution of observations with a deeper structure of hidden variables under non-linear transformations.  ... 
arXiv:1707.05721v3 fatcat:qmsbivrunzgihntwf7oba2w6bm

Contrastive Similarity Matching for Supervised Learning [article]

Shanshan Qin, Nayantara Mudur, Cengiz Pehlevan
2020 arXiv   pre-print
Contrastive similarity matching can be interpreted as an energy-based learning algorithm, but with significant differences from others in how a contrastive function is constructed.  ...  We formulate this idea using a contrastive similarity matching objective function and derive from it deep neural networks with feedforward, lateral, and feedback connections, and neurons that exhibit biologically-plausible  ...  In the second part of the algorithm, we update the synaptic weights by a gradient descent-ascent on (16) with y t fixed.  ... 
arXiv:2002.10378v5 fatcat:vi7p4mejdjc23onelasln2zffm

Adversarial Graph Perturbations for Recommendations at Scale

Huiyuan Chen, Kaixiong Zhou, Kwei-Herng Lai, Xia Hu, Fei Wang, Hao Yang
2022 Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval  
Our AdvGraph is mainly based on min-max robust optimization, where an universal graph perturbation is obtained through an inner maximization while the outer optimization aims to compute the model parameters  ...  For example, many prior studies inject adversarial perturbations into either node features or hidden layers of GNNs. However, perturbing graph structures has been far less studied in recommendations.  ...  The idea of AdvGraph is to compute graph universal perturbations and model parameters by solving a min-max optimization.  ... 
doi:10.1145/3477495.3531763 fatcat:hhkyne2ok5br7n5ajal5h76fbi

Adversarial Deep Learning for Robust Detection of Binary Encoded Malware

Abdullah Al-Dujaili, Alex Huang, Erik Hemberg, Una-May OReilly
2018 2018 IEEE Security and Privacy Workshops (SPW)  
Continuous-valued methods that are robust to adversarial examples of images have been developed using saddle-point optimization formulations.  ...  Thus, our adversarial learning composes (1) and (2) as: θ * ∈ arg min θ∈R p E (x,y)∼D adversarial loss max x∈S(x) L(θ, x, y) adversarial learning . ( 3 ) Solving ( 3 ) involves an inner non-concave maximization  ...  max{L(θ, x * , 1) | x * ∈ {x k , x}} Multi-Step Bit Gradient Ascent (BGA k ) x t+1 j = x t j ⊕ 1 (1 − 2x t j ) ∂ x t j L ≥ 1 √ m ||∇xL(θ, x t , y)|| 2 ∨ x j , 0 ≤ j < m , 0 ≤ t < k x adv ∈ arg max{L(θ  ... 
doi:10.1109/spw.2018.00020 dblp:conf/sp/Al-DujailiHHO18 fatcat:wf6ptin53bcyrkfwkxgjzur7cy

Riemannian Hamiltonian methods for min-max optimization on manifolds [article]

Andi Han, Bamdev Mishra, Pratik Jawanpuria, Pawan Kumar, Junbin Gao
2022 arXiv   pre-print
In this paper, we study the min-max optimization problems on Riemannian manifolds.  ...  We introduce a Riemannian Hamiltonian function, minimization of which serves as a proxy for solving the original min-max problems.  ...  C Review of RGDA and RCEG In this section, we provide the details of the Riemannian gradient descent ascent [26] and Riemannian corrected extra-gradient [79] algorithms for min-max optimization on  ... 
arXiv:2204.11418v1 fatcat:l75m6per3zffznxearsacz6oxm
« Previous Showing results 1 — 15 out of 1,250 results