68 Hits in 5.3 sec

Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis [article]

Thomas George, César Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent
2021 arXiv   pre-print
It consists in tracking a diagonal variance, not in parameter coordinates, but in a Kronecker-factored eigenbasis, in which the diagonal approximation is likely to be more effective.  ...  Optimization algorithms that leverage gradient covariance information, such as variants of natural gradient descent (Amari, 1998), offer the prospect of yielding more effective descent directions.  ...  Code EKFAC can be experimented with in PyTorch using the NNGeometry library (George, 2021) available at  ... 
arXiv:1806.03884v2 fatcat:g3uxj6myuncjjg3ofzynp6xd4u

Estimating Model Uncertainty of Neural Networks in Sparse Information Form [article]

Jongseok Lee, Matthias Humt, Jianxiang Feng, Rudolph Triebel
2020 arXiv   pre-print
As a result, we show that the information form can be scalably applied to represent model uncertainty in DNNs.  ...  We present a sparse representation of model uncertainty for Deep Neural Networks (DNNs) where the parameter posterior is approximated with an inverse formulation of the Multivariate Normal Distribution  ...  Jianxiang Feng is supported by the Munich School for Data Science (MUDS) and Rudolph Triebel is a member of MUDS.  ... 
arXiv:2006.11631v1 fatcat:2fwwrpi7ere2djavxzcz627xmy

Learning Multiresolution Matrix Factorization and its Wavelet Networks on Graphs [article]

Truong Son Hy, Risi Kondor
2021 arXiv   pre-print
Multiresolution Matrix Factorization (MMF) is unusual amongst fast matrix factorization algorithms in that it does not make a low rank assumption.  ...  In this paper we propose a learnable version of MMF that carfully optimizes the factorization with a combination of reinforcement learning and Stiefel manifold optimization through backpropagating errors  ...  This makes it possible to optimize the greed objective by simple gradient descent, but larger rotations would yield more expressive factorizations and better approximations.  ... 
arXiv:2111.01940v1 fatcat:4h3sqg6du5elbf6msx7lvdkuuq

A Random Matrix Theory Approach to Damping in Deep Learning [article]

Diego Granziol, Nicholas Baskerville
2022 arXiv   pre-print
We conjecture that the inherent difference in generalisation between adaptive and non-adaptive gradient methods in deep learning stems from the increased estimation noise in the flattest directions of  ...  We experimentally demonstrate our learner to be very insensitive to the initialised value and to allow for extremely fast convergence in conjunction with continued stable training and competitive generalisation  ...  a number of products m P can be seen as a low rank inverse approximation to H batch , or direct inversion of Kronecker factored approximations [Martens and Grosse, 2015] .  ... 
arXiv:2011.08181v5 fatcat:nyb2ckkmdndvfi4knz6ub6wqrm

Numerical estimation of the relative entropy of entanglement

Yuriy Zinchenko, Shmuel Friedland, Gilad Gour
2010 Physical Review A. Atomic, Molecular, and Optical Physics  
In low dimensions the implementation of the algorithm in MATLAB provides an estimation for the REE with an absolute error smaller than 10 −3 .  ...  Our algorithm is based on a practical semi-definite cutting plane approach.  ...  If we have a smooth objective function with no constraints, one natural choice is to follow the steepest descent direction -the direction of the negative gradient-at each iteration, until a significant  ... 
doi:10.1103/physreva.82.052336 fatcat:t3y6a7ppgfclln56pgkjqtnpbe

Filtering variational quantum algorithms for combinatorial optimization [article]

David Amaro, Carlo Modica, Matthias Rosenkranz, Mattia Fiorentini, Marcello Benedetti, Michael Lubasch
2021 arXiv   pre-print
Using random weighted MaxCut problems, we numerically analyze our methods and show that they perform better than the original VQE algorithm and the Quantum Approximate Optimization Algorithm (QAOA).  ...  Additionally we explore the use of causal cones to reduce the number of qubits required on a quantum computer.  ...  (B7) is replaced by −F t , the new VQE gradient evaluated at the point |ψ(θ) = |ψ t−1 coincides with the F-VQE gradient in Eq. (6) up to a positive multiplicative factor.  ... 
arXiv:2106.10055v2 fatcat:uwazf2bmyba6nmhkdnrsclqhui

Provably efficient variational generative modeling of quantum many-body systems via quantum-probabilistic information geometry [article]

Faris M. Sbahi, Antonio J. Martinez, Sahil Patel, Dmitri Saberi, Jae Hyeon Yoo, Geoffrey Roeder, Guillaume Verdon
2022 arXiv   pre-print
With the aim of addressing such intractabilities, we introduce a generalization of quantum natural gradient descent to parameterized mixed states, as well as provide a robust first-order approximating  ...  Our first-order algorithm is derived using a novel quantum generalization of the classical mirror descent duality.  ...  All numerical simulations in this paper were performed using our open source QHBM library 37 , built on a combination of Ten-sorFlow Quantum [2] and TensorFlow Probability [197] .  ... 
arXiv:2206.04663v1 fatcat:fprc72k4hrgytde3gml7tc6zqi

Numerical Methods for Electronic Structure Calculations of Materials

Yousef Saad, James R. Chelikowsky, Suzanne M. Shontz
2010 SIAM Review  
In particular, the last two decades saw a ¤urry of activity in developing effective software.  ...  The paper is intended for a diverse scienti£c computing audience. For this reason, we assume the reader does not have an extensive background in the related physics.  ...  or burden is that associated with the need to orthogonalize a given basis which approximates the desired eigenbasis.  ... 
doi:10.1137/060651653 fatcat:wowbhjdezjeodlo3ovigwgbugu

Fast inference in generalized linear models via expected log-likelihoods [article]

Alexandro D. Ramirez, Liam Paninski
2013 arXiv   pre-print
This paper discusses an approximation of the likelihood in these models that can greatly facilitate computation.  ...  Generalized linear models play an essential role in a wide variety of statistical applications.  ...  Pnevmatikakis, A. Pakman, and W. Truccolo for helpful comments and discussions.  ... 
arXiv:1305.5712v1 fatcat:qp6df6wbmrarffgxvsm23h6xi4

Fast inference in generalized linear models via expected log-likelihoods

Alexandro D. Ramirez, Liam Paninski
2013 Journal of Computational Neuroscience  
This paper discusses an approximation of the likelihood in these models that can greatly facilitate computation.  ...  Generalized linear models play an essential role in a wide variety of statistical applications.  ...  Pnevmatikakis, A. Pakman, and W.  ... 
doi:10.1007/s10827-013-0466-4 pmid:23832289 pmcid:PMC4374573 fatcat:acyrdlljlnfaroxffdloordlsm

Algorithm for initializing a generalized fermionic Gaussian state on a quantum computer [article]

Michael P. Kaicher, Simon B. Jäger, Frank K. Wilhelm
2021 arXiv   pre-print
We present a simple gradient-descent-based algorithm that can be used as an optimization subroutine in combination with imaginary time evolution, which by construction guarantees a monotonic decrease of  ...  Using this result we find a closed expression for the energy functional and its gradient of a general fermionic quantum many-body Hamiltonian.  ...  It should be noted, that imaginary time evolution is in itself a gradient descent method with respect to the natural inner product, i.e. the Fubini-Study metric [47] [48] [49] .  ... 
arXiv:2105.13047v2 fatcat:ha7can2shrgn3mp5ajwlt7xt2y

Earth mover's distances on discrete surfaces

Justin Solomon, Raif Rustamov, Leonidas Guibas, Adrian Butscher
2014 ACM Transactions on Graphics  
A number of additional applications of our machinery to geometry problems in graphics are presented.  ...  In particular, we uncover a class of smooth distances on a surface transitioning from a purely spectral distance to the geodesic distance between points; these distances also can be extended to the volume  ...  Acknowledgments The authors thank Nick Alger for discussions about ADMM, Andy Nguyen for discussions about path planning in §7, and Keenan Crane for providing an implementation of [Crane et al. 2013 ]  ... 
doi:10.1145/2601097.2601175 fatcat:gmh4yuqgmncuffeo6gse2iqcey

A Survey of Uncertainty in Deep Neural Networks [article]

Jakob Gawlikowski, Cedrique Rovile Njieutcheu Tassi, Mohsin Ali, Jongseok Lee, Matthias Humt, Jianxiang Feng, Anna Kruspe, Rudolph Triebel, Peter Jung, Ribana Roscher, Muhammad Shahzad, Wen Yang (+2 others)
2022 arXiv   pre-print
It is intended to give anyone interested in uncertainty estimation in neural networks a broad overview and introduction, without presupposing prior knowledge in this field.  ...  Many researchers have been working on understanding and quantifying uncertainty in a neural network's prediction.  ...  Shang, “Adaptive thermostats for noisy gradientFast approximate natural gradient descent in a kronecker factored systems,” SIAM Journal on Scientific Computing, vol. 38, no. 2, pp.  ... 
arXiv:2107.03342v3 fatcat:cex5j3xq5fdijjdtdbt2ixralm

Learning Laplacian Matrix from Graph Signals with Sparse Spectral Representation

Pierre Humbert, Batiste Le Bars, Laurent Oudre, Argyris Kalogeratos, Nicolas Vayatis
2021 Journal of machine learning research  
Based on a 3-step alternating procedure, both algorithms rely on standard minimization methods -such as manifold gradient descent or linear programming-and have lower complexity compared to state-of-the-art  ...  Finally, we present a probabilistic interpretation of the proposed optimization program as a Factor Analysis Model.  ...  At each iteration, the manifold gradient descent computes the Riemannian gradient (12) that gives a direction in the tangent space.  ... 
dblp:journals/jmlr/HumbertBOKV21 fatcat:vzyqunmesvarrjay42jk6w3mie

Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions [article]

Maksim Velikanov, Dmitry Yarotsky
2022 arXiv   pre-print
: vanilla Gradient Descent, Steepest Descent, Heavy Ball, and Conjugate Gradients.  ...  For large (effectively infinite-dimensional) problems, this part of the spectrum can often be naturally represented or approximated by power law distributions.  ...  w * over the eigenbasis of A.  ... 
arXiv:2202.00992v1 fatcat:psn5idb4ofh27dbpjtpzgdz5ka
« Previous Showing results 1 — 15 out of 68 results