1,280 Hits in 5.3 sec

Relative gradient optimization of the Jacobian term in unsupervised deep learning [article]

Luigi Gresele, Giancarlo Fissore, Adrián Javaloy, Bernhard Schölkopf, Aapo Hyvärinen
2020 arXiv   pre-print
Based on relative gradients, we exploit the matrix structure of neural network parameters to compute updates efficiently even in high-dimensional spaces; the computational cost of the training is quadratic  ...  Learning expressive probabilistic models correctly describing the data is a ubiquitous problem in machine learning.  ...  Acknowledgments A.H. was supported by a Fellowship from CIFAR, and by the DATAIA convergence institute as part of the "Programme d'Investissement d'Avenir", (ANR-17-CONV-0003) operated by Inria.  ... 
arXiv:2006.15090v2 fatcat:jl537lnhdjbhbk3nt5vpd47bru

Saddlepoints in Unsupervised Least Squares [article]

Samuel Gerber
2021 arXiv   pre-print
This paper sheds light on the risk landscape of unsupervised least squares in the context of deep auto-encoding neural nets.  ...  Within this context we discuss regularization of auto-encoders, in particular bottleneck, denoising and contraction auto-encoding and propose a new optimization strategy that can be framed as particular  ...  The relatively small reach of the spiral example requires a fairly deep network to achieve an accurate fit.  ... 
arXiv:2104.05000v1 fatcat:ammgw4nvvvflrfnhtkljmaefdy

Exact solutions to the nonlinear dynamics of learning in deep linear neural networks [article]

Andrew M. Saxe, James L. McClelland, Surya Ganguli
2014 arXiv   pre-print
on the weights, very deep networks incur only a finite, depth independent, delay in learning speed relative to shallow networks.  ...  Despite the widespread practical success of deep learning methods, our theoretical understanding of the dynamics of learning in deep neural networks remains quite sparse.  ...  B Optimal discrete time learning rates In Section 2 we state results on the optimal learning rate as a function of depth in a deep linear network, which we derive here.  ... 
arXiv:1312.6120v3 fatcat:emocqsm7irffzca7gwgivnyrq4

Higher Order Contractive Auto-Encoder [chapter]

Salah Rifai, Grégoire Mesnil, Pascal Vincent, Xavier Muller, Yoshua Bengio, Yann Dauphin, Xavier Glorot
2011 Lecture Notes in Computer Science  
We explicitly encourage the latent representation to contract the input space by regularizing the norm of the Jacobian (analytically) and the Hessian (stochastically) of the encoder's output with respect  ...  From a manifold learning perspective, balancing this regularization with the auto-encoder's reconstruction objective yields a representation that varies most when moving along the data manifold in input  ...  This approximate optimization will be carried out with a stochastic gradient descent technique. -J f (x) = ∂f ∂x (x) denotes the d h × d x Jacobian matrix of f evaluated at x.  ... 
doi:10.1007/978-3-642-23783-6_41 fatcat:c2bgqwwlpbd4rk2ernl7vf5eme

The Manifold Tangent Classifier

Salah Rifai, Yann N. Dauphin, Pascal Vincent, Yoshua Bengio, Xavier Muller
2011 Neural Information Processing Systems  
singular vectors of the Jacobian of a representation mapping.  ...  This representation learning algorithm can be stacked to yield a deep architecture, and we combine it with a domain knowledge-free version of the TangentProp algorithm to encourage the classifier to be  ...  Acknowledgments The authors would like to acknowledge the support of the following agencies for research funding and computing support: NSERC, FQRNT, Calcul Québec and CIFAR.  ... 
dblp:conf/nips/RifaiDVBM11 fatcat:4q3jlpzi35erhkxue7dfkk43ta

Vision-Aided Absolute Trajectory Estimation Using an Unsupervised Deep Network with Online Error Correction [article]

E. Jared Shamwell, Sarah Leung, William D. Nothwang
2018 arXiv   pre-print
We present an unsupervised deep neural network approach to the fusion of RGB-D imagery with inertial measurements for absolute trajectory estimation.  ...  The network learns to integrate IMU measurements and generate hypothesis trajectories which are then corrected online according to the Jacobians of scaled image projection errors with respect to a spatial  ...  The main contributions of VIOLearner are its unsupervised learning of scaled trajectory, online error correction based on the use of intermediate gradients, and ability to combine uncalibrated, loosely  ... 
arXiv:1803.05850v1 fatcat:ypambrewvbca5nq4lyyhrg7ete

Practical Recommendations for Gradient-Based Training of Deep Architectures [chapter]

Yoshua Bengio
2012 Lecture Notes in Computer Science  
This chapter is meant as a practical guide with recommendations for some of the most commonly used hyper-parameters, in particular in the context of learning algorithms based on backpropagated gradient  ...  and gradient-based optimization.  ...  Frederic Bastien, and Sina Honari, as well as for the financial support of NSERC, FQRNT, CIFAR, and the Canada Research Chairs.  ... 
doi:10.1007/978-3-642-35289-8_26 fatcat:k6lsp2fxv5ei3efgkmf5p5okyy

Practical recommendations for gradient-based training of deep architectures [article]

Yoshua Bengio
2012 arXiv   pre-print
This chapter is meant as a practical guide with recommendations for some of the most commonly used hyper-parameters, in particular in the context of learning algorithms based on back-propagated gradient  ...  and gradient-based optimization.  ...  Frederic Bastien, and Sina Honari, as well as for the financial support of NSERC, FQRNT, CIFAR, and the Canada Research Chairs.  ... 
arXiv:1206.5533v2 fatcat:xbtvaaby2jfjjae4hvwyxks7yu

Understanding the difficulty of training deep feedforward neural networks

Xavier Glorot, Yoshua Bengio
2010 Journal of machine learning research  
Finally, we study how activations and gradients vary across layers and during training, with the idea that training may be more difficult when the singular values of the Jacobian associated with each layer  ...  the superiority of deeper vs less deep architectures.  ...  gradients and training dynamics in deep architectures.  ... 
dblp:journals/jmlr/GlorotB10 fatcat:qfrlj2iewfbazhz5pi3dyorpfa

Gradients as Features for Deep Representation Learning [article]

Fangzhou Mu, Yingyu Liang, Yin Li
2020 arXiv   pre-print
We address the challenging problem of deep representation learning--the efficient adaption of a pre-trained deep network to different tasks.  ...  Our key innovation is the design of a linear model that incorporates both gradient and activation of the pre-trained network.  ...  The authors would also like to acknowledge the support provided by the University of Wisconsin-Madison Office of the Vice Chancellor for Research and Graduate Education with funding from the Wisconsin  ... 
arXiv:2004.05529v1 fatcat:gfccfgce2ncodjluczt3crmqh4

Latent Space Arc Therapy Optimization [article]

Noah Bice, Mohamad Fakhreddine, Ruiqi Li, Dan Nguyen, Christopher Kabat, Pamela Myers, Niko Papanikolaou, Neil Kirby
2021 arXiv   pre-print
In this work, arc therapy overparameterization is addressed by reducing the effective dimension of treatment plans with unsupervised deep learning.  ...  Traditionally, heuristics such as fluence-map-optimization-informed segment initialization use locally optimal solutions to begin the search of the full arc therapy plan space from a reasonable starting  ...  the relative contribution of the KL term.  ... 
arXiv:2106.05846v1 fatcat:hgdepv3q2jhz3fux7cyirx3v5q

Vision-based Pose Optimization Using Learned Metrics

Shaopeng Li, Yong Xian, Tao Zhang, Bangjie Li, Daqiao Zhang, Weilin Guo
2020 IEEE Access  
This distance is involved in the residual calculation of Gauss-Newton, and the Jacobian corresponding to this distance can be analytically solved.  ...  A pose optimization method based on learned metrics is proposed to improve the optimization convexity. The neural network was designed and trained based on the collected datasets, respectively.  ...  POSE ESTIMATION WITH DEEP LEARNING Deep learning is applied to pose estimation in different ways.  ... 
doi:10.1109/access.2020.3021824 fatcat:qyl7i7yc75d5jnz4b6nytai27e

Feature-Metric Registration: A Fast Semi-Supervised Approach for Robust Point Cloud Registration Without Correspondences

Xiaoshui Huang, Guofeng Mei, Jian Zhang
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
We train the proposed method in a semisupervised or unsupervised approach, which requires limited or no registration label data.  ...  The advantage of the feature-metric projection error is robust to noise, outliers and density difference in contrast to the geometric projection error.  ...  The approach applied in the classical 2D images, the chain rule is utilised to estimate the Jacobian into two partial terms: the gradient of the image and the Jacobian of warp.  ... 
doi:10.1109/cvpr42600.2020.01138 dblp:conf/cvpr/HuangM020 fatcat:akyya3jslvajrc6e52zth7kdwu

A Signal Propagation Perspective for Pruning Neural Networks at Initialization [article]

Namhoon Lee, Thalaiyasingam Ajanthan, Stephen Gould, Philip H. S. Torr
2020 arXiv   pre-print
Furthermore, we empirically study the effect of supervision for pruning and demonstrate that our signal propagation perspective, combined with unsupervised pruning, can be useful in various scenarios where  ...  In this work, by noting connection sensitivity as a form of gradient, we formally characterize initialization conditions to ensure reliable connection sensitivity measurements, which in turn yields effective  ...  ACKNOWLEDGMENTS This work was supported by the ERC grant ERC-2012-AdG 321162-HELIOS, EPSRC grant Seebibyte EP/M013774/1, EPSRC/MURI grant EP/N019474/1 and the Australian Research Council Centre of Excellence  ... 
arXiv:1906.06307v2 fatcat:nc4pzt4g3rgllecuhrrb6pquz4

A Deep Learning Framework for Unsupervised Affine and Deformable Image Registration [article]

Bob D. de Vos, Floris F. Berendsen, Max A. Viergever, Hessam Sokooti, Marius Staring, Ivana Isgum
2018 arXiv   pre-print
To circumvent the need for predefined examples, and thereby to increase convenience of training ConvNets for image registration, we propose the Deep Learning Image Registration (DLIR) framework for unsupervised  ...  After a ConvNet has been trained with the DLIR framework, it can be used to register pairs of unseen images in one shot.  ...  The statements contained herein are solely those of the authors and do not represent or imply concurrence or endorsement by NCI.  ... 
arXiv:1809.06130v2 fatcat:kup746qmw5aqracopg2jgglsfi
« Previous Showing results 1 — 15 out of 1,280 results