14,010 Hits in 2.8 sec

Revisiting Natural Gradient for Deep Networks [article]

Razvan Pascanu, Yoshua Bengio
2014 arXiv   pre-print
We evaluate natural gradient, an algorithm originally proposed in Amari (1997), for learning deep models. The contributions of this paper are as follows.  ...  We show the connection between natural gradient and three other recently proposed methods for training deep models: Hessian-Free (Martens, 2010), Krylov Subspace Descent (Vinyals and Povey, 2012) and TONGA  ...  ., 2012) and Guillaume Desjardins and Yann Dauphin for their insightful comments. We would also like to thank NSERC, Compute Canada, and Calcul Québec for providing computational resources.  ... 
arXiv:1301.3584v7 fatcat:xfqaqmtthzckdpl6ttd6tf56i4

Revisiting Recurrent Neural Networks for robust ASR

Oriol Vinyals, Suman V. Ravuri, Daniel Povey
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
We apply pretraining principles used for Deep Neural Networks (DNNs) and second-order optimization techniques to train an RNN.  ...  In this paper, we show how new training principles and optimization techniques for neural networks can be used for different network structures.  ...  Nelson Morgan for useful discussions on using neural networks in the pipeline of speech recognition.  ... 
doi:10.1109/icassp.2012.6288816 dblp:conf/icassp/VinyalsRP12 fatcat:23y2ze5xyjci3kulkxudzih2yy

Initializing Perturbations in Multiple Directions for Fast Adversarial Training [article]

Xunguang Wang, Ship Peng Xu, Eric Ke Wang
2021 arXiv   pre-print
Recent developments in the filed of Deep Learning have demonstrated that Deep Neural Networks(DNNs) are vulnerable to adversarial examples.  ...  Adversarial Training, one of the most direct and effective methods, minimizes the losses of perturbed-data to learn robust deep networks against adversarial attacks.  ...  INTRODUCTION D EEP Neural Networks(DNNs) have achieved great success in a variety of applications, mainly including Computer Vision [1] - [3] , Speech Recognition [4] and Natural Language Processing  ... 
arXiv:2005.07606v2 fatcat:jtqfbmd6ofel7d7yu5feozv6l4

L2M: Practical posterior Laplace approximation with optimization-driven second moment estimation [article]

Christian S. Perone, Roberto Pereira Silveira, Thomas Paula
2021 arXiv   pre-print
Uncertainty quantification for deep neural networks has recently evolved through many techniques.  ...  We hope our method can open new research directions on using quantities already computed by optimizers for uncertainty estimation in deep neural networks.  ...  We revisited Laplace approximation, a classical approach for posterior approximation that is computationally attractive, and showed that we can construct a posterior approximation with the gradient raw  ... 
arXiv:2107.04695v1 fatcat:sveyll64kbhpragzn7hs233jsq

Path-SGD: Path-Normalized Optimization in Deep Neural Networks [article]

Behnam Neyshabur, Ruslan Salakhutdinov, Nathan Srebro
2015 arXiv   pre-print
We revisit the choice of SGD for training deep neural networks by reconsidering the appropriate geometry in which to optimize the weights.  ...  We argue for a geometry invariant to rescaling of weights that does not affect the output of the network, and suggest Path-SGD, which is an approximate steepest descent method with respect to a path-wise  ...  We thank Hao Tang for insightful discussions.  ... 
arXiv:1506.02617v1 fatcat:isriewanhrcyvgg5x2ycglji54

Gating Revisited: Deep Multi-layer RNNs That Can Be Trained [article]

Mehmet Ozgur Turkoglu, Stefano D'Aronco, Jan Dirk Wegner, Konrad Schindler
2021 arXiv   pre-print
We propose a new STAckable Recurrent cell (STAR) for recurrent neural networks (RNNs), which has fewer parameters than widely used LSTM and GRU while being more robust against vanishing or exploding gradients  ...  We investigate the training of multi-layer RNNs and examine the magnitude of the gradients as they propagate through the network in the "vertical" direction.  ...  ACKNOWLEDGMENTS We thank the Swiss Federal Office for Agriculture (FOAG) for partially funding this Research project through the Deep-Field Project.  ... 
arXiv:1911.11033v4 fatcat:j7kg75b2cfhb7ka5g6yz4il7ue

Revisiting Deep Intrinsic Image Decompositions [article]

Qingnan Fan, Jiaolong Yang, Gang Hua, Baoquan Chen, David Wipf
2018 arXiv   pre-print
While invaluable for many computer vision applications, decomposing a natural image into intrinsic reflectance and shading layers represents a challenging, underdetermined inverse problem.  ...  We then apply flexibly supervised loss layers that are customized for each source of ground truth labels.  ...  Their rendered images often lack realism, and traditional deep networks trained on these data may perform poorly on more natural examples [22] .  ... 
arXiv:1701.02965v8 fatcat:ninkdtaacnckzcz7bk2kdgjnem

Training of Deep Neural Networks based on Distance Measures using RMSProp [article]

Thomas Kurbiel, Shahrzad Khaleghian
2017 arXiv   pre-print
Furthermore we show that when appropriately initialized these kinds of neural networks suffer much less from the vanishing and exploding gradient problem than traditional neural networks even for deep  ...  The vanishing gradient problem was a major obstacle for the success of deep learning. In recent years it was gradually alleviated through multiple different techniques.  ...  INITIALIZATION As in neural networks based on dot products [5] , a sensible initialization of the weights is crucial for convergence, especially when training deep neural networks.  ... 
arXiv:1708.01911v1 fatcat:hqnpwu67qvcbbdtxdsnfpvd36m

Revisiting Batch Normalization For Practical Domain Adaptation [article]

Yanghao Li, Naiyan Wang, Jianping Shi, Jiaying Liu, Xiaodi Hou
2016 arXiv   pre-print
By modulating the statistics in all Batch Normalization layers across the network, our approach achieves deep adaptation effect for domain adaptation tasks.  ...  Deep neural networks (DNN) have shown unprecedented success in various computer vision applications such as image classification and object detection.  ...  RevGrad incorporates a gradient reversal layer in the deep model to encourage learning domain-invariant features. Deep CORAL extends CORAL to perform end-to-end adaptation in DNN.  ... 
arXiv:1603.04779v4 fatcat:7ip74ozq2ngszacj4mf4clpep4

Revisiting Batch Normalization for Training Low-latency Deep Spiking Neural Networks from Scratch [article]

Youngeun Kim, Priyadarshini Panda
2021 arXiv   pre-print
BNTT allows us to train deep SNN architectures from scratch, for the first time, on complex datasets with just few 25-30 time-steps.  ...  Spiking Neural Networks (SNNs) have recently emerged as an alternative to deep learning owing to sparse, asynchronous and binary event (or spike) driven processing, that can yield huge energy efficiency  ...  In this paper, we revisit Batch Normalization (BN) for more advanced SNN training. The BN layer [15] has been used extensively in deep learning to accelerate the training process of ANNs.  ... 
arXiv:2010.01729v5 fatcat:jm3encegxfgvhnivmava5safj4

Deep Reinforcement Learning with Surrogate Agent-Environment Interface [article]

Song Wang, Yu Jing
2017 arXiv   pre-print
We introduce surrogate probability action and develop the probability surrogate action deterministic policy gradient (PSADPG) algorithm based on SAEI.  ...  The experiments show PSADPG achieves the performance of DQN in certain tasks with the stochastic optimal policy nature in the initial training stage.  ...  With the development of artificial neural network, deep reinforcement learning is able to handle realistic real world problem.  ... 
arXiv:1709.03942v3 fatcat:dwkuqtzwc5gwxoa5ayganehrgy

Distributed Hessian-Free Optimization for Deep Neural Network [article]

Xi He and Dheevatsa Mudigere and Mikhail Smelyanskiy and Martin Takáč
2017 arXiv   pre-print
With this objective, we revisit Hessian-free optimization method for deep networks.  ...  Training deep neural network is a high dimensional and a highly non-convex optimization problem.  ...  Conclusion In this paper, we revisited HF optimization for deep neural network, proposed a distributed variant with analysis.  ... 
arXiv:1606.00511v2 fatcat:7rhtl7merbgfzagu65fi3ljapi

Wider or Deeper: Revisiting the ResNet Model for Visual Recognition [article]

Zifeng Wu, Chunhua Shen, Anton van den Hengel
2016 arXiv   pre-print
Investigations into deep residual networks have also suggested that they may not in fact be operating as a single deep network, but rather as an ensemble of many relatively shallow networks.  ...  The trend towards increasingly deep neural networks has been driven by a general observation that increasing depth increases the performance of a network.  ...  Note that we do not apply post-processing with CRFs, which can smooth the output but is too slow in practice, especially for large images.  ... 
arXiv:1611.10080v1 fatcat:xtonddmixjd2vpf6am7kpxz26m

Is Simple Better? Revisiting Non-Linear Matrix Factorization for Learning Incomplete Ratings

Vaibhav Krishna, Tian Guo, Nino Antulov-Fantulin
2018 2018 IEEE International Conference on Data Mining Workshops (ICDMW)  
Matrix factorization techniques have been widely used as a method for collaborative filtering for recommender systems.  ...  Secondly, the architecture built is compared with deep-learning algorithms like Restricted Boltzmann Machine and state-of-the-art Deep Matrix factorization techniques.  ...  ACKNOWLEDGMENT The authors gratefully acknowledge Dijana Tolic for useful directions and comments regarding NMF and Deep Semi-NMF approach.  ... 
doi:10.1109/icdmw.2018.00183 dblp:conf/icdm/KrishnaGA18 fatcat:fpimimjpufa33bpyywkf33moxu

Special Issue on Machine Vision

Tae-Kyun Kim, Stefanos Zafeiriou, Ben Glocker, Stefan Leutenegger
2019 International Journal of Computer Vision  
The papers presented in this issue offer a snapshot of some of the best work in the field, on the topics of (1) learning Communicated by Taequantised representations by deep neural networks, (2) 3D shape  ...  In total, 12 papers were accepted for inclusion in this special issue.  ...  Stochastic Quantization for Learning Accurate Lowbit Deep Neural Networks Yinpeng Dong, Renkun Ni, Jianguo Li, Yurong Chen, Hang Su, and Jun Zhu 3.  ... 
doi:10.1007/s11263-019-01201-4 fatcat:pclxcarmzfejdj5tkih6t54vwi
« Previous Showing results 1 — 15 out of 14,010 results