1,451 Hits in 5.1 sec

Convolutional Neural Network Training with Distributed K-FAC [article]

J. Gregory Pauloski, Zhao Zhang, Lei Huang, Weijia Xu, Ian T. Foster
2020 arXiv   pre-print
We investigate here a scalable K-FAC design and its applicability in convolutional neural network (CNN) training at scale.  ...  Training neural networks with many processors can reduce time-to-solution; however, it is challenging to maintain convergence and efficiency at large scales.  ...  For example, with the classic image classification problem, a batch size of 32K is considered large for convolutional neural network training with the ImageNet-1k dataset.  ... 
arXiv:2007.00784v1 fatcat:tacioznilvh7locxcqr6mejtt4

Distributed Second-Order Optimization using Kronecker-Factored Approximations

Jimmy Ba, Roger B. Grosse, James Martens
2017 International Conference on Learning Representations  
As more computational resources become available, machine learning researchers train ever larger neural networks on millions of data points using stochastic gradient descent (SGD).  ...  Finally, we show that our distributed K-FAC method speeds up training of various state-of-the-art ImageNet classification models by a factor of two compared to an improved form of Batch Normalization .  ...  Without any changes to the algorithm, distributed K-FAC can be used to train neural networks that have BN layers.  ... 
dblp:conf/iclr/BaGM17 fatcat:kdqwtffwgravvgnrr6r2wirc7e

An Evaluation of Fisher Approximations Beyond Kronecker Factorization

César Laurent, Thomas George, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent
2018 International Conference on Learning Representations  
We study two coarser approximations on top of a Kronecker factorization (K-FAC) of the Fisher Information Matrix, to scale up Natural Gradient to deep and wide Convolutional Neural Networks (CNNs).  ...  Experiments on the VGG11 and ResNet50 architectures show the technique can substantially speed up both K-FAC and a baseline with Batch Normalization in wall-clock time, yielding faster convergence to similar  ...  INTRODUCTION AND PREVIOUS WORK Deep Neural Networks, especially Convolutional Neural Networks are the state-of-the art machine learning approach in many application areas, including image recognition  ... 
dblp:conf/iclr/LaurentGBBV18 fatcat:u3hfg4kp2ranlmmvli3hd5anyu

DNSS2: improved ab initio protein secondary structure prediction using advanced deep learning architectures [article]

Jie Hou, Zhiye Guo, Jianlin Cheng
2019 bioRxiv   pre-print
Results: The major improvements over the DNSS1 method include (i) designing and integrating six advanced one-dimensional deep convolutional/recurrent/residual/memory/fractal/inception networks to predict  ...  DNSS2 was systematically benchmarked on two independent test datasets with eight state-of-art tools and consistently ranked as one of the best methods.  ...  Six different deep neural network architectures were evaluated in the study, including convolutional neural network (CNN) (Krizhevsky, et al., 2012) , recurrent convolutional neural network (RCNN) (Liang  ... 
doi:10.1101/639021 fatcat:wbttv5kszrg2zniteeejyjzpke

Eigenvalue Corrected Noisy Natural Gradient [article]

Juhan Bae, Guodong Zhang, Roger Grosse
2018 arXiv   pre-print
Variational Bayesian neural networks combine the flexibility of deep learning with Bayesian uncertainty estimation.  ...  Noisy K-FAC is an instance of noisy natural gradient that fits a matrix-variate Gaussian posterior with minor changes to ordinary K-FAC.  ...  Similar to applying K-FAC on convolutional layers with Kronecker factors , EK-FAC can be extended to convolutional layers. We compare the results with SGD (with momentum), K-FAC, and noisy K-FAC.  ... 
arXiv:1811.12565v1 fatcat:qp37ticqxbc65et4pzuhwk5kxu

Facial Action Coding and Hybrid Deep Learning Architectures for Autism Detection

A. Saranya, R. Anandan
2022 Intelligent Automation and Soft Computing  
For feature extraction, DEEPFACENET uses the FACS integrated Convolutional Neural Network (FACS-CNN) and hybrid Deep Learning of LSTM (Long Short-Term Memory) for the classification and detection of autism  ...  The Multi-Layer Perceptron (48.67%), Convolutional neural networks (67.75%), and Long ShortTerm Memory (71.56), the suggested model showed a considerable increase in recognition rate (92%), from this proposed  ...  Acknowledgement: The author with a deep sense of gratitude would thank the supervisor for his guidance and constant support rendered during this research.  ... 
doi:10.32604/iasc.2022.023445 fatcat:fjk3cmyqijdwlbbr22jibobq74

Optimizing Deep Convolutional Neural Network for Facial Expression Recognition

Umesh B. Chavan, Dinesh Kulkarni
2020 European Journal of Engineering Research and Science  
In this project, we have designed deep learning Convolution Neural Network (CNN) for facial expression recognition and developed model in Theano and Caffe for training process.  ...  We designed a large, deep convolutional neural network to classify 40,000 images in the data-set into one of seven categories (disgust, fear, happy, angry, sad, neutral, surprise).  ...  Compared with CPU, it has amazing advantages. Experiments show that stream processor is suitable for convolution neural network.  ... 
doi:10.24018/ejers.2020.5.2.495 fatcat:euv4bc7cvjcqnaevup43xjzbaq

Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution [article]

Emad Barsoum, Cha Zhang, Cristian Canton Ferrer, Zhengyou Zhang
2016 arXiv   pre-print
In this paper, we demonstrate how to learn a deep convolutional neural network (DCNN) from noisy labels, using facial expression recognition as an example.  ...  An enhanced FER+ data set with multiple labels for each face image will also be shared with the research community.  ...  In this paper, we adopt the latest deep convolutional neural networks (DCNN) architecture, and evaluate the effectiveness of four different schemes to train emotion recognition on crowd-sourced labels.  ... 
arXiv:1608.01041v2 fatcat:ed62tsiiuna45llquwienn4xke

A Kronecker-factored approximate Fisher matrix for convolution layers [article]

Roger Grosse, James Martens
2016 arXiv   pre-print
In our experiments, approximate natural gradient descent with KFC was able to train convolutional networks several times faster than carefully tuned SGD.  ...  Second-order optimization methods such as natural gradient descent have the potential to speed up training of neural networks by correcting for the curvature of the loss function.  ...  Sequence to sequence learning with neural networks. In Neural Information Processing Systems, 2014. Swersky, K., Chen, Bo, Marlin, B., and de Freitas, N.  ... 
arXiv:1602.01407v2 fatcat:m4gaqeqdyngfrjuaxvps6c4lve

Three Mechanisms of Weight Decay Regularization [article]

Guodong Zhang, Chaoqi Wang, Bowen Xu, Roger Grosse
2018 arXiv   pre-print
We empirically investigate weight decay for three optimization algorithms (SGD, Adam, and K-FAC) and a variety of network architectures.  ...  Our results provide insight into how to improve the regularization of neural networks.  ...  Figure 5 : 5 Relationship between K-FAC GN norm and Jacobian norm for practical deep neural networks. Each point corresponds to a network trained to 100% training accuracy.  ... 
arXiv:1810.12281v1 fatcat:l2zpoupsa5eqjlt5zi6p6cvpiq

Measuring Uncertainty through Bayesian Learning of Deep Neural Network Structure [article]

Zhijie Deng, Yucen Luo, Jun Zhu, Bo Zhang
2021 arXiv   pre-print
Bayesian neural networks (BNNs) augment deep networks with uncertainty quantification by Bayesian treatment of the network weights.  ...  Instead of building structure from scratch inefficiently, we draw inspirations from neural architecture search to represent the network structure.  ...  which is also known as the L2 regularization of neural networks.  ... 
arXiv:1911.09804v3 fatcat:p7tvfrpnpfawhnmcuawp2mh62i

Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation [article]

Yuhuai Wu, Elman Mansimov, Shun Liao, Roger Grosse, Jimmy Ba
2017 arXiv   pre-print
We extend the framework of natural policy gradient and propose to optimize both the actor and the critic using Kronecker-factored approximate curvature (K-FAC) with trust region; hence we call our method  ...  With the proposed methods, we are able to achieve higher rewards and a 2- to 3-fold improvement in sample efficiency on average, compared to previous state-of-the-art on-policy actor-critic methods.  ...  We indirectly tested how accurate the Kronecker-factored approximation to the curvature is by measuring the exact KL changes during training, while performing trust region optimization using a Kronecker-factored  ... 
arXiv:1708.05144v2 fatcat:qrqnmir6czazfajc7vh4r62y3u

Facial Expression Recognition Research Based on Deep Learning [article]

Yongpei Zhu, Hongwei Fan, Kehong Yuan
2019 arXiv   pre-print
With the development of deep learning, the structure of convolution neural network is becoming more and more complex and the performance of object recognition is getting better.  ...  In this paper, we design and train a convolution neural network based on the expression recognition, and explore the classification mechanism of the network.  ...  Model design and training of CNN In order to analyze the mechanism of convolution neural network, a neural network with shallow layer is designed in this section.The training of deep learning convolutional  ... 
arXiv:1904.09737v3 fatcat:wshlfqk4ufhfvogce4ybgplbye

Facial Attribute Capsules for Noise Face Super Resolution [article]

Jingwei Xin, Nannan Wang, Xinrui Jiang, Jie Li, Xinbo Gao, Zhifeng Li
2020 arXiv   pre-print
In this paper, we propose a Facial Attribute Capsules Network (FACN) to deal with the problem of high-scale super-resolution of noisy face image.  ...  The diverse FACs could better combine the face prior information to generate the face images with fine-grained semantic attributes.  ...  Deep convolutional neural network (CNN) based Face SR methods have received significant attentions in recent years. Dong et al.  ... 
arXiv:2002.06518v1 fatcat:zs6355bui5aqpfric55kmv5rry

Emotion Recognition System from Speech and Visual Information based on Convolutional Neural Networks

Nicolae-Catalin Ristea, Liviu Cristian Dutu, Anamaria Radoi
2019 2019 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)  
In this paper, we propose a system that is able to recognize emotions with a high accuracy rate and in real time, based on deep Convolutional Neural Networks.  ...  Experimental results show the effectiveness of the proposed scheme for emotion recognition and the importance of combining visual with audio data.  ...  Firstly, training deep convolutional neural networks require large volumes of annotated data.  ... 
doi:10.1109/sped.2019.8906538 dblp:conf/sped/RisteaDR19 fatcat:ivjhman7ybfntpv2akmvfaqsua
« Previous Showing results 1 — 15 out of 1,451 results