A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
[article]
2019
arXiv
pre-print
In this paper, we propose a novel momentum-SGD-based optimization method to reduce the network complexity by on-the-fly pruning. ...
Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices. ...
Acknowledgement We sincerely thank all the reviewers for their comments. This work was supported by the National Key ...
arXiv:1909.12778v3
fatcat:hh4wnkm2xvdj3nianidyliol6e
Truly Sparse Neural Networks at Scale
[article]
2022
arXiv
pre-print
To achieve this goal, we introduce three novel contributions, specially designed for sparse neural networks: (1) a parallel training algorithm and its corresponding sparse implementation from scratch, ...
Recently, sparse training methods have started to be established as a de facto approach for training and inference efficiency in artificial neural networks. Yet, this efficiency is just in theory. ...
Acknowledgement We thank the Google Cloud Platform Research Credits program for granting us the necessary resources to run the Extreme large sparse MLPs experiments. ...
arXiv:2102.01732v2
fatcat:xw4pnoj5zfafvilmk34odczt5m
Exploring Structural Sparsity of Deep Networks via Inverse Scale Spaces
[article]
2022
arXiv
pre-print
forward selection methods for learning structural sparsity in deep networks. ...
The great success of deep neural networks is built upon their over-parameterization, which smooths the optimization landscape without degrading the generalization ability. ...
For ResNet50, we can find an interesting phenomenon, most layers inside the block can be pruned to a very sparse level. ...
arXiv:1905.09449v5
fatcat:ac4ox2cojrha3nkksi4nqjxkxm
DessiLBI: Exploring Structural Sparsity of Deep Networks via Differential Inclusion Paths
[article]
2020
arXiv
pre-print
Over-parameterization is ubiquitous nowadays in training neural networks to benefit both optimization in seeking global optima and generalization in reducing prediction error. ...
Such a differential inclusion scheme has a simple discretization, proposed as Deep structurally splitting Linearized Bregman Iteration (DessiLBI), whose global convergence analysis in deep learning is ...
Gradient descent finds global minima of deep neural networks. 2018. arXiv:1811.03804. 1 Zhang, Y. ...
arXiv:2007.02010v1
fatcat:sg2ozh6fijeqpmeuhovip6g6ee
Impact of Parameter Sparsity on Stochastic Gradient MCMC Methods for Bayesian Deep Learning
[article]
2022
arXiv
pre-print
Bayesian methods hold significant promise for improving the uncertainty quantification ability and robustness of deep neural network models. ...
Recent research has seen the investigation of a number of approximate Bayesian inference methods for deep neural networks, building on both the variational Bayesian and Markov chain Monte Carlo (MCMC) ...
A Appendix A.1 SGHMC in sparse neural networks In Algorithm 1, we present the SGHMC algorithm we use for neural networks with sparse substructure. ...
arXiv:2202.03770v1
fatcat:6vv6oku6irdwrdasnt7wpwpage
Directional Pruning of Deep Neural Networks
[article]
2020
arXiv
pre-print
In the light of the fact that the stochastic gradient descent (SGD) often finds a flat minimum valley in the training loss, we propose a novel directional pruning method which searches for a sparse minimizer ...
SGD. ...
This work was completed while Guang Cheng was a member of the Institute for Advanced Study, Princeton in the fall of 2019. ...
arXiv:2006.09358v2
fatcat:dadslgdh7becnnorheoilkkesa
Data-Driven Sparse Structure Selection for Deep Neural Networks
[article]
2018
arXiv
pre-print
Deep convolutional neural networks have liberated its extraordinary power on various tasks. ...
In this paper, we propose a simple and effective framework to learn and prune deep models in an end-to-end manner. ...
for deep neural networks. ...
arXiv:1707.01213v3
fatcat:shtsddglafdchdmqueylrkvunq
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
[article]
2019
arXiv
pre-print
Neural network pruning techniques can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without compromising ...
However, contemporary experience is that the sparse architectures produced by pruning are difficult to train from the start, which would similarly improve training performance. ...
Channel pruning for accelerating very deep neural networks.
In International Conference on Computer Vision (ICCV), volume 2, pp. 6, 2017.
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. ...
arXiv:1803.03635v5
fatcat:hycg3kxjqbdbbpqz2lq7l252ca
Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers
[article]
2020
arXiv
pre-print
We demonstrate that our dynamic sparse training algorithm can easily train very sparse neural network models with little performance loss using the same number of training epochs as dense models. ...
We present a novel network pruning algorithm called Dynamic Sparse Training that can jointly find the optimal network parameters and sparse network structure in a unified optimization process with trainable ...
due to the over-parameterization of deep neural networks. ...
arXiv:2005.06870v1
fatcat:vtvnnw2smvbf5cqxtyhcuyin2a
Data-Driven Sparse Structure Selection for Deep Neural Networks
[chapter]
2018
Lecture Notes in Computer Science
Deep convolutional neural networks have liberated its extraordinary power on various tasks. ...
In this paper, we propose a simple and effective framework to learn and prune deep models in an end-to-end manner. ...
for deep neural networks. ...
doi:10.1007/978-3-030-01270-0_19
fatcat:qgg6xbjmjbeytp5x7tju6dkvrq
Training Deep Neural Networks via Branch-and-Bound
[article]
2021
arXiv
pre-print
A computationally efficient solver based on BPGrad has been proposed to train the deep neural networks. ...
In this paper, we propose BPGrad, a novel approximate algorithm for deep nueral network training, based on adaptive estimates of feasible region via branch-and-bound. ...
Earlier work on training neural networks [47] showed that it is difficult to find the global optima because in the worst case even learning a simple 3-node neural network is NP-complete. ...
arXiv:2104.01730v2
fatcat:wyzu7hguuba4ziwnxbwxhokcae
Dynamic Model Pruning with Feedback
[article]
2020
arXiv
pre-print
Deep neural networks often have millions of parameters. ...
signal to reactivate prematurely pruned weights we obtain a performant sparse model in one single training pass (retraining is not needed, but can further improve the performance). ...
INTRODUCTION Highly overparametrized deep neural networks show impressive results on machine learning tasks. ...
arXiv:2006.07253v1
fatcat:5t5ijn7hzfh3xiuy6hbsvjvdju
Sparse Weight Activation Training
[article]
2020
arXiv
pre-print
Neural network training is computationally and memory intensive. ...
Sparse training can reduce the burden on emerging hardware platforms designed to accelerate sparse computations, but it can affect network convergence. ...
Channel pruning for accelerating very deep neural
networks. In Proceedings of the IEEE International Conference on Computer Vision, pages
1389-1397, 2017.
[22] C. A. R. Hoare. Algorithm 65: Find. ...
arXiv:2001.01969v3
fatcat:twbzslvhgnh7jk4sauhbqk6ql4
LOss-Based SensiTivity rEgulaRization: towards deep sparse neural networks
[article]
2020
arXiv
pre-print
LOBSTER (LOss-Based SensiTivity rEgulaRization) is a method for training neural networks having a sparse topology. ...
Parameters with low sensitivity, i.e. having little impact on the loss when perturbed, are shrunk and then pruned to sparsify the network. ...
In Sec. 2 we review the relevant literature concerning sparse neural architectures. Next, in Sec. 3 we describe our method for training a neural network such that its topology is sparse. ...
arXiv:2011.09905v1
fatcat:mkgomcochfda5iojhep4fn5dbi
A Bregman Learning Framework for Sparse Neural Networks
[article]
2022
arXiv
pre-print
In contrast to established methods for sparse training the proposed family of algorithms constitutes a regrowth strategy for neural networks that is solely optimization-based without additional heuristics ...
Our Bregman learning framework starts the training with very few initial parameters, successively adding only significant ones to obtain a sparse and expressive network. ...
Additionally we thank for the financial support by the Cluster of Excellence "Engineering of Advanced Materials" (EAM) and the "Competence Unit for Scientific Computing" (CSC) at the University of Erlangen-Nürnberg ...
arXiv:2105.04319v3
fatcat:tyiiilombrdybi7rkfjicgdscm
« Previous
Showing results 1 — 15 out of 573 results