13,570 Hits in 2.2 sec

Pruning from Scratch [article]

Yulong Wang, Xiaolu Zhang, Lingxi Xie, Jun Zhou, Hang Su, Bo Zhang, Xiaolin Hu
2019 arXiv   pre-print
Therefore, we propose a novel network pruning pipeline which allows pruning from scratch.  ...  We empirically show that more diverse pruned structures can be directly pruned from randomly initialized weights, including potential models with better performance.  ...  Our Solution: Pruning from Scratch Based on the above analysis, we propose a new pipeline named pruning from scratch.  ... 
arXiv:1909.12579v1 fatcat:f2guvau2yvdcfnvptj2el5e7oy

Pruning from Scratch

Yulong Wang, Xiaolu Zhang, Lingxi Xie, Jun Zhou, Hang Su, Bo Zhang, Xiaolin Hu
Therefore, we propose a novel network pruning pipeline which allows pruning from scratch with little training overhead.  ...  We empirically show that more diverse pruned structures can be directly pruned from randomly initialized weights, including potential models with better performance.  ...  Our Solution: Pruning from Scratch Based on the above analysis, we propose a new pipeline named pruning from scratch.  ... 
doi:10.1609/aaai.v34i07.6910 fatcat:i4lwvfocevgghalfmqxadjqel4

Learning Pruned Structure and Weights Simultaneously from Scratch: an Attention based Approach [article]

Qisheng He, Ming Dong, Loren Schwiebert, Weisong Shi
2021 arXiv   pre-print
the dense network and the sparse network are tracked so that the pruned structure is simultaneously learned from randomly initialized weights.  ...  pruning methods.  ...  In the "from scratch" column, × indicates pruning from pre-trained, and indicates pruning from scratch.  ... 
arXiv:2111.02399v1 fatcat:jviiaiki75aabdjt3qnyz5ey5y

Roulette: A Pruning Framework to Train A Sparse Neural Network From Scratch

Qiaoling Zhong, Zhibin Zhang, Qiang Qiu, Xueqi Cheng
2021 IEEE Access  
In this paper, we present a pruning framework Roulette to train a sparse network from scratch.  ...  Due to space and inference time restrictions, finding an efficient and sparse sub-network from a dense and over-parameterized network is critical for deploying neural networks on edge devices.  ...  a pruning framework Roulette to train a sparse network on multiple GPUs from scratch.  ... 
doi:10.1109/access.2021.3065406 fatcat:nkw4dogqjrbh7ajrex67konl6m

Rethinking the Value of Network Pruning [article]

Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, Trevor Darrell
2019 arXiv   pre-print
For pruning algorithms which assume a predefined target network architecture, one can get rid of the full pipeline and directly train the target network from scratch.  ...  A typical pruning algorithm is a three-stage pipeline, i.e., training (a large model), pruning and fine-tuning.  ...  Scratch-E/B means pre-training the pruned model from scratch on classification and transfer to detection.  ... 
arXiv:1810.05270v2 fatcat:berzw3fmvbfnhcgdvornkwionu

A Closer Look at Structured Pruning for Neural Network Compression [article]

Elliot J. Crowley, Jack Turner, Amos Storkey, Michael O'Boyle
2019 arXiv   pre-print
from scratch---consistently outperform pruned networks; (ii) if one takes the architecture of a pruned network and then trains it from scratch it is significantly more competitive.  ...  Furthermore, these architectures are easy to approximate: we can prune once and obtain a family of new, scalable network architectures that can simply be trained from scratch.  ...  This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 732204 (Bonseyes).  ... 
arXiv:1810.04622v3 fatcat:qkthpxwnivb3ronzfova7zts5y

Channel Pruning for Accelerating Very Deep Neural Networks [article]

Yihui He, Xiangyu Zhang, Jian Sun
2017 arXiv   pre-print
In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks.Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune  ...  Our pruned VGG-16 achieves the state-of-the-art results by 5x speed-up along with only 0.3% increase of error.  ...  From scratch 11.9 1.8 From scratch (uniformed) 12.5 2.4 Ours 18.0 7.9 Ours (fine-tuned) 11.1 1.0 Table 4. Comparisons with training from scratch, under 4× accel- eration.  ... 
arXiv:1707.06168v2 fatcat:tswkbanmtzdbjcnk2hh3m642ci

On the Effect of Pruning on Adversarial Robustness [article]

Artur Jordao, Helio Pedrini
2021 arXiv   pre-print
Our analyses reveal that pruning structures (filters and/or layers) from convolutional networks increase not only generalization but also robustness to adversarial images (natural images with content modified  ...  Pruning is a well-known mechanism for reducing the computational cost of deep convolutional networks.  ...  ., outperforming the best training from scratch (Scratch-B) by 1.28 p.p.  ... 
arXiv:2108.04890v2 fatcat:ies2w7dvj5gwbfdk5zyda2u2uu

The Difficulty of Training Sparse Neural Networks [article]

Utku Evci, Fabian Pedregosa, Aidan Gomez, Erich Elsen
2020 arXiv   pre-print
Recent work of has shown that sparse ResNet-50 architectures trained on ImageNet-2012 dataset converge to solutions that are significantly worse than those found by pruning.  ...  Additionally, our attempts to find a decreasing objective path from "bad" solutions to the "good" ones in the sparse subspace fail.  ...  A different approach is to reuse the sparsity pattern found through pruning and train a sparse network from scratch.  ... 
arXiv:1906.10732v3 fatcat:orsu2pd27jd3bncbnkrdavn4ue

Network Pruning That Matters: A Case Study on Retraining Variants [article]

Duong H. Le, Binh-Son Hua
2021 arXiv   pre-print
Our results emphasize the cruciality of the learning rate schedule in pruned network retraining - a detail often overlooked by practitioners during the implementation of network pruning.  ...  By leveraging the right learning rate schedule in retraining, we demonstrate a counter-intuitive phenomenon in that randomly pruned networks could even achieve better performance than methodically pruned  ...  Configurations of Model and Pruned Model are both from the original paper. The results of "Scratch-E" and "Scratch-B" on ImageNet are taken directly from work of Liu et al. (2019) .  ... 
arXiv:2105.03193v1 fatcat:2beuippssffstjvhl3gpzdcsbu

The State of Sparsity in Deep Neural Networks [article]

Trevor Gale, Erich Elsen, Sara Hooker
2019 arXiv   pre-print
., 2018) at scale and show that unstructured sparse architectures learned through pruning cannot be trained from scratch to the same test set performance as a model trained with joint sparsification and  ...  ., 2017b) shown to yield high compression rates on smaller datasets perform inconsistently, and that simple magnitude pruning approaches achieve comparable or better results.  ...  Experimental Framework For magnitude pruning, we used the TensorFlow model pruning library. We implemented variational dropout and l 0 regularization from scratch.  ... 
arXiv:1902.09574v1 fatcat:bqnagzhdyjg6rea7kvfsirvdom

Small Network for Lightweight Task in Computer Vision: A Pruning Method Based on Feature Representation

Yisu Ge, Shufang Lu, Fei Gao, Paolo Gastaldo
2021 Computational Intelligence and Neuroscience  
Different from other pruning approaches, the proposed strategy is guided by the practical task and eliminates the irrelevant filters in the network.  ...  In this paper, a pruning algorithm for the lightweight task is proposed, and a pruning strategy based on feature representation is investigated.  ...  Size-EX selects the filters based on the weight size of filters, and scratch-EX is trained from scratch.  ... 
doi:10.1155/2021/5531023 pmid:33959156 pmcid:PMC8075670 fatcat:gwzannnunnh5ri3icarhyhzr5e

Adversarial Robustness vs Model Compression, or Both? [article]

Shaokai Ye, Kaidi Xu, Sijia Liu, Jan-Henrik Lambrechts, Huan Zhang, Aojun Zhou, Kaisheng Ma, Yanzhi Wang, Xue Lin
2021 arXiv   pre-print
training a small model from scratch even with inherited initialization from the large model cannot achieve both adversarial robustness and high standard accuracy.  ...  Code is available at  ...  Weight Pruning vs Training from Scratch An ongoing debate about pruning is whether weight pruning is actually needed and why not just training a small network from scratch.  ... 
arXiv:1903.12561v5 fatcat:vujngm6rh5ge7h5ebqh6l2j42a

Progressive Gradient Pruning for Classification, Detection and DomainAdaptation [article]

Le Thanh Nguyen-Meidine, Eric Granger, Madhu Kiran, Louis-Antoine Blais-Morin, Marco Pedersoli
2020 arXiv   pre-print
However, thesetechniques involve numerous steps and complex optimisa-tions because some only prune after training CNNs, whileothers prune from scratch during training by integratingsparsity constraints  ...  Filter pruning techniques haverecently shown promising results for the compression andacceleration of convolutional NNs (CNNs).  ...  from scratch during training.  ... 
arXiv:1906.08746v4 fatcat:u245y2qjenbkjk7upsolqsox4q

On the Orthogonality of Knowledge Distillation with Other Techniques: From an Ensemble Perspective [article]

SeongUk Park, KiYoon Yoo, Nojun Kwak
2020 arXiv   pre-print
This analytical explanation is provided from the perspective of implicit data augmentation property of knowledge distillation.  ...  Developing an efficient model includes several strategies such as network architecture search, pruning, quantization, knowledge distillation, utilizing cheap convolution, regularization, and also includes  ...  Thus, we expect that given a same batch of inputs, a model trained with KD will return more diversified outputs than a model trained from scratch.  ... 
arXiv:2009.04120v2 fatcat:645ngid2erh6hev73mdkd7cvle
« Previous Showing results 1 — 15 out of 13,570 results