A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Pruning at a Glance: Global Neural Pruning for Model Compression
[article]
2019
arXiv
pre-print
To address these limitations, we propose a novel and simple pruning method that compresses neural networks by removing entire filters and neurons according to a global threshold across the network without ...
The resulting model is compact, non-sparse, with the same accuracy as the non-compressed model, and most importantly requires no special infrastructure for deployment. ...
Using these setting, we produce two models "Ours-minError" that is aimed at having a minimum error or closest accuracy to the baseline and "Ours-maxCompr" that is aimed at having a high compression percentage ...
arXiv:1912.00200v2
fatcat:h65bjw65obhyhmbcioh63fp3dm
A Signal Propagation Perspective for Pruning Neural Networks at Initialization
[article]
2020
arXiv
pre-print
Network pruning is a promising avenue for compressing deep neural networks. ...
Our modifications to the existing pruning at initialization method lead to improved results on all tested network models for image classification tasks. ...
We would also like to acknowledge the Royal Academy of Engineering and FiveAI, and thank Richard Hartley, Puneet Dokania and Amartya Sanyal for helpful discussions. ...
arXiv:1906.06307v2
fatcat:nc4pzt4g3rgllecuhrrb6pquz4
Stochastic Model Pruning via Weight Dropping Away and Back
[article]
2020
arXiv
pre-print
Compared to the Bayesian approaches that stochastically train a compact model for pruning, we directly aim at stochastic gradual pruning. ...
However, most successful DNNs have an extremely complex structure, leading to extensive research on model compression.As a significant area of progress in model compression, traditional gradual pruning ...
Taking a glance at the Traditional Pruning process in Figure 1 (a)&(b), we consider it as corresponding to an extension of IST (retrain several steps and prune), as marked below (2) . ...
arXiv:1812.02035v2
fatcat:glz34x4v6rhfldyivk3kxcwds4
Enabling Retrain-free Deep Neural Network Pruning using Surrogate Lagrangian Relaxation
[article]
2021
arXiv
pre-print
Network pruning is a widely used technique to reduce computation cost and model size for deep neural networks. ...
It also achieves a high model accuracy even at the hard-pruning stage without retraining (reduces the traditional three-stage pruning to two-stage). ...
At a glance at YOLOv3-tiny results, we observe that the advantage of SLR is higher with an increased compression rate. ...
arXiv:2012.10079v2
fatcat:u7lisfgksffobfdsyjsxtwb4ie
DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
Neural network pruning, as one of the mainstream model compression techniques, is under extensive study to reduce the model size and thus the amount of computation. ...
As a further optimization, we propose a density-adaptive regular-block (DARB) pruning that can effectively take advantage of the intrinsic characteristics of neural networks, and thereby outperform prior ...
It seems huge at the first glance. However, the weight index in DARB just indicates its position within the block. ...
doi:10.1609/aaai.v34i04.6000
fatcat:ky3gpxc6mrc6tmbcphfucohbpa
DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks
[article]
2019
arXiv
pre-print
Neural network pruning, as one of the mainstream model compression techniques, is under extensive study to reduce the number of parameters and computations. ...
As a further optimization, we propose a density-adaptive regular-block (DARB) pruning that outperforms prior structured pruning work with high pruning ratio and decoding efficiency. ...
It seems huge at the first glance. However, the weight index in DARB just indicates its position within the block. ...
arXiv:1911.08020v2
fatcat:6za4w2gsenal3azmgj7emqfjwi
Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization
[article]
2020
arXiv
pre-print
These methods can be applied in either a single way for moderate compression or a joint way for aggressive compression. ...
To this end, we realize a comprehensive SNN compression through three steps. First, we formulate the connection pruning and weight quantization as a constrained optimization problem. ...
Finally, we present Figure 8 to summary the accuracy results in Table II -IV for a better glance. Connection Pruning. ...
arXiv:1911.00822v3
fatcat:cmr43cefyrb2ldmmhjxhfotjxm
Layer-Wise Compressive Training for Convolutional Neural Networks
2018
Future Internet
Convolutional Neural Networks (CNNs) are brain-inspired computational models designed to recognize patterns. ...
This characteristic is a serious concern for the deployment on resource-constrained embedded-systems, where compression stages are needed to meet the stringent hardware constraints. ...
At a glance, the algorithm is composed of three main stages denoted with different colors: pre-training (light red), setup (yellow), and optimization (blue). ...
doi:10.3390/fi11010007
fatcat:6ftczvjqdbhzjinyal2ilw76du
Unified Visual Transformer Compression
[article]
2022
arXiv
pre-print
We formulate a budget-constrained, end-to-end optimization framework, targeting jointly learning model weights, layer-wise pruning ratios/masks, and skip configurations, under a distillation loss. ...
This paper proposes a unified ViT compression framework that seamlessly assembles three effective techniques: pruning, layer skipping, and knowledge distillation. ...
MODEL COMPRESSION Pruning. ...
arXiv:2203.08243v1
fatcat:5rrj5vn53zdahejoxtfaoda6me
Prune Your Model Before Distill It
[article]
2022
arXiv
pre-print
Based on this result, we propose an end-to-end neural network compression scheme where the student network is formed based on the pruned teacher and then apply the "prune, then distill" strategy. ...
In this work, we propose the novel framework, "prune, then distill," that prunes the model first to make it more transferrable and then distill it to the student. ...
At first glance, a powerful teacher with higher accuracy may show better distillation results; however, Cho and Hariharan [4] showed that the less-trained teacher teaches better when the student network ...
arXiv:2109.14960v2
fatcat:5blewtk4pjcl3kadc5ciznl3sq
Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions
[article]
2021
arXiv
pre-print
This paper attempts to provide a systematic summary via a comprehensive survey which can serve as a valuable reference and inspire both researchers and practitioners who work on visual recognition problems ...
Though recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications. ...
Although they may look different at the first glance, they all align the coordinates dimension to the CNN channel dimension to make the computation efficient and simultaneously model all points to avoid ...
arXiv:2108.13055v2
fatcat:nf3lymdbvzgl7otl7gjkk5qitq
Exposing Hardware Building Blocks to Machine Learning Frameworks
[article]
2020
arXiv
pre-print
This need for real time processing can be seen in industries ranging from developing neural network based pre-distortors for enhanced mobile broadband to designing FPGA-based triggers in major scientific ...
In this thesis, we explore how niche domains can benefit vastly if we look at neurons as a unique boolean function of the form f:B^I→ B^O, where B = {0,1}. ...
IMPLEMENTING NEURAL NETWORKS ON FPGAS To aid our understanding of how a neural network is optimized for inference on an FPGA, we glance at the HLS-RFNoC workflow. ...
arXiv:2004.05898v1
fatcat:g5a5fly4szfkdlw5kppysx2kia
Time-Correlated Sparsification for Communication-Efficient Federated Learning
[article]
2021
arXiv
pre-print
This is achieved by exchanging local model updates with the help of a parameter server (PS). ...
Hence, TCS seeks a certain correlation between the sparse representations used at consecutive iterations in FL, so that the overhead due to encoding and transmission of the sparse representation can be ...
See Fig. 1 for an illustration of the FL model across N clients each with local dataset. At the beginning of iteration t, each client pulls the current global parameter model θ t from the PS. ...
arXiv:2101.08837v1
fatcat:ne7m6al675gqvoncyrbnvbe6mq
ConfusionFlow: A model-agnostic visualization for temporal analysis of classifier confusion
[article]
2020
arXiv
pre-print
We further assess the scalability of ConfusionFlow and present a use case in the context of neural network pruning. ...
Many classification models exist, and choosing the right one for a given task is difficult. ...
Successful pruning results in a compressed model that retains its accuracy while requiring fewer computations and less memory. ...
arXiv:1910.00969v3
fatcat:u4f2bl45z5b7fdelbls25su2cu
Discounting and VBM
2015
Journal of Child and Adolescent Behaviour
At first sight, this result seems to contradict previous findings but it does not; DLPFC undergoes a normal pruning across adolescence, however, the more grey matter participants had at the onset of the ...
We observed a global decrease in grey matter volume over the course of adolescence. ...
Bernal-Casas wishes to thanks relatives and friends for their continued moral and financial support. ...
doi:10.4172/2375-4494.1000251
fatcat:cvfye7jm3fhtlnzwcccuiulbia
« Previous
Showing results 1 — 15 out of 437 results