895 Hits in 8.0 sec

Dynamic Network Surgery for Efficient DNNs [article]

Yiwen Guo, Anbang Yao, Yurong Chen
2016 arXiv   pre-print
In this paper, we propose a novel network compression method called dynamic network surgery, which can remarkably reduce the network complexity by making on-the-fly connection pruning.  ...  Without any accuracy loss, our method can efficiently compress the number of parameters in LeNet-5 and AlexNet by a factor of 108× and 17.7× respectively, proving that it outperforms the recent pruning  ...  Figure 3 : 3 The Exclusive-OR (XOR) classification problem (a) without noise and (b) with noise. Figure 4 : 4 Dynamic network surgery on a three-layer neural network for the XOR problem.  ... 
arXiv:1608.04493v2 fatcat:3truyx2r4vdylif6sa5rpjrwwy

CNN implementation in Resource Limited FPGAs - Key Concepts and Techniques

José Rosa, Monica Figueiredo, Luis Bento
2021 Zenodo  
Convolutional Neural Network (CNN) is a type of algorithm used to solve complex problems with a superior performance when compared to traditional computational methods.  ...  Field Programmable Gate Array (FPGA) is a good option for implementing CNN in the edge, since even the lowest cost FPGAs have a good energy efficiency and a sufficient throughput to enable real-time applications  ...  One or two bit quantized neural networks exchange accuracy for a high model compression [51] , being good candidates for real-time deep learning implementations on FPGA and ASIC due to their bitwise efficiency  ... 
doi:10.5281/zenodo.5080239 fatcat:s2r3zgus7jfdnfupin3dtbmeym

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han
2022 ACM Transactions on Design Automation of Electronic Systems  
To reduce the large design cost of these manual solutions, we discuss the AutoML framework for each of them, such as neural architecture search (NAS) and automated pruning and quantization.  ...  We start from introducing popular model compression methods, including pruning, factorization, quantization, as well as compact model design.  ...  On the other hand, model compression usually starts from a pre-defined/pre-trained deep network, and compresses it for a more efficient deployment.  ... 
doi:10.1145/3486618 fatcat:h6xwv2slo5eklift2fl24usine

StressedNets: Efficient Feature Representations via Stress-induced Evolutionary Synthesis of Deep Neural Networks [article]

Mohammad Javad Shafiee, Brendan Chwyl, Francis Li, Rongyan Chen, Michelle Karg, Christian Scharfenberger, Alexander Wong
2018 arXiv   pre-print
are imposed upon the synapses of a deep neural network during training to induce stress and steer the synthesis process towards the production of more efficient deep neural networks over successive generations  ...  One particularly promising strategy to addressing the complexity issue is the notion of evolutionary synthesis of deep neural networks, which was demonstrated to successfully produce highly efficient deep  ...  (NSERC) for their financial support.  ... 
arXiv:1801.05387v1 fatcat:3r53yksey5e4vjfh7qq7bpltbm

Pruning and Quantization for Deep Neural Network Acceleration: A Survey [article]

Tailin Liang, John Glossner, Lei Wang, Shaobo Shi, Xiaotong Zhang
2021 arXiv   pre-print
This paper provides a survey on two types of network compression: pruning and quantization. Pruning can be categorized as static if it is performed offline or dynamic if it is performed at run-time.  ...  We compare current techniques, analyze their strengths and weaknesses, present compressed network accuracy results on a number of frameworks, and provide practical guidance for compressing networks.  ...  Introduction Deep Neural Networks (DNNs) have shown extraordinary abilities in complicated applications such as image classification, object detection, voice synthesis, and semantic segmentation [138]  ... 
arXiv:2101.09671v3 fatcat:a34q7ca24zbylmjrddlkt3ggai


Adam Page, Ali Jafari, Colin Shea, Tinoosh Mohsenin
2017 ACM Journal on Emerging Technologies in Computing Systems  
The sparsification techniques include feature compression partition, structured filter pruning, and dynamic feature pruning.  ...  In particular, deep convolutional neural networks have been shown to dominate on several popular public benchmarks such as the ImageNet database.  ...  The sparsification techniques include feature compression partition, structured filter pruning and dynamic feature pruning.  ... 
doi:10.1145/3005448 fatcat:quxiy72jtrfipdpeup75mhiizm

Auto Deep Compression by Reinforcement Learning Based Actor-Critic Structure [article]

Hamed Hakkak
2018 arXiv   pre-print
Model-based compression is an effective, facilitating, and expanded model of neural network models with limited computing and low power.  ...  With a 4- fold reduction in FLOP, the accuracy of 2.8% is higher than the manual compression model for VGG-16 in ImageNet.  ...  Comparison of pruning strategies is less than 2 times. A uniform policy collection is checked uniformly with the same compression ratio for each layer.  ... 
arXiv:1807.02886v1 fatcat:jw66hxc3zjfefml3asyfpx3q5y

Fast inference of deep neural networks in FPGAs for particle physics

J. Duarte, S. Han, P. Harris, S. Jindariani, E. Kreinar, B. Kreis, J. Ngadiuba, M. Pierini, R. Rivera, N. Tran, Z. Wu
2018 Journal of Instrumentation  
We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable, among many other physics scenarios, searches for new dark sector particles  ...  Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole.  ...  Neural network training is run on CERN and Amazon AWS GPU resources. Amazon AWS GPU resources are provided through Fermilab as part of a DOE "Field Work Proposal", and in particular,  ... 
doi:10.1088/1748-0221/13/07/p07027 fatcat:bq7km3h5gbbe3l5sltzhkn3eeq

Bringing AI To Edge: From Deep Learning's Perspective [article]

Di Liu, Hao Kong, Xiangzhong Luo, Weichen Liu, Ravi Subramaniam
2020 arXiv   pre-print
To bridge the gap, a plethora of deep learning techniques and optimization methods are proposed in the past years: light-weight deep learning models, network compression, and efficient neural architecture  ...  This paper surveys the representative and latest deep learning techniques that are useful for edge intelligence systems, including hand-crafted models, model compression, hardware-aware neural architecture  ...  This research was conducted in collaboration with HP Inc. and supported by National Research Foundation (NRF) Singapore and the Singapore Government through the Industry Alignment Fund-Industry Collaboration  ... 
arXiv:2011.14808v1 fatcat:g6ib7v7cxbdglihkizw5ldsxcu

Learning Efficient Deep Feature Representations via Transgenerational Genetic Transmission of Environmental Information During Evolutionary Synthesis of Deep Neural Networks

M. J. Shafiee, E. Barshan, F. Li, B. Chwyl, M. Karg, C. Scharfenberger, A. Wong
2017 2017 IEEE International Conference on Computer Vision Workshops (ICCVW)  
One strategy to addressing the complexity issue is the evolutionary deep intelligence framework, which has been demonstrated to enable the synthesis of highly efficient deep neural networks that retain  ...  synapses during training to favor the synthesis of more efficient deep neural networks over successive generations.  ...  Future work includes investigating alternative intragenerational environmental stresses, as well as dynamic strategies for adapting the degree of intra-generational environmental stress based on the intrinsic  ... 
doi:10.1109/iccvw.2017.120 dblp:conf/iccvw/ShafieeBLCKSW17 fatcat:jfqjnwnmprcjtbqgem2nrihbmq

Recent Advances in Convolutional Neural Network Acceleration [article]

Qianru Zhang, Meng Zhang, Tinghuan Chen, Zhifei Sun, Yuzhe Ma, Bei Yu
2018 arXiv   pre-print
In recent years, convolutional neural networks (CNNs) have shown great performance in various fields such as image classification, pattern recognition, and multi-media compression.  ...  We propose a taxonomy in terms of three levels, i.e.~structure level, algorithm level, and implementation level, for acceleration methods.  ...  Network Pruning Network pruning originates as a method to reduce the size and over-fitting of a neural network.  ... 
arXiv:1807.08596v1 fatcat:jx66ekaofjhqzdbaueal476bvi

SENTEI: Filter-Wise Pruning with Distillation towards Efficient Sparse Convolutional Neural Network Accelerators

Masayuki SHIMODA, Youki SADA, Ryosuke KURAMOCHI, Shimpei SATO, Hiroki NAKAHARA
2020 IEICE transactions on information and systems  
Therefore, our technique realizes hardware-aware network with comparable accuracy. key words: sparse convolutional neural network, filter-wise pruning, distillation, FPGA * SENTEI is a Japanese word that  ...  Additionally, the speedup and power efficiency of our FPGA implementation were 33.2× and 87.9× higher than those of the mobile GPU.  ...  for Evolutional Science and Technology (CREST), and the New Energy and Industrial Technology Development Organisation (NEDO).  ... 
doi:10.1587/transinf.2020pap0013 fatcat:ezpelksin5c4lkqy73hoocc4mq

RSNN: A Software/Hardware Co-optimized Framework for Sparse Convolutional Neural Networks on FPGAs

Weijie You, Chang Wu
2020 IEEE Access  
To balance the computation load on different Processing Units (PUs), we propose a software-based loadbalance aware pruning technique as well as a kernel merging method.  ...  INDEX TERMS Accelerator, convolutional neural network, FPGA, sparse neural network. VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.  ...  When compared with unstructured pruning methods such as Deep Compression [15] , our pruning method could reduce the idle computation time and achieve higher computation efficiency.  ... 
doi:10.1109/access.2020.3047144 fatcat:bsdfleunjndabolzjrey7q7tk4

Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA

Hengyi Li, Xuebin Yue, Zhichen Wang, Zhilei Chai, Wenwen Wang, Hiroyuki Tomiyama, Lin Meng, M. Hassaballah
2022 Computational Intelligence and Neuroscience  
To accelerate the practical applications of artificial intelligence, this paper proposes a high efficient layer-wise refined pruning method for deep neural networks at the software level and accelerates  ...  As for the VGG network, 87.05% of parameters and 75.78% of Floating-Point Operations are pruned with only 0.74% accuracy loss for VGG13BN on CIFAR10.  ...  Deep neural networks (DNNs) have been applied for multiple tasks such as target classi cation, detection, and recognition.  ... 
doi:10.1155/2022/8039281 pmid:35694575 pmcid:PMC9177312 fatcat:xukxq54fl5ha7e6xzqykv7rw3e

Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection [article]

Mao Ye, Chengyue Gong, Lizhen Nie, Denny Zhou, Adam Klivans, Qiang Liu
2020 arXiv   pre-print
Recent empirical works show that large deep neural networks are often highly redundant and one can find much smaller subnetworks without a significant drop of accuracy.  ...  However, most existing methods of network pruning are empirical and heuristic, leaving it open whether good subnetworks provably exist, how to find them efficiently, and if network pruning can be provably  ...  ., Pedram, A., Horowitz, M. A., and Dally, W. J. Eie: efficient inference engine on compressed deep neural network. ACM SIGARCH Computer Architecture News, 44(3):243-254, 2016a.  ... 
arXiv:2003.01794v3 fatcat:wsfcq6em4zd4domik3sp7ltdk4
« Previous Showing results 1 — 15 out of 895 results