Filters








759 Hits in 4.9 sec

Towards Accurate Quantization and Pruning via Data-free Knowledge Transfer [article]

Chen Zhu, Zheng Xu, Ali Shafahi, Manli Shu, Amin Ghiasi, Tom Goldstein
2020 arXiv   pre-print
We study data-free quantization and pruning by transferring knowledge from trained large networks to compact networks.  ...  When large scale training data is available, one can obtain compact and accurate networks to be deployed in resource-constrained environments effectively through quantization and pruning.  ...  Data-free Quantization and Pruning Data-free via Adversarial Training Inspired by [MS19] , we exploit adversarial training in a knowledge distillation setting [HVD15] for data-free quantization and  ... 
arXiv:2010.07334v1 fatcat:q45d2vgdybcxxomx6h2ble4gxm

Efficient Synthesis of Compact Deep Neural Networks [article]

Wenhan Xia, Hongxu Yin, Niraj K. Jha
2020 arXiv   pre-print
In this paper, we review major approaches for automatically synthesizing compact, yet accurate, DNN/LSTM models suitable for real-world applications.  ...  Long short-term memories (LSTMs) are a type of recurrent neural network that have also found widespread use in the context of sequential data modeling.  ...  One such method is DeepInversion [46] , an image synthesis methodology that enables data-free knowledge transfer. One of its many applications is data-free pruning.  ... 
arXiv:2004.08704v1 fatcat:g6gu7ng2zjda7minnahitn455a

A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration

Deepak Ghimire, Dayoung Kil, Seong-heum Kim
2022 Electronics  
In this review, to improve the efficiency of deep learning research, we focus on three aspects: quantized/binarized models, optimized architectures, and resource-constrained systems.  ...  The learning capability of convolutional neural networks (CNNs) originates from a combination of various feature extraction layers that fully utilize a large amount of data.  ...  Acknowledgments: We appreciate our reviewers and editors for their precious time in providing valuable comments and improving our paper.  ... 
doi:10.3390/electronics11060945 fatcat:bxxgccwkujatzh4onkzh5lgspm

Efficient Deep Learning in Network Compression and Acceleration [chapter]

Shiming Ge
2018 Digital Systems  
It is important to design or develop efficient methods to support deep learning toward enabling its scalable deployment, particularly for embedded devices such as mobile, Internet of things (IOT), and  ...  I will describe the central ideas behind each approach and explore the similarities and differences between different methods. Finally, I will present some future directions in this field.  ...  Acknowledgements This work was partially supported by grants from National Key Research and Development Plan (2016YFC0801005), National Natural Science Foundation of China (61772513), and the International  ... 
doi:10.5772/intechopen.79562 fatcat:ya65wwhk5neppgxrut5phd42dy

Bringing AI To Edge: From Deep Learning's Perspective [article]

Di Liu, Hao Kong, Xiangzhong Luo, Weichen Liu, Ravi Subramaniam
2020 arXiv   pre-print
search and adaptive deep learning models.  ...  However, the development of edge intelligence systems encounters some challenges, and one of these challenges is the computational gap between computation-intensive deep learning algorithms and less-capable  ...  This research was conducted in collaboration with HP Inc. and supported by National Research Foundation (NRF) Singapore and the Singapore Government through the Industry Alignment Fund-Industry Collaboration  ... 
arXiv:2011.14808v1 fatcat:g6ib7v7cxbdglihkizw5ldsxcu

A Survey on Green Deep Learning [article]

Jingjing Xu, Wangchunshu Zhou, Zhiyi Fu, Hao Zhou, Lei Li
2021 arXiv   pre-print
The target is to yield novel results with lightweight and efficient technologies. Many technologies can be used to achieve this goal, like model compression and knowledge distillation.  ...  We classify these approaches into four categories: (1) compact networks, (2) energy-efficient training strategies, (3) energy-efficient inference approaches, and (4) efficient data usage.  ...  Therefore, many studies focus on transferring knowledge via initialization from low-layer features and mid-layer features.  ... 
arXiv:2111.05193v2 fatcat:t2blz24y2jakteeeawqqogbkpy

The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures [article]

Sushant Singh, Ausif Mahmood
2021 arXiv   pre-print
Consequently, some of the recent NLP architectures have utilized concepts of transfer learning, pruning, quantization, and knowledge distillation to achieve moderate model sizes while keeping nearly similar  ...  Additionally, to mitigate the data size challenge raised by language models from a knowledge extraction perspective, Knowledge Retrievers have been built to extricate explicit data documents from a large  ...  NSP loss, where is a For the above-outlined objectives, 3 training strategies are proposed: (i) Auxiliary Knowledge Transfer: Intermediary transfer via a linear combination of all layer transfer loss  ... 
arXiv:2104.10640v3 fatcat:ctuyddhm3baajk5uqrynwdap44

The NLP Cookbook: Modern Recipes for Transformer Based Deep Learning Architectures

Sushant Singh, Ausif Mahmood
2021 IEEE Access  
Consequently, some of the recent NLP architectures have utilized concepts of transfer learning, pruning, quantization, and knowledge distillation to achieve moderate model sizes while keeping nearly similar  ...  Additionally, to mitigate the data size challenge raised by language models from a knowledge extraction perspective, Knowledge Retrievers have been built to extricate explicit data documents from a large  ...  NSP loss, where is a For the above-outlined objectives, 3 training strategies are proposed: (i) Auxiliary Knowledge Transfer: Intermediary transfer via a linear combination of all layer transfer loss  ... 
doi:10.1109/access.2021.3077350 fatcat:gchmms4m2ndvzdowgrvro3w6z4

Edge-Cloud Polarization and Collaboration: A Comprehensive Survey [article]

Jiangchao Yao, Shengyu Zhang, Yang Yao, Feng Wang, Jianxin Ma, Jianwei Zhang, Yunfei Chu, Luo Ji, Kunyang Jia, Tao Shen, Anpeng Wu, Fengda Zhang (+6 others)
2021 arXiv   pre-print
Influenced by the great success of deep learning via cloud computing and the rapid development of edge chips, research in artificial intelligence (AI) has shifted to both of the computing paradigms, i.e  ...  progress in developing more advanced AI models on cloud servers that surpass traditional deep learning models owing to model innovations (e.g., Transformers, Pretrained families), explosion of training data  ...  G-META [322] uses local subgraphs to transfer subgraph-specific information and learn transferable knowledge faster via meta gradients with only a handful of nodes or edges in the new task.  ... 
arXiv:2111.06061v2 fatcat:qhbyomrom5ghvikjlqkqb7eayq

EdgeAI: A Vision for Deep Learning in IoT Era [article]

Kartikeya Bhardwaj, Naveen Suda, Radu Marculescu
2019 arXiv   pre-print
Specifically, we discuss the existing directions in computation-aware deep learning and describe two new challenges in the IoT era: (1) Data-independent deployment of learning, and (2) Communication-aware  ...  A relevant prior work is Data-Free Knowledge Distillation (DFKD) [8] which also relies on some metadata.  ...  Major challenges in EdgeAI: (a) Computation-aware model compression: Key techniques include Pruning, Quantization, and Knowledge Distillation (KD) [4], [5], (b) Data-independent model compression: Compress  ... 
arXiv:1910.10356v1 fatcat:6df62csanbcldaf5q6y47wymt4

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions [article]

Yang Wu, Dingheng Wang, Xiaotong Lu, Fan Yang, Guoqi Li, Weisheng Dong, Jianbo Shi
2021 arXiv   pre-print
We investigate not only from the model but also the data point of view (which is not the case in existing surveys), and focus on three most studied data types (images, videos and points).  ...  This paper attempts to provide a systematic summary via a comprehensive survey which can serve as a valuable reference and inspire both researchers and practitioners who work on visual recognition problems  ...  via transfer learning Fine-tuning Transfer the learned knowledge of the source task to a related task with domain adaptation [118] - [122] Pre-trained models with fine-tuning is generally better than  ... 
arXiv:2108.13055v2 fatcat:nf3lymdbvzgl7otl7gjkk5qitq

Compression of Deep Learning Models for Text: A Survey [article]

Manish Gupta, Puneet Agrawal
2021 arXiv   pre-print
In this survey, wediscuss six different types of methods (Pruning, Quantization, Knowledge Distillation, Parameter Sharing, Tensor Decomposition, andSub-quadratic Transformer based methods) for compression  ...  (GPT-2) [94], Multi-task Deep Neural Network (MT-DNN) [73], Extra-Long Network (XLNet) [134], Text-to-text transfer transformer (T5) [95], T-NLG [98] and GShard [63].  ...  Then, we conduct knowledge transfer from this teacher to MobileBERT using feature map transfer and attention transfer across all layers.  ... 
arXiv:2008.05221v4 fatcat:6frf2wzi7zganaqgkuvy4szgmq

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han
2022 ACM Transactions on Design Automation of Electronic Systems  
To reduce the large design cost of these manual solutions, we discuss the AutoML framework for each of them, such as neural architecture search (NAS) and automated pruning and quantization.  ...  We start from introducing popular model compression methods, including pruning, factorization, quantization, as well as compact model design.  ...  [265] propose a data-free pruning method to remove the redundant neurons.  ... 
doi:10.1145/3486618 fatcat:h6xwv2slo5eklift2fl24usine

CLIP-Q: Deep Network Compression Learning by In-parallel Pruning-Quantization

Frederick Tung, Greg Mori
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
In this paper, we combine network pruning and weight quantization in a single learning framework that performs pruning and quantization jointly, and in parallel with fine-tuning.  ...  This allows us to take advantage of the complementary nature of pruning and quantization and to recover from premature pruning errors, which is not possible with current two-stage approaches.  ...  Acknowledgements This work was supported by the Natural Sciences and Engineering Research Council of Canada.  ... 
doi:10.1109/cvpr.2018.00821 dblp:conf/cvpr/TungM18 fatcat:ooq2o22m7badzn5ch2fik35j6i

Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization [article]

Lei Deng, Yujie Wu, Yifan Hu, Ling Liang, Guoqi Li, Xing Hu, Yufei Ding, Peng Li, Yuan Xie
2020 arXiv   pre-print
First, we formulate the connection pruning and weight quantization as a constrained optimization problem.  ...  Model compression has been proposed as a promising technique to improve the running efficiency via parameter and operation reduction.  ...  , and activity regularization, including "Pruning & Regularization", "Quantization & Regularization", "Pruning & Quantization", and "Pruning & Quantization & Regularization".  ... 
arXiv:1911.00822v3 fatcat:cmr43cefyrb2ldmmhjxhfotjxm
« Previous Showing results 1 — 15 out of 759 results