A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Towards Accurate Quantization and Pruning via Data-free Knowledge Transfer
[article]
2020
arXiv
pre-print
We study data-free quantization and pruning by transferring knowledge from trained large networks to compact networks. ...
When large scale training data is available, one can obtain compact and accurate networks to be deployed in resource-constrained environments effectively through quantization and pruning. ...
Data-free Quantization and Pruning
Data-free via Adversarial Training Inspired by [MS19] , we exploit adversarial training in a knowledge distillation setting [HVD15] for data-free quantization and ...
arXiv:2010.07334v1
fatcat:q45d2vgdybcxxomx6h2ble4gxm
Efficient Synthesis of Compact Deep Neural Networks
[article]
2020
arXiv
pre-print
In this paper, we review major approaches for automatically synthesizing compact, yet accurate, DNN/LSTM models suitable for real-world applications. ...
Long short-term memories (LSTMs) are a type of recurrent neural network that have also found widespread use in the context of sequential data modeling. ...
One such method is DeepInversion [46] , an image synthesis methodology that enables data-free knowledge transfer. One of its many applications is data-free pruning. ...
arXiv:2004.08704v1
fatcat:g6gu7ng2zjda7minnahitn455a
A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration
2022
Electronics
In this review, to improve the efficiency of deep learning research, we focus on three aspects: quantized/binarized models, optimized architectures, and resource-constrained systems. ...
The learning capability of convolutional neural networks (CNNs) originates from a combination of various feature extraction layers that fully utilize a large amount of data. ...
Acknowledgments: We appreciate our reviewers and editors for their precious time in providing valuable comments and improving our paper. ...
doi:10.3390/electronics11060945
fatcat:bxxgccwkujatzh4onkzh5lgspm
Efficient Deep Learning in Network Compression and Acceleration
[chapter]
2018
Digital Systems
It is important to design or develop efficient methods to support deep learning toward enabling its scalable deployment, particularly for embedded devices such as mobile, Internet of things (IOT), and ...
I will describe the central ideas behind each approach and explore the similarities and differences between different methods. Finally, I will present some future directions in this field. ...
Acknowledgements This work was partially supported by grants from National Key Research and Development Plan (2016YFC0801005), National Natural Science Foundation of China (61772513), and the International ...
doi:10.5772/intechopen.79562
fatcat:ya65wwhk5neppgxrut5phd42dy
Bringing AI To Edge: From Deep Learning's Perspective
[article]
2020
arXiv
pre-print
search and adaptive deep learning models. ...
However, the development of edge intelligence systems encounters some challenges, and one of these challenges is the computational gap between computation-intensive deep learning algorithms and less-capable ...
This research was conducted in collaboration with HP Inc. and supported by National Research Foundation (NRF) Singapore and the Singapore Government through the Industry Alignment Fund-Industry Collaboration ...
arXiv:2011.14808v1
fatcat:g6ib7v7cxbdglihkizw5ldsxcu
A Survey on Green Deep Learning
[article]
2021
arXiv
pre-print
The target is to yield novel results with lightweight and efficient technologies. Many technologies can be used to achieve this goal, like model compression and knowledge distillation. ...
We classify these approaches into four categories: (1) compact networks, (2) energy-efficient training strategies, (3) energy-efficient inference approaches, and (4) efficient data usage. ...
Therefore, many studies focus on transferring knowledge via initialization from low-layer features and mid-layer features. ...
arXiv:2111.05193v2
fatcat:t2blz24y2jakteeeawqqogbkpy
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
[article]
2021
arXiv
pre-print
Consequently, some of the recent NLP architectures have utilized concepts of transfer learning, pruning, quantization, and knowledge distillation to achieve moderate model sizes while keeping nearly similar ...
Additionally, to mitigate the data size challenge raised by language models from a knowledge extraction perspective, Knowledge Retrievers have been built to extricate explicit data documents from a large ...
NSP loss, where is a For the above-outlined objectives, 3 training strategies are proposed: (i) Auxiliary Knowledge Transfer: Intermediary transfer via a linear combination of all layer transfer loss ...
arXiv:2104.10640v3
fatcat:ctuyddhm3baajk5uqrynwdap44
The NLP Cookbook: Modern Recipes for Transformer Based Deep Learning Architectures
2021
IEEE Access
Consequently, some of the recent NLP architectures have utilized concepts of transfer learning, pruning, quantization, and knowledge distillation to achieve moderate model sizes while keeping nearly similar ...
Additionally, to mitigate the data size challenge raised by language models from a knowledge extraction perspective, Knowledge Retrievers have been built to extricate explicit data documents from a large ...
NSP loss, where is a For the above-outlined objectives, 3 training strategies are proposed: (i) Auxiliary Knowledge Transfer: Intermediary transfer via a linear combination of all layer transfer loss ...
doi:10.1109/access.2021.3077350
fatcat:gchmms4m2ndvzdowgrvro3w6z4
Edge-Cloud Polarization and Collaboration: A Comprehensive Survey
[article]
2021
arXiv
pre-print
Influenced by the great success of deep learning via cloud computing and the rapid development of edge chips, research in artificial intelligence (AI) has shifted to both of the computing paradigms, i.e ...
progress in developing more advanced AI models on cloud servers that surpass traditional deep learning models owing to model innovations (e.g., Transformers, Pretrained families), explosion of training data ...
G-META [322] uses local subgraphs to transfer subgraph-specific information and learn transferable knowledge faster via meta gradients with only a handful of nodes or edges in the new task. ...
arXiv:2111.06061v2
fatcat:qhbyomrom5ghvikjlqkqb7eayq
EdgeAI: A Vision for Deep Learning in IoT Era
[article]
2019
arXiv
pre-print
Specifically, we discuss the existing directions in computation-aware deep learning and describe two new challenges in the IoT era: (1) Data-independent deployment of learning, and (2) Communication-aware ...
A relevant prior work is Data-Free Knowledge Distillation (DFKD) [8] which also relies on some metadata. ...
Major challenges in EdgeAI: (a) Computation-aware model compression: Key techniques include Pruning, Quantization, and Knowledge
Distillation (KD) [4], [5], (b) Data-independent model compression: Compress ...
arXiv:1910.10356v1
fatcat:6df62csanbcldaf5q6y47wymt4
Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions
[article]
2021
arXiv
pre-print
We investigate not only from the model but also the data point of view (which is not the case in existing surveys), and focus on three most studied data types (images, videos and points). ...
This paper attempts to provide a systematic summary via a comprehensive survey which can serve as a valuable reference and inspire both researchers and practitioners who work on visual recognition problems ...
via transfer learning Fine-tuning Transfer the learned knowledge of the source task to a related task with domain adaptation [118] - [122] Pre-trained models with fine-tuning is generally better than ...
arXiv:2108.13055v2
fatcat:nf3lymdbvzgl7otl7gjkk5qitq
Compression of Deep Learning Models for Text: A Survey
[article]
2021
arXiv
pre-print
In this survey, wediscuss six different types of methods (Pruning, Quantization, Knowledge Distillation, Parameter Sharing, Tensor Decomposition, andSub-quadratic Transformer based methods) for compression ...
(GPT-2) [94], Multi-task Deep Neural Network (MT-DNN) [73], Extra-Long Network (XLNet) [134], Text-to-text transfer transformer (T5) [95], T-NLG [98] and GShard [63]. ...
Then, we conduct knowledge transfer from this teacher to MobileBERT using feature map transfer and attention transfer across all layers. ...
arXiv:2008.05221v4
fatcat:6frf2wzi7zganaqgkuvy4szgmq
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
2022
ACM Transactions on Design Automation of Electronic Systems
To reduce the large design cost of these manual solutions, we discuss the AutoML framework for each of them, such as neural architecture search (NAS) and automated pruning and quantization. ...
We start from introducing popular model compression methods, including pruning, factorization, quantization, as well as compact model design. ...
[265] propose a data-free pruning method to remove the redundant neurons. ...
doi:10.1145/3486618
fatcat:h6xwv2slo5eklift2fl24usine
CLIP-Q: Deep Network Compression Learning by In-parallel Pruning-Quantization
2018
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
In this paper, we combine network pruning and weight quantization in a single learning framework that performs pruning and quantization jointly, and in parallel with fine-tuning. ...
This allows us to take advantage of the complementary nature of pruning and quantization and to recover from premature pruning errors, which is not possible with current two-stage approaches. ...
Acknowledgements This work was supported by the Natural Sciences and Engineering Research Council of Canada. ...
doi:10.1109/cvpr.2018.00821
dblp:conf/cvpr/TungM18
fatcat:ooq2o22m7badzn5ch2fik35j6i
Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization
[article]
2020
arXiv
pre-print
First, we formulate the connection pruning and weight quantization as a constrained optimization problem. ...
Model compression has been proposed as a promising technique to improve the running efficiency via parameter and operation reduction. ...
, and activity regularization, including "Pruning & Regularization", "Quantization & Regularization", "Pruning & Quantization", and "Pruning & Quantization & Regularization". ...
arXiv:1911.00822v3
fatcat:cmr43cefyrb2ldmmhjxhfotjxm
« Previous
Showing results 1 — 15 out of 759 results