Filters








10,327 Hits in 7.4 sec

Design Automation for Efficient Deep Learning Computing [article]

Song Han, Han Cai, Ligeng Zhu, Ji Lin, Kuan Wang, Zhijian Liu, Yujun Lin
2019 arXiv   pre-print
We propose design automation techniques for efficient neural networks. We investigate automatically designing specialized fast models, auto channel pruning, and auto mixed-precision quantization.  ...  Efficient deep learning computing requires algorithm and hardware co-design to enable specialization: we usually need to change the algorithm to reduce memory footprint and improve energy efficiency.  ...  CONCLUSION We present design automation techniques for efficient deep learning computing.  ... 
arXiv:1904.10616v1 fatcat:77ft4alwqvgszhevtcjssnkyzm

EMC2-NIPS 2019 Abstracts of Invited Talks

2019 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS)  
In this talk, we will describe how joint algorithm and hardware design can be used to reduce energy consumption while delivering real-time and robust performance for applications including deep learning  ...  This talk will uncover the need for building accurate, platform-specific power and latency models for convolutional neural networks (CNNs) and efficient hardware-aware CNN design methodologies, thus allowing  ...  Hardware-aware Neural Architecture Design for Small and Fast Models: from 2D to 3D Song Han, MIT Efficient deep learning computing requires algorithm and hardware co-design to enable specialization.  ... 
doi:10.1109/emc2-nips53020.2019.00007 fatcat:bvtcsgwxsrh3bmwh6tba3ly3ra

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Hanrui Wang, Yujun Lin, Song Han
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
We present APQ, a novel design methodology for efficient deep learning deployment.  ...  we achieve 2×/1.3× latency/energy saving compared to 36] while obtaining the same level accuracy; the marginal search cost of joint optimization for a new deployment scenario outperforms separate optimizations  ...  Acknowledgments We thank NSF Career Award #1943349, MIT-IBM Watson AI Lab, Samsung, SONY, AWS Machine Learning Research Award for supporting this research.  ... 
doi:10.1109/cvpr42600.2020.00215 dblp:conf/cvpr/WangWCLL0LH20 fatcat:hcgak6ct75fapng3k3jnudsfje

HAQ: Hardware-Aware Automated Quantization With Mixed Precision

Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency, which raises a great challenge to find the optimal bitwidth for each layer:  ...  There are plenty of specialized hardware for neural networks, but little research has been done for specialized neural network optimization for a particular hardware architecture.  ...  We thank Google Cloud and AWS Machine Learning Research Awards for providing the computation resource.  ... 
doi:10.1109/cvpr.2019.00881 dblp:conf/cvpr/WangLLLH19 fatcat:xlb3d7riejcm7c4udpzgqnk67y

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy [article]

Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Song Han
2020 arXiv   pre-print
We present APQ for efficient deep learning inference on resource-constrained hardware.  ...  Unlike previous methods that separately search the neural architecture, pruning policy, and quantization policy, we optimize them in a joint manner.  ...  Acknowledgments We thank NSF Career Award #1943349, MIT-IBM Watson AI Lab, Samsung, SONY, SRC, AWS Machine Learning Research Award for supporting this research.  ... 
arXiv:2006.08509v1 fatcat:kc7qbcokpfap7lt6y6li2fnale

HAQ: Hardware-Aware Automated Quantization with Mixed Precision [article]

Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
2019 arXiv   pre-print
Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency, which raises a great challenge to find the optimal bitwidth for each layer:  ...  Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference.  ...  We thank Google Cloud and AWS Machine Learning Research Awards for providing the computation resource.  ... 
arXiv:1811.08886v3 fatcat:csobmurle5bxrjr73mgyydz6ee

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han
2022 ACM Transactions on Design Automation of Electronic Systems  
To reduce the large design cost of these manual solutions, we discuss the AutoML framework for each of them, such as neural architecture search (NAS) and automated pruning and quantization.  ...  This article provides an overview of efficient deep learning methods, systems, and applications.  ...  Hardware like NVIDIA Volta GPU architecture also paves the way for mixed-precision training.  ... 
doi:10.1145/3486618 fatcat:h6xwv2slo5eklift2fl24usine

Full-Cycle Energy Consumption Benchmark for Low-Carbon Computer Vision [article]

Bo Li, Xinyang Jiang, Donglin Bai, Yuge Zhang, Ningxin Zheng, Xuanyi Dong, Lu Liu, Yuqing Yang, Dongsheng Li
2021 arXiv   pre-print
The benchmark can provide insights for low carbon emission when selecting efficient deep learning algorithms in different model usage scenarios.  ...  However, most of the existing efficient deep learning methods do not explicitly consider energy consumption as a key performance indicator.  ...  Neural Architecture Search. We focus on the subset of Neural Architecture Search (NAS) methods that aims at obtaining more efficient neural networks in this paper.  ... 
arXiv:2108.13465v2 fatcat:gqiu7mvyhrawdipyhm6x3j77tq

ENOS: Energy-Aware Network Operator Search for Hybrid Digital and Compute-in-Memory DNN Accelerators [article]

Shamma Nasrin, Ahish Shylendra, Yuti Kadakia, Nick Iliev, Wilfred Gomes, Theja Tulabandhula, Amit Ranjan Trivedi
2021 arXiv   pre-print
This work proposes a novel Energy-Aware Network Operator Search (ENOS) approach to address the energy-accuracy trade-offs of a deep neural network (DNN) accelerator.  ...  We also discuss a sequential operator assignment strategy in ENOS that only learns the assignment for one layer in one training step, enabling greater flexibility in converging towards the optimal operator  ...  This subfield of machine learning, called Neural Architecture Search (NAS), has seen rapid growth.  ... 
arXiv:2104.05217v1 fatcat:aw66hzjb6je3peanwot7babewy

Is my Neural Network Neuromorphic? Taxonomy, Recent Trends and Future Directions in Neuromorphic Engineering [article]

Sumon Kumar Bose, Jyotibdha Acharya, Arindam Basu
2020 arXiv   pre-print
We compare recent machine learning accelerator chips to show that indeed analog processing and reduced bit precision architectures have best throughput, energy and area efficiencies.  ...  We see that there is no clear consensus but each system has one or more of the following features:(1) Analog computing (2) Non vonNeumann Architecture and low-precision digital processing (3) Spiking Neural  ...  INTRODUCTION The rapid progress of Machine Learning (ML) fuelled by Deep Neural Networks (DNN) in the last several years has created an impact in a wide variety of fields ranging from computer vision,  ... 
arXiv:2002.11945v1 fatcat:ntoar4wecrgffdzyvjhids3yva

Hardware-Centric AutoML for Mixed-Precision Quantization

Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
2020 International Journal of Computer Vision  
Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency, which raises a great challenge to find the optimal bitwidth for each layer:  ...  Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference.  ...  Acknowledgements We thank NSF Career Award #1943349, MIT-IBM Watson AI Lab, Samsung, SONY, Xilinx, TI and AWS for supporting this research.  ... 
doi:10.1007/s11263-020-01339-6 fatcat:c565raez3faihf7jhfiidb47mu

Guest Editorial: IEEE TC Special Issue on Domain-Specific Architectures for Emerging Applications

Lisa Wu Wills, Karthik Swaminathan
2020 IEEE transactions on computers  
In the article "Accelerating Deep Neural Network In-situ Training with Non-volatile and Volatile Memory Based Hybrid Precision Synapses" by Yandong Luo and Shimeng Yu, the authors demonstrate a processing-in-memory  ...  The explosion of data and the unprecedented demands for energy efficiency, from data centers to wearable technologies, further exacerbate the urgency to innovate computer systems that can not only keep  ...  In the article "PaRTAA: A Real-time Multiprocessor for Mixed-Criticality Airborne Systems" by Shibarchi Majumder, Jens Nielsen, and Thomas Bak, the authors design an architecture for mixed-criticality  ... 
doi:10.1109/tc.2020.3002674 fatcat:wkojnwhojjfh3ffh6pcnu563vi

NAX: Co-Designing Neural Network and Hardware Architecture for Memristive Xbar based Computing Systems [article]

Shubham Negi, Indranil Chakraborty, Aayush Ankit, Kaushik Roy
2021 arXiv   pre-print
To that effect, we propose NAX -- an efficient neural architecture search engine that co-designs neural network and IMC based hardware architecture.  ...  NAX explores the aforementioned search space to determine kernel and corresponding crossbar sizes for each DNN layer to achieve optimal tradeoffs between hardware efficiency and application accuracy.  ...  Neural Architecture Search (NAS) has been widely used to explore over a large search space in order to design efficient neural networks for various deep learning tasks [14, 16, 17] .  ... 
arXiv:2106.12125v1 fatcat:fie7mv2giredzpz3zfm5ssm2hu

Rethinking Differentiable Search for Mixed-Precision Neural Networks [article]

Zhaowei Cai, Nuno Vasconcelos
2020 arXiv   pre-print
The resulting Efficient differentiable MIxed-Precision network Search (EdMIPS) method is effective at finding the optimal bit allocation for multiple popular networks, and can search a large model, e.g  ...  The learned mixed-precision networks significantly outperform their uniform counterparts.  ...  Neural architecture search: NAS is a popular approach to automated search of neural network architectures [38, 39, 3, 2, 20] .  ... 
arXiv:2004.05795v1 fatcat:glpjspyrcfgk3gjzpz7so47mj4

Rethinking Differentiable Search for Mixed-Precision Neural Networks

Zhaowei Cai, Nuno Vasconcelos
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
The resulting Efficient differentiable MIxed-Precision network Search (EdMIPS) method is effective at finding the optimal bit allocation for multiple popular networks, and can search a large model, e.g  ...  The learned mixed-precision networks significantly outperform their uniform counterparts.  ...  Neural architecture search: NAS is a popular approach to automated search of neural network architectures [38, 39, 3, 2, 20] .  ... 
doi:10.1109/cvpr42600.2020.00242 dblp:conf/cvpr/CaiV20 fatcat:uqnqbjr6rzg4phexzkk7pvg5lm
« Previous Showing results 1 — 15 out of 10,327 results