7,164 Hits in 8.9 sec

A Survey of Model Compression and Acceleration for Deep Neural Networks [article]

Yu Cheng, Duo Wang, Pan Zhou, Tao Zhang
2020 arXiv   pre-print
Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance.  ...  After that, we survey the evaluation matrices, the main datasets used for evaluating the model performance, and recent benchmark efforts.  ...  in deploying deep learning systems to portable devices with limited resources (e.g. memory, CPU, energy, bandwidth).  ... 
arXiv:1710.09282v9 fatcat:frwedew2gfe3rjif5ds75jqay4

Pruning and Quantization for Deep Neural Network Acceleration: A Survey [article]

Tailin Liang, John Glossner, Lei Wang, Shaobo Shi, Xiaotong Zhang
2021 arXiv   pre-print
In some cases accuracy may even improve. This paper provides a survey on two types of network compression: pruning and quantization.  ...  However, complex network architectures challenge efficient real-time deployment and require significant computation resources and energy costs.  ...  This allowed neural networks to be deployed in constrained environments such as embedded systems.  ... 
arXiv:2101.09671v3 fatcat:a34q7ca24zbylmjrddlkt3ggai

A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration

Deepak Ghimire, Dayoung Kil, Seong-heum Kim
2022 Electronics  
In this review, to improve the efficiency of deep learning research, we focus on three aspects: quantized/binarized models, optimized architectures, and resource-constrained systems.  ...  Over the past decade, deep-learning-based representations have demonstrated remarkable performance in academia and industry.  ...  In surveying efficient CNN architectures and hardware acceleration, we are deeply grateful again for all the researchers and their contributions to our science.  ... 
doi:10.3390/electronics11060945 fatcat:bxxgccwkujatzh4onkzh5lgspm

A Survey on Memory Subsystems for Deep Neural Network Accelerators

Arghavan Asad, Rupinder Kaur, Farah Mohammadi
2022 Future Internet  
While the existing surveys only address DNN accelerators in general, this paper investigates novel advancements in efficient memory organizations and design methodologies in the DNN accelerator.  ...  Thus, a review of the different memory architectures applied in DNN accelerators would prove beneficial.  ...  This study assessed that designing of energy-efficient memory architectures causes an improvement in power consumption and performance of DNN accelerators significantly.  ... 
doi:10.3390/fi14050146 fatcat:4mrod5zmibgxvp6ppevgnpwlqq

Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights [article]

Shail Dave, Riyadh Baghdadi, Tony Nowatzki, Sasikanth Avancha, Aviral Shrivastava, Baoxin Li
2021 arXiv   pre-print
This paper provides a comprehensive survey on the efficient execution of sparse and irregular tensor computations of ML models on hardware accelerators.  ...  The takeaways from this paper include: understanding the key challenges in accelerating sparse, irregular-shaped, and quantized tensors; understanding enhancements in accelerator systems for supporting  ...  energy consumption.  ... 
arXiv:2007.00864v2 fatcat:k4o2xboh4vbudadfiriiwjp7uu

A Survey on the Optimization of Neural Network Accelerators for Micro-AI On-Device Inference

Arnab Neelim Mazumder, Jian Meng, Hasib-Al Rashid, Utteja Kallakuri, Xin Zhang, Jae-sun Seo, Tinoosh Mohsenin
2021 IEEE Journal on Emerging and Selected Topics in Circuits and Systems  
In this work, we aim to provide a comprehensive survey about the recent developments in the domain of energy-efficient deployment of DNNs on micro-AI platforms.  ...  To this extent, we look at different neural architecture search strategies as part of micro-AI model design, provide extensive details about model compression and quantization strategies in practice, and  ...  about the DNN accelerator designs for micro-AI, and the optimizations used to improve latency and reduce energy consumption with fast convolution, data sharing, zero skipping and low precision implementation  ... 
doi:10.1109/jetcas.2021.3129415 fatcat:nknpy4eernaeljz2hpqafe7sja

Hardware-Accelerated Platforms and Infrastructures for Network Functions: A Survey of Enabling Technologies and Research Studies

Prateek Shantharama, Akhilesh S. Thyagaturu, Martin Reisslein
2020 IEEE Access  
In-memory acceleration for bulk bitwise operations showed 32-fold performance improvements and 35-fold energy consumption savings.  ...  In comparison to Processing In-Memory (PIM) systems (see Sec. III-E4), the RRAM based accelerator shows 6.9-fold and 5.2-fold performance improvement and energy savings, respectively.  ... 
doi:10.1109/access.2020.3008250 fatcat:kv4znpypqbatfk2m3lpzvzb2nu

Fall Detection FPGA-Based Systems: A Survey

Abdelhedi Sahar
2016 International Journal of Automation and Smart Technology  
From 2013, she is a PhD student working on health monitoring thematic, embedded systems and FPGA technology in collaboration with TELNET innovation department.  ...  In this paper, we give a survey of the different fall detection systems based on FPGAs in the literature, definition of the main theoretical points of fall detection accelerometers-based systems, existing  ...  Acknowledgements This research and innovation work is carried out within a MOBIDOC scholarship funded by the EU under the PASRI project.  ... 
doi:10.5875/ausmt.v6i4.1105 fatcat:rykizqoyfjf3xni4qwuqhsz6qy

A Survey of FPGA-Based Robotic Computing [article]

Zishen Wan, Bo Yu, Thomas Yuang Li, Jie Tang, Yuhao Zhu, Yu Wang, Arijit Raychowdhury, Shaoshan Liu
2021 arXiv   pre-print
With specialized designed hardware logic and algorithm kernels, FPGA-based accelerators can surpass CPU and GPU in performance and energy efficiency.  ...  In this paper, we give an overview of previous work on FPGA-based robotic accelerators covering different stages of the robotic system pipeline.  ...  [120] design and implement a hardware ORB feature extractor and achieved a great balance between performance and energy consumption, which outperforms ARM Krait by 51% and Intel Core i5 by 41% in computation  ... 
arXiv:2009.06034v3 fatcat:fnp5q5wcyrd2hgpllso22mv2xm

Image Compression: A Survey

Mehwish Rehman, Muhammad Sharif, Mudassar Raza
2014 Research Journal of Applied Sciences Engineering and Technology  
Image Compression is a demanding field in this era of communication.  ...  There is a need to study and analyze the literature for image compression, as the demand for images, video sequences and computer animation has increased at very high rate so that the increment is drastically  ...  For consumption of energy and decorrelation first coding layer is on a DWT that is produced by wavelet filter kernel choice in correct way.  ... 
doi:10.19026/rjaset.7.303 fatcat:wjdqyahpija2npgnp4nbi6puvi

Sensor Systems Based on FPGAs and Their Applications: A Survey

Antonio de la Piedra, An Braeken, Abdellah Touhafi
2012 Sensors  
In this manuscript, we present a survey of designs and implementations of research sensor nodes that rely on FPGAs, either based upon standalone platforms or as a combination of microcontroller and FPGA  ...  As it turns out, low-power optimized FPGAs are able to enhance the computation of several types of algorithms in terms of speed and power consumption in comparison to microcontrollers of commercial sensor  ...  This survey explores the use and possibilities of FPGAs in sensor node architectures and their applications, focusing on the level of power consumption and the proper optimization of the current embedded  ... 
doi:10.3390/s120912235 fatcat:q7i7qblyffhvhkfpiedllhahs4

A Survey on FPGA Virtualization

Anuj Vaishnav, Khoa Dang Pham, Dirk Koch
2018 2018 28th International Conference on Field Programmable Logic and Applications (FPL)  
FPGA accelerators are being applied in various types of systems ranging from embedded systems to cloud computing for their high performance and energy efficiency.  ...  In this survey, we identify and classify the various techniques and approaches into three main categories: 1) Resource level, 2) Node level, and 3) Multi-node level.  ...  The trend to use a heterogeneous computing using substrates such as CPUs, GPGPUs and FPGAs will continue to grow in order to maximize performance and energy efficiency and FPGA virtualization will have  ... 
doi:10.1109/fpl.2018.00031 dblp:conf/fpl/VaishnavPK18 fatcat:6ydu2dvlsndwfp5xuq527jvb4y

Air-Ground Integrated Mobile Edge Networks: A Survey

Wen Zhang, Longzhuang Li, Ning Zhang, Tao Han, Shangguang Wang
2020 IEEE Access  
As a new platform, mobile edge computing (MEC) moves computation and storage resources to edge network in proximity to the data source.  ...  These massive data needs to be stored, transmitted, and processed in time to exploit their value for decision making.  ...  In addition, it shows a better performance can be archived if UAV has a longer duration time, suggesting energy limitation is always a key factor to improve UAV's performance.  ... 
doi:10.1109/access.2020.3008168 fatcat:544d6keravgozkdktd6tfhry7e

A Survey of Performance Optimization for Mobile Applications

Max Hort, Maria Kechagia, Federica Sarro, Mark Harman
2021 IEEE Transactions on Software Engineering  
We target our search at four performance characteristics: responsiveness, launch time, memory and energy consumption.  ...  This paper provides a comprehensive survey of non-functional performance optimization for Android applications.  ...  Summary Energy consumption is a crucial characteristic of embedded systems since these devices have a limited battery size.  ... 
doi:10.1109/tse.2021.3071193 fatcat:76q5in4vffa2vobanvf4hwer4q

Efficient Deep Learning in Network Compression and Acceleration [chapter]

Shiming Ge
2018 Digital Systems  
In this chapter, I will present a comprehensive survey of several advanced approaches for efficient deep learning in network compression and acceleration.  ...  It is important to design or develop efficient methods to support deep learning toward enabling its scalable deployment, particularly for embedded devices such as mobile, Internet of things (IOT), and  ...  Acknowledgements This work was partially supported by grants from National Key Research and Development Plan (2016YFC0801005), National Natural Science Foundation of China (61772513), and the International  ... 
doi:10.5772/intechopen.79562 fatcat:ya65wwhk5neppgxrut5phd42dy
« Previous Showing results 1 — 15 out of 7,164 results