Filters








69 Hits in 8.5 sec

NoC-based DNN accelerator

Kun-Chih (Jimmy) Chen, Masoumeh Ebrahimi, Ting-Yi Wang, Yuch-Chi Yang
2019 Proceedings of the 13th IEEE/ACM International Symposium on Networks-on-Chip - NOCS '19  
In this paper, we suggest the NoC-based DNN platform as a new accelerator design paradigm.  ...  We first comprehensively investigate conventional platforms and methodologies used in DNN computing. Then we study and analyze different design parameters to implement the NoC-based DNN accelerator.  ...  as a new design paradigm to implement future flexible DNN accelerators. • We analyze the number of memory accesses in the conventional and NoC-based designs under different DNN models . • We analyze the  ... 
doi:10.1145/3313231.3352376 dblp:conf/nocs/ChenEWY19 fatcat:fjfcybf7rbbixpqz77znkfewgm

An Overview of Efficient Interconnection Networks for Deep Neural Network Accelerators

Seyed Morteza Nabavinejad, Mohammad Baharloo, Kun-Chih Chen, Maurizio Palesi, Tim Kogel, Masoumeh Ebrahimi
2020 IEEE Journal on Emerging and Selected Topics in Circuits and Systems  
This paper provides a comprehensive investigation of the recent advances in efficient on-chip interconnection and design methodology of the DNN accelerator design.  ...  As a result, efficient interconnection and data movement mechanisms for future on-chip artificial intelligence (AI) accelerators are worthy of study.  ...  Mesh-based interconnection can also help design area and energy optimized DNN accelerators using emerging computing paradigms such as in-memory processing [63] , which we will discuss in future sections  ... 
doi:10.1109/jetcas.2020.3022920 fatcat:idqitgwnrnegbd4dhrly3xsxbi

Understanding the Impact of On-chip Communication on DNN Accelerator Performance [article]

Robert Guirado, Hyoukjun Kwon, Eduard Alarcón, Sergi Abadal and Tushar Krishna
2019 arXiv   pre-print
This paper studies the communication flows within CNN inference accelerators of edge devices, with the aim to justify current and future decisions in the design of the on-chip networks that interconnect  ...  ASIC accelerators streamline the execution of certain dataflows amenable to CNN computation that imply the constant movement of large amounts of data, thereby turning on-chip communication into a critical  ...  Generally, a DNN accelerator is composed of a memory, an array of Processing Elements (PEs) and a NoC to interconnect the PEs and memory.  ... 
arXiv:1912.01664v1 fatcat:y454dtdmgre57lo7u3eu4333wq

Multi-DNN Accelerators for Next-Generation AI Systems [article]

Stylianos I. Venieris and Christos-Savvas Bouganis and Nicholas D. Lane
2022 arXiv   pre-print
increases while meeting the quality-of-service requirements, giving rise to the topic of multi-DNN accelerator design.  ...  When focusing either on cloud-based systems that serve multiple AI queries from different users each with their own DNN model, or on mobile robots and smartphones employing pipelines of various models  ...  As such, there is an emerging need for a paradigm shift towards multi-DNN accelerator design.  ... 
arXiv:2205.09376v1 fatcat:cvdhvxmmoza57eualmj27dh5uy

A Survey on Memory Subsystems for Deep Neural Network Accelerators

Arghavan Asad, Rupinder Kaur, Farah Mohammadi
2022 Future Internet  
Thus, a review of the different memory architectures applied in DNN accelerators would prove beneficial.  ...  First, an overview of the various memory architectures used in DNN accelerators will be provided, followed by a discussion of memory organizations on non-ASIC DNN accelerators.  ...  Atomic-layer architecture for ReRAM-based DNN accelerator design.  ... 
doi:10.3390/fi14050146 fatcat:4mrod5zmibgxvp6ppevgnpwlqq

An Updated Survey of Efficient Hardware Architectures for Accelerating Deep Convolutional Neural Networks

Maurizio Capra, Beatrice Bussolino, Alberto Marchisio, Muhammad Shafique, Guido Masera, Maurizio Martina
2020 Future Internet  
Deep Neural Networks (DNNs) are nowadays a common practice in most of the Artificial Intelligence (AI) applications.  ...  In this paper, the reader will first understand what a hardware accelerator is, and what are its main components, followed by the latest techniques in the field of dataflow, reconfigurability, variable  ...  The PEs are interconnected by a Network-on-Chip (NoC) designed to achieve the desired data movement scheme.  ... 
doi:10.3390/fi12070113 fatcat:heyq4l3rkrdc5p55xdbhsh4jxu

Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead

Maurizio Capra, Beatrice Bussolino, Alberto Marchisio, Guido Masera, Maurizio Martina, Muhammad Shafique
2020 IEEE Access  
designs.  ...  In addition to hardware solutions, this paper discusses some of the important security issues that these DNN and SNN models may have during their execution, and offers a comprehensive section on benchmarking  ...  In [153] , the authors proposed a black-box profiling-based search in the first stage of the accelerator-aware NAS pipeline using an ISA-based DNN accelerator on FPGA, with a particular focus on the accurate  ... 
doi:10.1109/access.2020.3039858 fatcat:nticzqgrznftrcji4krhyjxudu

A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration

Deepak Ghimire, Dayoung Kil, Seong-heum Kim
2022 Electronics  
Over the past decade, deep-learning-based representations have demonstrated remarkable performance in academia and industry.  ...  The learning capability of convolutional neural networks (CNNs) originates from a combination of various feature extraction layers that fully utilize a large amount of data.  ...  In surveying efficient CNN architectures and hardware acceleration, we are deeply grateful again for all the researchers and their contributions to our science.  ... 
doi:10.3390/electronics11060945 fatcat:bxxgccwkujatzh4onkzh5lgspm

XB-SIM∗: A simulation framework for modeling and exploration of ReRAM-based CNN acceleration design

Xiang Fei, Youhui Zhang, Weimin Zheng
2021 Tsinghua Science and Technology  
We present XB-SIM , a simulation framework for ReRAM-crossbar-based Convolutional Neural Network (CNN) accelerators.  ...  However, design of these accelerators faces a number of challenges including imperfections of the ReRAM device and a large amount of calculations required to accurately simulate the former.  ...  BAAI2019ZD0403), Beijing Innovation Center for Future Chip, Tsinghua University, and the Science and Technology Innovation Special Zone Project, China.  ... 
doi:10.26599/tst.2019.9010070 fatcat:znzjmsbe5jgwjbywvwyq6q7dum

Computing Graph Neural Networks: A Survey from Algorithms to Accelerators [article]

Sergi Abadal, Akshay Jain, Robert Guirado, Jorge López-Alonso, Eduard Alarcón
2021 arXiv   pre-print
On the other hand, an in-depth analysis of current software and hardware acceleration schemes is provided, from which a hardware-software, graph-aware, and communication-centric vision for GNN accelerators  ...  This includes a brief tutorial on the GNN fundamentals, an overview of the evolution of the field in the last decade, and a summary of operations carried out in the multiple phases of different GNN algorithm  ...  The basic unit of the accelerator is a tile composed by an aggregator module (AGG), a DNN accelerator module (DNA), a DNN queue (DNQ) and a graph PE (GPE), all of them connected to an on-chip router.  ... 
arXiv:2010.00130v3 fatcat:u5bcmjodcfdh7pew4nssjemdba

A Construction Kit for Efficient Low Power Neural Network Accelerator Designs [article]

Petar Jokic, Erfan Azarkhish, Andrea Bonetti, Marc Pons, Stephane Emery, Luca Benini
2021 arXiv   pre-print
To evaluate and compare hardware design choices, designers can refer to a myriad of accelerator implementations in the literature.  ...  This complicates the evaluation of optimizations for new accelerator designs, slowing-down the research progress.  ...  The compiled list of optimization approaches provides quantitative performance impact measures for each approach, allowing accelerator designer to estimate their impact in future designs.  ... 
arXiv:2106.12810v1 fatcat:gx7cspazc5fdfoi64t2zjth7am

Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks [article]

Charles Eckert, Xiaowei Wang, Jingcheng Wang, Arun Subramaniyan, Ravi Iyer, Dennis Sylvester, David Blaauw, Reetuparna Das
2018 arXiv   pre-print
Sparsity in DNN models can be exploited by accelerators [41] , [42] . Utilizing sparsity in DNN models for Neural Cache is a promising direction for future work. VIII.  ...  In contrast, our work is based on the cache, which improves performance of many other workloads when not functioning as a DNN accelerator.  ... 
arXiv:1805.03718v1 fatcat:d72fse5przg43h5ojhqydsl64i

Computing Graph Neural Networks: A Survey from Algorithms to Accelerators

Sergi Abadal, Akshay Jain, Robert Guirado, Jorge López-Alonso, Eduard Alarcón
2022 ACM Computing Surveys  
On the other hand, an in-depth analysis of current software and hardware acceleration schemes is provided, from which a hardware-software, graph-aware, and communication-centric vision for GNN accelerators  ...  This includes a brief tutorial on the GNN fundamentals, an overview of the evolution of the field in the last decade, and a summary of operations carried out in the multiple phases of different GNN algorithm  ...  Building on this observation, we envision that future accelerators shall adopt a hardware-software co-design approach to maximize performance, keep graph awareness as a profitable optimization opportunity  ... 
doi:10.1145/3477141 fatcat:6ef4jh3hrvefnoytckqyyous3m

GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks [article]

Amir Yazdanbakhsh, Hajar Falahati, Philip J. Wolfe, Kambiz Samadi, Nam Sung Kim, Hadi Esmaeilzadeh
2018 arXiv   pre-print
We propose the GANAX architecture to alleviate the sources of inefficiency associated with the acceleration of GANs using conventional convolution accelerators, making the first GAN accelerator design  ...  Therefore, we propose a unified MIMD-SIMD design for GANAX that leverages repeated patterns in the computation to create distinct microprograms that execute concurrently in SIMD mode.  ...  Amir Yazdanbakhsh is partly supported by a Microsoft Research PhD Fellowship.  ... 
arXiv:1806.01107v1 fatcat:6q743mpn65c63i3y7wkdqelzfq

Domino: A Tailored Network-on-Chip Architecture to Enable Highly Localized Inter- and Intra-Memory DNN Computing [article]

Kaining Zhou, Yangshuo He, Rui Xiao, Kejie Huang
2021 arXiv   pre-print
The emerging Computing-In-Memory (CIM) architecture has been a promising candidate to accelerate neural network computing.  ...  The ever-increasing computation complexity of fast-growing Deep Neural Networks (DNNs) has requested new computing paradigms to overcome the memory wall in conventional Von Neumann computing architectures  ...  Network-on-Chip Current CIM-based DNN accelerators use a bus-based H-tree interconnect [33, 40] , where most latency of each different type of CNN is spent on communication [32] .  ... 
arXiv:2107.09500v1 fatcat:b6uicbi3ifhytn5mlukw7z2kjq
« Previous Showing results 1 — 15 out of 69 results