Filters








71 Hits in 6.9 sec

An Overview of Efficient Interconnection Networks for Deep Neural Network Accelerators

Seyed Morteza Nabavinejad, Mohammad Baharloo, Kun-Chih Chen, Maurizio Palesi, Tim Kogel, Masoumeh Ebrahimi
2020 IEEE Journal on Emerging and Selected Topics in Circuits and Systems  
Currently, a large body of research aims to find an efficient on-chip interconnection to achieve low-power and high-bandwidth DNN computing.  ...  The edge computing demand in the Internet-of-Things (IoTs) era has motivated many kinds of computing platforms to accelerate DNN operations.  ...  This has led to recent growing popularity in developing domain-specific resource-constraint platforms with dedicated processing, memory, and communication resources for DNN computation [6] .  ... 
doi:10.1109/jetcas.2020.3022920 fatcat:idqitgwnrnegbd4dhrly3xsxbi

A Survey of Machine Learning for Computer Architecture and Systems [article]

Nan Wu, Yuan Xie
2021 arXiv   pre-print
It has been a long time that computer architecture and systems are optimized to enable efficient execution of machine learning (ML) algorithms or models.  ...  For ML-based modelling, we discuss existing studies based on their target level of system, ranging from the circuit level to the architecture/system level.  ...  Each ANN is responsible for one region of the NoC, and dynamically computes a threshold for every time interval to turn on/off links given the link utilization of each region.  ... 
arXiv:2102.07952v1 fatcat:vzj776a6abesljetqobakoc3dq

Designing Efficient NoC-Based Neural Network Architectures for Identification of Epileptic Seizure

Ayut Ghosh, Arka Prava Roy, Ramapati Patra, Hemanta Kumar Mondal
2021 SN Computer Science  
The trained neural network models are mapped onto the Network-on-Chip to increase the throughput, power efficiency, parallelism and scalability of the architecture.  ...  To counter the bottlenecks of the bus-based architectures, Network-on-Chip has been efficient for complex computations.  ...  A versatile and efficient computing framework is required to process such 'big data' on a realtime platform.  ... 
doi:10.1007/s42979-021-00756-9 fatcat:w355wtatizfj5pyp7odxmsqbhe

TRIM: A Design Space Exploration Model for Deep Neural Networks Inference and Training Accelerators [article]

Yangjie Qi, Shuo Zhang, Tarek M. Taha
2022 arXiv   pre-print
TRIM is a powerful tool to help architects evaluate different hardware choices to develop efficient inference and training architecture design.  ...  The model evaluates at the whole network level, considering both inter-layer and intra-layer activities.  ...  These two models give time and energy estimates based on the individual layers of a DNN.  ... 
arXiv:2105.08239v3 fatcat:ullymz35ibgl5df6mjteudyhoy

Data Streaming and Traffic Gathering in Mesh-based NoC for Deep Neural Network Acceleration [article]

Binayak Tiwari, Mei Yang, Xiaohang Wang, Yingtao Jiang
2021 arXiv   pre-print
However, the widely used mesh-based NoC architectures inherently cannot support the efficient one-to-many and many-to-one traffic largely existing in DNN workloads.  ...  As the communication backbone of a DNN accelerator, networks-on-chip (NoC) play an important role in supporting various dataflow patterns and enabling processing with communication parallelism in a DNN  ...  Due to limitations of computing resources, the inference operation of a DNN workload is performed in multiple rounds.  ... 
arXiv:2108.02569v1 fatcat:32hhv2aeqrgh3amn4j5fziwldi

Deep Learning for Mobile Multimedia

Kaoru Ota, Minh Son Dao, Vasileios Mezaris, Francesco G. B. De Natale
2017 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)  
As a consequence, there is an increasing interest on the possibility of applying DNNs to mobile environments [61] .  ...  Meanwhile, unlike traditional high performance servers, mobile multimedia devices such as wireless sensors and smartphones usually have limited resources in terms of energy, computing power, memory, network  ...  propose computing platforms on their GPUs.  ... 
doi:10.1145/3092831 fatcat:ez2fcgckhjawlfywyecest4jqy

Table of Content

2021 2021 International Symposium on VLSI Design, Automation and Test (VLSI-DAT)  
System and Architecture Design for Smart Data Analytics D6-1 Reconfigurable Database Processor for Query Acceleration on FPGA D6-2 Dynamic Mapping Mechanism to Compute DNN Models on a Resource-limited  ...  NoC Platform D6-3 Embedded Bearing Fault Detection Platform Design for the Drivetrain System in the Future Industry 4.0 Era Session D7: DAT Special Session -Machine Learning Applications on EDA  ... 
doi:10.1109/vlsi-dat52063.2021.9427313 fatcat:dahcwqnflndbdb4o3hc2g6b7gu

A Survey on Memory Subsystems for Deep Neural Network Accelerators

Arghavan Asad, Rupinder Kaur, Farah Mohammadi
2022 Future Internet  
First, an overview of the various memory architectures used in DNN accelerators will be provided, followed by a discussion of memory organizations on non-ASIC DNN accelerators.  ...  From self-driving cars to detecting cancer, the applications of modern artificial intelligence (AI) rely primarily on deep neural networks (DNNs).  ...  In order to use a DNN to make accurate predictions, the DNN model must first be created and tuned using a set of training data. This is the first step in using a DNN and is called the training phase.  ... 
doi:10.3390/fi14050146 fatcat:4mrod5zmibgxvp6ppevgnpwlqq

Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights [article]

Shail Dave, Riyadh Baghdadi, Tony Nowatzki, Sasikanth Avancha, Aviral Shrivastava, Baoxin Li
2021 arXiv   pre-print
storage efficiency and balance computations; understanding how to compile and map models with sparse tensors on the accelerators; understanding recent design trends for efficient accelerations and further  ...  This paper provides a comprehensive survey on the efficient execution of sparse and irregular tensor computations of ML models on hardware accelerators.  ...  A dataflow refers to the spatiotemporal execution of a model layer (nested loop) on architectural resources [23] , [56] .  ... 
arXiv:2007.00864v2 fatcat:k4o2xboh4vbudadfiriiwjp7uu

GPU-Based Embedded Intelligence Architectures and Applications

Li Minn Ang, Kah Phooi Seng
2021 Electronics  
This paper gives a comprehensive review and representative studies of the emerging and current paradigms for GPU-based EI with the focus on the architecture, technologies and applications: (1) First, the  ...  overview and classifications of GPU-based EI research are presented to give the full spectrum in this area that also serves as a concise summary of the scope of the paper; (2) Second, various architecture  ...  The main challenge of the design is the computational requirements of the CNN when implemented into a hardware platform with limited computational resources.  ... 
doi:10.3390/electronics10080952 fatcat:paubm2sevbhixi2in63ayflmti

Moving Deep Learning to the Edge

Mário P. Véstias, Rui Policarpo Duarte, José T. de Sousa, Horácio C. Neto
2020 Algorithms  
Hence, new resource and energy-oriented deep learning models are required, as well as new computing platforms.  ...  One solution is to process the data at the edge devices themselves, in order to alleviate cloud server workloads and improve latency.  ...  SENet includes a content-aware mechanism to remove or emphasize input channels dynamically.  ... 
doi:10.3390/a13050125 fatcat:xhaozkwjhzbgznbw3k6p5jr2eq

Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead

Maurizio Capra, Beatrice Bussolino, Alberto Marchisio, Guido Masera, Maurizio Martina, Muhammad Shafique
2020 IEEE Access  
In addition to hardware solutions, this paper discusses some of the important security issues that these DNN and SNN models may have during their execution, and offers a comprehensive section on benchmarking  ...  In a scenario where several sophisticated algorithms need to be executed with limited energy and low latency, the need for cost-effective hardware platforms capable of implementing energy-efficient DL  ...  SNNs, in contrast to the traditional DNNs, base their computational models much closer to that of the biological neurons, with a spike-based communication mechanism [59] .  ... 
doi:10.1109/access.2020.3039858 fatcat:nticzqgrznftrcji4krhyjxudu

A Survey of System Architectures and Techniques for FPGA Virtualization [article]

Masudul Hassan Quraishi, Erfan Bank Tavakoli, Fengbo Ren
2021 arXiv   pre-print
Therefore, the virtualization of FPGAs becomes extremely important to create a useful abstraction of the hardware suitable for application developers.  ...  This survey helps researchers to efficiently learn about FPGA virtualization research by providing a comprehensive review of the existing literature.  ...  For example, the solutions in [58] , [83] , [84] focus on enabling the acceleration of a large variety of DNN models using Instruction-Set-Architecturebased methods to avoid the overhead of traditional  ... 
arXiv:2011.09073v3 fatcat:iretcbvxf5hxherin2hskndvpy

FangTianSim: High-Level Cycle-Accurate Resistive Random-Access Memory-Based Multi-Core Spiking Neural Network Processor Simulator

Jinsong Wei, Zhibin Wang, Ye Li, Jikai Lu, Hao Jiang, Junjie An, Yiqi Li, Lili Gao, Xumeng Zhang, Tuo Shi, Qi Liu
2022 Frontiers in Neuroscience  
In order to map different network topologies on the chip, SNN representation format, interpreter, and instruction generator are designed.  ...  In order to effectively bridge the gap between device, circuit, algorithm, and architecture, this paper proposes a simulation model—FangTianSim, which covers analog neuron circuit, RRAM model and multi-core  ...  Although this chip runs SNN with low power consumption, its function is very limited, and it is mostly used to study a small-scale brain model.  ... 
doi:10.3389/fnins.2021.806325 pmid:35126046 pmcid:PMC8811373 fatcat:4aiwyx2chrg63oonh3fxilpib4

Dataflow-Architecture Co-Design for 2.5D DNN Accelerators using Wireless Network-on-Package

Robert Guirado, Hyoukjun Kwon, Sergi Abadal, Eduard Alarcón, Tushar Krishna
2021 Proceedings of the 26th Asia and South Pacific Design Automation Conference  
Deep neural network (DNN) models continue to grow in size and complexity, demanding higher computational power to enable real-time inference.  ...  To cope with this challenge, we propose WIENNA, a wireless NoP-based 2.5D DNN accelerator.  ...  However, as DNN models continue to scale, the compute capabilities of DNN accelerators need to scale as well. * Both authors contributed equally to this research.  ... 
doi:10.1145/3394885.3431537 fatcat:j7xmzxurkzgh7nzityaqwk2aa4
« Previous Showing results 1 — 15 out of 71 results