Filters








23 Hits in 9.7 sec

An Overview of Efficient Interconnection Networks for Deep Neural Network Accelerators

Seyed Morteza Nabavinejad, Mohammad Baharloo, Kun-Chih Chen, Maurizio Palesi, Tim Kogel, Masoumeh Ebrahimi
2020 IEEE Journal on Emerging and Selected Topics in Circuits and Systems  
., in/near-memory processing) for the DNN accelerator design. This paper systematically investigates the interconnection networks in modern DNN accelerator designs.  ...  With this motivation, reconfigurable DNN computing with flexible on-chip interconnection will be investigated in this paper.  ...  DNNs usually have various convolutional layers with different input/output/kernel size features.  ... 
doi:10.1109/jetcas.2020.3022920 fatcat:idqitgwnrnegbd4dhrly3xsxbi

A Survey of Machine Learning for Computer Architecture and Systems [article]

Nan Wu, Yuan Xie
2021 arXiv   pre-print
For ML-based design methodology, we follow a bottom-up path to review current work, with a scope of (micro-)architecture design (memory, branch prediction, NoC), coordination between architecture/system  ...  and workload (resource allocation and management, data center management, and security), compiler, and design automation.  ...  compilation for approximate computing or DNN applications.  ... 
arXiv:2102.07952v1 fatcat:vzj776a6abesljetqobakoc3dq

Deep Learning for Mobile Multimedia

Kaoru Ota, Minh Son Dao, Vasileios Mezaris, Francesco G. B. De Natale
2017 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)  
As a consequence, there is an increasing interest on the possibility of applying DNNs to mobile environments [61] .  ...  All those data may be used to monitor the patient's condition, but require an e cient on-the-y processing to produce a compact stream of signi cant information.  ...  A host processor is used to manage the interaction with general data and applications.  ... 
doi:10.1145/3092831 fatcat:ez2fcgckhjawlfywyecest4jqy

A Survey of Near-Data Processing Architectures for Neural Networks

Mehdi Hassanpour, Marc Riera, Antonio González
2022 Machine Learning and Knowledge Extraction  
Data-intensive workloads and applications, such as machine learning (ML), are fundamentally limited by traditional computing systems based on the von-Neumann architecture.  ...  As data movement operations and energy consumption become key bottlenecks in the design of computing systems, the interest in unconventional approaches such as Near-Data Processing (NDP), machine learning  ...  Section 3.2 describes state-of-the-art NDP accelerators for DNNs based on DRAM HMC architectures. The Hybrid Memory Cube (HMC) is designed for high performance data-centric applications.  ... 
doi:10.3390/make4010004 fatcat:5frcwe57drgihbgygiecoqqnvy

From DNNs to GANs: Review of efficient hardware architectures for deep learning [article]

Gaurab Bhattacharya
2021 arXiv   pre-print
Similarly, different algorithms have been adapted to design a DSP processor compatible for fast performance in neural network, activation function, convolutional neural network and generative adversarial  ...  In this review, we illustrate the recent developments in hardware for accelerating the efficient implementation of deep learning networks with enhanced performance.  ...  This is useful to implement an arbitrary size of fivestage DNN with resource allocation scheme, as in Fig. 4 .  ... 
arXiv:2107.00092v1 fatcat:i6kijx7pavdajeskn4lip7gnhe

A Survey of Near-Data Processing Architectures for Neural Networks [article]

Mehdi Hassanpour, Marc Riera, Antonio González
2021 arXiv   pre-print
Data-intensive workloads and applications, such as machine learning (ML), are fundamentally limited by traditional computing systems based on the von-Neumann architecture.  ...  As data movement operations and energy consumption become key bottlenecks in the design of computing systems, the interest in unconventional approaches such as Near-Data Processing (NDP), machine learning  ...  Therefore, one kernel design.  ... 
arXiv:2112.12630v1 fatcat:drkwrztkazd3hlblxc7i4kgn2a

Hardware-assisted Machine Learning in Resource-constrained IoT Environments for Security: Review and Future Prospective

Georgios Kornaros
2022 IEEE Access  
To protect an IoT infrastructure, various solutions look into hardware-based methods for ML-based IoT authentication, access control, secure offloading, and malware detection schemes.  ...  Machine learning (ML) based intrusion and anomaly detection has lately gained traction due to its capacity to cope with encrypted and rapidly developing threat techniques.  ...  With an actually realized hardware architecture for Support Vector Machine kernel, a proposed security framework gives a detection accuracy of up to 97% for three expected Trojan attacks for a NoC-based  ... 
doi:10.1109/access.2022.3179047 fatcat:damwrncpzzbxzamtghwlmrg6v4

FPGA HLS Today: Successes, Challenges, and Opportunities

Jason Cong, Jason Lau, Gai Liu, Stephen Neuendorffer, Peichen Pan, Kees Vissers, Zhiru Zhang
2022 ACM Transactions on Reconfigurable Technology and Systems  
In multiple ways, Year 2011 marked an important transition for FPGA high-level synthesis (HLS), as it went from prototyping to deployment.  ...  We also discuss the challenges faced by today's HLS technology and the opportunities for further research and development, especially in the areas of achieving high clock frequency, coping with complex  ...  reuse bufer size.  ... 
doi:10.1145/3530775 fatcat:hacv5vmlczbanpiurj73knmrzm

A Survey on FPGA Virtualization

Anuj Vaishnav, Khoa Dang Pham, Dirk Koch
2018 2018 28th International Conference on Field Programmable Logic and Applications (FPL)  
Given the scale of deployment, there is a need for efficient application development, resource management, and scalable systems, which make FPGA virtualization extremely important.  ...  is performed at an arbitrary point in time.  ...  Note that systems may combine these models to form a hybrid architecture based on the application requirements.  ... 
doi:10.1109/fpl.2018.00031 dblp:conf/fpl/VaishnavPK18 fatcat:6ydu2dvlsndwfp5xuq527jvb4y

Near-Memory Computing on FPGAs with 3D-stacked Memories: Applications, Architectures, and Optimizations

Veronia Iskandar, Mohamed A. Abd El Ghany, Diana Goehringer
2022 ACM Transactions on Reconfigurable Technology and Systems  
Various FPGA-based NMC designs have been proposed with software and hardware optimization methods to achieve high performance and energy efficiency.  ...  FPGA vendors have started introducing 3D memories to their products in an effort to remain competitive on bandwidth requirements of modern memory-intensive applications.  ...  Prior architectures were based on a processor-centric design, where data is transferred to the CPU for processing while with near-memory processing the cores are brought to the place where data is.  ... 
doi:10.1145/3547658 fatcat:zbmovoc4kfb6vm6f2ytjuxugri

Computing Graph Neural Networks: A Survey from Algorithms to Accelerators [article]

Sergi Abadal, Akshay Jain, Robert Guirado, Jorge López-Alonso, Eduard Alarcón
2021 arXiv   pre-print
Such an ability has strong implications in a wide variety of fields whose data is inherently relational, for which conventional neural networks do not perform well.  ...  Graph Neural Networks (GNNs) have exploded onto the machine learning scene in recent years owing to their capability to model and learn from graph-structured data.  ...  data reuse and I/O cost.  ... 
arXiv:2010.00130v3 fatcat:u5bcmjodcfdh7pew4nssjemdba

Computing Graph Neural Networks: A Survey from Algorithms to Accelerators

Sergi Abadal, Akshay Jain, Robert Guirado, Jorge López-Alonso, Eduard Alarcón
2022 ACM Computing Surveys  
Such an ability has strong implications in a wide variety of fields whose data are inherently relational, for which conventional neural networks do not perform well.  ...  Graph Neural Networks (GNNs) have exploded onto the machine learning scene in recent years owing to their capability to model and learn from graph-structured data.  ...  data reuse and I/O cost.  ... 
doi:10.1145/3477141 fatcat:6ef4jh3hrvefnoytckqyyous3m

2021 Index IEEE Transactions on Circuits and Systems II: Express Briefs Vol. 68

2021 IEEE Transactions on Circuits and Systems - II - Express Briefs  
An Impedance Adapting Compensation Scheme for High Current NMOS LDO Design. Cao, H., +, TCSII July 2021 2287-2291 Gray Code-Based 10-Bit Source Driver for Large-Size OLED Display.  ...  in self test 200 Design of 2 × 8 Filtering Butler Matrix With Arbitrary Power Distribution.  ... 
doi:10.1109/tcsii.2022.3144928 fatcat:bm53w7gva5bthholfhhiq4yg3a

Intelligent Edge-Embedded Technologies for Digitising Industry [chapter]

Ovidiu Vermesan, Mario Diaz Nava
2022 Intelligent Edge-Embedded Technologies for Digitising Industry  
The series includes research monographs, edited volumes, handbooks and textbooks, providing professionals, researchers, educators, and advanced students in the field with an invaluable insight into the  ...  Topics range from the theory and use of systems involving all terminals, computers, and information processors to wired and wireless networks and network layouts, protocols, architectures, and implementations  ...  The input size is 13x64, the kernel sizes of Conv1 and Conv2 are 4x4x1, with 16 filters, and 3x3x16, with 8 filters, respectively, and the output size is 3x29x8.  ... 
doi:10.13052/rp-9788770226103 fatcat:mgz277pmkbetvbpoaomoktzgzi

A Survey on CNN and RNN Implementations

Javier Hoffmann, Osvaldo Navarro, Florian Kästner, Benedikt Janßen, Michael Hübner
unpublished
With this, we provide insights regarding the specific benefits and drawbacks of recent FPGA implementations of DNNs.  ...  Deep Neural Networks (DNNs) are widely used for complex applications, such as image and voice processing.  ...  These advantages are especially interesting with respect to the application of DNNs on embedded devices.  ... 
fatcat:gwn6ygocgfavxp6qgs73c62crm
« Previous Showing results 1 — 15 out of 23 results