6 Hits in 3.3 sec

User Driven FPGA-Based Design Automated Framework of Deep Neural Networks For Low-Power Low-Cost Edge Computing

TaRek Belabed, Maria Gracielly F. Coutinho, Marcelo A. C. Fernandes, Carlos Valderrama, Chokri Souani
2021 IEEE Access  
In this context, we propose an automated framework for the implementation of hardware-accelerated DNN architectures.  ...  However, owing to topologies with many hidden layers, Deep Neural Networks (DNNs) have high computational complexity, which makes their deployment difficult in contexts highly constrained by requirements  ...  The high-level design framework FP-DNN [53] enables Tensor-Flow DNN specifications to be mapped to FPGAs using HLS-RTL hybrid templates (RTL components written in Verilog and HLS in OpenCL).  ... 
doi:10.1109/access.2021.3090196 fatcat:3bqqr45lmreb7ptws7gprxj23y

hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices [article]

Farah Fahim, Benjamin Hawks, Christian Herwig, James Hirschauer, Sergo Jindariani, Nhan Tran, Luca P. Carloni, Giuseppe Di Guglielmo, Philip Harris, Jeffrey Krupa, Dylan Rankin, Manuel Blanco Valentin (+18 others)
2021 arXiv   pre-print
To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC  ...  , long pipeline kernels for low power, and new device backends include an ASIC workflow.  ...  FP-DNN [26] is a framework that takes TensorFlow [27]-described CNNs as input, and generates the hardware implementations on FPGA boards with RTL-HLS hybrid templates.  ... 
arXiv:2103.05579v3 fatcat:5zsggdpmfng6bnfxrnv72tw7q4

HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation [article]

Hanchen Ye, Xiaofan Zhang, Zhize Huang, Gengsheng Chen, Deming Chen
2020 arXiv   pre-print
To speedup Deep Neural Networks (DNN) accelerator design and enable effective implementation, we propose HybridDNN, a framework for building high-performance hybrid DNN accelerators and delivering FPGA-based  ...  Experimental results show that the accelerators generated by HybridDNN can deliver 3375.7 and 83.3 GOPS on a high-end FPGA (VU9P) and an embedded FPGA (PYNQ-Z1), respectively, which achieve a 1.8x higher  ...  ACKNOWLEDGMENTS This work is supported in part by the IBM-Illinois Center for Cognitive Computing Systems Research (C3SR) and  ... 
arXiv:2004.03804v1 fatcat:2r7ymftbordw5odrfndowsuxg4

Fast convolutional neural networks on FPGAs with hls4ml

Thea Klaeboe Aarrestad, Vladimir Loncar, Nicolo Ghielmetti, Maurizio Pierini, Sioni Paris Summers, Jennifer Ngadiuba, Christoffer Petersson, Hampus Linander, Yutaro Iiyama, Giuseppe Di Guglielmo, Javier Mauricio Duarte, Philip Harris (+8 others)
2021 Machine Learning: Science and Technology  
We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs).  ...  We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.  ...  Acknowledgment We acknowledge the Fast Machine Learning collective as an open community of multi-domain experts and collaborators.  ... 
doi:10.1088/2632-2153/ac0ea1 fatcat:es2g6kfu3za4djdb4qs3dyr6ba

FPGA-based Accelerators of Deep Learning Networks for Learning and Classification: A Review [article]

Ahmad Shawahna, Sadiq M. Sait, Aiman El-Maleh
2019 pre-print
In this paper, we review recent existing techniques for accelerating deep learning networks on FPGAs.  ...  More precisely, FPGAs have been recently adopted for accelerating the implementation of deep learning networks due to their ability to maximize parallelism as well as due to their energy efficiency.  ...  FP-DNN [171] is an end-to-end framework that automatically generates optimized FPGA-based implementations of deep neural networks (DNNs) using an RTL-HLS hybrid library.  ... 
doi:10.1109/access.2018.2890150 arXiv:1901.00121v1 fatcat:ifrv2rtazrffbl7nj6cbfkg7sq

Artificial neural networks acceleration on field-programmable gate arrays considering model redundancy

Jiang Su, Peter Y. K. Cheung, David B. Thomas
The main topics discussed in this thesis include neural network redundancy and its impact on hardware systems.  ...  Therefore, it is important to study new design methods for ANN hardware systems that produce high model accuracy with low resource usage.  ...  FP-DNN[ea17b] is another interesting work that provides a tool flow which maps from a high level NN description written in TensorFlow to generated RTL-HLS hybrid templates.  ... 
doi:10.25560/66261 fatcat:inxrbgihprh45epacix3olx46a