Filters








171 Hits in 2.0 sec

Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs

Ritchie Zhao, Weinan Song, Wentao Zhang, Tianwei Xing, Jeng-Hau Lin, Mani Srivastava, Rajesh Gupta, Zhiru Zhang
2017 Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA '17  
As a result, existing CNN applications are typically run on clusters of CPUs or GPUs. Research on FPGA acceleration of CNN workloads has achieved reductions in power and energy consumption.  ...  A combination of low-precision networks and high-level design methodology may help address the performance and productivity gap between FPGAs and GPUs.  ...  The Tesla K40 GPU used for this research was donated by the NVIDIA Corporation.  ... 
doi:10.1145/3020078.3021741 fatcat:6yzshksxx5eanowroqnvez2kze

Accelerating Binarized Neural Networks via Bit-Tensor-Cores in Turing GPUs [article]

Ang Li, Simon Su
2020 arXiv   pre-print
as CPUs and GPUs.  ...  Despite foreseeing tremendous speedups over conventional deep neural networks, the performance advantage of binarized neural networks (BNNs) has merely been showcased on general-purpose processors such  ...  We built the full implementation for the inference of binarized neural networks.  ... 
arXiv:2006.16578v2 fatcat:tb22mdssebfm3fycqivxoetopu

FCA-BNN: Flexible and Configurable Accelerator for Binarized Neural Networks on FPGA

Jiabao GAO, Yuchen YAO, Zhengjie LI, Jinmei LAI
2021 IEICE transactions on information and systems  
A series of Binarized Neural Networks (BNNs) show the accepted accuracy in image classification tasks and achieve the excellent performance on field programmable gate array (FPGA).  ...  For Cifar-10 AlexNet, FCA-BNN achieves 188.2× and 60.6× better than CPU and GPU in energy efficiency, respectively.  ...  Acknowledgements This work was supported in part by the National Natural Science Foundation of China under Grant No. U20A20202.  ... 
doi:10.1587/transinf.2021edp7054 fatcat:aad7a2wdl5hjlbgrvoj7dsywzi

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration [article]

Jeng-Hau Lin, Tianwei Xing, Ritchie Zhao, Zhiru Zhang, Mani Srivastava, Zhuowen Tu, Rajesh K. Gupta
2017 arXiv   pre-print
State-of-the-art convolutional neural networks are enormously costly in both compute and memory, demanding massively parallel GPUs for execution.  ...  We verify BCNNw/SF on the MNIST, CIFAR-10, and SVHN datasets, and implement an accelerator for CIFAR-10 on FPGA hardware.  ...  We also implement an accelerator for the inference of a CIFAR-10 network on an FPGA platform.  ... 
arXiv:1707.04693v1 fatcat:i4byufawe5ctjewbd4zwxroi4u

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration

Jeng-Hau Lin, Tianwei Xing, Ritchie Zhao, Zhiru Zhang, Mani Srivastava, Zhuowen Tu, Rajesh K. Gupta
2017 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
State-of-the-art convolutional neural networks are enormously costly in both compute and memory, demanding massively parallel GPUs for execution.  ...  We verify BCNNw/SF on the MNIST, CIFAR-10, and SVHN datasets, and implement an accelerator for CIFAR-10 on FPGA hardware.  ...  Introduction Albeit the community of neural networks has been prospering for decades, state-of-the-art CNNs still demand significant computing resources (i.e., high-performance GPUs), and are eminently  ... 
doi:10.1109/cvprw.2017.48 dblp:conf/cvpr/LinXZZSTG17 fatcat:akx4u2xz2nc4jnq3kmi5jce7lu

ReBNN: in-situ acceleration of binarized neural networks in ReRAM using complementary resistive cell

Linghao Song, You Wu, Xuehai Qian, Hai Li, Yiran Chen
2019 CCF Transactions on High Performance Computing  
The binarized neural network (BNN) is a hardware-friendly model that can dramatically reduce the computation and storage overheads.  ...  than state-of-the-art BNN accelerators.  ...  During execution of the whole binarized neural network, a static and regular communication graph is formed between CRC arrays based on order of layers in a binarized neural network.  ... 
doi:10.1007/s42514-019-00014-8 fatcat:eh7yeyvatrczpamq5777utvbri

GUINNESS: A GUI Based Binarized Deep Neural Network Framework for Software Programmers

Hiroki NAKAHARA, Haruyoshi YONEKAWA, Tomoya FUJII, Masayuki SHIMODA, Shimpei SATO
2019 IEICE transactions on information and systems  
tool flow for a binarized deep neural network toward FPGA implementation based on the GUI including both the training on the GPU and inference on the FPGA.  ...  We compare the proposed FPGA design with the CPU and the GPU designs.  ...  Acknowledgements This research is supported in part by the Grants in Aid for Scientific Research from JSPS, and the New Energy and Industrial Technology Development Organization (NEDO).  ... 
doi:10.1587/transinf.2018rcp0002 fatcat:55dvdmcw4zf2zeqrmg2tzm6j4e

Learning on Hardware: A Tutorial on Neural Network Accelerators and Co-Processors [article]

Lukas Baischer, Matthias Wess, Nima TaheriNejad
2021 arXiv   pre-print
However, there are various neural network hardware accelerator platforms, such as graphics processing units (GPUs), application specific integrated circuits (ASICs) and field programmable gate arrays (  ...  In this article an overview of existing neural network hardware accelerators and acceleration methods is given.  ...  Section 7, Section 8 and Section 9 present the most commonly used neural network hardware accelerator platforms, namely GPUs, ASICs and FPGAs.  ... 
arXiv:2104.09252v1 fatcat:625wtuskhff3lbswhwmj7decni

Recent Advances in Convolutional Neural Network Acceleration [article]

Qianru Zhang, Meng Zhang, Tinghuan Chen, Zhifei Sun, Yuzhe Ma, Bei Yu
2018 arXiv   pre-print
In recent years, convolutional neural networks (CNNs) have shown great performance in various fields such as image classification, pattern recognition, and multi-media compression.  ...  At last, we give a discussion on different perspectives of these acceleration and optimization methods within each level.  ...  Table 7 : 7 Performance comparison among GPU, FPGA, and ASIC.  ... 
arXiv:1807.08596v1 fatcat:jx66ekaofjhqzdbaueal476bvi

A Review of Binarized Neural Networks

Taylor Simons, Dah-Jye Lee
2019 Electronics  
In this work, we review Binarized Neural Networks (BNNs). BNNs are deep neural networks that use binary values for activations and weights, instead of full precision values.  ...  BNNs are also good candidates for deep learning implementations on FPGAs and ASICs due to their bitwise efficiency.  ...  They compare the execution performance of the ASIC implementations with implementations in an FPGA, CPU and GPU.  ... 
doi:10.3390/electronics8060661 fatcat:7cvd6fn2undjdhuunnthsrzfzu

Automated flow for compressing convolution neural networks for efficient edge-computation with FPGA [article]

Farhan Shafiq, Takato Yamada, Antonio T. Vilchez, Sakyasingha Dasgupta
2017 arXiv   pre-print
This flow involves quantization of model parameters and activations, generation of network and model in embedded-C, followed by automatic generation of the FPGA accelerator for binary convolutions.  ...  Due to the large size of these models, they are typically run on clusters of CPUs or GPUs.  ...  However the computation and memory demands of recent CNN architectures require powerful GPUs, distributed CPU servers, Specialized ASIC or DSP processors.  ... 
arXiv:1712.06272v1 fatcat:3dwv7runyndtdlvicdctb633bm

Accelerating Neural Network Inference on FPGA-Based Platforms—A Survey

Ran Wu, Xinmin Guo, Jian Du, Junbao Li
2021 Electronics  
The architecture of networks and characteristics of FPGA are analyzed, compared and summarized, as well as their influence on acceleration tasks.  ...  In this paper, we research neural networks which are involved in the acceleration on FPGA-based platforms.  ...  GPU-Based Acceleration GPUs are still the most widely used processors in neural network development.  ... 
doi:10.3390/electronics10091025 doaj:92e7eb4228a44c6387f846a1203529d0 fatcat:2xa7dv5hsjbczpvc4w6acdehwu

FP-BNN: Binarized neural network on FPGA

Shuang Liang, Shouyi Yin, Leibo Liu, Wayne Luk, Shaojun Wei
2018 Neurocomputing  
This paper presents FP-BNN, a Binarized Neural Network (BNN) for FPGAs, which drastically cuts down the hardware consumption while maintaining acceptable accuracy.  ...  This paper presents FP-BNN, a Binarized Neural Network (BNN) for FPGAs, which drastically cuts down the hardware consumption while maintaining acceptable accuracy.  ...  Conclusion This paper presents FP-BNN -our design for binarized neural networks targeting FPGA technology.  ... 
doi:10.1016/j.neucom.2017.09.046 fatcat:afwmh2k2obbelbrmsaoaa5eq2e

Training Hardware for Binarized Convolutional Neural Network Based on CMOS Invertible Logic

Duckgyu Shin, Naoya Onizawa, Warren J. Gross, Takahiro Hanyu
2020 IEEE Access  
For performance evaluation, the proposed hardware is implemented on an FPGA and trains a binarized 2-layer convolutional neural network model using a modified MNIST dataset.  ...  The proposed hardware obtains parameters of neural networks such as weights directly from given data (an input feature map and a true label) without backpropagation.  ...  For example, inference of binarized neural networks (BNNs) can be implemented by XNOR gates and bit counters because of binarized activations and weights [6] .  ... 
doi:10.1109/access.2020.3029576 fatcat:3nn6glwg6ndibfqf36oy2d34cm

Energy Proportional Neural Network Inference with Adaptive Voltage and Frequency Scaling

Jose L. Nunez-Yanez
2018 IEEE transactions on computers  
This research presents the extension and application of a voltage and frequency scaling framework called Elongate to a high-performance and reconfigurable binarized neural network.  ...  The neural network is created in the FPGA reconfigurable fabric and coupled to a multiprocessor host that controls the operational point to obtain energy proportionality.  ...  ACKNOWLEDGMENTS This work was partially supported by Xilinx and UK EPSRC with the ENPOWER (EP/L00321X/1) and the ENEAC (EP/N002539/1) projects.  ... 
doi:10.1109/tc.2018.2879333 fatcat:mrorbc3ol5g5ldrvrf5gvtzznm
« Previous Showing results 1 — 15 out of 171 results