A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA
[article]
2017
arXiv
pre-print
However, to date, there has been little research on the use of FPGA implementations of deconvolutional neural networks (DCNNs). ...
map these DCNNs to an FPGA DCNN-plus-accelerator implementation to perform generative inference on a Xilinx Zynq-7000 FPGA. ...
Section III presents our methodology for efficiently implementing an FPGA-based deconvolution accelerator. Section IV explains our three-step design methodology. ...
arXiv:1705.02583v1
fatcat:qpscnbb5lzfmxfisbmautgy6om
An Energy-Efficient FPGA-based Deconvolutional Neural Networks Accelerator for Single Image Super-Resolution
[article]
2018
arXiv
pre-print
First, we propose a new methodology for optimizing the deconvolutional neural networks (DCNNs) used for increasing feature maps. ...
Finally, we quantize and compress a DCNN-based SR algorithm into an optimal model for efficient inference using on-chip memory. ...
Deconvolutional Neural Networks
III. RELATED WORK
A. ...
arXiv:1801.05997v3
fatcat:mbhed5c2lncqxitgmz5uhnivwy
A Competitive Edge: Can FPGAs Beat GPUs at DCNN Inference Acceleration in Resource-Limited Edge Computing Applications?
[article]
2021
arXiv
pre-print
As such, we design a spatio-temporally parallelized hardware architecture capable of accelerating a deconvolution algorithm optimized for power-efficient inference on a resource-limited FPGA. ...
We propose this FPGA-based accelerator to be used for Deconvolutional Neural Network (DCNN) inference in low-power edge computing applications. ...
ACKNOWLEDGEMENTS This work was supported in part by NSF awards CNS-1730158, ACI-1540112, ACI-1541349, OAC-1826967, the University of California Office of the President, and the California Institute for ...
arXiv:2102.00294v2
fatcat:x6gzg7v2anhprauyrghgowlkcm
Binarized Encoder-Decoder Network and Binarized Deconvolution Engine for Semantic Segmentation
2020
IEEE Access
BEDN has a network size of 0.21 MB, and its maximum memory usage is 1.38 MB. BiDE was implemented on Xilinx ZU7EV field-programmable gate array (FPGA) to operate at 187.5 MHz. ...
For this reason, the segmentation network requires a lot of hardware resources and power consumption, and it is difficult to be applied to an environment where they are limited, such as an embedded system ...
ACKNOWLEDGMENT Hyunwoo Kim was in charge of the development of hardware accelerator (BiDE) and Jeong Hoon Kim was in charge of network design and training (BEDN ). ...
doi:10.1109/access.2020.3048375
fatcat:kcy6zjhtzzgslmhth367gtephm
2020 Index IEEE Transactions on Very Large Scale Integration (VLSI) Systems Vol. 28
2020
IEEE Transactions on Very Large Scale Integration (vlsi) Systems
Reconfigurable Power-Efficient Ternary Content-Addressable Memory on FPGAs; TVLSI Aug. 2020 Aug. 1925Aug ...
., Conflux-An Asynchronous Two-to-One Multiplexor for Time-Division Multiplexing and Clockless, Tokenless Readout; TVLSI Feb. 2020 503-515 Holcomb, D., see 2685-2698 Holcomb, D.E., see 1807-1820 Homayoun ...
., +, TVLSI June 2020 1540-1544 An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs. ...
doi:10.1109/tvlsi.2020.3041879
fatcat:33vb2eia2jfjpog4wei4peq5ge
Efficient Object Detection Framework and Hardware Architecture for Remote Sensing Images
2019
Remote Sensing
Moreover, for evaluating the performance of proposed hardware architecture, we implement it on Xilinx XC7Z100 field programmable gate array (FPGA) and test on the proposed CBFFSSD and VGG16 models. ...
Based on the analysis and optimization of the calculation of each layer in the algorithm, we propose efficient hardware architecture of deep learning processor with multiple neural processing units (NPUs ...
Conflicts of Interest: The authors declare no conflict of interest. Remote Sens. 2019, 11, 2376 ...
doi:10.3390/rs11202376
fatcat:6ubha7ol5bcx7mgpn3mnt3zawy
Smart sensors using artificial intelligence for on-detector electronics and ASICs
[article]
2022
arXiv
pre-print
, and implementations for next generation experiments. ...
In this paper, we discuss the motivations and potential applications for on-detector AI. ...
Figure 2 : 2 Figure 2: A typical workflow to translate an ML model into an FPGA or ASIC implementation using hls4ml. ...
arXiv:2204.13223v1
fatcat:2lm7k2epcraelasr6n4d6nshzu
A Memristor based Unsupervised Neuromorphic System Towards Fast and Energy-Efficient GAN
[article]
2019
arXiv
pre-print
We also proposed an efficient data flow for optimal parallelism training and testing, depending on the computation correlations between different computing blocks. ...
In this work, we proposed a holistic solution for fast and energy-efficient GAN computation through a memristor-based neuromorphic system. ...
In [11, 12] , convolutional neural networks (CNNs) were deployed on memristor crossbar and an on-line training process with backpropagation were implemented via a hardware and software co-design methodology ...
arXiv:1806.01775v4
fatcat:pckbn7vgvbadbfcb2fbqju3sui
2020 Index IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Vol. 39
2020
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
., +, TCAD Oct. 2020 2668-
2681
DSP-Efficient Hardware Acceleration of Convolutional Neural Network
Inference on FPGAs. ...
., +, TCAD Oct. 2020 2668-
2681
DSP-Efficient Hardware Acceleration of Convolutional Neural Network
Inference on FPGAs. ...
Entropy-Directed Scheduling for FPGA High-Level Synthesis. Shen, M., +, TCAD Oct. 2020 2588 -2601 FLASH: Fast, Parallel, and Accurate Simulator for HLS. ...
doi:10.1109/tcad.2021.3054536
fatcat:wsw3olpxzbeclenhex3f73qlw4
Muon–Electron Pulse Shape Discrimination for Water Cherenkov Detectors Based on FPGA/SoC
2021
Electronics
One uses an artificial neural network (ANN) algorithm; the other exploits a correlation approach based on finite impulse response (FIR) filters. ...
We describe two methods for pulse shape detection and discrimination of muons and electrons implemented on FPGA. ...
[30] proposed a framework for deconvolutional neural networks (DCNN) hardware accelerators, focusing on the efficient utilization of the on-chip memory to improve the performance of the layers bounded ...
doi:10.3390/electronics10030224
fatcat:uljnj6urtbaynflwcbrf43xcy4
An Overview of Efficient Interconnection Networks for Deep Neural Network Accelerators
2020
IEEE Journal on Emerging and Selected Topics in Circuits and Systems
This paper provides a comprehensive investigation of the recent advances in efficient on-chip interconnection and design methodology of the DNN accelerator design. ...
Currently, a large body of research aims to find an efficient on-chip interconnection to achieve low-power and high-bandwidth DNN computing. ...
Implementing deconvolution on current ReRAM-based NN Accelerators, which are optimized for convolution, can significantly degrade performance and energy efficiency. ...
doi:10.1109/jetcas.2020.3022920
fatcat:idqitgwnrnegbd4dhrly3xsxbi
Towards Design Methodology of Efficient Fast Algorithms for Accelerating Generative Adversarial Networks on FPGAs
[article]
2019
arXiv
pre-print
Finally, we propose an efficient architecture for implementing Winograd DeConv by designing the line buffer and exploring the design space. ...
In this paper, we propose an efficient Winograd DeConv accelerator that combines these two orthogonal approaches on FPGAs. ...
Finally, we designed an efficient architecture by designing the line buffer and exploring the design space for the efficient implementation. ...
arXiv:1911.06918v1
fatcat:g2gpdp2ikjgjtbyjr334fyrsla
FPGA-Embedded Anomaly Detection System for Milling Process
2021
IEEE Access
The main goal of this work is to design a supervising controller able to detect an anomaly in the milling process and implement the soultion in Field Programmable Gate Array (FPGA) chip. ...
The detection method relies on determining selected signal features in the frequency domain and applying an auto-associative neural network (AANN) for novelty detection. ...
A methodology for real-time stator condition monitoring of an induction motor using a fuzzy system implemented on FPGA was presented in [18] . ...
doi:10.1109/access.2021.3110479
fatcat:uvgc3gzxljdrhaw6wp42fqlzg4
ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network
[article]
2020
arXiv
pre-print
The ZynqNet Embedded CNN is designed for image classification on ImageNet and consists of ZynqNet CNN, an optimized and customized CNN topology, and the ZynqNet FPGA Accelerator, an FPGA-based architecture ...
This master thesis explores the potential of FPGA-based CNN acceleration and demonstrates a fully functional proof-of-concept CNN implementation on a Zynq System-on-Chip. ...
Acknowledgement First and foremost, I would like to thank my supervisor Emanuel Schmid for the pleasant ...
arXiv:2005.06892v1
fatcat:tduahjb5w5cjromemahngmt3gy
FPGA Implementation of a Novel Gaussian Filter Using Power Optimized Approximate Adders
2018
Indonesian Journal of Electrical Engineering and Computer Science
This paper discusses the implementation of a novel Gaussian smoothing filter with low power approximate adders in Field Programmable Gate Array (FPGA). ...
Hence the hardware implementation of the Gaussian filter becomes a reliable solution for real time image processing applications. ...
Efficient modified guided filter architecture is implemented on Xilinx FPGA device which offers less cost, speed and low power. ...
doi:10.11591/ijeecs.v11.i3.pp1048-1059
fatcat:hcnrb7gearhync5dwegjxezdxi
« Previous
Showing results 1 — 15 out of 183 results