A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Filters
Hardware design space exploration using HercuLeS HLS
2013
Proceedings of the 17th Panhellenic Conference on Informatics - PCI '13
IP integration, d) backend C code generation for compiled simulation, and e) an exemplary case of DSE. ...
HercuLeS is an extensible high-level synthesis (HLS) environment. ...
since it uses a graph-based back-end; d) open specifications such as Graphviz [9] and NAC are used throughout the HLS process, and e) the generated HDL code is completely vendor-and technology-independent ...
doi:10.1145/2491845.2491865
dblp:conf/pci/KavvadiasM13
fatcat:4o6s4hyztff3hmbc4p72b63rvy
A Toolchain for Dynamic Function Off-load on CPU-FPGA Platforms
2015
Journal of Information Processing
This new toolchain for accelerating application on CPU-FPGA platforms, called Courier-FPGA, extracts runtime information from a running target binary, and re-constructs the function call graph including ...
Then, it synthesizes hardware modules on the FPGA and makes software functions on CPU by using Pipeline Generator. ...
Acknowledgments The present study is supported in part by the JST/CREST program entitled "Research and Development on Unified Environment of Accelerated Computing and Interconnection for Post-Petascale ...
doi:10.2197/ipsjjip.23.153
fatcat:2pb5qh2ocjaiver3f2bpiykyba
Pushing the Level of Abstraction of Digital System Design: a Survey on How to Program FPGAs
2022
ACM Computing Surveys
They are state-of-the-art for prototyping, telecommunications, embedded, and an emerging alternative for cloud-scale acceleration. ...
We review these abstraction solutions, provide a timeline, and propose a taxonomy for each abstraction trend: programming models for HDLs; IP-based or System-based toolchains for HLS; application, architecture ...
ACKNOWLEDGEMENTS The authors are grateful for feedbacks from Reviewers and NECSTLab members, with a particular mention to A. Damiani, A. Parravicini, E. D'Arnese, F. Carloni, F. Peverelli, and R. ...
doi:10.1145/3532989
fatcat:nsk5lwvt3vba5fbxmaj7sgpwru
HipaccVX: Wedding of OpenVX and DSL-based Code Generation
[article]
2020
arXiv
pre-print
These optimizations can double the throughput on an Nvidia GTX GPU and decrease the resource usage of a Xilinx Zynq FPGA by 50% for our benchmarks. ...
Yet, the OpenVX' algorithm space is constrained to a small set of vision functions. This hinders accelerating computations that are not included in the standard. ...
The filter graph function (Line 21)
Algorithm 1 : 1 Graph Analysis for Dead Computation Elimination input : Gapp -application graphDnv -set of are non-virtual data nodes output : G f ilt -optimized application ...
arXiv:2008.11476v1
fatcat:e4yyu4ei7nayjma5rpt6p3nmei
HipaccVX: wedding of OpenVX and DSL-based code generation
2020
Journal of Real-Time Image Processing
These optimizations can double the throughput on an Nvidia GTX GPU and decrease the resource usage of a Xilinx Zynq FPGA by 50% for our benchmarks. ...
Yet, the OpenVX ' algorithm space is constrained to a small set of vision functions. This hinders accelerating computations that are not included in the standard. ...
Listing 1 shows an example OpenVX code for a simple edge detection algorithm, for which the application graph is shown in Fig. 1 . ...
doi:10.1007/s11554-020-01015-5
fatcat:iowzgiohnvc3beo4at6aamcb5y
SNNAP: Approximate computing on programmable SoCs via neural acceleration
2015
2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)
We describe the design and implementation of SNNAP, a flexible FPGA-based neural accelerator for approximate programs. ...
No hardware expertise is required to accelerate software with SNNAP, so the effort required can be substantially lower than custom hardware design for an FPGA fabric and possibly even lower than current ...
The authors thank Eric Chung for his help on prototyping accelerators on the Zynq. ...
doi:10.1109/hpca.2015.7056066
dblp:conf/hpca/MoreauWNSECO15
fatcat:kcr5mngnrncbfmeanik46g3t4i
Hardware Compilation of Deep Neural Networks: An Overview
2018
2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)
Deploying a deep neural network model on a reconfigurable platform, such as an FPGA, is challenging due to the enormous design spaces of both network models and hardware design. ...
Design templates for neural network accelerators are studied with a specific focus on their derivation methodologies. ...
In other implementations, the authors applied polyhedral analysis for DNN acceleration on non-FPGA platforms. ...
doi:10.1109/asap.2018.8445088
dblp:conf/asap/ZhaoLNWDNWSCCL18
fatcat:v5txrrsfifa6bah2oksjdlrsgi
Virtualized Execution Runtime for FPGA Accelerators in the Cloud
2017
IEEE Access
Parallel kernels are used to generate an HLS description, which is processed by the HLS tool (Vivado HLS 2015.4.2 in our case) to generate an RTL description of the hardware accelerators. ...
Lastly, a third approach supports direct RTL description of accelerators, bypassing the HLS generation stage. ...
doi:10.1109/access.2017.2661582
fatcat:donw4yrggjdftkgob3x5l5r664
Applications and Techniques for Fast Machine Learning in Science
[article]
2021
arXiv
pre-print
This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. ...
The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for ...
Most of the efforts are focused on digital CMOS technology, such as implementations based on general-purpose TPUs/GPUs, FPGAs, and more specialized ML hardware accelerators. ...
arXiv:2110.13041v1
fatcat:cvbo2hmfgfcuxi7abezypw2qrm
HIR: An MLIR-based Intermediate Representation for Hardware Accelerator Description
[article]
2021
arXiv
pre-print
Though FPGAs are an ideal target for energy efficient custom accelerators, the difficulty of hardware design and the lack of vendor agnostic, standardized hardware compilation infrastructure has hindered ...
Our implementation shows that the code generation time of the HIR code generator is on average 1112x lower than that of Xilinx Vivado HLS on a range of kernels without a compromise on the quality of the ...
For example, the generated accelerator may be deployed as a part of a larger design containing PCIe controllers, DRAM controllers and soft CPU cores (CPU implemented on an FPGA). ...
arXiv:2103.00194v1
fatcat:vwv7jfr2ofgxjih7uamqxvv4xe
FPGA-based accelerator development for non-engineers
2014
2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)
However, domain experts, who are the brains behind this processing, typically lack the skills required to build FPGA-based hardware accelerators ideal for their applications, as traditional development ...
The efficacy of these flows in extending FPGA-based acceleration to non-engineers in the life sciences was informally tested at two separate instances of an NSF-funded summer workshop, organized and hosted ...
In general, however, the use of a non-standard language as the entry format of an HLS tool creates a barrier to widespread adoption [25] . ...
doi:10.1109/reconfig.2014.7032522
dblp:conf/reconfig/UlianaAK14
fatcat:s6krmx2zerbffnpagsanagvjwy
High-Level Synthesis in the Delft Workbench Hardware/Software Co-design Tool-Chain
2014
2014 12th IEEE International Conference on Embedded and Ubiquitous Computing
to an increasing attention for HLS tool development and optimization from both the academia as well as the industry. ...
This advantage, coupled with the increasing number of available heterogeneous platforms that loosely couple general-purpose processors with Field-Programmable Gate Array (FPGA)-based co-processors, led ...
Reconfigurable fabrics such as FPGAs can be used as stand-alone processing units or in combination with a General-Purpose Processor (GPP). ...
doi:10.1109/euc.2014.28
dblp:conf/euc/NaneSPGB14
fatcat:elr5fcxj6bec7btbpuywsvnfnq
Are Coarse-Grained Overlays Ready for General Purpose Application Acceleration on FPGAs?
2016
2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech)
Section V examines the use of an FPGA overlay for general purpose application acceleration within a hybrid FPGA. Finally, we conclude in Section VI.
II. ...
Are Coarse-Grained Overlays Ready for General Purpose Application Acceleration on FPGAs? Abhishek Kumar Jain * , Douglas L. Maskell * and Suhaib A. ...
doi:10.1109/dasc-picom-datacom-cyberscitec.2016.110
dblp:conf/dasc/JainMF16
fatcat:gmiz7uunpbaatjryzjiozj24om
Best-Effort FPGA Programming: A Few Steps Can Go a Long Way
[article]
2018
arXiv
pre-print
FPGA-based heterogeneous architectures provide programmers with the ability to customize their hardware accelerators for flexible acceleration of many workloads. ...
We show that for a broad class of accelerator benchmarks from MachSuite, the proposed best-effort guideline improves the FPGA accelerator performance by 42-29,030x. ...
accelerator design expert. ...
arXiv:1807.01340v1
fatcat:6ocpzvp2cvgkninbtyvvyk7yiu
DeCO: A DSP Block Based FPGA Accelerator Overlay with Low Overhead Interconnect
2016
2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)
However, achieving the desired performance often still requires detailed low-level design engineering effort that is difficult for non-experts. ...
, long compilation times, and poor design productivity are major issues preventing the mainstream adoption of FPGA based accelerators in general purpose computing [2] . ...
doi:10.1109/fccm.2016.10
dblp:conf/fccm/JainLSMF16
fatcat:q5hlrqeoyrezll4o7qtsfnyp7e
« Previous
Showing results 1 — 15 out of 167 results