A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Reconfigurable acceleration of 3D image registration
2009
2009 5th Southern Conference on Programmable Logic (SPL)
This paper proposes techniques for accelerating a software based image registration algorithm for 3D medical images targeting a reconfigurable hardware platform. ...
Based on the reconfigurability of FPGA devices, the system can be extended to swap modules optimized for different parameters, and to adopt more advanced registration algorithms. ...
Conclusion This paper presents a reconfigurable framework for accelerating registration algorithms for 3D medical images. ...
doi:10.1109/spl.2009.4914908
fatcat:fcrvvrhicndv5fctivlcivokay
An Automated Framework for Accelerating Numerical Algorithms on Reconfigurable Platforms Using Algorithmic/Architectural Optimization
2009
IEEE transactions on computers
Subsequently, TANOR automatically generates a configuration bitstream for a target FPGA along with associated drivers and control software necessary to direct the application from a host PC. ...
This paper describes TANOR, an automated framework for designing hardware accelerators for numerical computation on reconfigurable platforms. ...
ACKNOWLEDGMENTS This work is supported in part by grants from the US Defense Advanced Research Projects Agency W911NF-05-1-0248 and the US National Science Foundation CAREER 0093085. ...
doi:10.1109/tc.2009.78
fatcat:zklqp4ljhngc7jostzhlv2jgpq
Survey on Multigrained Reconfigurable Architecture using Parallel Mapping Method
2017
Indian Journal of Science and Technology
A new folding tree algorithm is proposed (MGRA) with CRGA is proposed to eliminate PE's. ...
Findings: For better execution characteristics of parallel mapping on MGRA, more PE utilisation rate and less memory access overhead are considered as resulting conditions. ...
Introduction Spatial computing could be a method that usually uses significant amount of simple parallel processing factors, that which operate at a time, to execute a one application or application kernel ...
doi:10.17485/ijst/2017/v10i6/110837
fatcat:eseoaf2g2vbrfajebekoz5frl4
TEXTAROSSA: Towards EXtreme scale Technologies and Accelerators for euROhpc hw/Sw Supercomputing Applications for exascale
2021
2021 24th Euromicro Conference on Digital System Design (DSD)
Acknowledgements This work is supported by the TEXTAROSSA project G.A. n.956831, as part of the EuroHPC initiative. ...
The Reverse Time Migration application and mini-kernels are used within EPI to co-design the STX Accelerator and have been ported to FPGAs within the EuroEXA project. ...
(GPUs and FPGAs) by focusing on data/stream locality, efficient algorithms and programming models, tuned libraries and innovative IPs; 3) seamless integration of reconfigurable accelerators by extending ...
doi:10.1109/dsd53832.2021.00051
fatcat:tvsivkak5vgphc35ie5ow7kip4
OpenDwarfs: Characterization of Dwarf-Based Benchmarks on Fixed and Reconfigurable Architectures
2015
Journal of Signal Processing Systems
Using OpenDwarfs, we characterize a diverse set of modern fixed and reconfigurable parallel platforms: multicore CPUs, discrete and integrated GPUs, Intel Xeon Phi co-processor, as well as a FPGA. ...
Furthermore, we desire a common programming model for the benchmarks that facilitates code portability across a wide variety of different processors (e.g., CPU, APU, GPU, FPGA, DSP) and computing environments ...
An FPGA implementation with a single pair of accelerators (one accelerator for each OpenCL kernel) offers performance worse even than that of the singlethreaded Opteron 6272 execution (FPGA C1). ...
doi:10.1007/s11265-015-1051-z
fatcat:ifnbayv26zdttgeovidgjqtoue
K-loops: Loop skewing for Reconfigurable Architectures
2009
2009 International Conference on Field-Programmable Technology
In this paper, we propose new techniques for improving the performance of applications running on a reconfigurable platform supporting the Molen programming paradigm. ...
The first technique presented in this paper improves the application performance by running in parallel on the reconfigurable hardware multiple instances of the kernel. ...
The contributions of this paper are: a) a technique for parallelizing K-loops with wavefront-like dependencies, running all kernel instances on the reconfigurable hardware; b) a technique for parallelizing ...
doi:10.1109/fpt.2009.5377656
fatcat:jgo6gmucmveafasktpgjlnxbry
OpenRCL: Low-Power High-Performance Computing with Reconfigurable Devices
2010
2010 International Conference on Field Programmable Logic and Applications
The key idea is to expose the FPGA platform as a compiler target for applications expressed in the OpenCL paradigm. ...
For the well-known Parallel Prefix Sum (Scan) problem, comparing the runtime of the same problem on a GeForce 9400m using the OpenCL SDK from Apple Inc., the OpenRCL machine demonstrates comparable performance ...
The key objective is to make a large body of existing and new parallel applications available to FPGA acceleration without significant recoding. ...
doi:10.1109/fpl.2010.93
dblp:conf/fpl/LinLW10
fatcat:2gqc62hvpbe5jczac44zrtgwum
Data-aware process networks
2021
Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction
With the emergence of reconfigurable FPGA circuits as a credible alternative to GPUs for HPC acceleration, new compilation paradigms are required to map high-level algorithmic descriptions to a circuit ...
DPN combines the benefits of a low-level dataflow representation -close to the final circuit -and affine iteration space tiling to explore the parallelization trade-offs (local memory size, communication ...
Recently, reconfigurable FPGA circuits [8] have appeared to be a competitive alternative to GPU [46] in the race for energy efficiency. ...
doi:10.1145/3446804.3446847
fatcat:pyhil53nuzg2hk2dc7pbj7zh6q
Hardware Compilation of Deep Neural Networks: An Overview
2018
2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)
Deploying a deep neural network model on a reconfigurable platform, such as an FPGA, is challenging due to the enormous design spaces of both network models and hardware design. ...
Design templates for neural network accelerators are studied with a specific focus on their derivation methodologies. ...
s design has a tunable folding parameter K for each two-dimensional FFT kernel, while parameters T i and T k enabled configurable parallelism. ...
doi:10.1109/asap.2018.8445088
dblp:conf/asap/ZhaoLNWDNWSCCL18
fatcat:v5txrrsfifa6bah2oksjdlrsgi
Automatic compilation to a coarse-grained reconfigurable system-opn-chip
2003
ACM Transactions on Embedded Computing Systems
The Morphosys project proposes an SoC architecture consisting of reconfigurable hardware that supports a data-parallel, SIMD computational model. ...
The rapid growth of device densities on silicon has made it feasible to deploy reconfigurable hardware as a highly parallel computing platform. ...
FPGAs have the potential for a very large degree of parallelism as compared to traditional processors. ...
doi:10.1145/950162.950167
fatcat:atgwub4vmnfmtekpsaxiot77ju
Mapping a data-flow programming model onto heterogeneous platforms
2012
Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems - LCTES '12
usage of 0.52× of the power used by CPUs alone, when using accelerators (GPUs and FPGAs) and CPUs. ...
We demonstrate a working example that maps a pipeline of medical image-processing algorithms onto a prototype heterogeneous platform that includes CPUs, GPUs and FPGAs. ...
Additional thanks to the Habanero team for their comments and feedback on this work. ...
doi:10.1145/2248418.2248428
dblp:conf/lctrts/SbirleaZBCS12
fatcat:pt3s2jlcibehho65hstsw65ahm
ELASTIC CLOUD COMPUTING ARCHITECTURE AND SYSTEM FOR HETEROGENEOUS SPATIOTEMPORAL COMPUTING
2017
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Furthermore, considering the energy efficiency requirement in general computation, Field Programmable Gate Array (FPGA) may be a better solution for better energy efficiency when the performance of computation ...
Now that a variety of hardware accelerators and computing platforms are available to improve the performance of geocomputation, different algorithms may have different behavior on different computing infrastructure ...
ACKNOWLEDGEMENT This research was partially supported by the National Science Foundation (NSF) through NSF SMA-1416509. ...
doi:10.5194/isprs-annals-iv-4-w2-115-2017
fatcat:3iij6pxybjbxno2524jbhbbea4
Performance and energy footprint assessment of FPGAs and GPUs on HPC systems using Astrophysics application
[article]
2020
arXiv
pre-print
New challenges in Astronomy and Astrophysics (AA) are urging the need for a large number of exceptionally computationally intensive simulations. ...
Our experience reveals that considering FPGAs for computationally intensive application seems very promising, as their performance is improving to meet the requirements of scientific applications. ...
We thank Piero Vicini and the INFN APE Roma Group for the support and for the use of INFN computational infrastructure. ...
arXiv:2003.03283v2
fatcat:cgsagyvimbhd3pu3cv37q2hfyu
A fully pipelined kernel normalised least mean squares processor for accelerated parameter optimisation
2015
2015 25th International Conference on Field Programmable Logic and Applications (FPL)
In this paper, we propose the first fully pipelined floating point implementation of the kernel normalised least mean squares algorithm for regression. ...
KAFs are members of a family of kernel methods which apply an implicit nonlinear mapping of input data to a high dimensional feature space, permitting learning algorithms to be expressed entirely as inner ...
ACKNOWLEDGMENT This research was supported under the Australian Research Councils Linkage Projects funding scheme (project number LP130101034). ...
doi:10.1109/fpl.2015.7293952
dblp:conf/fpl/FraserMLTJL15
fatcat:tr4g4mfgwzhydckeupksctxcae
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale
2016
Proceedings of the Seventh ACM Symposium on Cloud Computing - SoCC '16
In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. ...
A straightforward JNI (Java Native Interface) integration of FPGA accelerators can diminish or even degrade the overall performance (up to 1000X slowdown) due to the overwhelming JVM-to-native-to-FPGA ...
Acknowledgments This work is partially supported by the Center for Domain-Specific Computing under the NSF InTrans Award CCF-1436827, funding from CDSC industrial partners including Baidu, Fujitsu Labs ...
doi:10.1145/2987550.2987569
pmid:28317049
pmcid:PMC5351886
dblp:conf/cloud/HuangWYFICC16
fatcat:5f6bnm6xxbfk3k5fv3sgqarftu
« Previous
Showing results 1 — 15 out of 340 results