A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Python Non-Uniform Fast Fourier Transform (PyNUFFT): multi-dimensional non-Cartesian image reconstruction package for heterogeneous platforms and applications to MRI
[article]
2017
arXiv
pre-print
This paper reports the development of a Python Non-Uniform Fast Fourier Transform (PyNUFFT) package, which accelerates non-Cartesian image reconstruction on heterogeneous platforms. ...
The PyNUFFT package has been tested on multi-core CPU and GPU, with acceleration factors of 6.3 - 9.5× on a 32 thread CPU platform and 5.4 - 13× on the GPU. ...
The benchmarks were carried out on Amazon Web Services provided by AWS Educate credit. J.-M. Lin declares no conflict of interest.
References ...
arXiv:1710.03197v1
fatcat:6z4eizuiujbznnh6nh3g6z6xzm
Python Non-Uniform Fast Fourier Transform (PyNUFFT): An Accelerated Non-Cartesian MRI Package on a Heterogeneous Platform (CPU/GPU)
2018
Journal of Imaging
A Python non-uniform fast Fourier transform (PyNUFFT) package has been developed to accelerate multidimensional non-Cartesian image reconstruction on heterogeneous platforms. ...
The PyNUFFT package has been tested on multi-core central processing units (CPUs) and graphic processing units (GPUs), with acceleration factors of 6.3-9.5× on a 32-thread CPU platform and 5.4-13× on a ...
Introduction Fast Fourier transform (FFT) is an exact fast algorithm to compute the discrete Fourier transform (DFT) when data are acquired on an equispaced grid. ...
doi:10.3390/jimaging4030051
fatcat:a63kgcae7vgr7d5wkgecuykdwa
Energy Efficient Computation Method for CPU-GPU System on Chip
2018
International Journal for Research in Applied Science and Engineering Technology
The improvement in performance gained by the use of a multi-core processor depends very much on the software algorithms used and their implementation. ...
The increasing trends in multi-core chips allows higher performance at lower energy and the communication between the cores is a limiting factor which can be improved by the parallel computation such as ...
Since each core in a multi-core CPU is generally more energy-efficient, the chip becomes more efficient than having a single large monolithic core. ...
doi:10.22214/ijraset.2018.5199
fatcat:g3wqyatcdffa3n743oazsds7wi
Large Scale Parallelized 3D Mesoscopic Simulations Of The Mechanical Response To Shear In Disordered Media
2015
Zenodo
In this paper we describe the development of a code that implements a coarse grained dynamics for the large scale modeleling of 3 dimensional athermal yielding and flow of disordered systems under externally ...
The stochastic lattice model for the heterogeneous flow response involves long range elastic interactions, that are resolved using fast Fourier techniques, implemented in parallel in an efficient and well ...
The choice to perform the convolution in Fourier space allow to profit from fast Fourier transform methods and convert the long range interactions, that would imply a large number of CPU communications ...
doi:10.5281/zenodo.825609
fatcat:mkzobksr35caxk7mh3aislglse
Energy-Saving Task Scheduling Based on Hard Reliability Requirements: A Novel Approach with Low Energy Consumption and High Reliability
2022
Sustainability
problem of DAG applications concerning energy-saving and hard reliability requirements in heterogeneous multi-core processor systems. ...
With the increasing complexity of application situations in multi-core processing systems, how to assure task execution reliability has become a focus of scheduling algorithm research in recent years. ...
The Fast Fourier Transform is an efficient algorithm for computing the discrete Fourier transform in a computer. ...
doi:10.3390/su14116591
fatcat:jsgojo2r6vgtfbnbot22m2skvu
A simple spectral algorithm for solving large-scale Poisson equation in 2D
2003
Computer Physics Communications
We have used a spectral Fourier technique and parallelized FFTs with OPEN_MP on SGI machines. This method can be easily extended to 3D. ...
We show that it is possible with easy-to-program algorithms to reach spatial resolutions of the order of 10 8 grid points for computing the electric potential on 2D periodic lattices, such as the Si(111 ...
Xavier Thibert-Plante is a fellow of CRSNG Canada. ...
doi:10.1016/s0010-4655(03)00283-2
fatcat:5vcw5rpycvh7pmrqct6gb2uyqi
D12.1: Heterogeneous and Auto-tuned Runtime System
2013
Zenodo
Task 12.1 contributes to improve the support of auto-tuning methods to face the complexity of existing and future large scale systems. ...
It impacts parallel languages, runtime, generic and kernel specific auto-tuning algorithms, multi-core, many-core and multi-node sytems, as well as batch systems and energy consumption measurement methods ...
This complexity is particular high within a node, with the apparition of large multi-core and/or many-core systems, leading to deep memory hiearchies and heterogeneous nodes. ...
doi:10.5281/zenodo.6572371
fatcat:uttgomgovjeb5iopc2ccgyar7y
Improving HPC Application Performance in Cloud through Dynamic Load Balancing
2013
2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing
It infers the static hardware heterogeneity in virtualized environments, and also adapts to the dynamic heterogeneity caused by the interference arising due to multi-tenancy. ...
Through experimental evaluation on a private cloud with 64 VMs using benchmarks and a real science application, we demonstrate performance benefits up to 45%. ...
VM (on a Fast core) which starts at iteration 50. ...
doi:10.1109/ccgrid.2013.65
dblp:conf/ccgrid/GuptaSKM13
fatcat:poatj7saofdvdpijockg4ehmky
High Performance Graph Data Imputation on Multiple GPUs
2021
Future Internet
Furthermore, we design a scheme to extend the GPU-optimized implementation to multiple GPUs for large-scale computing. Experimental results show that the GPU implementation is both fast and accurate. ...
In this paper, we propose a scheme to perform the convolutional imputation algorithm with higher time performance on GPUs (Graphics Processing Units) by exploiting multi-core GPUs of CUDA architecture. ...
In the field of data mining, a variety of graph processing systems have been developed, from GraphChi [6] , which is a CPU-based system for computing large-scale graphs on a single machine, to the multi-GPU ...
doi:10.3390/fi13020036
fatcat:qv5vzegj3jcm3elsk5ssjsxedu
Impact of Kernel-Assisted MPI Communication over Scientific Applications: CPMD and FFTW
[chapter]
2011
Lecture Notes in Computer Science
FFTW, a Discrete Fourier Transform (DFT). ...
Our experiments indicate that the quality of the collective communication implementation on a specific machine plays a critical role on the overall application performance. ...
FFTW, "Fastest Fourier Transform in the West", is one of the most popular libraries to compute discrete Fourier transforms (DFTs). ...
doi:10.1007/978-3-642-24449-0_28
fatcat:bnkvbrne4jebram52epyspqhre
heFFTe: Highly Efficient FFT for Exascale
[chapter]
2020
Lecture Notes in Computer Science
Currently, several and diverse applications, such as those part of the Exascale Computing Project (ECP) in the United States, rely on efficient computation of the Fast Fourier Transform (FFT). ...
A communication model for parallel FFTs is also provided to analyze the bottleneck for large-scale problems. ...
Introduction Considered one of the top 10 algorithms of the 20th century, the Fast Fourier transform (FFT) is widely used by applications in science and engineering. ...
doi:10.1007/978-3-030-50371-0_19
fatcat:hblqdlwkvjchpckx7npiqgtlyi
Accelerating Fast Fourier Transforms Using Hadoop and CUDA
[article]
2014
arXiv
pre-print
There has been considerable research into improving Fast Fourier Transform (FFT) performance through parallelization and optimization for specialized hardware. ...
In this paper we present a unique approach that not only parallelizes the workload over multi-cores, but distributes the problem over a cluster of graphics processing unit (GPU)-equipped servers. ...
We plan on expanding our work to allow overlapping FFTs operation to be performed in our distributed environment. VII. ...
arXiv:1407.6915v1
fatcat:a4xzuas3s5dp7h2bvtlv7zk53y
Introducing Scalable Quantum Approaches in Language Representation
[chapter]
2011
Lecture Notes in Computer Science
The novel paradigm of general-purpose computing on graphics processors (GPGPU) offers a feasible and economical alternative: it has already become a common phenomenon in scientific computation, with many ...
High-performance computational resources and distributed systems are crucial for the success of real-world language technology applications. ...
Fast Fourier transformation on GPUs is a classical area for acceleration [57] . ...
doi:10.1007/978-3-642-24971-6_2
fatcat:emliiuolnzdtpnhfflc7wsmkde
2018 Index IEEE Transactions on Computers Vol. 67
2019
IEEE transactions on computers
., TC June 2018 771-783 Faz-Hernandez, A., Lopez, J., Ochoa-Jimenez, E., and Rodriguez-Henriquez, F., A Faster Software Implementation of the Supersingular Isogeny Diffie-Hellman Key Exchange Protocol ...
Feng, H., þ, TC Feb. 2018 252-267
F
Fast Fourier transforms
A Scheme to Design Concurrent Error Detection Techniques for the Fast
Fourier Transform Implemented in SRAM-Based FPGAs. ...
Choi, I., þ, TC Dec. 2018 1835-1839
Decision diagrams
Performability Analysis of Large-Scale Multi-State Computing Systems. ...
doi:10.1109/tc.2018.2882120
fatcat:j2j7yw42hnghjoik2ghvqab6ti
Accurate, scalable and informative design space exploration for large and sophisticated multi-core oriented architectures
2009
2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems
We extensively evaluate the efficiency of our predictive models in forecasting the complex and heterogeneous characteristics of large and distributed shared cache interconnected by a network on chip in ...
In this paper, we propose novel, multi-scale 2D predictive models which can efficiently reason the characteristics of large and sophisticated multi-core oriented architectures during the design space exploration ...
ACKNOWLEDGMENT This work is supported in part by NSF CAREER Award CCF-0845721, and by Microsoft Research Safe and Scalable Multi-core Computing Award. ...
doi:10.1109/mascot.2009.5366283
dblp:conf/mascots/ChoPLY09
fatcat:npsknlayezc65h6w4plxnjtcqu
« Previous
Showing results 1 — 15 out of 11,171 results