6,198 Hits in 5.6 sec

Mixed-Precision Tomographic Reconstructor Computations on Hardware Accelerators

Nicolas Doucet, Hatem Ltaief, Damien Gratadour, David Keyes
2019 2019 IEEE/ACM 9th Workshop on Irregular Applications: Architectures and Algorithms (IA3)  
To mitigate this increasing dimensionality overhead, this paper presents the implementation of a novel mixed-precision Cholesky-based dense matrix solver on hardware accelerators.  ...  To our knowledge, this is the first computational astronomy application that exploits V100's tensor cores outside of the traditional arena of artificial intelligence.  ...  LEVERAGING MIXED-PRECISION TECHNIQUES FOR THE TOR COMPUTATIONS With the advent of hardware support for low precision floating-point arithmetics (e.g., Google TPU chip and NVIDIA GPU with tensor cores),  ... 
doi:10.1109/ia349570.2019.00011 dblp:conf/sc/DoucetLGK19 fatcat:mh3kpd4divh3tpzzhh5hhphrgm

White Paper from Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020): HPC and Computer Arithmetic toward Minimal-Precision Computing [article]

Roman Iakymchuk, Daichi Mukunoki, Artur Podobas, Fabienne Jézéquel, Toshiyuki Imamura, Norihisa Fujita, Jens Huthmann, Shuhei Kudo, Yiyu Tan, Jens Domke, Kai Torben Ohlhus, Takeshi Fukaya (+6 others)
2020 arXiv   pre-print
In numerical computations, precision of floating-point computations is a key factor to determine the performance (speed and energy-efficiency) as well as the reliability (accuracy and reproducibility).  ...  In 2019, we have started the Minimal-Precision Computing project to propose a more broad concept of the minimal-precision computing system with precision-tuning, involving both hardware and software stack  ...  Hence, we show examples of failures in the HPL-AI implemen-tation to discuss with the problems for using the lower-and mixed-precision computation in scientific computations.  ... 
arXiv:2004.04628v2 fatcat:7fo3kfaa7zfnhg4mlz62ljnvee

SPFP: Speed without compromise—A mixed precision model for GPU accelerated molecular dynamics simulations

Scott Le Grand, Andreas W. Götz, Ross C. Walker
2013 Computer Physics Communications  
This precision model replaces double precision arithmetic with fixed point integer arithmetic for the accumulation of force components as compared to a previously introduced model that uses mixed single  ...  and double and mixed single/double precision GPU implementations.  ...  Acknowledgments This work was funded in part by the National Science Foundation through the Scientific Software Innovations Institutes program-NSF SI2-SSE (NSF1047875 and NSF1148276) grants to R.C.W and  ... 
doi:10.1016/j.cpc.2012.09.022 fatcat:azokxlbkyzhgngaxnsb7d7fp4e

Toward a modular precision ecosystem for high-performance computing

Hartwig Anzt, Goran Flegar, Thomas Grützmacher, Enrique S Quintana-Ortí
2019 The international journal of high performance computing applications  
With the memory bandwidth of current computer architectures being significantly slower than the (floating point) arithmetic performance, many scientific computations only leverage a fraction of the computational  ...  This paper tackles this mismatch between floating point arithmetic throughput and memory bandwidth by advocating a disruptive paradigm change with respect to how data is stored and processed in scientific  ...  Introduction The digital revolution is shaping the future through breathtaking advances in all scientific fields, fueled by new scientific computing algorithms and the increasing computing power available  ... 
doi:10.1177/1094342019846547 fatcat:63sikoiq5nfzzo2rb5wkidzdwe

A scalable and compact systolic architecture for linear solvers

Kevin S. H. Ong, Suhaib A. Fahmy, Keck-Voon Ling
2014 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors  
In comparison with similar work, our design offers up to a 12-fold improvement in speed whilst requiring up to 50% less hardware resources.  ...  A novel systolic array architecture that can be used as a building block in scientific applications is described and prototyped on a Xilinx Virtex 6 FPGA.  ...  The individual blocks are designed to be easily composable in SysGen. A mixed number representation is used, and we adopt the precision used in [5] . B.  ... 
doi:10.1109/asap.2014.6868658 dblp:conf/asap/OngFL14 fatcat:3j23w7boyffb7issoqhobvblei

Accelerating iterative CT reconstruction algorithms using Tensor Cores

Mohsen Nourazar, Bart Goossens
2021 Journal of Real-Time Image Processing  
The relative reconstruction error due to the mixed-precision computations was almost equal to the error of single-precision (32-bit) floating-point computations.  ...  In this paper, we demonstrate the feasibility of using NVIDIA Tensor Cores for the acceleration of a non-machine learning application: iterative Computed Tomography (CT) reconstruction.  ...  Because Tensor Cores operate in a mixed 16-bit/32-bit floating-point precision, we expect a loss in accuracy compared to a purely 32-bit floating-point implementation.  ... 
doi:10.1007/s11554-020-01069-5 fatcat:7o3nt5ojtfcpdpenlbilqboqpm

An Error Correction Solver for Linear Systems: Evaluation of Mixed Precision Implementations [chapter]

Hartwig Anzt, Vincent Heuveline, Björn Rocker
2011 Lecture Notes in Computer Science  
This paper proposes an error correction method for solving linear systems of equations and the evaluation of an implementation using mixed precision techniques.  ...  While different technologies are available, graphic processing units (GPUs) have been established as particularly powerful coprocessors in recent years.  ...  In many cases single precision floating point operations are not suitable for scientific computation.  ... 
doi:10.1007/978-3-642-19328-6_8 fatcat:pxrxxwd6yzhs5e45gfjegwiq64

Recycled Error Bits: Energy-Efficient Architectural Support for Floating Point Accuracy

Ralph Nathan, Bryan Anthonio, Shih-Lien Lu, Helia Naeimi, Daniel J. Sorin, Xiaobai Sun
2014 SC14: International Conference for High Performance Computing, Networking, Storage and Analysis  
In this work, we provide energy-efficient architectural support for floating point accuracy. For each floating point addition performed, we "recycle" that operation's rounding error.  ...  Experimental results on physical hardware show that software that exploits architecturally recycled error bits can (a) achieve accuracy comparable to a 64-bit FPU with performance and energy that are comparable  ...  We thank our shepherd, Mike O'Connor, for his advice in improving this work.  ... 
doi:10.1109/sc.2014.15 dblp:conf/sc/NathanALNSS14 fatcat:axrthxpeefg5jehl7ytaa3m66i

Accelerating Geometric Multigrid Preconditioning with Half-Precision Arithmetic on GPUs [article]

Kyaw L. Oo, Andreas Vogel
2020 arXiv   pre-print
With the hardware support for half-precision arithmetic on NVIDIA V100 GPUs, high-performance computing applications can benefit from lower precision at appropriate spots to speed up the overall execution  ...  In this paper, we investigate a mixed-precision geometric multigrid method to solve large sparse systems of equations stemming from discretization of elliptic PDEs.  ...  ( for funding this project by providing computing time on the GCS Supercomputer JUWELS at Jülich Supercomputing Centre (JSC).  ... 
arXiv:2007.07539v1 fatcat:uspsoue4iresredpy3djlm66nu

European Exascale Software Initiative: Numerical Libraries, Solvers and Algorithms [chapter]

Iain S. Duff
2012 Lecture Notes in Computer Science  
Computers with sustained Petascale performance are now available and it is expected that hardware will be developed with a peak capability in the Exascale range by around 2018.  ...  The main goals of EESI are to build a European vision and roadmap to address the international outstanding challenge of performing scientific computing on the new generation of computers.  ...  Thus the involvement of all the people listed in Table 2 is gratefully acknowledged.  ... 
doi:10.1007/978-3-642-29737-3_34 fatcat:7nqpdtlvlbddtmf5epwq23mera


Cheng Tan, Thierry Tambe, Jeff (Jun) Zhang, Bo Fang, Tong Geng, Gu-Yeon Wei, David Brooks, Antonino Tumeo, Ganesh Gopalakrishnan, Ang Li
2022 Proceedings of the 36th ACM International Conference on Supercomputing  
Our evaluation shows that ASAP generates specialized designs 3.2×, 4.21×, and 5.8× more efficient (in terms of performance per unit of energy or area) than non-specialized homogeneous CGRAs, for the scientific  ...  To address this gap, we propose ASAP -a hardware/software co-design framework that automatically identifies and synthesizes optimal precision-aware CGRA for a set of applications of interest.  ...  Machine learning inference can typically afford further reduced precision (paying some trivial penalty in accuracy), exploiting half-precision floating-point formats FP16) or specialized solutions like  ... 
doi:10.1145/3524059.3532359 fatcat:re267n7aw5cxlirdteca7tmzey

Pipelined Mixed Precision Algorithms on FPGAs for Fast and Accurate PDE Solvers from Low Precision Components

Robert Strzodka, Dominik Goddeke
2006 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines  
FPGAs are becoming more and more attractive for high precision scientific computations.  ...  in double precision to obtain the same accuracy as a full double precision solver.  ...  Acknowledgments We thank Pavle Belanovic and Miriam Leeser for the availability of their generic floating point library, Industrial Light & Magic for the half class and Xilinx for their ISE.  ... 
doi:10.1109/fccm.2006.57 dblp:conf/fccm/StrzodkaG06 fatcat:yokqwxy46bcvno6cd7dcuefgdi

Evaluating Mixed-Precision Arithmetic for 3D Generative Adversarial Networks to Simulate High Energy Physics Detectors

John Osorio Rios, Adria Armejach, Gulrukh Khattak, Eric Petit, Sofia Vallecorsa, Marc Casas
2020 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)  
The usage of Mixed Precision (MP) arithmetic with floating-point 32-bit (FP32) and 16-bit half-precision aims at improving memory and floating-point operations throughput, allowing faster training of bigger  ...  This paper proposes a binary analysis tool enabling the emulation of lower precision numerical formats in Neural Network implementation without the need for hardware support.  ...  In that case, computation proceeds in FP32. • The tool intercepts all floating-point instructions of the workload, including FMAs.  ... 
doi:10.1109/icmla51294.2020.00017 fatcat:wz7z4efnffchrnhqhtzyi7tcqy

Make it real: Effective floating-point reasoning via exact arithmetic

Miriam Leeser, Saoni Mukherjee, Jaideep Ramachandran, Thomas Wahl
2014 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2014  
Floating-point arithmetic is widely used in scientific computing.  ...  To address these problems, we present a decision procedure for floating-point arithmetic (FPA) that exploits the proximity to real arithmetic (RA), via a loss-less reduction from FPA to RA.  ...  In this paper, we propose an approach that exploits the proximity of floating-point to exact real arithmetic.  ... 
doi:10.7873/date.2014.130 dblp:conf/date/LeeserMRW14 fatcat:m35vz6i3krgvrlcw4f5wirnn5i

Towards a fixed point QP solver for predictive control

Juan L. Jerez, George A. Constantinides, Eric C. Kerrigan
2012 2012 IEEE 51st IEEE Conference on Decision and Control (CDC)  
solver to allow for fast and efficient computation in parallel hardware.  ...  The proposed approach is evaluated through the implementation of a mixed precision interior-point controller for a Boeing 747 aircraft.  ...  solver (MINRES in this case) -is computed in fixed-point, whereas the rest of the algorithm is computed in double precision floating-point.  ... 
doi:10.1109/cdc.2012.6427015 dblp:conf/cdc/JerezCK12 fatcat:6jtalwqjjvgl7c3rpevl5i2rea
« Previous Showing results 1 — 15 out of 6,198 results