A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Mixed-Precision Tomographic Reconstructor Computations on Hardware Accelerators
2019
2019 IEEE/ACM 9th Workshop on Irregular Applications: Architectures and Algorithms (IA3)
To mitigate this increasing dimensionality overhead, this paper presents the implementation of a novel mixed-precision Cholesky-based dense matrix solver on hardware accelerators. ...
To our knowledge, this is the first computational astronomy application that exploits V100's tensor cores outside of the traditional arena of artificial intelligence. ...
LEVERAGING MIXED-PRECISION TECHNIQUES FOR THE TOR COMPUTATIONS With the advent of hardware support for low precision floating-point arithmetics (e.g., Google TPU chip and NVIDIA GPU with tensor cores), ...
doi:10.1109/ia349570.2019.00011
dblp:conf/sc/DoucetLGK19
fatcat:mh3kpd4divh3tpzzhh5hhphrgm
White Paper from Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020): HPC and Computer Arithmetic toward Minimal-Precision Computing
[article]
2020
arXiv
pre-print
In numerical computations, precision of floating-point computations is a key factor to determine the performance (speed and energy-efficiency) as well as the reliability (accuracy and reproducibility). ...
In 2019, we have started the Minimal-Precision Computing project to propose a more broad concept of the minimal-precision computing system with precision-tuning, involving both hardware and software stack ...
Hence, we show examples of failures in the HPL-AI implemen-tation to discuss with the problems for using the lower-and mixed-precision computation in scientific computations. ...
arXiv:2004.04628v2
fatcat:7fo3kfaa7zfnhg4mlz62ljnvee
SPFP: Speed without compromise—A mixed precision model for GPU accelerated molecular dynamics simulations
2013
Computer Physics Communications
This precision model replaces double precision arithmetic with fixed point integer arithmetic for the accumulation of force components as compared to a previously introduced model that uses mixed single ...
and double and mixed single/double precision GPU implementations. ...
Acknowledgments This work was funded in part by the National Science Foundation through the Scientific Software Innovations Institutes program-NSF SI2-SSE (NSF1047875 and NSF1148276) grants to R.C.W and ...
doi:10.1016/j.cpc.2012.09.022
fatcat:azokxlbkyzhgngaxnsb7d7fp4e
Toward a modular precision ecosystem for high-performance computing
2019
The international journal of high performance computing applications
With the memory bandwidth of current computer architectures being significantly slower than the (floating point) arithmetic performance, many scientific computations only leverage a fraction of the computational ...
This paper tackles this mismatch between floating point arithmetic throughput and memory bandwidth by advocating a disruptive paradigm change with respect to how data is stored and processed in scientific ...
Introduction The digital revolution is shaping the future through breathtaking advances in all scientific fields, fueled by new scientific computing algorithms and the increasing computing power available ...
doi:10.1177/1094342019846547
fatcat:63sikoiq5nfzzo2rb5wkidzdwe
A scalable and compact systolic architecture for linear solvers
2014
2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors
In comparison with similar work, our design offers up to a 12-fold improvement in speed whilst requiring up to 50% less hardware resources. ...
A novel systolic array architecture that can be used as a building block in scientific applications is described and prototyped on a Xilinx Virtex 6 FPGA. ...
The individual blocks are designed to be easily composable in SysGen. A mixed number representation is used, and we adopt the precision used in [5] .
B. ...
doi:10.1109/asap.2014.6868658
dblp:conf/asap/OngFL14
fatcat:3j23w7boyffb7issoqhobvblei
Accelerating iterative CT reconstruction algorithms using Tensor Cores
2021
Journal of Real-Time Image Processing
The relative reconstruction error due to the mixed-precision computations was almost equal to the error of single-precision (32-bit) floating-point computations. ...
In this paper, we demonstrate the feasibility of using NVIDIA Tensor Cores for the acceleration of a non-machine learning application: iterative Computed Tomography (CT) reconstruction. ...
Because Tensor Cores operate in a mixed 16-bit/32-bit floating-point precision, we expect a loss in accuracy compared to a purely 32-bit floating-point implementation. ...
doi:10.1007/s11554-020-01069-5
fatcat:7o3nt5ojtfcpdpenlbilqboqpm
An Error Correction Solver for Linear Systems: Evaluation of Mixed Precision Implementations
[chapter]
2011
Lecture Notes in Computer Science
This paper proposes an error correction method for solving linear systems of equations and the evaluation of an implementation using mixed precision techniques. ...
While different technologies are available, graphic processing units (GPUs) have been established as particularly powerful coprocessors in recent years. ...
In many cases single precision floating point operations are not suitable for scientific computation. ...
doi:10.1007/978-3-642-19328-6_8
fatcat:pxrxxwd6yzhs5e45gfjegwiq64
Recycled Error Bits: Energy-Efficient Architectural Support for Floating Point Accuracy
2014
SC14: International Conference for High Performance Computing, Networking, Storage and Analysis
In this work, we provide energy-efficient architectural support for floating point accuracy. For each floating point addition performed, we "recycle" that operation's rounding error. ...
Experimental results on physical hardware show that software that exploits architecturally recycled error bits can (a) achieve accuracy comparable to a 64-bit FPU with performance and energy that are comparable ...
We thank our shepherd, Mike O'Connor, for his advice in improving this work. ...
doi:10.1109/sc.2014.15
dblp:conf/sc/NathanALNSS14
fatcat:axrthxpeefg5jehl7ytaa3m66i
Accelerating Geometric Multigrid Preconditioning with Half-Precision Arithmetic on GPUs
[article]
2020
arXiv
pre-print
With the hardware support for half-precision arithmetic on NVIDIA V100 GPUs, high-performance computing applications can benefit from lower precision at appropriate spots to speed up the overall execution ...
In this paper, we investigate a mixed-precision geometric multigrid method to solve large sparse systems of equations stemming from discretization of elliptic PDEs. ...
(www.gauss-centre.eu) for funding this project by providing computing time on the GCS Supercomputer JUWELS at Jülich Supercomputing Centre (JSC). ...
arXiv:2007.07539v1
fatcat:uspsoue4iresredpy3djlm66nu
European Exascale Software Initiative: Numerical Libraries, Solvers and Algorithms
[chapter]
2012
Lecture Notes in Computer Science
Computers with sustained Petascale performance are now available and it is expected that hardware will be developed with a peak capability in the Exascale range by around 2018. ...
The main goals of EESI are to build a European vision and roadmap to address the international outstanding challenge of performing scientific computing on the new generation of computers. ...
Thus the involvement of all the people listed in Table 2 is gratefully acknowledged. ...
doi:10.1007/978-3-642-29737-3_34
fatcat:7nqpdtlvlbddtmf5epwq23mera
ASAP
2022
Proceedings of the 36th ACM International Conference on Supercomputing
Our evaluation shows that ASAP generates specialized designs 3.2×, 4.21×, and 5.8× more efficient (in terms of performance per unit of energy or area) than non-specialized homogeneous CGRAs, for the scientific ...
To address this gap, we propose ASAP -a hardware/software co-design framework that automatically identifies and synthesizes optimal precision-aware CGRA for a set of applications of interest. ...
Machine learning inference can typically afford further reduced precision (paying some trivial penalty in accuracy), exploiting half-precision floating-point formats FP16) or specialized solutions like ...
doi:10.1145/3524059.3532359
fatcat:re267n7aw5cxlirdteca7tmzey
Pipelined Mixed Precision Algorithms on FPGAs for Fast and Accurate PDE Solvers from Low Precision Components
2006
2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
FPGAs are becoming more and more attractive for high precision scientific computations. ...
in double precision to obtain the same accuracy as a full double precision solver. ...
Acknowledgments We thank Pavle Belanovic and Miriam Leeser for the availability of their generic floating point library, Industrial Light & Magic for the half class and Xilinx for their ISE. ...
doi:10.1109/fccm.2006.57
dblp:conf/fccm/StrzodkaG06
fatcat:yokqwxy46bcvno6cd7dcuefgdi
Evaluating Mixed-Precision Arithmetic for 3D Generative Adversarial Networks to Simulate High Energy Physics Detectors
2020
2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)
The usage of Mixed Precision (MP) arithmetic with floating-point 32-bit (FP32) and 16-bit half-precision aims at improving memory and floating-point operations throughput, allowing faster training of bigger ...
This paper proposes a binary analysis tool enabling the emulation of lower precision numerical formats in Neural Network implementation without the need for hardware support. ...
In that case, computation proceeds in FP32. • The tool intercepts all floating-point instructions of the workload, including FMAs. ...
doi:10.1109/icmla51294.2020.00017
fatcat:wz7z4efnffchrnhqhtzyi7tcqy
Make it real: Effective floating-point reasoning via exact arithmetic
2014
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2014
Floating-point arithmetic is widely used in scientific computing. ...
To address these problems, we present a decision procedure for floating-point arithmetic (FPA) that exploits the proximity to real arithmetic (RA), via a loss-less reduction from FPA to RA. ...
In this paper, we propose an approach that exploits the proximity of floating-point to exact real arithmetic. ...
doi:10.7873/date.2014.130
dblp:conf/date/LeeserMRW14
fatcat:m35vz6i3krgvrlcw4f5wirnn5i
Towards a fixed point QP solver for predictive control
2012
2012 IEEE 51st IEEE Conference on Decision and Control (CDC)
solver to allow for fast and efficient computation in parallel hardware. ...
The proposed approach is evaluated through the implementation of a mixed precision interior-point controller for a Boeing 747 aircraft. ...
solver (MINRES in this case) -is computed in fixed-point, whereas the rest of the algorithm is computed in double precision floating-point. ...
doi:10.1109/cdc.2012.6427015
dblp:conf/cdc/JerezCK12
fatcat:6jtalwqjjvgl7c3rpevl5i2rea
« Previous
Showing results 1 — 15 out of 6,198 results