A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Abstract Machine Models and Proxy Architectures for Exascale Computing
2014
2014 Hardware-Software Co-Design for High Performance Computing
They allow for application performance analysis and hardware optimization opportunities. ...
knowledge and refinements for contemporary computer systems. ...
many billions of computing elements in an exascale system. ...
doi:10.1109/co-hpc.2014.4
dblp:conf/sc/AngBBBCCDHHKLLR14
fatcat:sot6sfvdhbcwfbspps77auhwum
Coordinated energy management in heterogeneous processors
2013
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13
This paper examines energy management in a heterogeneous processor consisting of an integrated CPU-GPU for highperformance computing (HPC) applications. ...
The insights from this analysis form the basis of a coordinated energy management scheme, called DynaCo, for integrated CPU-GPU architectures. ...
The authors gratefully acknowledge the efforts and detailed comments of the reviewers, which substantially improved the final manuscript. ...
doi:10.1145/2503210.2503227
dblp:conf/sc/PaulRMAY13
fatcat:gqy377iu5bdmvm2e2zx64suao4
The Landscape of Exascale Research
2020
ACM Computing Surveys
In this work, we present an overview of these efforts and provide insight into the important trends, developments, and exciting research opportunities in exascale computing. ...
Although we will certainly reach exascale soon, without additional research, these issues could potentially limit the applicability of exascale computing. ...
[117] provide an overview of the challenges and opportunities of in-situ processing for exascale computing. Yu et al. ...
doi:10.1145/3372390
fatcat:jhtwt7pxd5c5darhz75hiqgsnq
Coordinated Energy Management in Heterogeneous Processors
2014
Scientific Programming
This paper examines energy management in a heterogeneous processor consisting of an integrated CPU–GPU for high-performance computing (HPC) applications. ...
The insights from this analysis form the basis of a coordinated energy management scheme, called DynaCo, for integrated CPU–GPU architectures. ...
Acknowledgements The authors gratefully acknowledge the efforts and detailed comments of the reviewers, which substantially improved the final manuscript. ...
doi:10.1155/2014/210762
fatcat:t73zetmjrjfojmlg7dlovv7xl4
Navigating an Evolutionary Fast Path to Exascale
2012
2012 SC Companion: High Performance Computing, Networking Storage and Analysis
The advent of manycore and heterogeneous computing nodes forces us to reconsider every aspect of the system software and application stack. ...
Aiding an evolutionary approach is the recognition that the performance potential of the architectures is, in a meaningful sense, an extension of existing capabilities: vectorization, threading, and a ...
The test beds used for this research are funded by the Department of Energy's NNSA ASC program and the Office of Science Advanced Scientific Computing Research (ASCR) program. ...
doi:10.1109/sc.companion.2012.55
dblp:conf/sc/BarrettHVDHLR12
fatcat:3frq3n526vccbmfcpkoneb4edu
Efficient synthetic traffic models for large, complex SoCs
2016
2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)
As systems scale up in size and functionality, the ability to efficiently model larger and more complex NoCs becomes increasingly important to the design and evaluation of such systems. ...
We identify and analyze the shortcomings of SynFull in the context of a SoC consisting of a heterogeneous architecture (CPU and GPU), a more complex cache hierarchy including support for full coherence ...
Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies. ...
doi:10.1109/hpca.2016.7446073
dblp:conf/hpca/YinKPJL16
fatcat:fd4fgvheendr7gegehz7ejn72m
Simultac Fonton: A Fine-Grain Architecture for Extreme Performance beyond Moore's Law
2017
Supercomputing Frontiers and Innovations
"Continuum Computer Architecture" is introduced as a genus of ultra-fine-grained architectures where complexity of operation is an emergent behavior of simplicity of design combined with highly replicated ...
With nano-scale technology and Moore's Law end, architecture advance serves as the principal means of achieving enhanced efficiency and scalability into the exascale era. ...
The paper concludes with an analysis that suggests that an exascale computer employing such concepts could be devised and fabricated using contemporary CMOS semiconductor technology and managed through ...
doi:10.14529/jsfi170203
fatcat:2us4eiaoyvgdrnldlljjlrntae
Scientific Computing Using Consumer Video-Gaming Hardware Devices
[article]
2016
arXiv
pre-print
In this article, we evaluate a sample of current generation video-gaming hardware devices for scientific computing and compare their performance with specialized supercomputing general purpose graphics ...
, Einstein@Home in the field of gravitational physics for the purposes of this evaluation. ...
The authors would like to acknowledge support from the Center for Scientific Computing and Visualization Research at UMass Dartmouth. ...
arXiv:1607.05537v1
fatcat:xjqxkq7up5etfbr5hztqnqgoby
Accelerating Hydrocodes with OpenACC, OpenCL and CUDA
2012
2012 SC Companion: High Performance Computing, Networking Storage and Analysis
Hardware accelerators such as GPGPUs are becoming increasingly common in HPC platforms and their use is widely recognised as being one of the most promising approaches for reaching exascale levels of performance ...
We find that OpenACC is an extremely viable programming model for accelerator devices, improving programmer productivity and achieving better performance than OpenCL and CUDA. ...
of Predictive Models for Future Computing Requirements) and CDK0724 (AWE Technical Outreach Programme). ...
doi:10.1109/sc.companion.2012.66
dblp:conf/sc/HerdmanGMBBMJ12
fatcat:hu77tqzljrgjfbdcwsoonjz5yu
Highly optimized full GPU-acceleration of non-hydrostatic weather model SCALE-LES
2013
2013 IEEE International Conference on Cluster Computing (CLUSTER)
The proposed acceleration is important for identifying the expectations and requirements of scaling SCALE-LES, and similar real world applications, into the exascale era. ...
The GPU implementation includes the optimized GPU acceleration of SCALE-LES for a single GPU with both CUDA Fortran and OpenACC. It also includes scaling SCALE-LES for GPUaccelerated clusters. ...
ACKNOWLEDGMENT The authors would like to thank Team SCALE for their support and discussions. The authors also thank Seiya Nishizawa for his consistent support and review of the results. ...
doi:10.1109/cluster.2013.6702667
dblp:conf/cluster/WahibM13
fatcat:cm7uzjleinaofbgjw2mgoaytxi
Enabling Cp2K Application For Exascale Computing With Accelerators Using Openacc And Opencl
2014
Zenodo
CP2K is an application for atomistic and molecular simulation and, with its excellent scalability, is particularly important with regards to use on future exascale systems. ...
We focused on enabling the library on a potentially wider range of computing resources using OpenCL and OpenACC technologies, to bring the overall application closer to exascale. ...
Introduction CP2K [1] is an open-source application designed for atomistic and molecular simulation of solid state, liquid, molecular and biological systems. ...
doi:10.5281/zenodo.822907
fatcat:qc4aalw3fvdavedfumz2wvpzdy
The tradeoffs of fused memory hierarchies in heterogeneous computing architectures
2012
Proceedings of the 9th conference on Computing Frontiers - CF '12
We examine the impact of this trend for high performance scientific computing by investigating AMD's new Fusion Accelerated Processing Unit (APU) as a testbed. ...
With the rise of general purpose computing on graphics processing units (GPGPU), the influence from consumer markets can now be seen across the spectrum of computer architectures. ...
This research is sponsored in part by the Office of Advanced Computing Research; U.S. Department of Energy. ...
doi:10.1145/2212908.2212924
dblp:conf/cf/SpaffordMLLRV12
fatcat:6dm46euwcvc4nmubvuvatb5wji
TOP-PIM
2014
Proceedings of the 23rd international symposium on High-performance parallel and distributed computing - HPDC '14
We also introduce a methodology for rapid design space exploration by analytically predicting performance and energy of in-memory processors based on metrics obtained from execution on today's GPU hardware ...
Moving computation closer to memory presents an opportunity to reduce both energy and data movement overheads. ...
ACKNOWLEDGEMENTS We would like to thank Yasuko Eckert and Wei Huang for their input on modeling memory stack thermals. We appreciate the invaluable comments from the anonymous reviewers. ...
doi:10.1145/2600212.2600213
dblp:conf/hpdc/ZhangJLGXI14
fatcat:gfgw5o2kara6jnft3tcadzhclu
Performance evaluation and analysis of sparse matrix and graph kernels on heterogeneous processors
2019
CCF Transactions on High Performance Computing
We then analyze the best performance configurations, in terms of algorithm and compute resource, for matrices of various sparsity structures. ...
Heterogeneous processors integrate very distinct compute resources such as CPUs and GPUs into the same chip, thus can exploit the advantages and avoid disadvantages of those compute units. ...
Acknowledgements This work has been partly supported by the National Natural Science Foundation of China (Grant nos. 61732014, 61802412, 61671151), Beijing Natural Science Foundation (no. 4172031), and ...
doi:10.1007/s42514-019-00008-6
fatcat:t3nfa446lbb6pb4azkfjowhijm
First steps towards more numerical reproducibility
2014
ESAIM Proceedings and Surveys
Results of floating-point computation depends on the computer arithmetic precision and on the order of arithmetic operations. ...
Massive parallel HPC which merges, for instance, many-core CPU and GPU, clearly modifies these two parameters even from run to run on a given computing platform. How to trust such computed results? ...
Algorithms are designed specifically for interval computations and most of them usually combine both kinds of techniques. ...
doi:10.1051/proc/201445023
fatcat:lcr3p5bkhncrncytegcsday7ti
« Previous
Showing results 1 — 15 out of 54 results