Filters








54 Hits in 7.1 sec

Abstract Machine Models and Proxy Architectures for Exascale Computing

J.A. Ang, R.F. Barrett, R.E. Benner, D. Burke, C. Chan, J. Cook, D. Donofrio, S.D. Hammond, K.S. Hemmert, S.M. Kelly, H. Le, V.J. Leung (+6 others)
2014 2014 Hardware-Software Co-Design for High Performance Computing  
They allow for application performance analysis and hardware optimization opportunities.  ...  knowledge and refinements for contemporary computer systems.  ...  many billions of computing elements in an exascale system.  ... 
doi:10.1109/co-hpc.2014.4 dblp:conf/sc/AngBBBCCDHHKLLR14 fatcat:sot6sfvdhbcwfbspps77auhwum

Coordinated energy management in heterogeneous processors

Indrani Paul, Vignesh Ravi, Srilatha Manne, Manish Arora, Sudhakar Yalamanchili
2013 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13  
This paper examines energy management in a heterogeneous processor consisting of an integrated CPU-GPU for highperformance computing (HPC) applications.  ...  The insights from this analysis form the basis of a coordinated energy management scheme, called DynaCo, for integrated CPU-GPU architectures.  ...  The authors gratefully acknowledge the efforts and detailed comments of the reviewers, which substantially improved the final manuscript.  ... 
doi:10.1145/2503210.2503227 dblp:conf/sc/PaulRMAY13 fatcat:gqy377iu5bdmvm2e2zx64suao4

The Landscape of Exascale Research

Stijn Heldens, Pieter Hijma, Ben Van Werkhoven, Jason Maassen, Adam S. Z. Belloum, Rob V. Van Nieuwpoort
2020 ACM Computing Surveys  
In this work, we present an overview of these efforts and provide insight into the important trends, developments, and exciting research opportunities in exascale computing.  ...  Although we will certainly reach exascale soon, without additional research, these issues could potentially limit the applicability of exascale computing.  ...  [117] provide an overview of the challenges and opportunities of in-situ processing for exascale computing. Yu et al.  ... 
doi:10.1145/3372390 fatcat:jhtwt7pxd5c5darhz75hiqgsnq

Coordinated Energy Management in Heterogeneous Processors

Indrani Paul, Vignesh Ravi, Srilatha Manne, Manish Arora, Sudhakar Yalamanchili
2014 Scientific Programming  
This paper examines energy management in a heterogeneous processor consisting of an integrated CPU–GPU for high-performance computing (HPC) applications.  ...  The insights from this analysis form the basis of a coordinated energy management scheme, called DynaCo, for integrated CPU–GPU architectures.  ...  Acknowledgements The authors gratefully acknowledge the efforts and detailed comments of the reviewers, which substantially improved the final manuscript.  ... 
doi:10.1155/2014/210762 fatcat:t73zetmjrjfojmlg7dlovv7xl4

Navigating an Evolutionary Fast Path to Exascale

R.F. Barrett, S.D. Hammond, C.T. Vaughan, D.W. Doerfler, M.A. Heroux, J.P. Luitjens, D. Roweth
2012 2012 SC Companion: High Performance Computing, Networking Storage and Analysis  
The advent of manycore and heterogeneous computing nodes forces us to reconsider every aspect of the system software and application stack.  ...  Aiding an evolutionary approach is the recognition that the performance potential of the architectures is, in a meaningful sense, an extension of existing capabilities: vectorization, threading, and a  ...  The test beds used for this research are funded by the Department of Energy's NNSA ASC program and the Office of Science Advanced Scientific Computing Research (ASCR) program.  ... 
doi:10.1109/sc.companion.2012.55 dblp:conf/sc/BarrettHVDHLR12 fatcat:3frq3n526vccbmfcpkoneb4edu

Efficient synthetic traffic models for large, complex SoCs

Jieming Yin, Onur Kayiran, Matthew Poremba, Natalie Enright Jerger, Gabriel H. Loh
2016 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)  
As systems scale up in size and functionality, the ability to efficiently model larger and more complex NoCs becomes increasingly important to the design and evaluation of such systems.  ...  We identify and analyze the shortcomings of SynFull in the context of a SoC consisting of a heterogeneous architecture (CPU and GPU), a more complex cache hierarchy including support for full coherence  ...  Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.  ... 
doi:10.1109/hpca.2016.7446073 dblp:conf/hpca/YinKPJL16 fatcat:fd4fgvheendr7gegehz7ejn72m

Simultac Fonton: A Fine-Grain Architecture for Extreme Performance beyond Moore's Law

2017 Supercomputing Frontiers and Innovations  
"Continuum Computer Architecture" is introduced as a genus of ultra-fine-grained architectures where complexity of operation is an emergent behavior of simplicity of design combined with highly replicated  ...  With nano-scale technology and Moore's Law end, architecture advance serves as the principal means of achieving enhanced efficiency and scalability into the exascale era.  ...  The paper concludes with an analysis that suggests that an exascale computer employing such concepts could be devised and fabricated using contemporary CMOS semiconductor technology and managed through  ... 
doi:10.14529/jsfi170203 fatcat:2us4eiaoyvgdrnldlljjlrntae

Scientific Computing Using Consumer Video-Gaming Hardware Devices [article]

Glenn Volkema, Gaurav Khanna
2016 arXiv   pre-print
In this article, we evaluate a sample of current generation video-gaming hardware devices for scientific computing and compare their performance with specialized supercomputing general purpose graphics  ...  , Einstein@Home in the field of gravitational physics for the purposes of this evaluation.  ...  The authors would like to acknowledge support from the Center for Scientific Computing and Visualization Research at UMass Dartmouth.  ... 
arXiv:1607.05537v1 fatcat:xjqxkq7up5etfbr5hztqnqgoby

Accelerating Hydrocodes with OpenACC, OpenCL and CUDA

J. A. Herdman, W. P. Gaudin, S. McIntosh-Smith, M. Boulton, D. A. Beckingsale, A. C. Mallinson, S. A. Jarvis
2012 2012 SC Companion: High Performance Computing, Networking Storage and Analysis  
Hardware accelerators such as GPGPUs are becoming increasingly common in HPC platforms and their use is widely recognised as being one of the most promising approaches for reaching exascale levels of performance  ...  We find that OpenACC is an extremely viable programming model for accelerator devices, improving programmer productivity and achieving better performance than OpenCL and CUDA.  ...  of Predictive Models for Future Computing Requirements) and CDK0724 (AWE Technical Outreach Programme).  ... 
doi:10.1109/sc.companion.2012.66 dblp:conf/sc/HerdmanGMBBMJ12 fatcat:hu77tqzljrgjfbdcwsoonjz5yu

Highly optimized full GPU-acceleration of non-hydrostatic weather model SCALE-LES

Mohamed Wahib, Naoya Maruyama
2013 2013 IEEE International Conference on Cluster Computing (CLUSTER)  
The proposed acceleration is important for identifying the expectations and requirements of scaling SCALE-LES, and similar real world applications, into the exascale era.  ...  The GPU implementation includes the optimized GPU acceleration of SCALE-LES for a single GPU with both CUDA Fortran and OpenACC. It also includes scaling SCALE-LES for GPUaccelerated clusters.  ...  ACKNOWLEDGMENT The authors would like to thank Team SCALE for their support and discussions. The authors also thank Seiya Nishizawa for his consistent support and review of the results.  ... 
doi:10.1109/cluster.2013.6702667 dblp:conf/cluster/WahibM13 fatcat:cm7uzjleinaofbgjw2mgoaytxi

Enabling Cp2K Application For Exascale Computing With Accelerators Using Openacc And Opencl

Mariusz Uchroński
2014 Zenodo  
CP2K is an application for atomistic and molecular simulation and, with its excellent scalability, is particularly important with regards to use on future exascale systems.  ...  We focused on enabling the library on a potentially wider range of computing resources using OpenCL and OpenACC technologies, to bring the overall application closer to exascale.  ...  Introduction CP2K [1] is an open-source application designed for atomistic and molecular simulation of solid state, liquid, molecular and biological systems.  ... 
doi:10.5281/zenodo.822907 fatcat:qc4aalw3fvdavedfumz2wvpzdy

The tradeoffs of fused memory hierarchies in heterogeneous computing architectures

Kyle L. Spafford, Jeremy S. Meredith, Seyong Lee, Dong Li, Philip C. Roth, Jeffrey S. Vetter
2012 Proceedings of the 9th conference on Computing Frontiers - CF '12  
We examine the impact of this trend for high performance scientific computing by investigating AMD's new Fusion Accelerated Processing Unit (APU) as a testbed.  ...  With the rise of general purpose computing on graphics processing units (GPGPU), the influence from consumer markets can now be seen across the spectrum of computer architectures.  ...  This research is sponsored in part by the Office of Advanced Computing Research; U.S. Department of Energy.  ... 
doi:10.1145/2212908.2212924 dblp:conf/cf/SpaffordMLLRV12 fatcat:6dm46euwcvc4nmubvuvatb5wji

TOP-PIM

Dongping Zhang, Nuwan Jayasena, Alexander Lyashevsky, Joseph L. Greathouse, Lifan Xu, Michael Ignatowski
2014 Proceedings of the 23rd international symposium on High-performance parallel and distributed computing - HPDC '14  
We also introduce a methodology for rapid design space exploration by analytically predicting performance and energy of in-memory processors based on metrics obtained from execution on today's GPU hardware  ...  Moving computation closer to memory presents an opportunity to reduce both energy and data movement overheads.  ...  ACKNOWLEDGEMENTS We would like to thank Yasuko Eckert and Wei Huang for their input on modeling memory stack thermals. We appreciate the invaluable comments from the anonymous reviewers.  ... 
doi:10.1145/2600212.2600213 dblp:conf/hpdc/ZhangJLGXI14 fatcat:gfgw5o2kara6jnft3tcadzhclu

Performance evaluation and analysis of sparse matrix and graph kernels on heterogeneous processors

Feng Zhang, Weifeng Liu, Ningxuan Feng, Jidong Zhai, Xiaoyong Du
2019 CCF Transactions on High Performance Computing  
We then analyze the best performance configurations, in terms of algorithm and compute resource, for matrices of various sparsity structures.  ...  Heterogeneous processors integrate very distinct compute resources such as CPUs and GPUs into the same chip, thus can exploit the advantages and avoid disadvantages of those compute units.  ...  Acknowledgements This work has been partly supported by the National Natural Science Foundation of China (Grant nos. 61732014, 61802412, 61671151), Beijing Natural Science Foundation (no. 4172031), and  ... 
doi:10.1007/s42514-019-00008-6 fatcat:t3nfa446lbb6pb4azkfjowhijm

First steps towards more numerical reproducibility

Fabienne Jézéquel, Philippe Langlois, Nathalie Revol, Jean-Stéphane Dhersin
2014 ESAIM Proceedings and Surveys  
Results of floating-point computation depends on the computer arithmetic precision and on the order of arithmetic operations.  ...  Massive parallel HPC which merges, for instance, many-core CPU and GPU, clearly modifies these two parameters even from run to run on a given computing platform. How to trust such computed results?  ...  Algorithms are designed specifically for interval computations and most of them usually combine both kinds of techniques.  ... 
doi:10.1051/proc/201445023 fatcat:lcr3p5bkhncrncytegcsday7ti
« Previous Showing results 1 — 15 out of 54 results