Filters








1,440 Hits in 3.5 sec

A performance analysis framework for identifying potential benefits in GPGPU applications

Jaewoong Sim, Aniruddha Dasgupta, Hyesoon Kim, Richard Vuduc
2012 SIGPLAN notices  
In this paper, we present a performance analysis framework that can help shed light on such bottlenecks for GPGPU applications.  ...  Then, we apply static and dynamic profiling to instantiate our performance model for a particular input code and show how the model can predict the potential performance benefits.  ...  Acknowledgments Many thanks to Sunpyo Hong, Jiayuan Meng, HPArch members, and the anonymous reviewers for their suggestions and feedback on improving the paper.  ... 
doi:10.1145/2370036.2145819 fatcat:763qhlvghvfo7k6mxu73jbgpg4

A performance analysis framework for identifying potential benefits in GPGPU applications

Jaewoong Sim, Aniruddha Dasgupta, Hyesoon Kim, Richard Vuduc
2012 Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming - PPoPP '12  
In this paper, we present a performance analysis framework that can help shed light on such bottlenecks for GPGPU applications.  ...  Then, we apply static and dynamic profiling to instantiate our performance model for a particular input code and show how the model can predict the potential performance benefits.  ...  Acknowledgments Many thanks to Sunpyo Hong, Jiayuan Meng, HPArch members, and the anonymous reviewers for their suggestions and feedback on improving the paper.  ... 
doi:10.1145/2145816.2145819 dblp:conf/ppopp/SimDKV12 fatcat:fo4fdxdlo5ebzm2pr3lw7v77cy

Large-Scale Agent-Based Models in Marketing Research: The Quest for the Mythical Free Lunch

Alexandru Voicu
2013 International Journal of Business and Economics Research  
The present research presents a cross-section of the current state-of-the-art in high-performance ABM frameworks, and proposes a novel approach to levering the as of yet untapped potential of cheap, ubiquitous  ...  Unfortunately, the latter source of growth has grown stagnant, and the only avenue for the continued expansion of performance appears to be the move to parallel platforms and programming.  ...  Clearly, it would be folly to aim for an all encompassing analysis, as there is a breadth of applications of ABM in economics.  ... 
doi:10.11648/j.ijber.20130203.11 fatcat:6cjp6e555bbwletsgng6scalna

Shadowfax

Alexander M. Merritt, Vishakha Gupta, Abhishek Verma, Ada Gavrilovska, Karsten Schwan
2011 Proceedings of the 5th international workshop on Virtualization technologies in distributed computing - VTDC '11  
CPUs and CUDA-supported GPGPUs to form a 'virtual execution platform' for an application.  ...  To address this problem and to support increased flexibility in usage models for CUDA-based GPGPU applications, our research proposes GPGPU assemblies, where each assembly combines a desired number of  ...  This research was funded in part by the National Science Foundation SHF and Track II programs through award numbers OCI-0910735, SHF-0905459.  ... 
doi:10.1145/1996121.1996124 dblp:conf/hpdc/MerrittGVGS11 fatcat:qjnqtxg5wjd6lnwicoplvg4e4y

Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU)

Hyesoon Kim, Richard Vuduc, Sara Baghsorkhi, Jee Choi, Wen-mei Hwu
2012 Synthesis Lectures on Computer Architecture  
Acknowledgments First we would like to thank Mark Hill and Michael Morgan for having invited us to write a synthesis lecture and for their support. Many thanks to reviews from Tor M. Aamodt  ...  Given CPU code skeletons, the framework predicts the cost and benefit of GPGPU acceleration. There are also GPU simulators that can be used for performance analysis. Bakhoda et al.  ...  on GPGPUs (Chapter 3), and a survey of the current state-of-the-art in lower-level performance modeling and analysis for GPGPUs (Chapter 4).  ... 
doi:10.2200/s00451ed1v01y201209cac020 fatcat:ll4uas6lmjbcll5zqzomhcv5vq

High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities

Dennis Abts, John Kim
2011 Synthesis Lectures on Computer Architecture  
Acknowledgments First we would like to thank Mark Hill and Michael Morgan for having invited us to write a synthesis lecture and for their support. Many thanks to reviews from Tor M. Aamodt  ...  Given CPU code skeletons, the framework predicts the cost and benefit of GPGPU acceleration. There are also GPU simulators that can be used for performance analysis. Bakhoda et al.  ...  on GPGPUs (Chapter 3), and a survey of the current state-of-the-art in lower-level performance modeling and analysis for GPGPUs (Chapter 4).  ... 
doi:10.2200/s00341ed1v01y201103cac014 fatcat:rjpziqdnezdrnhfiygrg3jdz4m

Predicting Critical Warps in Near-Threshold GPGPU Applications using a Dynamic Choke Point Analysis

Sourav Sanyal, Prabal Basu, Aatreyi Bal, Sanghamitra Roy, Koushik Chakraborty
2019 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)  
Predicting Critical Warps in Near-Threshold GPGPU Applications using a Dynamic Choke Point Analysis by Sourav Sanyal, Master General-purpose graphics processing units (GPGPU), owing to their enormous parallelism  ...  CPAWS identifies the choke point induced critical warps in GPGPU applications, and improves their execution latencies in their respective execution units.  ...  benefits of GPGPUs at NTC.  ... 
doi:10.23919/date.2019.8715059 dblp:conf/date/SanyalBBRC19 fatcat:dlcqotduibekpmm5r6lmwxsppe

A Map-Reduce Based Framework for Heterogeneous Processing Element Cluster Environments

Yu Shyang Tan, Bu-Sung Lee, Bingsheng He, Roy H. Campbell
2012 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)  
In this paper, we present our design of a Processing Element (PE) Aware MapReduce base framework, Pamar.  ...  Pamar allows users to easily parallelize applications across large datasets and at the same time utilizes different PEs for processing different classes of functions efficiently.  ...  This might potentially lead to a longer processing time as there are lesser nodes to share the workload. This can be seen vaguely in the performance of the DNS application in Figure 7 .  ... 
doi:10.1109/ccgrid.2012.35 dblp:conf/ccgrid/TanLHC12 fatcat:5urllrimvzeobkqmesjvnexgri

GPGPU Benchmark Suites: How Well Do They Sample the Performance Spectrum?

Jee Ho Ryoo, Saddam J. Quirem, Michael Lebeane, Reena Panda, Shuang Song, Lizy K. John
2015 2015 44th International Conference on Parallel Processing  
Recently, GPGPUs have positioned themselves in the mainstream processor arena with their potential to perform a massive number of jobs in parallel.  ...  Our methodology can serve as a performance spectrum oriented guidebook for designing future GPGPU benchmarks.  ...  For completeness of the analysis, every application in all suites is included in the analysis regardless of whether one suite has a similar benchmark to another suite.  ... 
doi:10.1109/icpp.2015.41 dblp:conf/icpp/RyooQLPSJ15 fatcat:mdabrq4gh5fblgbb3koeyk6bvq

An automated framework for characterizing and subsetting GPGPU workloads

Vignesh Adhinarayanan, Wu-chun Feng
2016 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)  
Our analysis shows that a subset of eight applications provides most of the diversity in the 19-application benchmark suite.  ...  To overcome these problems, we propose an automated framework that characterizes and subsets GPGPU workloads, depending on a user-chosen set of performance metrics/counters.  ...  ACKNOWLEDGMENT This work was supported in part by NSF I/UCRC IIP-0804155 and IIP-1266245 via the NSF Center for High-Performance Reconfigurable Computing.  ... 
doi:10.1109/ispass.2016.7482105 dblp:conf/ispass/AdhinarayananF16 fatcat:uxwgfxtgkjcndkfr4hbwps757i

A complete and efficient CUDA-sharing solution for HPC clusters

Antonio J. Peña, Carlos Reaño, Federico Silla, Rafael Mayo, Enrique S. Quintana-Ortí, José Duato
2014 Parallel Computing  
This work was also supported in part by the U.S. Department of Energy, Office of Science, under contract DE-AC02-06CH11357.  ...  Authors are grateful for the generous support provided by Mellanox Technologies to the rCUDA project.  ...  Since a similar issue arises also when a few local GPUs share internal paths, applications caring about this potential limitation will seamlessly benefit from this feature of rCUDA.  ... 
doi:10.1016/j.parco.2014.09.011 fatcat:5s4knmxn3nbzpmem2fqhttv2x4

Emerging Computing Technologies in High Energy Physics [article]

Amir Farbin
2009 arXiv   pre-print
has resulted in a generally slow adoption of emerging computing technologies which rapidly become commonplace in business and other scientific fields.  ...  I will overview some of the fundamental computing problems in HEP computing and then present the current state and future potential of employing new computing technologies in addressing these problems.  ...  Simple ROOT Analysis A very simple study of read/write rates of ROOT [18] analyses illustrates the potential of SSDs and GPGPUs on the data analysis iteration rate.  ... 
arXiv:0910.3440v1 fatcat:avvkikyqznbrfhcz6abmshdoje

OpenMP to GPGPU

Seyong Lee, Seung-Jai Min, Rudolf Eigenmann
2009 SIGPLAN notices  
This paper presents a compiler framework for automatic source-to-source translation of standard OpenMP applications into CUDA-based GPGPU applications.  ...  In this paper, we have identified several key transformation techniques, which enable efficient GPU global memory access, to achieve high performance.  ...  Acknowledgments This work was supported, in part, by the National Science Foundation under grants No. 0429535-CCF, CNS-0751153, and 0833115-CCF.  ... 
doi:10.1145/1594835.1504194 fatcat:wbpl7ohbzffedndc6s6tafkfny

OpenMP to GPGPU

Seyong Lee, Seung-Jai Min, Rudolf Eigenmann
2008 Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '09  
This paper presents a compiler framework for automatic source-to-source translation of standard OpenMP applications into CUDA-based GPGPU applications.  ...  In this paper, we have identified several key transformation techniques, which enable efficient GPU global memory access, to achieve high performance.  ...  Acknowledgments This work was supported, in part, by the National Science Foundation under grants No. 0429535-CCF, CNS-0751153, and 0833115-CCF.  ... 
doi:10.1145/1504176.1504194 dblp:conf/ppopp/LeeME09 fatcat:7ru27sozu5h5hhlni4w4cdx6hi

Exploring the interoperability of remote GPGPU virtualization using rCUDA and directive-based programming models

Adrián Castelló, Antonio J. Peña, Rafael Mayo, Judit Planas, Enrique S. Quintana-Ortí, Pavan Balaji
2016 Journal of Supercomputing  
In particular, rCUDA provides transparent access to any graphic processor unit installed in a cluster, reducing the number of accelerators and increasing their utilization ratio.  ...  Remote accelerator virtualization frameworks address those problems.  ...  The performance results for rCUDA in the chart clearly demonstrate the benefits of the remote virtualization approach for this application.  ... 
doi:10.1007/s11227-016-1791-y fatcat:rxi6kn7rjbcv3kxebs52a2j5wm
« Previous Showing results 1 — 15 out of 1,440 results