25,004 Hits in 7.2 sec

Another Trip to the Wall

Milan Radulovic, Darko Zivanovic, Daniel Ruiz, Bronis R. de Supinski, Sally A. McKee, Petar Radojković, Eduard Ayguadé
2015 Proceedings of the 2015 International Symposium on Memory Systems - MEMSYS '15  
Here we summarize our analysis and expectations of how such 3D-stacked DRAMs will affect the memory wall for a set of representative HPC applications.  ...  The first such products will soon hit the market, and some of the publicity claims that they will break through the memory wall.  ...  In Figure 6 , we quantify the impact of a 25% memory bandwidth increase on application performance. Performance improves by 14.7%, 8.5%, and 3.7% for STREAM, QE, and ALYA, respectively.  ... 
doi:10.1145/2818950.2818955 dblp:conf/memsys/RadulovicZRSMRA15 fatcat:zkurubdr6naz5cdg7thwmqle7m

On the Effects of Memory Latency and Bandwidth on Supercomputer Application Performance

Richard Murphy
2007 2007 IEEE 10th International Symposium on Workload Characterization  
This paper compares the memory performance sensitivity of both traditional and emerging HPC applications, and shows that the new codes are significantly more sensitive to memory latency and bandwidth than  ...  The performance of both classes of applications is dominated by the performance of the memory system.  ...  It examines the impact of memory latency and bandwidth on the applications.  ... 
doi:10.1109/iiswc.2007.4362179 dblp:conf/iiswc/Murphy07 fatcat:yt6zhxueibcfloc2htnaj73wwq

Exploiting NVM in large-scale graph analytics

Jasmina Malicevic, Subramanya Dulloor, Narayanan Sundaram, Nadathur Satish, Jeff Jackson, Willy Zwaenepoel
2015 Proceedings of the 3rd Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads - INFLOW '15  
While all of these applications are sensitive to higher latency or lower bandwidth of NVM, resulting in performance degradation of up to 4⇥ with NVM-only (compared to DRAM-only), we show that the performance  ...  Further, we demonstrate that, in a hybrid memory system with NVM and DRAM, intelligent placement of application data based on their relative importance may help offset the overheads of the NVM-only solution  ...  We provide a short discussion on the impact of memory access patterns on the average memory access latency, and therefore performance of an application.  ... 
doi:10.1145/2819001.2819005 dblp:conf/sosp/MalicevicDSSJZ15 fatcat:iwocr4et7bdg7kyldniveesqnu

Understanding Cloud Workloads Performance in a Production like Environment [article]

Lucia Pons, Josué Feliu, José Puche, Chaoyi Huang, Salvador Petit, Julio Pons, María E. Gómez, Julio Sahuquillo
2020 arXiv   pre-print
After that, we present three main studies addressing three major concerns to improve the cloud performance: impact of the level of load on performance, impact of hyper-threading on performance, and impact  ...  With this aim, this paper devises a workload taxonomy that classifies applications according to how the major system resources affect their performance (e.g., tail latency) as a function of the level of  ...  Finally, we have used Intel RDT to study the impact of limiting the LLC space and the main memory bandwidth on the overall system performance for each application.  ... 
arXiv:2010.05031v1 fatcat:2qgptmdacjhwlnffv3j46qcevi

Understanding the Impact of Emerging Non-Volatile Memories on High-Performance, IO-Intensive Computing

Adrian M. Caulfield, Joel Coburn, Todor Mollov, Arup De, Ameen Akel, Jiahua He, Arun Jagatheesan, Rajesh K. Gupta, Allan Snavely, Steven Swanson
2010 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis  
We evaluate several approaches to integrating these memories into computer systems by measuring their impact on IO-intensive, database, and memory-intensive applications.  ...  The memories provide substantial application-level gains as well, but overheads in the OS, file system, and application can limit performance.  ...  The authors would also like to thank Nathan Goulding, Brett Kettering, and James Nunez.  ... 
doi:10.1109/sc.2010.56 dblp:conf/sc/CaulfieldCMDAHJGSS10 fatcat:kb66vm67gvatnpxfsff3jirc6i

Performance Modeling: Understanding the Past and Predicting the Future [chapter]

David H. Bailey, Allan Snavely
2005 Lecture Notes in Computer Science  
Such models can be used by vendors in system designs, by computing centers in system acquisitions, and by application scientists to improve the performance of their codes.  ...  We present an overview of current research in performance modeling, focusing on efforts underway in the Performance Evaluation Research Center (PERC).  ...  The last set of bars show the TCS values of performance, processor and memory subsystem speed, network bandwidth and latency, as a ratio of the BH values.  ... 
doi:10.1007/11549468_23 fatcat:kgo4sn2wcrbs5cgtjuvsbnazaa

MDM: The GPU Memory Divergence Model

Lu Wang, Magnus Jahre, Almutaz Adileho, Lieven Eeckhout
2020 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)  
In this work, we focus on analytically modeling the performance of emerging memory-divergent GPU-compute applications which are common in domains such as machine learning and data analytics.  ...  We propose the GPU Memory Divergence Model (MDM) which faithfully captures the key performance characteristics of memory-divergent applications, including memory request batching and excessive NoC/DRAM  ...  ACKNOWLEDGEMENTS We thank the anonymous reviewers for  ... 
doi:10.1109/micro50266.2020.00085 dblp:conf/micro/0019JAE20 fatcat:fppdhwq4dfgzlgs7pqdw27ragq

Interferences between Communications and Computations in Distributed HPC Systems

Alexandre Denis, Emmanuel Jeannot, Philippe Swartvagher
2021 50th International Conference on Parallel Processing  
We study the impact of communications on computations, and conversely the impact of computations on communication performance. We consider two aspects: CPU frequency, and memory contention.  ...  We show that CPU frequency variations caused by computation have a small impact on communication latency and bandwidth.  ...  This work was granted access to the HPC resources of CINES under the allocation 2019-A0060601567 attributed by GENCI (Grand Equipement National de Calcul Intensif).  ... 
doi:10.1145/3472456.3473516 fatcat:i7cixrumbzfpjmyxgeeosfrtuu

A study of application performance with non-volatile main memory

Yiying Zhang, Steven Swanson
2015 2015 31st Symposium on Mass Storage Systems and Technologies (MSST)  
We find that although NVMM is projected to have higher latency and lower bandwidth than DRAM, these difference have only a modest impact on application performance.  ...  We also compare the performance of applications running on realistic NVMM with the performance of the same applications running on idealized NVMM with the same performance as DRAM.  ...  First, NVMM improves storage application performance significantly over SSD and HDD. NVMM's higher latency and lower bandwidth than DRAM does not have big impact on storage applications.  ... 
doi:10.1109/msst.2015.7208275 dblp:conf/mss/ZhangS15 fatcat:fwf76hn2yrdq3ggxj3i4vfq6xe

A look at application performance sensitivity to the bandwidth and latency of InfiniBand networks

D.J. Kerbyson
2006 Proceedings 20th IEEE International Parallel & Distributed Processing Symposium  
The performance analysis is based on the use of detailed performance models of the three applications developed at Los Alamos.  ...  The relative importance of bandwidth, latency and node size differs between the applications. 1-4244-0054-6/06/$20.00 ©2006 IEEE SAGE The performance of SAGE is considered using an input deck which assigns  ...  Acknowledgements This work was funded in part by the Accelerated Strategic Computing (ASC) program of the Department of Energy, and by the DARPA High Productivity Computing Systems program in collaboration  ... 
doi:10.1109/ipdps.2006.1639566 dblp:conf/ipps/Kerbyson06 fatcat:rfntyk52tndofegujavrasuawa


Nachiappan Chidambaram Nachiappan, Praveen Yedlapalli, Niranjan Soundararajan, Mahmut Taylan Kandemir, Anand Sivasubramaniam, Chita R. Das
2014 The 2014 ACM international conference on Measurement and modeling of computer systems - SIGMETRICS '14  
models for several IPs to collectively study their impact on system-level performance and power.  ...  GemDroid has been designed by integrating the Android open-source emulator for facilitating execution of mobile applications, the GEM5 core simulator for analyzing the CPU and memory centric designs, and  ...  Acknowledgments We thank the anonymous reviewers, Ashutosh Pattnaik, Adwait Jog, Onur Kayiran, Prashanth Thinakaran and other HPCL members for their feedback on this paper.  ... 
doi:10.1145/2591971.2591973 dblp:conf/sigmetrics/NachiappanYSKSD14 fatcat:b4lxwk2r2vcefivgfhnalpjkaa

Exploring the Performance Benefit of Hybrid Memory System on HPC Environments

Ivy Bo Peng, Roberto Gioiosa, Gokcen Kestor, Pietro Cicotti, Erwin Laure, Stefano Markidis
2017 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)  
In this paper, we analyze the Intel KNL system and quantify the impact of the most important factors on the application performance by using a set of applications that are representative of scientific  ...  On the contrary, applications with random memory access pattern are latency-bound and may suffer from performance degradation when using only MCDRAM.  ...  This work was supported by the DOE Office of Science, Advanced Scientific Computing Research, under the ARGO project (award number 66150) and the CENATE project (award number 64386).  ... 
doi:10.1109/ipdpsw.2017.115 dblp:conf/ipps/PengGKCLM17 fatcat:awswvmbqurctpazotrxgii5ece

Run-time power-down strategies for real-time SDRAM memory controllers

Karthik Chandrasekar, Benny Akesson, Kees Goossens
2012 Proceedings of the 49th Annual Design Automation Conference on - DAC '12  
One provides significant energy savings without impacting the guaranteed bandwidth and latency bounds.  ...  If employed speculatively with real-time memory controllers, power-down mechanisms could impact both the guaranteed bandwidth and the memory latency bounds.  ...  Results and Analysis In our first experiment, we analyze the impact of the different power-down policies on total memory energy consumption when executing the four application traces concurrently.  ... 
doi:10.1145/2228360.2228538 dblp:conf/dac/0001AG12 fatcat:clyrz54g7rhvpabynnisxnp56u

STATS: A framework for microprocessor and system-level design space exploration

David H. Albonesi, Israel Koren
1999 Journal of systems architecture  
performance systems with minimal schedule impact.  ...  As microprocessor-based systems grow in complexity, and the processor-memory speed gap widens further, more emphasis needs to be placed on early design space exploration in order to produce the highest  ...  Acknowledgements The authors thank Tryggve Fossum, Michael Adler, Joel Emer, Geo Lowney, Bob Nix, and David Webb of Digital Equipment Corporation for developing and licensing the compilation and simulation  ... 
doi:10.1016/s1383-7621(98)00052-6 fatcat:7ikxbpyvrfaqph2aiq7god74ci

Understanding PCIe performance for end host networking

Rolf Neugebauer, Gianni Antichi, José Fernando Zazo, Yury Audzevich, Sergio López-Buedo, Andrew W. Moore
2018 Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication - SIGCOMM '18  
This paper focuses on the performance implication of PCIe, the de-facto I/O interconnect in contemporary servers, when interacting with the host architecture and device drivers.  ...  However, implementing custom designs on programmable NICs is not easy: many potential bottlenecks can impact performance.  ...  Acknowledgments.This research is (in part) supported by the UK's Engineering and Physical Sciences Research Council (EPSRC) under the EARL project (EP/P025374/1) and the European H2020 projects dReDBox  ... 
doi:10.1145/3230543.3230560 dblp:conf/sigcomm/NeugebauerAZAL018 fatcat:buf6rbn4gbgoneddg2rja67tli
« Previous Showing results 1 — 15 out of 25,004 results