Filters








74 Hits in 3.9 sec

SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance [chapter]

Vishal Aslot, Max Domeika, Rudolf Eigenmann, Greg Gaertner, Wesley B. Jones, Bodo Parady
2001 Lecture Notes in Computer Science  
We present a new benchmark suite for parallel computers. SPEComp targets mid-size parallel servers. It includes a number of science/engineering and data processing applications.  ...  Our overview also describes the organization developing SPEComp, issues in creating OpenMP parallel benchmarks, the benchmarking methodology underlying SPEComp, and basic performance characteristics.  ...  Acknowledgement We'd like to acknowledge the important contributions of the many authors of the individual benchmark applications.  ... 
doi:10.1007/3-540-44587-0_1 fatcat:nluzkwgtynfifjf2l5ldgvsb5y

Performance Evaluation of Massively Parallel Systems Using SPECOMP Suite

Dheya Mustafa
2022 Computers  
Performance analysis plays an essential role in achieving a scalable performance of applications on massively parallel supercomputers equipped with thousands of processors.  ...  This paper is an empirical investigation to study, in depth, the performance of two of the most common High-Performance Computing architectures in the world.  ...  Acknowledgments: We would like to thank Argonne National Lab for allowing us to use their machines. Conflicts of Interest: The author declares no conflict of interest.  ... 
doi:10.3390/computers11050075 fatcat:4lcuefuno5fwbdibxht43taw74

Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP

Alejandro Duran, Xavier Teruel, Roger Ferrer, Xavier Martorell, Eduard Ayguade
2009 2009 International Conference on Parallel Processing  
And so, the need to have some set of benchmarks to evaluate it. In this paper, we motivate the need of having such a benchmarks suite, for irregular and/or recursive task parallelism.  ...  Parallel tasks allow the exploitation of irregular parallelism, but there is a lack of benchmarks exploiting tasks in OpenMP.  ...  OpenMP applications in OmpSCR, PARSEC, NAS, and SPEComp suites are mostly regular, and parallelism is exploited based on loops, with only a pair of applications exploiting parallelism based on sections  ... 
doi:10.1109/icpp.2009.64 dblp:conf/icpp/DuranTFMA09 fatcat:osdrpdeebvbqxovhorszp33naq

Power-performance efficiency of asymmetric multiprocessors for multi-threaded scientific applications

R.E. Grant, A. Afsahi
2006 Proceedings 20th IEEE International Parallel & Distributed Processing Symposium  
This paper explores the power-performance efficiency of Hyper-Threaded (HT) AMP servers, and proposes a new scheduling algorithm that can be used to reduce the overall power consumption of a server while  ...  maintaining a high level of performance.  ...  Acknowledgments The authors would like to thank the anonymous referees for their insightful comments.  ... 
doi:10.1109/ipdps.2006.1639601 dblp:conf/ipps/GrantA06 fatcat:bbmlbtjf7rakdkfnh65ibrctyi

Is the Schedule Clause Really Necessary in OpenMP? [chapter]

Eduard Ayguadé, Bob Blainey, Alejandro Duran, Jesús Labarta, Francisco Martínez, Xavier Martorell, Raúl Silvera
2003 Lecture Notes in Computer Science  
This paper proposes a new scheduling strategy, that derives at run time the best scheduling policy for each parallel loop in the program, based on information gathered at runtime by the library itself.  ...  This is not an easy task, even for expert programmers, and it can potentially take a large amount of time.  ...  Acknowledgments Authors want to thank Julita Corbalan for her insightful comments.  ... 
doi:10.1007/3-540-45009-2_12 fatcat:k7cimng75ng7lathmfpabhkbni

STAMP: Stanford Transactional Applications for Multi-Processing

Chi Cao Minh, JaeWoong Chung, Christos Kozyrakis, Kunle Olukotun
2008 2008 IEEE International Symposium on Workload Characterization  
We introduce the Stanford Transactional Application for Multi-Processing (STAMP), a comprehensive benchmark suite for evaluating TM systems.  ...  Transactional Memory (TM) is emerging as a promising technology to simplify parallel programming.  ...  Future work for STAMP includes the addition of more benchmarks to cover even more scenarios. VII.  ... 
doi:10.1109/iiswc.2008.4636089 dblp:conf/iiswc/MinhCKO08 fatcat:ygykkfjasvdotnbdlkcfcistj4

IPC Considered Harmful for Multiprocessor Workloads

A.R. Alameldeen, D.A. Wood
2006 IEEE Micro  
Performance evaluation using IPC Many computer architecture textbooks and introductory courses teach that the time to run an application (time per program) is the ultimate performance measure for an archi  ...  use simulation as a primary tool to evaluate computer system performance and to compare architectural alternatives.  ...  We thank Virtutech AB, the Wisconsin Condor group, and the Wisconsin Computer Systems Lab for their help and support.  ... 
doi:10.1109/mm.2006.73 fatcat:yqhe3aswzbgvfne5n3s6c4b624

Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite [chapter]

Philippe Virouleau, Pierrick Brunet, François Broquedis, Nathalie Furmento, Samuel Thibault, Olivier Aumage, Thierry Gautier
2014 Lecture Notes in Computer Science  
This paper introduces the KASTORS benchmark suite designed to evaluate OpenMP tasks dependencies.  ...  We modified state-of-theart OpenMP 3.0 benchmarks and data-flow parallel linear algebra kernels to make use of tasks dependencies.  ...  Older OpenMP benchmark suites such as PARSEC [3] , SPECOMP [10] and Rodinia [4] could be extended to benefit from task parallelism, as well as the NAS Parallel Benchmark suite [2] (NPB).  ... 
doi:10.1007/978-3-319-11454-5_2 fatcat:oajqv2mrfremvmnqzby7czko34

Multiple Instruction Stream Processor

Richard A. Hankins, Gautham N. Chinya, Jamison D. Collins, Perry H. Wang, Ryan Rakvic, Hong Wang, John P. Shen
2006 SIGARCH Computer Architecture News  
Microprocessor design is undergoing a major paradigm shift towards multi-core designs, in anticipation that future performance gains will come from exploiting threadlevel parallelism in the software.  ...  MISP introduces the sequencer as a new category of architectural resource, and defines a canonical set of instructions to support user-level inter-sequencer signaling and asynchronous control transfer.  ...  Workloads and Methodology To analyze the performance of the MISP architecture, we choose a number of compute-bound, multithreaded kernels and applications from the SPEComp benchmark suite [1] and the  ... 
doi:10.1145/1150019.1136495 fatcat:ibcv5a3h5fhnzjpvcw7jlk5xli

Celebrating diversity: a mixture of experts approach for runtime mapping in dynamic environments

Murali Krishna Emani, Michael O'Boyle
2015 Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation - PLDI 2015  
This paper focuses on selecting the best number of threads for a parallel application in dynamic environments. It develops a new scheme based on a mixture of experts approach.  ...  It learns online which, of a number of existing policies, or experts, is best suited for a particular environment without having to try out each policy.  ...  SpecOMP programs target high performance computing (HPC) domain. NAS programs are derived from computational fluid dynamics applications.  ... 
doi:10.1145/2737924.2737999 dblp:conf/pldi/EmaniO15 fatcat:fjy27us6xfey5jjozwlofmiwc4

Lonestar: A suite of parallel irregular programs

Milind Kulkarni, Martin Burtscher, Calin Cascaval, Keshav Pingali
2009 2009 IEEE International Symposium on Performance Analysis of Systems and Software  
To study and understand the patterns of parallelism and locality in sparse graph computations better, we are in the process of building the Lonestar benchmark suite.  ...  Our speedup numbers demonstrate that this new type of parallelism can successfully be exploited on modern multi-core machines.  ...  Several other studies have characterized older parallel benchmark suites, including Perfect Club [4] , SPLASH-2 [22] , NAS Parallel Benchmarks [2] , and SPEComp [1] .  ... 
doi:10.1109/ispass.2009.4919639 dblp:conf/ispass/KulkarniBCP09 fatcat:y27s645zdzftxdqap4nmbu5efi

Automatically Tuning Parallel and Parallelized Programs [chapter]

Chirag Dave, Rudolf Eigenmann
2010 Lecture Notes in Computer Science  
We evaluated our algorithm on a suite of hand-parallelized C benchmarks from the SPEC OMP2001 and NAS Parallel benchmarks and provide two sets of results.  ...  sections of code for the best possible parallel performance, both of which are difficult and time-consuming.  ...  A subset of benchmarks from the NAS Parallel benchmark suite and the SPEC OMP2001 suite was considered for evaluation.  ... 
doi:10.1007/978-3-642-13374-9_9 fatcat:azmmpsehifaxbevo3fcxmq2t5u

Redefining the Role of the CPU in the Era of CPU-GPU Integration

Manish Arora, Siddhartha Nath, Subhra Mazumdar, Scott B. Baden, Dean M. Tullsen
2012 IEEE Micro  
GPU computing has emerged as a viable alternative to CPUs for throughput oriented applications or regions of code. Speedups of 10× to 100× over CPU implementations have been reported.  ...  much performance-critical.  ...  Acknowledgment This work was funded in part by NSF grant CCF-1018356 and a grant from AMD.  ... 
doi:10.1109/mm.2012.57 fatcat:uumc4bnsrrcsxeqometp75kf6y

Smart, adaptive mapping of parallelism in the presence of external workload

M. K. Emani, Zheng Wang, M. F. P. O'Boyle
2013 Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)  
Algorithm computing pagerank of each node, i.e. its importance Benchmark suite Program Input size Notes SpecOMP 320.equake medium Finite element simulation; earth- quake modelling SpecOMP 330  ...  Benchmarks In total, 11 benchmark programs were chosen to evaluate this approach. These include all OpenMP-based C programs from NAS, SpecOMP benchmark suites as listed in Appendix A.  ... 
doi:10.1109/cgo.2013.6495010 dblp:conf/cgo/EmaniWO13 fatcat:ebxa4mr5qzcp5ifbiix23eyc4u

Towards architecture independent metrics for multicore performance analysis

Milind Kulkarni, Vijay Pai, Derek Schuff
2011 Performance Evaluation Review  
An understanding of locality effects and communication behavior can provide programmers with valuable information about performance bottlenecks and opportunities for optimization.  ...  Such metrics will allow a program's performance to be analyzed across a range of architectures without incurring the overhead of repeated profiling and analysis.  ...  Table 1 shows the accuracy of the sampled analysis compared to the full analysis for a selection of benchmarks from the NAS, SpecOMP and Parsec suites.  ... 
doi:10.1145/1925019.1925022 fatcat:yey6efndwnbhjg4tbxjolmuiue
« Previous Showing results 1 — 15 out of 74 results