A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance
[chapter]
2001
Lecture Notes in Computer Science
We present a new benchmark suite for parallel computers. SPEComp targets mid-size parallel servers. It includes a number of science/engineering and data processing applications. ...
Our overview also describes the organization developing SPEComp, issues in creating OpenMP parallel benchmarks, the benchmarking methodology underlying SPEComp, and basic performance characteristics. ...
Acknowledgement We'd like to acknowledge the important contributions of the many authors of the individual benchmark applications. ...
doi:10.1007/3-540-44587-0_1
fatcat:nluzkwgtynfifjf2l5ldgvsb5y
Performance Evaluation of Massively Parallel Systems Using SPECOMP Suite
2022
Computers
Performance analysis plays an essential role in achieving a scalable performance of applications on massively parallel supercomputers equipped with thousands of processors. ...
This paper is an empirical investigation to study, in depth, the performance of two of the most common High-Performance Computing architectures in the world. ...
Acknowledgments: We would like to thank Argonne National Lab for allowing us to use their machines.
Conflicts of Interest: The author declares no conflict of interest. ...
doi:10.3390/computers11050075
fatcat:4lcuefuno5fwbdibxht43taw74
Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP
2009
2009 International Conference on Parallel Processing
And so, the need to have some set of benchmarks to evaluate it. In this paper, we motivate the need of having such a benchmarks suite, for irregular and/or recursive task parallelism. ...
Parallel tasks allow the exploitation of irregular parallelism, but there is a lack of benchmarks exploiting tasks in OpenMP. ...
OpenMP applications in OmpSCR, PARSEC, NAS, and SPEComp suites are mostly regular, and parallelism is exploited based on loops, with only a pair of applications exploiting parallelism based on sections ...
doi:10.1109/icpp.2009.64
dblp:conf/icpp/DuranTFMA09
fatcat:osdrpdeebvbqxovhorszp33naq
Power-performance efficiency of asymmetric multiprocessors for multi-threaded scientific applications
2006
Proceedings 20th IEEE International Parallel & Distributed Processing Symposium
This paper explores the power-performance efficiency of Hyper-Threaded (HT) AMP servers, and proposes a new scheduling algorithm that can be used to reduce the overall power consumption of a server while ...
maintaining a high level of performance. ...
Acknowledgments The authors would like to thank the anonymous referees for their insightful comments. ...
doi:10.1109/ipdps.2006.1639601
dblp:conf/ipps/GrantA06
fatcat:bbmlbtjf7rakdkfnh65ibrctyi
Is the Schedule Clause Really Necessary in OpenMP?
[chapter]
2003
Lecture Notes in Computer Science
This paper proposes a new scheduling strategy, that derives at run time the best scheduling policy for each parallel loop in the program, based on information gathered at runtime by the library itself. ...
This is not an easy task, even for expert programmers, and it can potentially take a large amount of time. ...
Acknowledgments Authors want to thank Julita Corbalan for her insightful comments. ...
doi:10.1007/3-540-45009-2_12
fatcat:k7cimng75ng7lathmfpabhkbni
STAMP: Stanford Transactional Applications for Multi-Processing
2008
2008 IEEE International Symposium on Workload Characterization
We introduce the Stanford Transactional Application for Multi-Processing (STAMP), a comprehensive benchmark suite for evaluating TM systems. ...
Transactional Memory (TM) is emerging as a promising technology to simplify parallel programming. ...
Future work for STAMP includes the addition of more benchmarks to cover even more scenarios.
VII. ...
doi:10.1109/iiswc.2008.4636089
dblp:conf/iiswc/MinhCKO08
fatcat:ygykkfjasvdotnbdlkcfcistj4
IPC Considered Harmful for Multiprocessor Workloads
2006
IEEE Micro
Performance evaluation using IPC Many computer architecture textbooks and introductory courses teach that the time to run an application (time per program) is the ultimate performance measure for an archi ...
use simulation as a primary tool to evaluate computer system performance and to compare architectural alternatives. ...
We thank Virtutech AB, the Wisconsin Condor group, and the Wisconsin Computer Systems Lab for their help and support. ...
doi:10.1109/mm.2006.73
fatcat:yqhe3aswzbgvfne5n3s6c4b624
Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite
[chapter]
2014
Lecture Notes in Computer Science
This paper introduces the KASTORS benchmark suite designed to evaluate OpenMP tasks dependencies. ...
We modified state-of-theart OpenMP 3.0 benchmarks and data-flow parallel linear algebra kernels to make use of tasks dependencies. ...
Older OpenMP benchmark suites such as PARSEC [3] , SPECOMP [10] and Rodinia [4] could be extended to benefit from task parallelism, as well as the NAS Parallel Benchmark suite [2] (NPB). ...
doi:10.1007/978-3-319-11454-5_2
fatcat:oajqv2mrfremvmnqzby7czko34
Multiple Instruction Stream Processor
2006
SIGARCH Computer Architecture News
Microprocessor design is undergoing a major paradigm shift towards multi-core designs, in anticipation that future performance gains will come from exploiting threadlevel parallelism in the software. ...
MISP introduces the sequencer as a new category of architectural resource, and defines a canonical set of instructions to support user-level inter-sequencer signaling and asynchronous control transfer. ...
Workloads and Methodology To analyze the performance of the MISP architecture, we choose a number of compute-bound, multithreaded kernels and applications from the SPEComp benchmark suite [1] and the ...
doi:10.1145/1150019.1136495
fatcat:ibcv5a3h5fhnzjpvcw7jlk5xli
Celebrating diversity: a mixture of experts approach for runtime mapping in dynamic environments
2015
Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation - PLDI 2015
This paper focuses on selecting the best number of threads for a parallel application in dynamic environments. It develops a new scheme based on a mixture of experts approach. ...
It learns online which, of a number of existing policies, or experts, is best suited for a particular environment without having to try out each policy. ...
SpecOMP programs target high performance computing (HPC) domain. NAS programs are derived from computational fluid dynamics applications. ...
doi:10.1145/2737924.2737999
dblp:conf/pldi/EmaniO15
fatcat:fjy27us6xfey5jjozwlofmiwc4
Lonestar: A suite of parallel irregular programs
2009
2009 IEEE International Symposium on Performance Analysis of Systems and Software
To study and understand the patterns of parallelism and locality in sparse graph computations better, we are in the process of building the Lonestar benchmark suite. ...
Our speedup numbers demonstrate that this new type of parallelism can successfully be exploited on modern multi-core machines. ...
Several other studies have characterized older parallel benchmark suites, including Perfect Club [4] , SPLASH-2 [22] , NAS Parallel Benchmarks [2] , and SPEComp [1] . ...
doi:10.1109/ispass.2009.4919639
dblp:conf/ispass/KulkarniBCP09
fatcat:y27s645zdzftxdqap4nmbu5efi
Automatically Tuning Parallel and Parallelized Programs
[chapter]
2010
Lecture Notes in Computer Science
We evaluated our algorithm on a suite of hand-parallelized C benchmarks from the SPEC OMP2001 and NAS Parallel benchmarks and provide two sets of results. ...
sections of code for the best possible parallel performance, both of which are difficult and time-consuming. ...
A subset of benchmarks from the NAS Parallel benchmark suite and the SPEC OMP2001 suite was considered for evaluation. ...
doi:10.1007/978-3-642-13374-9_9
fatcat:azmmpsehifaxbevo3fcxmq2t5u
Redefining the Role of the CPU in the Era of CPU-GPU Integration
2012
IEEE Micro
GPU computing has emerged as a viable alternative to CPUs for throughput oriented applications or regions of code. Speedups of 10× to 100× over CPU implementations have been reported. ...
much performance-critical. ...
Acknowledgment This work was funded in part by NSF grant CCF-1018356 and a grant from AMD. ...
doi:10.1109/mm.2012.57
fatcat:uumc4bnsrrcsxeqometp75kf6y
Smart, adaptive mapping of parallelism in the presence of external workload
2013
Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)
Algorithm computing pagerank of each node, i.e. its importance Benchmark suite Program
Input size
Notes
SpecOMP
320.equake
medium
Finite element simulation; earth-
quake modelling
SpecOMP
330 ...
Benchmarks In total, 11 benchmark programs were chosen to evaluate this approach. These include all OpenMP-based C programs from NAS, SpecOMP benchmark suites as listed in Appendix A. ...
doi:10.1109/cgo.2013.6495010
dblp:conf/cgo/EmaniWO13
fatcat:ebxa4mr5qzcp5ifbiix23eyc4u
Towards architecture independent metrics for multicore performance analysis
2011
Performance Evaluation Review
An understanding of locality effects and communication behavior can provide programmers with valuable information about performance bottlenecks and opportunities for optimization. ...
Such metrics will allow a program's performance to be analyzed across a range of architectures without incurring the overhead of repeated profiling and analysis. ...
Table 1 shows the accuracy of the sampled analysis compared to the full analysis for a selection of benchmarks from the NAS, SpecOMP and Parsec suites. ...
doi:10.1145/1925019.1925022
fatcat:yey6efndwnbhjg4tbxjolmuiue
« Previous
Showing results 1 — 15 out of 74 results