A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
A framework to analyze processor architectures for next-generation on-board space computing
2014
2014 IEEE Aerospace Conference
We demonstrate the ability of the framework to generate data for various architectures in terms of performance and power, and analyze this data for initial insights into the effects of processor architectures ...
performance and mission capabilities. ...
ACKNOWLEDGMENTS This work was supported in part by the I/UCRC Program of the National Science Foundation under Grant Nos. EEC-0642422 and IIP-1161022. ...
doi:10.1109/aero.2014.6836387
fatcat:pdehrlytgra6rkdwm4hwdicdwa
Retrospective: the Cedar system
1998
25 years of the international symposia on Computer architecture (selected papers) - ISCA '98
A two-phase approach was advocated: the construction of a 32processor prototype followed by a production system with thousands of processors. ...
We felt that major advances in the state of hardware technology, architecture, compilers, and parallel algorithms made such a demonstration possible. ...
Cedar performance was carefully evaluated using these and other benchmark applications/algorithms. ...
doi:10.1145/285930.285965
dblp:conf/isca/VeidenbaumYKPPDG98
fatcat:obv6ipwmffbh7e27pzrhkzaza4
Parallelized benchmark-driven performance evaluation of SMPs and tiled multi-core architectures for embedded systems
2012
2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC)
We base our evaluation on a parallelized information fusion application and benchmarks that are used as building blocks in applications for SMPs and TMAs. ...
We compare and analyze the performance of an Intel-based SMP and Tilera's TILEPro64 TMA based on our parallelized benchmarks for the following performance metrics: runtime, speedup, efficiency, cost, scalability ...
CONCLUSIONS In this paper, we compared the performance of symmetric multiprocessors (SMPs) and tiled multi-core architectures (TMAs) (focusing on the TILEPro64) based on a parallelized information fusion ...
doi:10.1109/pccc.2012.6407785
dblp:conf/ipccc/MunirGR12
fatcat:ej24xs7gvrhdbffctjdy3ktmcq
Auto-generation of communication benchmark traces
2012
Performance Evaluation Review
Experimental results demonstrate that generated source code of benchmarks preserves both the communication patterns and the run-time behavior of the original application. ...
Benchmarks are essential for evaluating HPC hardware and software for petascale machines and beyond. But benchmark creation is a tedious manual process. ...
INTRODUCTION Benchmarks are widely used for evaluating and analyzing system performance and assessing migration costs of HPC * This work was supported in part by NSF grants 1058779, 0958311, 0937908. ...
doi:10.1145/2381056.2381078
fatcat:woeqxffb65bhpptdgl3r5jlufy
Advanced Virtualization Techniques for High Performance Cloud Cyberinfrastructure
2014
2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
Upon evaluating these newfound features and leveraging the system within the OpenStack environment, we illustrate that cloud computing can perform at near-native speeds and support a broad range of scientific ...
However, there is still a notable gap that exists between the performance of IaaS when compared to typical high performance computing (HPC) resources, limiting the applicability of IaaS for many potential ...
The SHOC benchmarks were chosen because they provide a higher level of evaluation regarding GPU performance than the sample applications provided in the Nvidia SDK, and can also evaluate OpenCL performance ...
doi:10.1109/ccgrid.2014.93
dblp:conf/ccgrid/YoungeF14
fatcat:hntwcefmcve4fm4hkraduxmao4
Auto-generation of communication benchmark traces
2011
Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems - PMBS '11
Experimental results demonstrate that generated source code of benchmarks preserves both the communication patterns and the run-time behavior of the original application. ...
Benchmarks are essential for evaluating HPC hardware and software for petascale machines and beyond. But benchmark creation is a tedious manual process. ...
INTRODUCTION Benchmarks are widely used for evaluating and analyzing system performance and assessing migration costs of HPC * This work was supported in part by NSF grants 1058779, 0958311, 0937908. ...
doi:10.1145/2088457.2088468
fatcat:g2bd26wk4zc4fbli55kq7zm2w4
Tools for Simulation and Benchmark Generation at Exascale
[chapter]
2014
Tools for High Performance Computing 2013
Simulations using models of future HPC systems and communication traces from applications running on existing HPC systems can offer an insight into the performance of future architectures. ...
Investigating the performance of parallel applications at scale on future architectures and the performance impact of different architecture choices is an important component of HPC hardware/software co-design ...
Acknowledgements This work was supported in part by NSF grants 1217748, 0937908 and 0958311, as well, as a subcontract from ORNL. ...
doi:10.1007/978-3-319-08144-1_2
dblp:conf/ptw/LagadapatiME13
fatcat:4la4dopjofex5dggcsadzfxzke
How Good Are Low-Power 64-Bit SoCs for Server-Class Workloads?
2015
2015 IEEE International Symposium on Workload Characterization
In this paper we thoroughly evaluate the performance and energy efficiency of two 64-bit eight-core ARM and x86 SoCs on a number of parallel scale-out benchmarks and high-performance computing benchmarks ...
We characterize the workloads on these servers and elaborate the impact of the SoC architecture, memory hierarchy, and system design on the performance and energy efficiency outcomes. ...
Acknowledgment: This project is supported by NSF grant 1305148. ...
doi:10.1109/iiswc.2015.21
dblp:conf/iiswc/AzimiZR15
fatcat:b2cgcvcjjjcbddhsd3vcbyqsai
Integrating multiple forms of multithreaded execution on multi-SMT systems: a study with scientific applications
2005
Second International Conference on the Quantitative Evaluation of Systems (QEST'05)
Most scientific applications have high degrees of parallelism and thread-level parallel execution appears to be a natural choice for executing these applications on systems composed of SMT processors. ...
We show, through a rigorous evaluation with hardware monitoring counters on a real multi-SMT system, that in traditionally scalable parallel applications conflicting resource requirements are -due to the ...
Acknowledgements This work is supported by an NSF ITR grant (ACI-0312980), an NSF CAREER award (CCF-0346867) and the College of William and Mary. ...
doi:10.1109/qest.2005.16
dblp:conf/qest/Curtis-MauryW05
fatcat:tgr5h2c7xrbd7f6txgmzygusui
An Evaluation of OpenMP on Current and Emerging Multithreaded/Multicore Processors
[chapter]
2008
Lecture Notes in Computer Science
In this paper we evaluate the performance of OpenMP applications on these two parallel architectures. We use detailed hardware metrics to identify architectural bottlenecks. ...
We evaluate an adaptive, run-time mechanism which provides limited performance improvements on SMTs, however the inherent bottlenecks remain difficult to overcome. ...
Acknowledgements This work is supported by an NSF ITR grant (ACI-0312980), an NSF CAREER award (CCF-0346867) and the College of William and Mary. ...
doi:10.1007/978-3-540-68555-5_11
fatcat:5sdth4krs5b3hhnsht24ymz5k4
POSTER
2017
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP '17
Many HPC applications require dynamic load balancing to achieve high performance and system utilization. ...
It employs randomized decision forests, a machine learning method, to learn a model for choosing the best load balancing strategy for an application represented by a set of features that capture the application ...
Acknowledgments This research was supported in part by NCSA PAID -LB (NSF OCI 07-25070) NSF ChaNGa (AST 13-12913) and NSF OpenAtom (ACI 13-39715). ...
doi:10.1145/3018743.3019033
fatcat:yddn45un4fenfldud7ko6lzf6q
Characterization of essential dynamic instructions
2003
Performance Evaluation Review
Using this approach, we characterize the streams of the SPEC2000 integer benchmarks compiled for the Alpha ISA on the OSF operating system. ...
eliminating their impact on performance entirely. ...
doi:10.1145/885651.781071
fatcat:rcdfjjznuzfivnf3wahfbvictm
SPEC HPC2002: The Next High-Performance Computer Benchmark
[chapter]
2002
Lecture Notes in Computer Science
SPEC High-Performance Group The High-Performance Group of the Standard Performance Evaluation Corporation (SPEC/HPG) [1] is developing a next release of its high-performance computer benchmark suite. ...
Like SPECseis, SPECchem is often used to exhibit performance of high-performance systems among the computer vendors. Portions of SPECchem codes date back to 1984. ...
doi:10.1007/3-540-47847-7_3
fatcat:4l6dw6h4bndkzgkfsmcfkqtx6y
Studying Effects of Meltdown and Spectre Patches on the Performance of HPC Application Using Application Kernel Module of XDMoD
2018
Zenodo
The application kernel module is designed for continuous performance monitoring of HPC systems. ...
To study this we use the application kernel module of XDMoD to test the performance before and after the application of the vulnerability patches. ...
RESULTS AND DISCUSSION IOR and MDTest benchmarks measure the performance of the file system. ...
doi:10.5281/zenodo.3552962
fatcat:b2xzqcf7lrgixd2yfm3w4gblbu
Reducing the Energy Cost of Irregular Code Bases in Soft Processor Systems
2011
2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines
Whereas accelerator approaches have traditionally achieved energy benefits as a side effect from increasing performance via parallel execution, ICERs aim to achieve energy gains even on code with little ...
In contrast, because the ICER approach targets energy rather than performance, it easily scales to large, irregular applications that are poor candidates for traditional acceleration. ...
We would also like to thank Adrian Caulfield for help with the b-tree benchmark. ...
doi:10.1109/fccm.2011.45
dblp:conf/fccm/AroraSGBVTS11
fatcat:xxbwbena4ze75icihoqxq6j4vm
« Previous
Showing results 1 — 15 out of 11,629 results