Filters








11,629 Hits in 7.1 sec

A framework to analyze processor architectures for next-generation on-board space computing

Tyler M. Lovelly, Donavon Bryan, Kevin Cheng, Rachel Kreynin, Alan D. George, Ann Gordon-Ross, Gabriel Mounce
2014 2014 IEEE Aerospace Conference  
We demonstrate the ability of the framework to generate data for various architectures in terms of performance and power, and analyze this data for initial insights into the effects of processor architectures  ...  performance and mission capabilities.  ...  ACKNOWLEDGMENTS This work was supported in part by the I/UCRC Program of the National Science Foundation under Grant Nos. EEC-0642422 and IIP-1161022.  ... 
doi:10.1109/aero.2014.6836387 fatcat:pdehrlytgra6rkdwm4hwdicdwa

Retrospective: the Cedar system

A. Veidenbaum, P.-C. Yew, D. J. Kuck, C. D. Polychronopoulos, D. H. Padua, E. S. Davidson, K. Gallivan
1998 25 years of the international symposia on Computer architecture (selected papers) - ISCA '98  
A two-phase approach was advocated: the construction of a 32processor prototype followed by a production system with thousands of processors.  ...  We felt that major advances in the state of hardware technology, architecture, compilers, and parallel algorithms made such a demonstration possible.  ...  Cedar performance was carefully evaluated using these and other benchmark applications/algorithms.  ... 
doi:10.1145/285930.285965 dblp:conf/isca/VeidenbaumYKPPDG98 fatcat:obv6ipwmffbh7e27pzrhkzaza4

Parallelized benchmark-driven performance evaluation of SMPs and tiled multi-core architectures for embedded systems

Arslan Munir, Ann Gordon-Ross, Sanjay Ranka
2012 2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC)  
We base our evaluation on a parallelized information fusion application and benchmarks that are used as building blocks in applications for SMPs and TMAs.  ...  We compare and analyze the performance of an Intel-based SMP and Tilera's TILEPro64 TMA based on our parallelized benchmarks for the following performance metrics: runtime, speedup, efficiency, cost, scalability  ...  CONCLUSIONS In this paper, we compared the performance of symmetric multiprocessors (SMPs) and tiled multi-core architectures (TMAs) (focusing on the TILEPro64) based on a parallelized information fusion  ... 
doi:10.1109/pccc.2012.6407785 dblp:conf/ipccc/MunirGR12 fatcat:ej24xs7gvrhdbffctjdy3ktmcq

Auto-generation of communication benchmark traces

Vivek Deshpande, Xing Wu, Frank Mueller
2012 Performance Evaluation Review  
Experimental results demonstrate that generated source code of benchmarks preserves both the communication patterns and the run-time behavior of the original application.  ...  Benchmarks are essential for evaluating HPC hardware and software for petascale machines and beyond. But benchmark creation is a tedious manual process.  ...  INTRODUCTION Benchmarks are widely used for evaluating and analyzing system performance and assessing migration costs of HPC * This work was supported in part by NSF grants 1058779, 0958311, 0937908.  ... 
doi:10.1145/2381056.2381078 fatcat:woeqxffb65bhpptdgl3r5jlufy

Advanced Virtualization Techniques for High Performance Cloud Cyberinfrastructure

Andrew J. Younge, Geoffrey C. Fox
2014 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing  
Upon evaluating these newfound features and leveraging the system within the OpenStack environment, we illustrate that cloud computing can perform at near-native speeds and support a broad range of scientific  ...  However, there is still a notable gap that exists between the performance of IaaS when compared to typical high performance computing (HPC) resources, limiting the applicability of IaaS for many potential  ...  The SHOC benchmarks were chosen because they provide a higher level of evaluation regarding GPU performance than the sample applications provided in the Nvidia SDK, and can also evaluate OpenCL performance  ... 
doi:10.1109/ccgrid.2014.93 dblp:conf/ccgrid/YoungeF14 fatcat:hntwcefmcve4fm4hkraduxmao4

Auto-generation of communication benchmark traces

Vivek Deshpande, Xing Wu, Frank Mueller
2011 Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems - PMBS '11  
Experimental results demonstrate that generated source code of benchmarks preserves both the communication patterns and the run-time behavior of the original application.  ...  Benchmarks are essential for evaluating HPC hardware and software for petascale machines and beyond. But benchmark creation is a tedious manual process.  ...  INTRODUCTION Benchmarks are widely used for evaluating and analyzing system performance and assessing migration costs of HPC * This work was supported in part by NSF grants 1058779, 0958311, 0937908.  ... 
doi:10.1145/2088457.2088468 fatcat:g2bd26wk4zc4fbli55kq7zm2w4

Tools for Simulation and Benchmark Generation at Exascale [chapter]

Mahesh Lagadapati, Frank Mueller, Christian Engelmann
2014 Tools for High Performance Computing 2013  
Simulations using models of future HPC systems and communication traces from applications running on existing HPC systems can offer an insight into the performance of future architectures.  ...  Investigating the performance of parallel applications at scale on future architectures and the performance impact of different architecture choices is an important component of HPC hardware/software co-design  ...  Acknowledgements This work was supported in part by NSF grants 1217748, 0937908 and 0958311, as well, as a subcontract from ORNL.  ... 
doi:10.1007/978-3-319-08144-1_2 dblp:conf/ptw/LagadapatiME13 fatcat:4la4dopjofex5dggcsadzfxzke

How Good Are Low-Power 64-Bit SoCs for Server-Class Workloads?

Reza Azimi, Xin Zhan, Sherief Reda
2015 2015 IEEE International Symposium on Workload Characterization  
In this paper we thoroughly evaluate the performance and energy efficiency of two 64-bit eight-core ARM and x86 SoCs on a number of parallel scale-out benchmarks and high-performance computing benchmarks  ...  We characterize the workloads on these servers and elaborate the impact of the SoC architecture, memory hierarchy, and system design on the performance and energy efficiency outcomes.  ...  Acknowledgment: This project is supported by NSF grant 1305148.  ... 
doi:10.1109/iiswc.2015.21 dblp:conf/iiswc/AzimiZR15 fatcat:b2cgcvcjjjcbddhsd3vcbyqsai

Integrating multiple forms of multithreaded execution on multi-SMT systems: a study with scientific applications

M. Curtis-Maury, Tanping Wang, C. Antonopoulos, D. Nikolopoulos
2005 Second International Conference on the Quantitative Evaluation of Systems (QEST'05)  
Most scientific applications have high degrees of parallelism and thread-level parallel execution appears to be a natural choice for executing these applications on systems composed of SMT processors.  ...  We show, through a rigorous evaluation with hardware monitoring counters on a real multi-SMT system, that in traditionally scalable parallel applications conflicting resource requirements are -due to the  ...  Acknowledgements This work is supported by an NSF ITR grant (ACI-0312980), an NSF CAREER award (CCF-0346867) and the College of William and Mary.  ... 
doi:10.1109/qest.2005.16 dblp:conf/qest/Curtis-MauryW05 fatcat:tgr5h2c7xrbd7f6txgmzygusui

An Evaluation of OpenMP on Current and Emerging Multithreaded/Multicore Processors [chapter]

Matthew Curtis-Maury, Xiaoning Ding, Christos D. Antonopoulos, Dimitrios S. Nikolopoulos
2008 Lecture Notes in Computer Science  
In this paper we evaluate the performance of OpenMP applications on these two parallel architectures. We use detailed hardware metrics to identify architectural bottlenecks.  ...  We evaluate an adaptive, run-time mechanism which provides limited performance improvements on SMTs, however the inherent bottlenecks remain difficult to overcome.  ...  Acknowledgements This work is supported by an NSF ITR grant (ACI-0312980), an NSF CAREER award (CCF-0346867) and the College of William and Mary.  ... 
doi:10.1007/978-3-540-68555-5_11 fatcat:5sdth4krs5b3hhnsht24ymz5k4

POSTER

Harshitha Menon, Kavitha Chandrasekar, Laxmikant V. Kale
2017 Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP '17  
Many HPC applications require dynamic load balancing to achieve high performance and system utilization.  ...  It employs randomized decision forests, a machine learning method, to learn a model for choosing the best load balancing strategy for an application represented by a set of features that capture the application  ...  Acknowledgments This research was supported in part by NCSA PAID -LB (NSF OCI 07-25070) NSF ChaNGa (AST 13-12913) and NSF OpenAtom (ACI 13-39715).  ... 
doi:10.1145/3018743.3019033 fatcat:yddn45un4fenfldud7ko6lzf6q

Characterization of essential dynamic instructions

Steven S. Lumetta, Sanjay J. Patel
2003 Performance Evaluation Review  
Using this approach, we characterize the streams of the SPEC2000 integer benchmarks compiled for the Alpha ISA on the OSF operating system.  ...  eliminating their impact on performance entirely.  ... 
doi:10.1145/885651.781071 fatcat:rcdfjjznuzfivnf3wahfbvictm

SPEC HPC2002: The Next High-Performance Computer Benchmark [chapter]

Rudolf Eigenmann, Greg Gaertner, Wesley Jones, Hideki Saito, Brian Whitney
2002 Lecture Notes in Computer Science  
SPEC High-Performance Group The High-Performance Group of the Standard Performance Evaluation Corporation (SPEC/HPG) [1] is developing a next release of its high-performance computer benchmark suite.  ...  Like SPECseis, SPECchem is often used to exhibit performance of high-performance systems among the computer vendors. Portions of SPECchem codes date back to 1984.  ... 
doi:10.1007/3-540-47847-7_3 fatcat:4l6dw6h4bndkzgkfsmcfkqtx6y

Studying Effects of Meltdown and Spectre Patches on the Performance of HPC Application Using Application Kernel Module of XDMoD

Nikolay A. Simakov, Martins D. Innus, Matthew D. Jones, Ohad Katz, Joseph P. White, Ryan Rathsam, Steven M. Gallo, Robert L. DeLeon, Thomas R. Furlani
2018 Zenodo  
The application kernel module is designed for continuous performance monitoring of HPC systems.  ...  To study this we use the application kernel module of XDMoD to test the performance before and after the application of the vulnerability patches.  ...  RESULTS AND DISCUSSION IOR and MDTest benchmarks measure the performance of the file system.  ... 
doi:10.5281/zenodo.3552962 fatcat:b2xzqcf7lrgixd2yfm3w4gblbu

Reducing the Energy Cost of Irregular Code Bases in Soft Processor Systems

Manish Arora, Jack Sampson, Nathan Goulding-Hotta, Jonathan Babb, Ganesh Venkatesh, Michael Bedford Taylor, Steven Swanson
2011 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines  
Whereas accelerator approaches have traditionally achieved energy benefits as a side effect from increasing performance via parallel execution, ICERs aim to achieve energy gains even on code with little  ...  In contrast, because the ICER approach targets energy rather than performance, it easily scales to large, irregular applications that are poor candidates for traditional acceleration.  ...  We would also like to thank Adrian Caulfield for help with the b-tree benchmark.  ... 
doi:10.1109/fccm.2011.45 dblp:conf/fccm/AroraSGBVTS11 fatcat:xxbwbena4ze75icihoqxq6j4vm
« Previous Showing results 1 — 15 out of 11,629 results