An early prototype of an autonomic performance environment for exascale

Kevin Huck, Sameer Shende, Allen Malony, Hartmut Kaiser, Allan Porterfield, Rob Fowler, Ron Brightwell
2013 Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers - ROSS '13  
Extreme-scale computing requires a new perspective on the role of performance observation in the exascale system software stack. Because of the anticipated high concurrency and dynamic operation in these systems, it is no longer reasonable to expect that a post-mortem performance measurement and analysis methodology will suffice. Rather, there is a strong need for performance observation that merges firstand third-person observation, in situ analysis, and introspection across stack layers that
more » ... erves online dynamic feedback and adaptation. In this paper we describe the DOE-funded XPRESS project and the role of autonomic performance support in exascale systems. XPRESS will build an integrated exascale software stack (called OpenX ) that supports the ParalleX execution model and is targeted towards future exascale platforms. An initial version of an autonomic performance environment called APEX has been developed for OpenX using the current TAU performance technology and results are presented that highlight the challenges of highly integrative observation and runtime analysis.
doi:10.1145/2491661.2481434 dblp:conf/ics/HuckSMKPFB13 fatcat:qum63pv3dfaw7ksinw4ujqm6j4