Detailed analysis of I/O traces for large scale applications

N. Nakka, A. Choudhary, W. K. Liao, L. Ward, R. Klundt, M. I. Weston
2009 2009 International Conference on High Performance Computing (HiPC)  
In this paper, we present a tool to extract I/O traces from very large applications running at full scale during their production runs. We analyze these traces to gain information about the application. We analyze the traces of three applications. The analysis showed that the I/O traces reveal much information about the application even without access to the source code. In particular, these I/O traces provide multiple indications towards the algorithmic nature of the application by observing
more » ... e changes of data amount and I/O request distribution at the checkpoints. Adaptive Mesh Refinement (AMR) is one of the kind of algorithms that can exhibit such I/O behavior. This is the first study of I/O characteristics of unbalanced AMR-supported applications at scale. The key observations that we made in the trace were (1) Variation in aggregate data sizes across checkpoints for AMR and non-AMR applications, (2) Variation in the number of I/O calls by a client depending on the nature of the application, (3) Use of temporary files by applications and possible erroneous calls to I/O functions, (4) Variation in average data transfer size according as whether the application has AMR support or not, (5) Aggregation of I/O for processes executing on a single physical node through MPI-IO calls, and (6) Updates to specific data structures in the checkpoint file. Keywords:Large scale I/O tracing, I/O trace analysis, adaptive mesh refinement I. 978-1-4244-4921-7/09/$25.00 ©2009 IEEE
doi:10.1109/hipc.2009.5433186 dblp:conf/hipc/NakkaCLWKW09 fatcat:ihzii6aexvar3ioxjpqffldf6e