A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
What Supercomputers Say: A Study of Five System Logs
2007
37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07)
If we hope to automatically detect and diagnose failures in large-scale computer systems, we must study real deployed systems and the data they generate. Progress has been hampered by the inaccessibility of empirical data. This paper addresses that dearth by examining system logs from five supercomputers, with the aim of providing useful insight and direction for future research into the use of such logs. We present details about the systems, methods of log collection, and how alerts were
doi:10.1109/dsn.2007.103
dblp:conf/dsn/OlinerS07
fatcat:oxkhsbvbt5bk3axa6pgs6acg3a