Paravirtualization for HPC Systems [report]

L Youseff, R Wolski, B Gorda, C Krintz
2006 unpublished
In this work, we investigate the efficacy of using paravirtualizing software for performance-critical HPC kernels and applications. We present a comprehensive performance evaluation of Xen, a low-overhead, Linux-based, virtual machine monitor, for paravirtualization of HPC cluster systems at LLNL. We investigate subsystem and overall performance using a wide range of benchmarks and applications. We employ statistically sound methods to compare the performance of a paravirtualized kernel against
more » ... ized kernel against three Linux operating systems: RedHat Enterprise 4 for build versions 2.6.9 and 2.6.12 and the LLNL CHAOS kernel. Our results indicate that Xen is very efficient and practical for HPC systems. This work is sponsored in part by LLNL and NSF (CNS-0546737 and ST-HEC-0444412). Despite the potential benefits, performance advances, and recent research indicating its potential [22, 44, 15, 19] , virtualization is currently not used in high-performance computing (HPC) environments. One reason for this is the perception that the remaining overhead that VMMs introduce is unacceptable for performance-critical applications and systems. The goal of our work is to evaluate empirically and to quantify the degree to which this perception is true for Linux and Xen. Xen is an open-source VMM for the Linux OS which reports low-overhead and efficient execution of Linux [40] . Linux, itself, is the current operating system of choice when building and deploying computational clusters composed of commodity components. In this work, we study the performance impact of Xen using current HPC commodity hardware at Lawrence Livermore National Laboratory (LLNL). Xen is an ideal candidate VMM for an HPC setting given its large-scale development efforts [28, 42] and its availability, performance-focus, and evolution for a wide range of platforms. We objectively compare the performance of benchmarks and applications using a Xen-based Linux system against three Linux OS versions and configurations currently in use for HPC application execution at LLNL and other super-computing sites. The Linux versions include Red Hat Enterprise Linux 4 (RHEL4) for build versions 2.6.9 and 2.6.12 and the LLNL CHAOS kernel, a specialized version of RHEL4 version 2.6.9. We collect performance data using micro-and macro-benchmarks from the HPC Challenge, LLNL ASCI Purple, and NAS parallel benchmark suites among others, as well as using a large-scale, HPC application for simulation of oceanographic and climatologic phenomena. Using micro-benchmarks, we evaluate machine memory and disk I/O performance while our experiments using the macro-benchmarks and HPC applications assess full system performance. We find that Xen paravirtualization system, in general, does not introduce significant overhead over other OS configurations that we study -including one specialized for the HPC cluster we investigate. There is one case for which Xen overhead is significant: random disk I/O. Curiously, in a small number of other cases, Xen improves subsystem or full system performance over various other kernels due to its implementation for efficient interaction between the guest and host OS. Overall, we find that Xen does not impose an onerous performance penalty for a wide range of HPC program behaviors and applications. As a result we believe the flexibility and potential for enhanced security that Xen offers makes it useful in a commodity HPC context.
doi:10.2172/894791 fatcat:pqx2oacay5h2pjdov7aag7rtme