GScheduler: Optimizing resource provision by using GPU usage pattern extraction in cloud environments

Zhuqing Xu, Fang Dong, Jiahui Jin, Junzhou Luo, Jun Shen
2017 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC)  
GPU-based clusters are widely chosen for accelerating a variety of scientific applications in high-end cloud environments. With their growing popularity, there is a necessity for improving the system throughput and decreasing the turnaround time for co-executing applications on the same GPU device. However, resource contention among multiple applications on a multi-tasked GPU leads to the performance degradation of applications. Previous works are not accurate enough to learn the
more » ... of GPU application before execution, or cannot get such information timely, which may lead to misleading scheduling decisions. In this paper, we present GScheduler, a framework to detect and reduce interference for co-executing applications on the GPU-based cloud. The most important feature of GScheduler is to utilize GPU usage pattern extractor for detecting interference between applications. It is composed of key function-call graph extractor and key GPU resource usage vector extractor, the former is used to detect the similarity of GPU usage mode between applications, while the latter is used to calculate the similarity of GPU resource requirements in-between. In addition, an interference aware scheduler is proposed to minimize the interference. We evaluated our framework with 26 diverse, realworld CUDA applications. When compared with state-of the-art interference-oblivious schedulers, our framework improves system throughput by 36% on average, and achieves a 30.5% reduction of turnaround time on average. Abstract-GPU-based clusters are widely chosen for accelerating a variety of scientific applications in high-end cloud environments. With their growing popularity, there is a necessity for improving the system throughput and decreasing the turnaround time for co-executing applications on the same GPU device. However, resource contention among multiple applications on a multi-tasked GPU leads to the performance degradation of applications. Previous works are not accurate enough to learn the characteristics of GPU application before execution, or cannot get such information timely, which may lead to misleading scheduling decisions. In this paper, we present GScheduler, a framework to detect and reduce interference for co-executing applications on the GPU-based cloud. The most important feature of GScheduler is to utilize GPU usage pattern extractor for detecting interference between applications. It is composed of key function-call graph extractor and key GPU resource usage vector extractor, the former is used to detect the similarity of GPU usage mode between applications, while the latter is used to calculate the similarity of GPU resource requirements in-between. In addition, an interference aware scheduler is proposed to minimize the interference. We evaluated our framework with 26 diverse, realworld CUDA applications. When compared with state-of the-art interference-oblivious schedulers, our framework improves system throughput by 36% on average, and achieves a 30.5% reduction of turnaround time on average.
doi:10.1109/smc.2017.8123125 dblp:conf/smc/XuDJLS17 fatcat:kldrfjlt3zbohauqepoeracsna