A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Characterization and Prediction of Deep Learning Workloads in Large-Scale GPU Datacenters
[article]
2021
arXiv
pre-print
Modern GPU datacenters are critical for delivering Deep Learning (DL) models and services in both the research community and industry. When operating a datacenter, optimization of resource scheduling and management can bring significant financial benefits. Achieving this goal requires a deep understanding of the job features and user behaviors. We present a comprehensive study about the characteristics of DL jobs and resource management. First, we perform a large-scale analysis of real-world
arXiv:2109.01313v1
fatcat:izw77evef5fpzb2ent3u6adyca