A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Model-driven Cluster Resource Management for AI Workloads in Edge Clouds
[article]
2022
arXiv
pre-print
Since emerging edge applications such as Internet of Things (IoT) analytics and augmented reality have tight latency constraints, hardware AI accelerators have been recently proposed to speed up deep neural network (DNN) inference run by these applications. Resource-constrained edge servers and accelerators tend to be multiplexed across multiple IoT applications, introducing the potential for performance interference between latency-sensitive workloads. In this paper, we design analytic models
arXiv:2201.07312v1
fatcat:d4wdw7frbvcfvjwgfwvcod5ufa