A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
BARISTA: Efficient and Scalable Serverless Serving System for Deep Learning Prediction Services
[article]
2019
arXiv
pre-print
Pre-trained deep learning models are increasingly being used to offer a variety of compute-intensive predictive analytics services such as fitness tracking, speech and image recognition. The stateless and highly parallelizable nature of deep learning models makes them well-suited for serverless computing paradigm. However, making effective resource management decisions for these services is a hard problem due to the dynamic workloads and diverse set of available resource configurations that
arXiv:1904.01576v2
fatcat:jzhpvqzdsbanzeqhdnwizj6f6q