A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
Moreover, Willump complements these statistical optimizations with compiler optimizations to automatically generate fast inference code for ML applications. ... We show that Willump improves the end-to-end performance of real-world ML inference pipelines curated from major data science competitions by up to 16x without statistically significant loss of accuracy ... Toyota Research Institute ("TRI") provided funds to assist the authors with their research but this article solely reflects the opinions and conclusions of its authors and not TRI or any other Toyota entity ...arXiv:1906.01974v3 fatcat:yv2j4corjzeurj4hpifo5zex4a
In this paper, we present a cost-based approach for the automatic selection and allocation of a disjoint ensemble of black-box predictors to answer predictive spatio-temporal queries. ... To the best of our knowledge, this is the first work to solve the problem of optimizing the allocation of black-box models to answer predictive spatio-temporal queries. ... Assuming ML models are used in higher-level end-to-end user queries in an ML application (compute the top-K predictions for a recommendation model) a query-aware adaptive parallelization. ...arXiv:2005.11093v3 fatcat:ddnsbgpth5fm5inh5uophvo6j4