Hurry-up: Scaling Web Search on Big/Little Multi-core Architectures [article]

Rajiv Nishtala and Xavier Martorell Norwegian University of Science and Technology, Barcelona Supercomputing center)
2019 arXiv   pre-print
Heterogeneous multi-core systems such as big/little architectures have been introduced as an attractive server design option with the potential to improve performance under power constraints in data centres. Since both big high-performing and little power-efficient cores can run on the same system sharing the workload processing, thread mapping/scheduling turns out to be much more challenging. This is particularly hard when considering the different trade-offs shaped by the heterogeneous cores
more » ... n the quality-of-service (expressed as tail latency) experienced by user-facing applications, such as Web Search. In this work, we present Hurry-up, a runtime thread mapping solution designed to select individual requests to run on the most appropriate heterogeneous cores to improve tail latency. Hurry-up accelerates compute-intensive requests on big cores, while letting less intensive threads to execute on little cores. We implement and deploy Hurry-up on a real 64-bit big/little architecture (ARM Juno), and show that, compared to a conservative policy on Linux, Hurry-up reduces the server tail latency by 39.5% (mean).
arXiv:1912.09844v1 fatcat:way7n4tdkfe6hlmrdn2mxecaym