Shuffling, Fast and Slow: Scalable Analytics on Serverless Infrastructure

Qifan Pu, Shivaram Venkataraman, Ion Stoica
2019 Symposium on Networked Systems Design and Implementation  
Serverless computing is poised to fulfill the long-held promise of transparent elasticity and millisecond-level pricing. To achieve this goal, service providers impose a finegrained computational model where every function has a maximum duration, a fixed amount of memory and no persistent local storage. We observe that the fine-grained elasticity of serverless is key to achieve high utilization for general computations such as analytics workloads, but that resource limits make it challenging to
more » ... implement such applications as they need to move large amounts of data between functions that don't overlap in time. In this paper, we present Locus, a serverless analytics system that judiciously combines (1) cheap but slow storage with (2) fast but expensive storage, to achieve good performance while remaining cost-efficient. Locus applies a performance model to guide users in selecting the type and the amount of storage to achieve the desired cost-performance trade-off. We evaluate Locus on a number of analytics applications including TPC-DS, CloudSort, Big Data Benchmark and show that Locus can navigate the costperformance trade-off, leading to 4⇥-500⇥ performance improvements over slow storage-only baseline and reducing resource usage by up to 59% while achieving comparable performance with running Apache Spark on a cluster of virtual machines, and within 2⇥ slower compared to Redshift.
dblp:conf/nsdi/PuVS19 fatcat:aqz5uue6bfg7fnuubx43xfphui