Towards autoscaling of Apache Flink jobs

Balázs Varga, Márton Balassi, Attila Kiss
2021 Acta Universitatis Sapientiae: Informatica  
Data stream processing has been gaining attention in the past decade. Apache Flink is an open-source distributed stream processing engine that is able to process a large amount of data in real time with low latency. Computations are distributed among a cluster of nodes. Currently, provisioning the appropriate amount of cloud resources must be done manually ahead of time. A dynamically varying workload may exceed the capacity of the cluster, or leave resources underutilized. In our paper, we
more » ... ribe an architecture that enables the automatic scaling of Flink jobs on Kubernetes based on custom metrics, and describe a simple scaling policy. We also measure the e ects of state size and target parallelism on the duration of the scaling operation, which must be considered when designing an autoscaling policy, so that the Flink job respects a Service Level Agreement.
doi:10.2478/ausi-2021-0003 fatcat:ek4pmqbypve3nca2jphmxn6g44