A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Dynamic speculative optimizations for SQL compilation in Apache Spark
2020
Proceedings of the VLDB Endowment
Big-data systems have gained significant momentum, and Apache Spark is becoming a de-facto standard for modern data analytics. Spark relies on SQL query compilation to optimize the execution performance of analytical workloads on a variety of data sources. Despite its scalable architecture, Spark's SQL code generation suffers from significant runtime overheads related to data access and de-serialization. Such performance penalty can be significant, especially when applications operate on
doi:10.14778/3377369.3377382
fatcat:5jm4mgvxpjhapi46kjcjjshsfq