A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is
Sampling is often used to reduce query latency for interactive big data analytics. The established parallel data processing paradigm relies on function shipping, where a coordinator dispatches queries to worker nodes and then collects the results. The commoditization of high-performance networking makes data shipping possible, where the coordinator directly reads data in the workers' memory using RDMA while workers process other queries. In this work, we explore when to use function shipping orarXiv:1807.11149v1 fatcat:3hhqugoqtfamtawwo5uujnbnd4