Accelerating Relational Databases by Leveraging Remote Memory and RDMA

Feng Li, Sudipto Das, Manoj Syamala, Vivek R. Narasayya
2016 Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16  
Memory is a crucial resource in relational databases (RDBMSs). When there is insufficient memory, RDBMSs are forced to use slower media such as SSDs or HDDs, which can significantly degrade workload performance. Cloud database services are deployed in data centers where network adapters supporting remote direct memory access (RDMA) at low latency and high bandwidth are becoming prevalent. We study the novel problem of how a Symmetric Multi-Processing (SMP) RDBMS, whose memory demands exceed
more » ... lly-available memory, can leverage available remote memory in the cluster accessed via RDMA to improve query performance. We expose available memory on remote servers using a lightweight file API that allows an SMP RDBMS to leverage the benefits of remote memory with modest changes. We identify and implement several novel scenarios to demonstrate these benefits, and address design challenges that are crucial for efficient implementation. We implemented the scenarios in Microsoft SQL Server engine and present the first end-to-end study to demonstrate benefits of remote memory for a variety of micro-benchmarks and industry-standard benchmarks. Compared to using disks when memory is insufficient, we improve the throughput and latency of queries with short reads and writes by 3× to 10×, while improving the latency of multiple TPC-H and TPC-DS queries by 2× to 100×. Keywords Relational databases; RDMA; remote memory; opportunistic caching; buffer pool extension; semantic caching. Scenarios for remote memory usage in RDBMS: We describe four novel scenarios where an SMP RDBMS can leverage remote memory to significantly improve performance of memory-intensive workloads. These scenarios are: (i) extending the RDBMS caches such as the buffer pool; (ii) spilling temporary data for memory intensive operators such as Sort and Hash; and (iii) supporting a semantic cache [10, 45]-which has traditionally been limited to application or middleware tiers-integrated into an RDBMS and pinned in available remote memory; and (iv) leveraging fast memoryto-memory transfer to prime and warm-up the buffer pool of a newly-elected primary in the event of a planned primary-secondary swap in an RDBMS cluster or a cloud database service. The first three scenarios leverage remote memory as a a new level in the memory hierarchy of the RDBMS whose performance lies between local memory and local SSDs or HDDs. While (i) and (ii) are handled by storage engine and lower layers, (iii) introduces interesting challenges of integrating this new form of cache within the RDBMS, query optimizer costing, plan selection, and identifying appropriate structures to cache. All four scenarios leverage the lightweight file API to access remote memory and can dramatically improve the performance of an SMP RDBMS, without affecting correctness and availability of the database server even if the remote server fails, rendering the memory unavailable. Brokering of unutilized memory in the cluster: Unutilized memory on a subset of servers in a cluster of a cloud database service needs to be brokered to allow sharing among the different servers requesting additional memory. We use a design similar to many standard designs of resource negotiators, such as YARN [49]. Each server reports the unused memory to a memory broker. A server with unmet memory demand can request the broker for a lease to a remote memory region. This lease provides the database server exclusive access to the region. The database server opportunistically leverages this remote memory to improve the workload's performance without stealing memory committed to processes executing on the remote server. Efficient implementation: There are several design decisions to consider to efficiently exploit remote memory. These include: (i) the suitable protocol to access remote memory via RDMA; (ii) whether to treat remote memory accesses as synchronous or asynchronous operations; and (iii) efficiently managing registration of memory regions to NICs which has non-trivial overheads [13] . We present a detailed implementation in Microsoft SQL Server (Section 4). While our implementation is specific to SQL Server, we expect our design to generalize to other RDBMSs particularly due to the choice of a lightweight file API to expose available remote memory in the cluster. We conduct extensive experiments using a commodity RDMAenabled cluster of ten servers, and using a variety of configurations, targeted micro-benchmarks and industry-standard TPC benchmarks (Section 5). We compare our implementation against several alternatives: (a) a baseline where the RDBMS uses locally-attached HDDs and SSDs when demand exceeds the memory available on the server; (b) two alternatives that leverage remote memory using off-the-shelf technologies but use different protocols to access remote memory; and (c) when the RDBMS server has sufficient local memory to serve the workload. In all our experiments, we use a high-performance enterprise-grade disk subsystem with a hardware RAID-0 controller and up to 20 disks. For queries with short reads and writes (similar to OLTP workloads), the throughput and latency improvements are 3× to 10×. The latency of multiple TPC-H queries can be improved by 2× to 10×, and the latency of many TPC-DS queries can be improved by 10× to 100× (Section 6 and Appendix B). For a variety of workloads, the throughput and la-
doi:10.1145/2882903.2882949 dblp:conf/sigmod/LiDSN16 fatcat:czojcfylinej7dwklezw6lwnty