Incast Mitigation in a Data Center Storage Cluster Through a Dynamic Fair-Share Buffer Policy

Yawar Abbas Bangash, Tauseef Rana, Haider Abbas, Muhammad Ali Imran, Adnan Ahmed Khan
<span title="">2019</span> <i title="Institute of Electrical and Electronics Engineers (IEEE)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/q7qi7j4ckfac7ehf3mjbso4hne" style="color: black;">IEEE Access</a> </i> &nbsp;
Incast is a phenomenon when multiple devices interact with only one device at a given time. Multiple storage senders overflow either the switch buffer or the single-receiver memory. This pattern causes all concurrent-senders to stop and wait for buffer/memory availability, and leads to a packet loss and retransmission-resulting in a huge latency. We present a software-defined technique tackling the manyto-one communication pattern-Incast-in a data center storage cluster. Our proposed method
more &raquo; ... uples the default TCP windowing mechanism from all storage servers, and delegates it to the software-defined storage controller. The proposed method removes the TCP saw-tooth behavior, provides a global flow awareness, and implements the dynamic fair-share buffer policy for end-to-end I/O path. It considers all I/O stages (applications, device drivers, NICs, switches/routers, file systems, I/O schedulers, main memory, and physical disks) while achieving the maximum I/O throughput. The policy, which is part of the proposed method, allocates fair-share bandwidth utilization for all storage servers. Priority queues are incorporated to handle the most important data flows. In addition, the proposed method provides better manageability and maintainability compared with traditional storage networks, where data plane and control plane reside in the same device. INDEX TERMS Incast, software-defined storage, I/O throughput, dynamic fair-share buffer, end-to-end I/O path, fair-share BW utilization. 10718 2169-3536 2019 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. VOLUME 7, 2019 Y. A. Bangash et al.: Incast Mitigation in a Data Center Storage Cluster Through a Dynamic Fair-Share Buffer Policy FIGURE 1. A general Incast model for a single switch: For every file access, the client/server requests the MDS to provide metadata (stripe ID and location). FIGURE 2. A single switch with multiple concurrent I/O requests from multiple clients. storage senders. Switch buffers exhaust when more traffic is forwarded to a port than it can send/receive. Reasons include: ingress and egress port speed mismatch, multiple inputs to a single output port, half-duplex collision on an out-put port, the complex interconnection in data centers equipment (switches, servers, links, etc.), and the deployed communication protocol. The main problem occurs in a storage cluster, where multiple storage servers send data and cannot send more data until all parallel threads completed [2]. The application level throughput decreases far below the available bandwidth (BW ) when the synchronized concurrent senders increase. In addition, in a parallel file system, when a single I/O request is issued for data stripped over multiple file servers, I/O request has to wait for all aggregator storage nodes to complete [3]. This intra request synchronization FIGURE 3. Throughput collapse in a data center storage cluster [5]: As the number of storage servers increases and, hence their I/O requests, the I/O throughput collapse occurs. leads to Incast in storage. A brief overview of Incast problem can be found at [4] . A common throughput collapse behavior is depicted in Figure 3 , where Phanishayee et al. [5] observed that when the number of concurrent sending storage servers exceeds seven, the overall throughput collapse starts. The default TCP fails to handle multiple concurrent storage servers. If the switch's buffer capacity is big enough to accommodate the traffic burst, throughput collapse will not occur. However, in practical situation, big switch buffers are expensive and causes delay in the network. For multiple concurrent senders, the throughput collapse stays the same irrespective of the data transfer size (small data size and big data size). The throughput collapse results in an increased latency, and causes harm to any network-based business activities. Latency is an important metric in a computer network, specially in a delay sensitive distributed storage application [6] . The main aim is to improve the end-user experience. In emergency and disaster scenarios such as earthquake, tsunami, and terrorism, a container based data center requires as minimum as possible delay to provide service level agreement. Mayer [7] presented at Web 2.0 conference, that Google's traffic is reduced by 20% due to 500ms increase in latency. Stefanov [8] in YSlow (a web page analyzer based on Yahoo rules) stated that in Yahoo, an extra 400ms reduced the traffic by 9%, and every 100ms latency costs 1% in business revenue to Amazon. These results suggest to find a solution to minimize latency as much as possible. McKeown et al. [9] presented SDN where a control plane is decoupled from a data plane, and provided a centralized policy to manage and control all networking operations (packet drop, modify, forward, update). In SDN, the control plane is moved out of a commodity switch and housed in the main controller. According to our best knowledge, a must-read comprehensive survey about SDN can be found at [10] .
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/access.2019.2891264">doi:10.1109/access.2019.2891264</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/lvd76vcaxjhf3ehoyxdhg5zhxi">fatcat:lvd76vcaxjhf3ehoyxdhg5zhxi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190427142000/http://eprints.gla.ac.uk/177062/1/177062.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/5c/a6/5ca614dc603eb728390464bf8cd3f7ea08f5274f.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/access.2019.2891264"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> ieee.com </button> </a>