A Survey and Classification of Software-Defined Storage Systems

Ricardo Macedo, João Paulo, José Pereira, Alysson Bessani
2020 ACM Computing Surveys  
The exponential growth of digital information is imposing increasing scale and efficiency demands on modern storage infrastructures. As infrastructures complexity increases, so does the difficulty in ensuring quality of service, maintainability, and resource fairness, raising unprecedented performance, scalability, and programmability challenges. Software-Defined Storage (SDS) addresses these challenges by cleanly disentangling control and data flows, easing management and improving control
more » ... tionality of conventional storage systems. Despite its momentum in the research community, many aspects of the paradigm are still unclear, undefined, and unexplored, leading to misunderstandings that hamper the research and development of novel SDS technologies. In this article, we present an in-depth study of SDS systems, providing a thorough description and categorization of each plane of functionality. Further, we propose a taxonomy and classification of existing SDS solutions accordingly several different criteria. Finally, we provide key insights about the paradigm and discuss potential future research directions for the field. 39:2 R. Macedo et al. storage functionalities, including operating systems (OSes), hypervisors, distributed storage, caches, I/O schedulers, file systems, and device drivers [105] . Each of these layers includes a predetermined set of services (e.g., caching, queueing, data management) with strict interfaces and isolated procedures to employ over I/O requests, leveraging a complex, limited, and coarse-grained treatment of the I/O flow. Moreover, data-centric operations such as routing, processing, and management are, in most cases, blended as a single monolithic block, turning infeasible the ability to enforce end-to-end policies (e.g., bandwidth aggregation, I/O prioritization), imposing major scalability, flexibility, and modularity limitations to storage infrastructures [102] . Second, efficiently managing system resources in multi-tenant environments becomes progressively harder as service demand increases, since not only fine-grained resources within a process are shared but also resources across multiple processes and nodes along the I/O path (e.g., shared memory, storage devices, schedulers, caches, network) [63] . Further, since tenants comprehend different service requirements and workload profiles, traditional resource management mechanisms fall short to ensure performance isolation and resource fairness, due to their rigid and coarse-grained I/O management [63] . As a result, achieving QoS under multi-tenancy is infeasible while differentiated treatment of I/O flow, global knowledge of system resources, and end-to-end control and coordination of the infrastructure are not ensured [103, 125] . Third, storage infrastructures have become highly heterogeneous, and are used simultaneously by a myriad of applications with significant different behaviors and evolving requirements that fluctuate over time [21] . However, these infrastructures are frequently tuned with monolithic configuration setups that prevent the online tuning of the storage system [3] . As a result, these homogeneous setups have lead to applications running on a general-purpose I/O stack, competing for the system resources in a non-optimal fashion, and incapable of performing I/O differentiation and end-to-end system optimization [30, 102] . These pitfalls are inherent to the design of traditional large-scale storage infrastructures (e.g., cloud computing, high-performance computing -HPC), and reflect the absence of a true programmable I/O stack and the uncoordinated control of the distributed infrastructure [30] . As the length of the I/O path increases, it becomes harder to efficiently control and maintain the infrastructure stack. Moreover, individually fine-tuning and optimizing each layer of the I/O stack (i.e., in a non-holistic fashion) of a large-scale infrastructure increases the difficulty to scale to new levels of performance, concurrency, fairness, and resource capacity. Such outcomes result in lack of coordination and performance isolation, weak programmability and customization, strict configuration and adaptability, and waste of shared system resources. To overcome the shortcomings of traditional storage infrastructures, the Software-Defined Storage (SDS) paradigm emerged as a compelling solution to ease data and configuration management, while improving end-to-end control functionality of conventional storage systems [105] . By decoupling the control and the data flows into two major components -control and data planes -it ensures improved modularity of the storage stack, enables dynamic end-to-end policy enforcement, and introduces differentiated I/O treatment on multi-tenant environments. SDS inherits legacy concepts from Software-Defined Networking (SDN) [54] and applies them to storage-oriented environments, bringing new insights to storage infrastructures, such as improved system programmability and extensibility [80, 95] ; fine-grained resource orchestration (under single-and multi-tenancy) [63, 103] ; end-to-end QoS, maintainability, and flexibility [46, 105] ; and resource efficiency [68, 70] . Furthermore, by breaking the vertical alignment of conventional storage designs, SDS systems provide holistic orchestration of heterogeneous infrastructures, ensure system-wide visibility of storage components, and enable straightforward enforcement of several storage objectives. In recent years, the SDS paradigm has gained significant traction in the research community, leading to a wide spectrum of both academic and commercial proposals to address the drawbacks
doi:10.1145/3385896 fatcat:f4v7bgxwfbaxpfckrei4qkttby