Data Placement for Multi-Tenant Data Federation on the Cloud [article]

Ji Liu, Lei Mo, Sijia Yang, Jingbo Zhou, Shilei Ji, Haoyi Xiong, Dejing Dou
2021 arXiv   pre-print
Due to privacy concerns of users and law enforcement in data security and privacy, it becomes more and more difficult to share data among organizations. Data federation brings new opportunities to the data-related cooperation among organizations by providing abstract data interfaces. With the development of cloud computing, organizations store data on the cloud to achieve elasticity and scalability for data processing. The existing data placement approaches generally only consider one aspect,
more » ... ich is either execution time or monetary cost, and do not consider data partitioning for hard constraints. In this paper, we propose an approach to enable data processing on the cloud with the data from different organizations. The approach consists of a data federation platform named FedCube and a Lyapunov-based data placement algorithm. FedCube enables data processing on the cloud. We use the data placement algorithm to create a plan in order to partition and store data on the cloud so as to achieve multiple objectives while satisfying the constraints based on a multi-objective cost model. The cost model is composed of two objectives, i.e., reducing monetary cost and execution time. We present an experimental evaluation to show our proposed algorithm significantly reduces the total cost (up to 69.8\%) compared with existing approaches.
arXiv:2112.07980v1 fatcat:thyuoz2ulvbirddf3gmznczbti