Awan: Locality-Aware Resource Manager for Geo-Distributed Data-Intensive Applications

Albert Jonathan, Abhishek Chandra, Jon Weissman
2016 2016 IEEE International Conference on Cloud Engineering (IC2E)  
Today, many organizations need to operate on data that is distributed around the globe. This is inevitable due to the nature of data that is generated in different locations such as video feeds from distributed cameras, log files from distributed servers, and many others. Although centralized cloud platforms have been widely used for data-intensive applications, such systems are not suitable for processing geo-distributed data due to high data transfer overheads. An alternative approach is to
more » ... e an Edge Cloud which reduces the network cost of transferring data by distributing its computations globally. While the Edge Cloud is attractive for geo-distributed data-intensive applications, extending existing cluster computing frameworks to a wide-area environment must account for locality. We propose Awan : a new locality-aware resource manager for geo-distributed dataintensive applications. Awan allows resource sharing between multiple computing frameworks while enabling high locality scheduling within each framework. Our experiments with the Nebula Edge Cloud on PlanetLab show that Awan achieves up to a 28% increase in locality scheduling which reduces the average job turnaround time by approximately 18% compared to existing cluster management mechanisms. Awan is an Indonesian word meaning "Cloud". Throughout this paper, we will use the term scheduler and Framework Scheduler interchangeably. Closeness is measured in term of the network bandwidth between two nodes, unless explicitly specified as a geographic distance.
doi:10.1109/ic2e.2016.15 dblp:conf/ic2e/JonathanCW16 fatcat:deou7xvbdbgkfncwqgpblvohki