A general scalable and accurate decentralized level monitoring method for large-scale dynamic service provision in hybrid clouds

Yongquan Fu, Yijie Wang, Ernst Biersack
2013 Future generations computer systems  
Hybrid cloud computing combines private clouds with geographically-distributed resources from public clouds, desktop grids or in-house gateways to provide the most flexibility of each kind of cloud platforms. Service provisioning for widearea applications such as cloud backup or cloud network games is sensitive to wide-area network metric such as round trip time, bandwidth, loss rates. In order to optimize the quality of the service provision in hybrid clouds, it is highly valuable for the
more » ... d clouds to collect detailed network metric between participating nodes of the hybrid clouds. However, since nodes can be large-scale and dynamic, the network metric may be diverse for different cloud services, it is challenging to increase the generality, scalability, accuracy and the robustness of the measurement process. We propose a novel distributed level monitoring method HPM (Hierarchical Performance Measurement) satisfying these requirements. For each kind of network metric, HPM represents the degree of pairwise closeness with discrete level values inspired by the hierarchical clustering tree. HPM maps probed metric to discrete levels based on an existing distributed Kmeans clustering method that helps maximize the similarity of the network metric in the same level, which therefore optimizes the matching between pairwise levels and the real-world pairwise proximity. Furthermore, HPM computes the pairwise levels with decentralized coordinates for scalability. Each node independently maintains its low-dimensional coordinate based on a novel decentralized implementation of the Maximum Margin Matrix Factorization method that optimizes 1 1 the mapping between the network metric and the level values. Simulation results for the RTT, bandwidth, loss and hop metric confirm that HPM converges fast, is robust to parameter settings, scales well with increasing levels or system size, and adapts well to diverse metric. A prototyping deployment on the PlanetLab platform shows that HPM not only converges fast, but also incurs modest maintenance bandwidth costs. Finally, applying HPM to optimize the service provision of hybrid clouds shows how HPM can achieve close to optimal solutions.
doi:10.1016/j.future.2012.11.001 fatcat:bxp6wvoqu5attmwcnf32lbghfa