REDEMON: Resilient Decentralized Monitoring System for Edge Infrastructures
2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID)
The Guifi.net community network has evolved during the past 15 years into a telecommunications infrastructure that offers Internet access to more than 80.000 people. The monitoring system currently in place for this network is lagging behind the growth of the infrastructure, requiring manual intervention and counting several single points of failure. In this paper we present REDEMON, a resilient decentralized monitoring system, hosted on distributed and interconnected edge devices, for a
... e, eventually-consistent monitoring of the Guifi.net network, leveraging CRDT-based data structures implemented on AntidoteDB. We developed the REDEMON system as a prototype featuring resilience, decentralization and automation, in order to replace the legacy monitoring system. To assess the system, this prototype was deployed on resource-constraint edge nodes in the Guifi.net production network and evaluated under realistic conditions. The decentralized assignment mechanism successfully achieves setting the minimum number of monitoring servers per network device that satisfies the established system requirements. Besides, by concentrating the workload on the minimum required number of servers running at their maximum capacity, the remaining devices can idle away, reducing the consumption footprint of the system. With regard to computing resources, we measure a moderate CPU and RAM usage by the monitoring system on low-capacity devices, while we observe that a considerable network traffic is required for achieving a resilient and consistent data storage layer. This resilient and decentralized architecture could lay the basis for other edge applications in the cloud computing domain that need to coordinate over distributed and consistent shared data.