Replica placement for high availability in distributed stream processing systems

Thomas Repantis, Vana Kalogeraki
2008 Proceedings of the second international conference on Distributed event-based systems - DEBS '08  
A significant number of emerging on-line data analysis applications require the processing of data streams, large amounts of data that get updated continuously, to generate outputs of interest or to identify meaningful events. Example domains include network traffic management, stock price monitoring, customized e-commerce websites, and analysis of sensor data. In this paper we look at the problem of high availability in such a distributed stream processing system. By taking into account the
more » ... ticular characteristics of stream processing applications we first identify design principles for a replica placement algorithm for high availability. We incorporate these principles in a decentralized replica placement protocol that aims to maximize availability, while respecting resource constraints, and making performance-aware placement decisions. We have integrated our replica placement protocol in Synergy, our distributed stream processing middleware. Our experimental comparison over PlanetLab with the current state of the art corroborates our claims that our techniques maximize availability while sustaining good performance.
doi:10.1145/1385989.1386012 dblp:conf/debs/RepantisK08 fatcat:gk5izwcsnvg57biwcuugcj4n2a