Thrifty Label Propagation: Fast Connected Components for Skewed-Degree Graphs

Mohsen Koohi Esfahani, Peter Kilpatrick, Hans Vandierendonck
2021 2021 IEEE International Conference on Cluster Computing (CLUSTER)  
Various concurrent algorithms have been proposed in the literature in recent years that mostly focus on the disjoint set approach to the Connected Components (CC) algorithm. However, these CC algorithms do not take the skewed structure of real-world graphs into account and as a result they do not benefit from common features of graph datasets to accelerate processing. We investigate the implications of the skewed degree distribution of real-world graphs on their connectivity and we use these
more » ... tures to introduce Thrifty Label Propagation as a structureaware CC algorithm obtained by incorporating 4 fundamental optimization techniques in the Label Propagation CC algorithm. Our evaluation on 15 real-world graphs and 2 different processor architectures shows that Thrifty accelerates the flow of labels and processes only 1.4% of the edges of the graph. In this way, Thrifty is up to 16× faster than state-of-the-art CC algorithms such as Afforest, Jayanti-Tarjan, and Breadth-First Search CC. In particular, Thrifty delivers 1.5 × −19.9× speedup for graph datasets larger than one billion edges.
doi:10.1109/cluster48925.2021.00042 fatcat:tuaugzgh5fgh5bm2iwg5hzuyz4