Parallel Community Detection Based on Distance Dynamics for Large-Scale Network
Data mining task is a challenge on finding a high-quality community structure from largescale networks. The distance dynamics model was proved to be active on regular-size network community, but it is difficult to discover the community structure effectively from the large-scale network (0.1-1 billion edges), due to the limit of machine hardware and high time complexity. In this paper, we proposed a parallel community detection algorithm based on the distance dynamics model called P-Attractor,
... alled P-Attractor, which is capable of handling the detection problem of large networks community. Our algorithm first developed a graph partitioning method to divide large network into lots of sub-networks, yet maintaining the complete neighbor structure of the original network. Then, the traditional distance dynamics model was improved by the dynamic interaction process to simulate the distance evolution of each sub-network. Finally, we discovered the real community structure by removing all external edges after evolution process. In our extensive experiments on multiple synthetic networks and real-world networks, the results showed the effectiveness and efficiency of P-Attractor, and the execution time on 4 threads and 32 threads are around 10 and 2 h, respectively. Our proposed algorithm is potential to discover community from a billion-scale network, such as Uk-2007. INDEX TERMS Community detection, complex network, graph clustering, web mining.