Massive Social Network Analysis: Mining Twitter for Social Good

David Ediger, Karl Jiang, Jason Riedy, David A. Bader, Courtney Corley
2010 2010 39th International Conference on Parallel Processing  
Social networks produce an enormous quantity of data. Facebook consists of over 400 million active users sharing over 5 billion pieces of information each month. Analyzing this vast quantity of unstructured data presents challenges for software and hardware. We present GraphCT, a Graph Characterization Toolkit for massive graphs representing social network data. On a 128processor Cray XMT, GraphCT estimates the betweenness centrality of an artificially generated (R-MAT) 537 million vertex, 8.6
more » ... illion edge graph in 55 minutes and a realworld graph (Kwak, et al.) with 61.6 million vertices and 1.47 billion edges in 105 minutes. We use GraphCT to analyze public data from Twitter, a microblogging network. Twitter's message connections appear primarily tree-structured as a news dissemination system. Within the public data, however, are clusters of conversations. Using GraphCT, we can rank actors within these conversations and help analysts focus attention on a much smaller data subset.
doi:10.1109/icpp.2010.66 dblp:conf/icpp/EdigerJRBCFR10 fatcat:i6clwbmjuzetxgntubekzl3qxq