COSI: Cloud Oriented Subgraph Identification in Massive Social Networks

Matthias Bröcheler, Andrea Pugliese, V.S. Subrahmanian
2010 2010 International Conference on Advances in Social Networks Analysis and Mining  
Subgraph matching is a key operation on graph data. Social network (SN) providers may want to find all subgraphs within their social network that "match" certain query graph patterns. Unfortunately, subgraph matching is NP-complete, making its application to massive SNs a major challenge. Past work has shown how to implement subgraph matching on a single processor when the graph has 10-25M edges. In this paper, we show how to use cloud computing in conjunction with such existing single
more » ... ing single processor methods to efficiently match complex subgraphs on graphs as large as 778M edges. A cloud consists of one "master" compute node and k "slave" compute nodes. We first develop a probabilistic method to estimate probabilities that a vertex will be retrieved by a random query and that a pair of vertices will be successively retrieved by a random query. We use these probability estimates to define edge weights in an SN and to compute minimal edge cuts to partition the graph amongst k slave nodes. We develop algorithms for both master and slave nodes that try to minimize communication overhead. The resulting COSI system can answer complex queries over real-world SN data containing over 778M edges very efficiently.
doi:10.1109/asonam.2010.80 dblp:conf/asunam/BrochelerPS10 fatcat:hh2w6zpgnfbz7ffd3bitkymwsi