Metric convergence in social network sampling

Christian Doerr, Norbert Blenn
2013 Proceedings of the 5th ACM workshop on HotPlanet - HotPlanet '13  
While enabling new research questions and methodologies, the massive size of social media platforms also poses a significant issue for the analysis of these networks. In order to deal with this data volume, researchers typically turn to samples of these graph structures to conduct their analysis. This however raises the question about the representativeness of such limited crawls, and the amount of data necessary to come to stable predictions about the underlying systems. This paper analyzes
more » ... convergence of six commonly used topological metrics as a function of the crawling method and sample size used. We find that graph crawling methods drastically over-and underestimate network metrics, and that a non-trivial amount of data is needed to arrive at a stable estimate of the underlying network.
doi:10.1145/2491159.2491168 dblp:conf/sigcomm/DoerrB13 fatcat:qkzkhtesvrhublbnnr2hzzk3va