Taxonomy and clustering in collaborative systems: The case of the on-line encyclopedia Wikipedia

A. Capocci, F. Rao, G. Caldarelli
2007 Europhysics letters  
In this paper we investigate the nature and structure of the relation between imposed classifications and real clustering in a particular case of a scale-free network given by the on-line encyclopedia Wikipedia. We find a statistical similarity in the distributions of community sizes both by using the top-down approach of the categories division present in the archive and in the bottom-up procedure of community detection given by an algorithm based on the spectral properties of the graph.
more » ... less the statistically similar behaviour the two methods provide a rather different division of the articles, thereby signaling that the nature and presence of power laws is a general feature for these systems and cannot be used as a benchmark to evaluate the suitability of a clustering method.
doi:10.1209/0295-5075/81/28006 fatcat:55eo5gmirnfnrcucdhe7zlpzaq