Exploiting the wisdom of the crowds for characterizing and connecting heterogeneous resources

Ricardo Kawase, Patrick Siehndel, Bernardo Pereira Nunes, Eelco Herder, Wolfgang Nejdl
2014 Proceedings of the 25th ACM conference on Hypertext and social media - HT '14  
Heterogeneous content is an inherent problem for crosssystem search, recommendation and personalization. In this paper we investigate differences in topic coverage and the impact of topicstopics in different kinds of Web services. We use entity extraction and categorization to create 'fingerprints' that allow for meaningful comparison. As a basis taxonomy, we use the 23 main categories of Wikipedia Category Graph, which has been assembled over the years by the wisdom of the crowds. Following a
more » ... roof of concept of our approach, we analyze differences in topic coverage and topic impact. The results show many differences between Web services like Twitter, Flickr and Delicious, which reflect users' behavior and the usage of each system. The paper concludes with a user study that demonstrates the benefits of fingerprints over traditional textual methods for recommendations of heterogeneous resources.
doi:10.1145/2631775.2631797 dblp:conf/ht/KawaseSNHN14 fatcat:dnmcu4lkbna7fjckbk4rjuf4zy