Ranking related entities

Marc Bron, Krisztian Balog, Maarten de Rijke
2010 Proceedings of the 19th ACM international conference on Information and knowledge management - CIKM '10  
Related entity finding is the task of returning a ranked list of homepages of relevant entities of a specified type that need to engage in a given relationship with a given source entity. We propose a framework for addressing this task and perform a detailed analysis of four core components; co-occurrence models, type filtering, context modeling and homepage finding. Our initial focus is on recall. We analyze the performance of a model that only uses cooccurrence statistics. While it identifies
more » ... a set of related entities, it fails to rank them effectively. Two types of error emerge: (1) entities of the wrong type pollute the ranking and (2) while somehow associated to the source entity, some retrieved entities do not engage in the right relation with it. To address (1), we add type filtering based on category information available in Wikipedia. To correct for (2), we add contextual information, represented as language models derived from documents in which source and target entities co-occur. To complete the pipeline, we find homepages of top ranked entities by combining a language modeling approach with heuristics based on Wikipedia's external links. Our method achieves very high recall scores on the end-to-end task, providing a solid starting point for expanding our focus to improve precision; additional heuristics lead to state-of-the-art performance.
doi:10.1145/1871437.1871574 dblp:conf/cikm/BronBR10 fatcat:3a3kgjmwezhxpksi7v5s2p4xl4