Searching for expertise using the terrier platform

Craig Macdonald, Iadh Ounis
2006 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06  
EXTENDED ABSTRACT In large Enterprise organisations, users often seek other members of their organisation to collaborate with. The use of an expert search system can help users in finding people that have relevant expertise to their need. In the TREC 2005 Enterprise track, we developed an expert search system based on the Terrier IR platform [1] . In this work, we demonstrate an improved version of our technology, applied in a different Enterprise setting, namely the University of Glasgow,
more » ... of Computing Science. From the intranet of the organisation, we crawl all documents, such as mailing lists, staff Web homepages, CVs, research publications and administrative and teaching material. Using the staff list, our expert search system builds a profile of expertise evidence for each candidate expert, by associating documents to each candidate. The expert search system uses a retrieval methodology based on the Divergence from Randomness framework to rank candidate profiles in response to a user query. Each document is represented by a set of fields (e.g. Content, Title and Anchor text of incoming hyperlinks), and the used matching function takes these various fields of the documents into account when ranking candidate profiles. Based on our experience from the TREC 2005 Enterprise track, we assign different importance weights on the various sources of evidence. For example, the homepage or CV of a candidate are good sources of expertise evidence. Figure 1 presents the results for a typical query to the expert search system. A ranking of candidate experts for the query is displayed, along with a concise description of their job interests. In contrast to [2], the top-ranked documents for each candidate are also displayed, as found by the retrieval methodology. These related documents are important, as they allow the user to quickly assess the relevance of a candidate by scanning the related documents. This is similar to a user scanning summaries to assess the relevance of documents in classical retrieval. REFERENCES [1] C. Macdonald, V. Plachouras, B. He, and I. Ounis. University of Glasgow at TREC 2005: Experiments in Terabyte and Enterprise tracks with Terrier. In TREC-2005 Proc. [2] M. Maybury, R.
doi:10.1145/1148170.1148345 dblp:conf/sigir/MacdonaldO06a fatcat:x7usodaflnguhirijhwdge3zda