The Use of Implicit Evidence for Relevance Feedback in Web Retrieval [chapter]

Ryen W. White, Ian Ruthven, Joemon M. Jose
2002 Lecture Notes in Computer Science  
In this paper we report on the application of two contrasting types of relevance feedback for web retrieval. We compare two systems; one using explicit relevance feedback (where searchers explicitly have to mark documents relevant) and one using implicit relevance feedback (where the system endeavours to estimate relevance by mining the searcher's interaction). The feedback is used to update the display according to the user's interaction. Our research focuses on the degree to which implicit
more » ... dence of document relevance can be substituted for explicit evidence. We examine the two variations in terms of both user opinion and search effectiveness. based solely on the initial query, and the resultant need for query modification have already been identified [22] . Relevance feedback systems automatically resubmit the initial query, expanding it using terms taken from the documents marked relevant by the user. In practice, relevance feedback can be very effective but it relies on users assessing the relevance of documents and indicating to the system which documents contain relevant information. In real-life Internet searches, users may be unwilling to browse to web pages to gauge their relevance. Such a task imposes an increased burden and increased cognitive load [20] . Documents may be lengthy or complex, users may have time restrictions or the initial query may have retrieved a poor set of documents. An alternative strategy is to present a query-biased summary of each of the first n web pages returned in response to a user's query [23] . The summaries allow users to assess documents for relevance, and give feedback, more quickly. However the problem of getting the users to indicate to the system which documents contain relevant information remains. In this paper, we examine the extent to which implicit feedback (where the system attempts to estimate what the user may be interested in) can act as a substitute for explicit feedback (where searchers explicitly mark documents relevant). Therefore, we attempt to side-step the problem of getting users to explicitly mark documents relevant by making predictions on relevance through analysing the user's interaction with the system. Previously, many studies that endeavour through the use of various 'surrogate' measures (links clicked, mouseovers, scrollbar activity, etc.) [11], [7] to unobtrusively monitor user behaviour have been conducted. Through such means, other studies have sought to determine document relevance implicitly [4], [12], [9], [14]. These studies infer relevance from the time spent viewing a document. If a user 'examines' [10] a document for a long time, or if a document suffers a lot of 'read wear' [4] it is assumed to be relevant. These studies only focus on newsgroup documents and rely on users interaction with the actual document. In this paper we extend these concepts onto web result lists, using document summaries instead of the actual document. Much can be gleaned from a user's ephemeral interactions during a single search session [15] . Our system seeks to capture these and predict relevance based on this interaction. Specifically, we hypothesised that implicit and explicit feedback were interchangeable as sources of relevance information for relevance feedback. Through developing a system that utilised each type we were able to compare the two approaches from the user's perspective and in terms of search effectiveness. This paper will describes the system and experiments used to test the viability of interchanging implicit and explicit relevance feedback. The experiments were carried out as part of the TREC-10 interactive track. In this paper we expand on our original analysis of our experiments and provide a deeper insight into our experimental results. This paper describes the two systems used in section 2, the relevance feedback approaches in section 3, then outlines the experimental methodology employed in section 4. We present the initial results and analyse them in section 5, and conclude in section 6.
doi:10.1007/3-540-45886-7_7 fatcat:nzgecxzamfe37bcxbrkivoyt4m