Relevance criteria identified by health information users during Web searches

Abe Crystal, Jane Greenberg
2006 Journal of the American Society for Information Science and Technology  
This study focused on the relevance judgments made by health information users using the Web. Health information users were conceptualized as motivated information users concerned about how an environmental issue affects their health. Users identified their own environmental health interests, and conducted a Web search of a particular environmental health Web site. Users were asked to identify (by highlighting with a mouse) the criteria they use to assess relevance in both Web search engine
more » ... ogates and full-text Web documents. Content analysis of document criteria highlighted by users identified the criteria these users relied on most often. Key criteria identified included (in order of frequency of appearance): research, topic, scope, data, influence, affiliation, Web characteristics, and authority/person. A power-law distribution of criteria was observed (a few criteria represented most of the highlighted regions, with a long tail of occasionally-used criteria). Implications of this work are that IR systems should be tailored based on users' tendencies to rely on certain document criteria, and that relevance research should combine methods to gather richer, contextualized data. Metadata for IR systems, such as that used in search engine surrogates, could be improved by taking into account actual usage of relevance criteria. Such metadata should be user-centered (based on data from users, as in this study) and context-appropriate (fit to users' situations and tasks). CRYSTAL/GREENBERG JASIS&T pre-print, for personal research use only Our study extends this line of research to Web IR. Previous studies have focused on users' relevance judgments when interacting with OPACs, bibliographic databases, or simply sets of documents, abstracted from any system (e.g., Janes, 1991; Barry, 1994) . The Web provides a notably different context for IR interaction: it contains diverse types of documents incorporating hypertext and multimedia, information is provided by many different authors and publishers, and documents are organized in numerous heterogeneous systems, from simple lists to complex hyperlinked networks. Research is needed to address how relevance judgments on the Web may differ from those in earlier IR environments, and then to incorporate these more nuanced conceptions into IR systems (Kekäläinen & Järvelin, 2002) . Our study addresses this need by exploring the criteria employed by Web users to make relevance judgments. Our user study complements recent work based on transaction logs (Spink, 2003) , as well as other user studies with different foci (Rieh, 2002; Tombros, Ruthven & Jose, 2005) . These empirical findings can, in turn, be used to support improved Web IR systems through the more effective use of metadata, surrogates, retrieval algorithms, and filtering mechanisms. Our study builds upon recent advances in relevance research by incorporating four key approaches: 1. A two-stage model of relevance judgment, in which users evaluate surrogates and then full-text documents (
doi:10.1002/asi.20436 fatcat:ft7bmjrx3vbureoswnru4iomqu