A Day in the Life of PubMed: Analysis of a Typical Day's Query Log

J. R. Herskovic, L. Y. Tanaka, W. Hersh, E. V. Bernstam
2007 JAMIA Journal of the American Medical Informatics Association  
A b s t r a c t Objective: To characterize PubMed usage over a typical day and compare it to previous studies of user behavior on Web search engines. Design: We performed a lexical and semantic analysis of 2,689,166 queries issued on PubMed over 24 consecutive hours on a typical day. Measurements: We measured the number of queries, number of distinct users, queries per user, terms per query, common terms, Boolean operator use, common phrases, result set size, MeSH categories, used semantic
more » ... rements to group queries into sessions, and studied the addition and removal of terms from consecutive queries to gauge search strategies. Results: The size of the result sets from a sample of queries showed a bimodal distribution, with peaks at approximately 3 and 100 results, suggesting that a large group of queries was tightly focused and another was broad. Like Web search engine sessions, most PubMed sessions consisted of a single query. However, PubMed queries contained more terms. F i g u r e 7. Number of sessions per user for 2,689,166 queries issued on a single day. F i g u r e 8. Number of queries per session for 2,689,166 queries issued in a single day, as a proportion of sessions with the specified number of queries. Figure truncated at 20 queries.
doi:10.1197/jamia.m2191 pmid:17213501 pmcid:PMC2213463 fatcat:bwa2d63d75f27f3laqjpiidobq