The Impact of Future Term Statistics in Real-Time Tweet Search [chapter]

Yulu Wang, Jimmy Lin
2014 Lecture Notes in Computer Science  
In the real-time tweet search task operationalized in the TREC Microblog evaluations, a topic consists of a query Q and a time t, modeling the task where the user wishes to see the most recent but relevant tweets that address the information need. To simulate the real-time aspect of the task in an evaluation setting, many systems search over the entire collection and then discard results that occur after the query time. This approach, while computationally efficient, "cheats" in that it takes
more » ... vantage of term statistics from documents not available at query time (i.e., future information). We show, however, that such results are nearly identical to a "gold standard" method that builds a separate index for each topic containing only those documents that occur before the query time. The implications of this finding on evaluation, system design, and user task models are discussed.
doi:10.1007/978-3-319-06028-6_58 fatcat:lcjpxmevjzamxaqv4iyykjhcja