NOVASearch at TREC 2017 Real-Time Summarization Track

Gustavo Gonçalves, Flávio Martins, João Magalhães
2017 Text Retrieval Conference  
The rise of large data streams introduces new challenges regarding the delivery of relevant content towards an information need. This information need can be seen as a broad topic of information. One possible strategy to tackle the delivery of the most relevant documents regarding this broader topic is summarization. TREC 2017 Real-Time Summarization (RTS) provides a testbed for the development of stream based real-time summarization systems. Leveraging on the social media network, Twitter, the
more » ... participants are challenged to deliver the most relevant and diverse information in two main scenarios. The real-time push notifications scenario, or Scenario A, focuses on the identification and delivery of relevant information in near real-time. Whereas the daily-digest scenario, or scenario B, strives for the daily delivery of the most relevant and diverse documents. This paper presents the participation of the NOVASearch group at TREC 2017 Real-Time Summarization (RTS). Our work was developed for tackling the scenario B, after an analysis of the proposed systems for the TREC RTS 2016. In our approach we explore document filtering methods; vocabulary expansions; and the identification of subtopics through the aggregation of documents based on their textual similarity.
dblp:conf/trec/GoncalvesMM17 fatcat:dywedw2pf5fgvjguhmkbyypzmq