Microblog Processing: A Study

Sandip Modha
2017 Forum for Information Retrieval Evaluation  
Sensing Microblog from retrieval and summarization become the challenging area for the Information retrieval community. Twitter is one of the most popular micro blogging platforms. In this paper, Twitter posts called tweets are studied from retrieval and extractive summarization perspectives. Given a set of topics or interest profiles or information requirement, a Microblog summarization system is desinged which process Twitter sample status stream and generate day-wise, topic-wise tweet
more » ... . Since volume of the Twitter public status stream is very large, tweet filtering or relevant tweet retrieval is the primary task for the summarization system. To measure the relevance between tweets and interest profiles, Language model with Jelinek-mercer smoothing, Dirichlet smoothing and Okapi BM25 model are used. Behaviour of Language Model smoothing parameter λ for JM-smoothing and µ for dirichlet smoothing is also studied. Summarization is anticipated as clustering problem. TREC MB 2015 and TREC RTS 2016 dataset is used to perform experiment. TREC RTS official metrics nDCG@10 − 1 and nDCG@10 − 0 are used to evaluate outcome of experiment. A detailed post hoc analysis is also performed on experiment results.
dblp:conf/fire/Modha17 fatcat:xolemzgx2fdc3oggvkzqrvcdqe