A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Hypergeometric language models for republished article finding
2011
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR '11
Republished article finding is the task of identifying instances of articles that have been published in one source and republished more or less verbatim in another source, which is often a social media source. We address this task as an ad hoc retrieval problem, using the source article as a query. Our approach is based on language modeling. We revisit the assumptions underlying the unigram language model taking into account the fact that in our setup queries are as long as complete news
doi:10.1145/2009916.2009983
dblp:conf/sigir/TsagkiasRW11
fatcat:6xiam7p3wbehbcqrfqob4wroyi