The Alyssa System at TREC QA 2007: Do We Need Blog06?

Dan Shen, Michael Wiegand, Andreas Merkel, Stefan Kazalski, Sabine Hunsicker, Jochen L. Leidner, Dietrich Klakow
2007 Text Retrieval Conference  
We describe the participation of the Saarland University LSV group in the DARPA/NIST TREC 2007 Q&A track with the Alyssa system, using an approach that combines cascaded language-model based information retrieval (LMIR) with data-driven learning methods for answer extraction and ranking. To test the robustness of this approach that was previously proven on news data also across document collections of varying levels of subjectivity, we test the hypothesis that the answer accuracy over factoid
more » ... estions does not decrease significantly if blog data is added. Our results show that on the contrary, the method remains competitive on larger datasets with mixed content, such as the union of the AQUAINT 2 (news) and BLOG 06 (blog) corpora (Macdonald and Ounis, 2006) . We also present evaluation results on an unofficial set of questions manually generated from BLOG 06 documents, which were created at LSV.
dblp:conf/trec/ShenWMKHLK07 fatcat:a26qzivj2jd33gtxz22m577dry