Bridging Languages for Question Answering: DIOGENE at CLEF 2003 [chapter]

Matteo Negri, Hristo Tanev, Bernardo Magnini
2004 Lecture Notes in Computer Science  
This paper presents the extension of the ITC-irst DIOGENE Question Answering system towards multilinguality. DIOGENE relies on a well tested three-components architecture built in the framework of our participation in the QA track at the Text Retrieval Conference (TREC 2002). The novelty factors are represented by the enhancement of the system with language-specific tools targeted to the Italian language (e.g. a module in charge of the answer-type extraction, and a named entities recognizer)
more » ... the introduction of a module for the translation of Italian queries into English queries. The overall architecture of the extended system, as well as the results obtained in the CLEF-2003 Monolingual Italian and Bilingual Italian/English QA tracks will be presented and discussed throughout the paper. Stampa newspaper and the 85Mb corpus of the 1994 SDA press agency), the target collection for the B-I/E task was composed of English texts (the 425Mb corpus of the whole year 1994 of Los Angeles Times). Focusing on the system's architecture, Section 2 will describe the question processing component, while Section 3 and 4 will describe in detail the search, and the answer extraction component respectively. Finally, Section 5 and 6 will conclude the paper presenting the results of the different runs submitted to evaluation and drawing some conclusions.
doi:10.1007/978-3-540-30222-3_48 fatcat:mhyrt4nxzvaptcicicyfsemnsq