The Multi-model DBMS Architecture and XML Information Retrieval [chapter]

Arjen P. de Vries, Johan A. List, Henk Ernst Blok
2003 Lecture Notes in Computer Science  
Since long, computer science has distinguished between information retrieval and data retrieval, where information retrieval entails the problem of ranking textual documents on their content (with the goal to identify documents relevant for satisfying a user's information need) while data retrieval involves exact match, that is, checking a data collection for presence or absence of (precisely specified) items. But, now that XML has become a standard document model that allows structure and text
more » ... content to be represented in a combined way, new generations of information retrieval systems are expected to handle semi-structured documents instead of plain text, with usage scenarios that require the combination of 'conventional' ranking with other query constraints; based on the structure of text documents, on the information extracted from various media (or various media representations), or through additional information induced during the query process. Consider for example an XML collection representing a newspaper archive, and the information need 'recent English newspaper articles about Willem-Alexander dating Maxima'. 3 This can be expressed as the following query (syntax in the spirit of the XQuery-Fulltext working draft [1]): 4
doi:10.1007/978-3-540-45194-5_12 fatcat:qtjesgap3zh3dbivuzrzbgn4ee