Cross-Language Information Retrieval on the Web [chapter]

María-Dolores Olvera-Lobo
Handbook of Research on Social Dimensions of Semantic Technologies and Web Services  
The Web stands today as the world´s largest source of public information. Its magnitude can also be perceived as a drawback in a certain sense, however: nowadays there is a generalized problem in retrieving documents that may be written in any language, but through queries expressed in a single source language. And although Information Retrieval (IR) depends on the availability of digital collections, this key aspect is no longer the only concern. It is time for the multicultural society of
more » ... rnet to make use of new technologies such as Cross-Language Information Retrieval (CLIR). Whereas classical IR is a field that embraces retrieval models, evaluation, query languages and document indexing involving "small" collections of documents, modern IR tends to focus on Internet search engines, mark-up languages, multimedia contents, the distribution of collections, user interaction and multilingual systems. Thus, CLIR may border on work in the following fields: information retrieval, natural language processing, machine translation and abstracting, speech processing, the interpretation of document images, and human-computer interaction. "Given a query in any medium and any language, select relevant items from a multilingual multimedia collection which can be in any medium and any language, and present them in the style or order most likely to be useful to the querier, with identical or near identical objects in different media or languages appropriately identified" (Hull & Oard, 1997). This sentence sums up the main objective of CLIR, acknowledged as an independent research subfield roughly a decade ago, so that at present a number of international CLIR conferences take place in the world.
doi:10.4018/978-1-60566-650-1.ch034 fatcat:xombpvh52vbyjc4hr3n7poh3oi