18 Hits in 3.5 sec

Word normalization and decompounding in mono- and bilingual IR

Eija Airio
2006 Information retrieval (Boston)  
The present research studies the impact of decompounding and two different word normalization methods, stemming and lemmatization, on monolingual and bilingual retrieval.  ...  The reason for the poorer performance of indexes without decompounding in bilingual retrieval is the difference between the source language and target languages: phrases are used in English, while compounds  ...  GlobalDix Dictionary Software was used for automatic word-by-word translations. Copyright (c) 1998 Kielikone plc, Finland. MOT Dictionary Software was used for automatic word-by-word translations.  ... 
doi:10.1007/s10791-006-0884-2 fatcat:vsjslmor5vbrrixfrbu2obphaq

ITC-irst at CLEF 2003: Monolingual, Bilingual, and Multilingual Information Retrieval [chapter]

Nicola Bertoldi, Marcello Federico
2004 Lecture Notes in Computer Science  
This paper reports on the participation of ITC-irst in the Cross Language Evaluation Forum 2003; in particular, in the monolingual, bilingual, small multilingual, and spoken document retrieval tracks.  ...  As in the last CLEF, bilingual models integrate retrieval and translation scores over the set of N-best translations of the source query.  ...  German word decompounding seems to be slightly effective, as shown by comparing the run without decompounding ( de-en-1bst-brf-bfr) and the with (de-en-dec-1bst-brf-bfr).  ... 
doi:10.1007/978-3-540-30222-3_13 fatcat:o3s2nfbufvebdj3nvifnpz3r4m

The impact of evaluation on multilingual text retrieval

Julio Gonzalo, Carol Peters
2005 Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '05  
contributed to advances in the state-of-the-art.  ...  We summarize the impact of the first five years of activity of the Cross-Language Evaluation Forum (CLEF) on multilingual text retrieval system performance and show how the CLEF evaluation campaigns have  ...  In the first years, the core track -also known as the ad-hoc track -has consisted of a number of monolingual, bilingual and multilingual document retrieval tasks and has produced the largest available  ... 
doi:10.1145/1076034.1076149 dblp:conf/sigir/GonzaloP05 fatcat:2mjbi2xhpbc6njpofwuvjfahru

Cheshire at GeoCLEF 2008: Text and Fusion Approaches for GIR [chapter]

Ray R. Larson
2009 Lecture Notes in Computer Science  
In this paper we will briefly describe the approaches taken by Berkeley for the main GeoCLEF 2008 tasks (Mono and Bilingual retrieval).  ...  Our results were good overall with Cheshire systems runs appearing in the top 5 participants for each task (German, English and Portuguese both Monolingual and Bilingual) with the highest ranked runs for  ...  The German language runs did not use decompounding in the indexing and querying processes to generate simple word forms from compounds.  ... 
doi:10.1007/978-3-642-04447-2_108 fatcat:3h3hbqtz3fad5cjq5qemceygwa

Evaluation of MIRACLE Approach Results for CLEF 2003

José Luis Martínez-Fernández, Julio Villena-Román, Jorge Fombella, Ana García-Serrano, Alberto Ruiz-Cristina, Paloma Martínez, José M. Goñi, José Carlos González
2003 Conference and Labs of the Evaluation Forum  
This paper describes MIRACLE (Multilingual Information RetrievAl for the CLEf campaign) approach and results for the mono, bi and multilingual Cross Language Evaluation Forum tasks.  ...  The approach is based on the combination of linguistic and statistic techniques to perform indexing and retrieval tasks.  ...  Traditional IR works directly with the words used in the initial user query, and most of the effectiveness of IR systems comes from the matches among these query words (including morphological changes,  ... 
dblp:conf/clef/Martinez-FernandezVFGRMGG03 fatcat:h4kk2fzjwvdczh4trobwodh34a

Exploring New Languages with HAIRCUT at CLEF 2005 [chapter]

Paul McNamee
2006 Lecture Notes in Computer Science  
In our monolingual tests n-grams were more effective than unnormalized words for retrieval in Bulgarian (+30%) and Hungarian (+63%).  ...  This year we participated in the ad hoc cross-language track and submitted both monolingual and bilingual runs. We undertook our first investigations in the Bulgarian and Hungarian languages.  ...  N-grams also obviate the need to perform decompounding (e.g., in German) or word segmentation (e.g., in Chinese).  ... 
doi:10.1007/11878773_17 fatcat:ugsqm4uofzbo3kixvk2pswsszy

CLEF 2005: Ad Hoc Track Overview [chapter]

Giorgio M. Di Nunzio, Nicola Ferro, Gareth J. F. Jones, Carol Peters
2006 Lecture Notes in Computer Science  
The main stream offered mono-and bilingual tasks on target collections for central European languages (Bulgarian, Czech and Hungarian).  ...  The second stream, designed for more experienced participants, offered mono-and bilingual "robust" tasks with the objective of privileging experiments which achieve good stable performance over all queries  ...  New pools were formed in CLEF 2007 for the runs submitted for the main stream mono-and bilingual tasks.  ... 
doi:10.1007/11878773_2 fatcat:dtgh6jjcejbsrf6hqrlcf67vea

CLEF 2007: Ad Hoc Track Overview [chapter]

Giorgio M. Di Nunzio, Nicola Ferro, Thomas Mandl, Carol Peters
2008 Lecture Notes in Computer Science  
The main stream offered mono-and bilingual tasks on target collections for central European languages (Bulgarian, Czech and Hungarian).  ...  The second stream, designed for more experienced participants, offered mono-and bilingual "robust" tasks with the objective of privileging experiments which achieve good stable performance over all queries  ...  New pools were formed in CLEF 2007 for the runs submitted for the main stream mono-and bilingual tasks.  ... 
doi:10.1007/978-3-540-85760-0_2 fatcat:ythemxmvdrcu3cffjer2rgrewa

Current research issues and trends in non-English Web searching

Fotis Lazarinis, Jesús Vilares, John Tait, Efthimis N. Efthimiadis
2009 Information retrieval (Boston)  
A significant number of papers are reviewed and the research issues investigated in these studies are categorized in order to identify the research questions and solutions proposed in these papers.  ...  With increasingly higher numbers of nonEnglish language web searchers the problems of efficient handling of nonEnglish Web documents and user queries are becoming major issues for search engines.  ...  Vilares' research has been partially funded by the Spanish Government and FEDER (through project HUM200766607C0403) and the Galician Autonomous Government (through the "Galician Network for NLP and IR"  ... 
doi:10.1007/s10791-009-9093-0 fatcat:ueniqckvaffe5gftjwkkrlki7y

From CLEF to TrebleCLEF: the Evolution of the Cross-Language Evaluation Forum

Nicola Ferro, Carol Peters
2008 NTCIR Conference on Evaluation of Information Access Technologies  
In the first part of the paper, we provide a brief overview of the entire activity and summarise the main achievements; in the second part, we focus our attention on the Ad Hoc track with the aim of showing  ...  In the final part, we outline our main ideas for the future of CLEF.  ...  The work reported has been partially supported by the TrebleCLEF Coordination Action, FP7 ICT programme for Digital Libraries and Technologyenhanced Learning. Grant agreement: 215231.  ... 
dblp:conf/ntcir/FerroP08 fatcat:h3p27ct6grgdnoe4sr52ce77va

Semantic annotation for concept-based cross-language medical information retrieval

Martin Volk, Bärbel Ripplinger, Špela Vintar, Paul Buitelaar, Diana Raileanu, Bogdan Sacaleanu
2002 International Journal of Medical Informatics  
The paper describes experiments in monolingual and cross-language document retrieval, performed on a corpus of medical abstracts.  ...  On the other hand they show that semantic information, specifically the combined use of concepts and relations, increases the performance in monolingual and cross-language retrieval. 1 Bärbel Ripplinger  ...  The similarity thesaurus is thus a bilingual lexicon with a broad translation set (in our case 10 similar English words per German word).  ... 
doi:10.1016/s1386-5056(02)00058-8 pmid:12460635 fatcat:matcl4ufunaczooia2wzuh2ieu

Cross-Language Evaluation Forum: Objectives, Results, Achievements

Martin Braschler, Carol Peters
2004 Information retrieval (Boston)  
and development in the multilingual information access domain.  ...  We summarize the main lessons learned during this period, outline the state-of-the-art of the research reported in the CLEF experiments and discuss the contribution that this initiative has made to research  ...  Acknowledgments The authors would like to acknowledge the help, support and advice of numerous individuals and organizations which has been invaluable in the organization of the CLEF campaigns.  ... 
doi:10.1023/b:inrt.0000009438.69013.fa fatcat:pu7xyyfwkzhjdh7wptsn3qsuce

Translation techniques in cross-language information retrieval

Dong Zhou, Mark Truran, Tim Brailsford, Vincent Wade, Helen Ashman
2012 ACM Computing Surveys  
Unlike IR, CLIR must reconcile queries and documents which are written in different languages.  ...  Like IR, CLIR is centred upon the search for documents, and for information contained within those documents.  ...  ACKNOWLEDGMENTS This research was partially supported by a PHD scholarship from the University of Nottingham and funding from the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for  ... 
doi:10.1145/2379776.2379777 fatcat:mu5p5djufjghvn3xjppekmwnwu

CLEF 15th Birthday

Nicola Ferro
2014 SIGIR Forum  
2014 marks the 15 th birthday for CLEF, an evaluation campaign activity which has applied the Cranfield evaluation paradigm to the testing of multilingual and multimodal information access systems in Europe  ...  This paper provides a summary of the motivations which led to the establishment of CLEF, and a description of how it has evolved over the years, the major achievements, and what we see as the next challenges  ...  We would like to sincerely and warmly thank Maristella Agosti, Donna Harman, and Carol Peters (Coordinator of CLEF 2000-2009) for their precious and continuous advice and suggestions during this journey  ... 
doi:10.1145/2701583.2701587 fatcat:exd4r3qlznhqhm77efsyof5gee

Assessing Translation Quality for Cross Language Image Retrieval [chapter]

Paul Clough, Mark Sanderson
2004 Lecture Notes in Computer Science  
Quality is also measured using an automatic score derived from the mteval MT evaluation tool, and compared to the manual assessment score.  ...  The quality of translation is assessed manually by comparing the original ImageCLEF topics with the output from Systran and rated by assessors based on their semantic content.  ...  Acknowledgments We would like to thank members of the NLP group and Department of Information Studies for their time and effort in producing manual assessments.  ... 
doi:10.1007/978-3-540-30222-3_57 fatcat:ogfhg2u22bavfadaxghqttlc7q
« Previous Showing results 1 — 15 out of 18 results