A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Exploiting the Web as the multilingual corpus for unknown query translation
2006
Journal of the American Society for Information Science and Technology
In this article, the authors investigate the feasibility of exploiting the Web as the multilingual corpus source to translate unknown query terms for cross-language information retrieval in digital libraries ...
They propose a Webbased term translation approach to determine effective translations for unknown query terms by mining bilingual search-result pages obtained from a real Web search engine. ...
We intend to exploit the Web as the corpus to find effective translations automatically for query terms not included in a dictionary (unknown terms). ...
doi:10.1002/asi.20328
fatcat:xmr2ekbzorfjffegzfvbxy6zoq
Introduction to the special topic section on multilingual information systems
2006
Journal of the American Society for Information Science and Technology
Parallel and comparable corpora are important for generating a statistical translation model to overcome the limitations of a manually generated dictionary. ...
All of this reveals the importance of research in multilingual information systems. There are several essential components in multilingual information systems as depicted in Figure 1 . ...
the Web as the multilingual corpus source for translating unknown query terms. ...
doi:10.1002/asi.20325
fatcat:rjg7qleo7fh3zmdmhclksbg53i
Anchor text mining for translation of Web queries
2004
ACM Transactions on Information Systems
To discover translation knowledge in diverse data resources on the Web, this article proposes an effective approach to finding translation equivalents of query terms and constructing multilingual lexicons ...
Although Web anchor texts are wide-scoped hypertext resources, not every particular pair of languages contains sufficient anchor texts for effective extraction of translations for Web queries. ...
ACKNOWLEDGMENTS The authors would like to thank Prof. Mark Sanderson and the anonymous reviewers for their valuable comments and suggestions. Many thanks are given to Mr. ...
doi:10.1145/984321.984324
fatcat:75mnaq3qmza6vdduluhn3yhm5m
Towards Web Mining of Query Translations for Cross-Language Information Retrieval in Digital Libraries
[chapter]
2003
Lecture Notes in Computer Science
Web mining methods that can exploit huge amounts of multilingual and wide-scoped Web resources as live bilingual corpora have received great attentions to alleviate the translation difficulties of query ...
methods, which exploit huge amounts of multilingual and wide-scoped Web resources as live bilingual corpora to alleviate translation difficulties, and have been proven particularly effective for extracting ...
Therefore, we present search-result-based approaches to fully exploiting Web resources where search result pages of queries submitted to real search engines are used as the corpus for extracting translations ...
doi:10.1007/978-3-540-24594-0_8
fatcat:zdak4nwubndkzksnk2mt5xmmxi
CROSS LANGUAGE INFORMATION RETRIEVAL: IN INDIAN LANGUAGE PERSPECTIVE
2014
International Journal of Research in Engineering and Technology
This makes cross-language information retrieval (CLIR) and multilingual information retrieval (MLIR) for Web applications a valuable need of the day. ...
It will also discuss the issues related to the English to Hindi language translation. We had tested 30 queries manually using suggested prototype and found that the precision level is quite good. ...
Wang et. al. ( 2004 ) exploit the bilingual search result pages obtained from a real search engine as a corpus for automatic translation of unknown query terms not included in the dictionary. ...
doi:10.15623/ijret.2014.0322010
fatcat:zcjebdivyfcivnzjfvxpk2ec5u
TNO at CLEF-2001: Comparing Translation Resources
[chapter]
2002
Lecture Notes in Computer Science
The main contribution of this paper is a systematic comparison of three types of translation resources for bilingual retrieval based on query translation. ...
This paper describes the official runs of TNO TPD for CLEF-2001. We participated in the monolingual, bilingual and multilingual tasks. ...
We also thank George Foster and Jian-Yun Nie (also RALI) for general discussions about the application of statistical translation models for CLIR. ...
doi:10.1007/3-540-45691-0_6
fatcat:iacbryudt5fbbj2477uiyrtkdq
Precision at K in Multilingual Information Retrieval
2011
International Journal of Computer Applications
Multilingual Information Retrieval (MLIR) system helps the users to pose the query in one language and retrieve the documents in more than one language. ...
Information Retrieval (IR) is used to store and represent the knowledge and the retrieval of information relevant for a special user query. ...
The MLIR techniques are: An approach for exploiting the Web as the multilingual corpus source for translating unknown query terms have been proposed by [2] . ...
doi:10.5120/2990-3929
fatcat:dwhgticdujaffjyeuif53fqq5u
Translation Resources, Merging Strategies, and Relevance Feedback for Cross-Language Information Retrieval
[chapter]
2001
Lecture Notes in Computer Science
Finally, we performed preliminary experiments to exploit the web to generate translation probabilities and bilingual dictionaries, notably for English-Italian and English-Dutch. ...
This paper describes the official runs of the Twenty-One group for the first CLEF workshop. The Twenty-One group participated in the monolingual, bilingual and multilingual tasks. ...
Acknowledgements We would like to thank the Druid project for sponsoring the translation of the topic set into Dutch. We thank Xerox XRCE for making the Xelda morphological toolkit available to us. ...
doi:10.1007/3-540-44645-1_10
fatcat:corp7lp6uvae5bb4pms3dlbv4m
Translating unknown cross-lingual queries in digital libraries using a web-based approach
2004
Proceedings of the 2004 joint ACM/IEEE conference on Digital libraries - JCDL '04
In this paper, we investigate the feasibility of exploiting the Web as the corpus source to translate unknown query terms for cross-language information retrieval (CLIR) in digital libraries. ...
We propose a Web-based term translation approach to determine effective translations for unknown query terms by mining bilingual search-result pages obtained from a real Web search engine. ...
We intend to exploit the Web as the corpus to find effective translations automatically for query terms not included in a dictionary (unknown terms). ...
doi:10.1145/996350.996378
dblp:conf/jcdl/WangTCLC04
fatcat:4z7xxoqpxzgzncncyl2ubuvbee
Translation of web queries using anchor text mining
2002
ACM Transactions on Asian Language Information Processing
The proposed approach successfully exploits the anchor-text resources and reduces the existing difficulties of query term translation. ...
________________________________________________________________________ This article presents an approach to automatically extracting translations of Web query terms through mining of Web anchor texts ...
The authors would like to thank Kam-Fai Wong and Noriko Kando, and also the anonymous reviewers for their valuable comments and suggestions. ...
doi:10.1145/568954.568958
fatcat:2mrbolllebgclayqicdmkxjrn4
Creating multilingual translation lexicons with regional variations using web corpora
2004
Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics - ACL '04
The purpose of this paper is to automatically create multilingual translation lexicons with regional variations. ...
We propose a transitive translation approach to determine translation variations across languages that have insufficient corpora for translation via the mining of bilingual search-result pages and clues ...
In addition, Simard (2000) exploited the transitive properties of translations to improve the quality of multilingual text alignment. ...
doi:10.3115/1218955.1219023
dblp:conf/acl/ChengLTC04
fatcat:fhyi4bmy3jhgfd7yq2ausw7ysm
ParCourE: A Parallel Corpus Explorer for a Massively Multilingual Corpus
[article]
2021
arXiv
pre-print
ParCourE can be set up for any parallel corpus and can thus be used for typological research on other corpora as well as for exploring their quality and properties. ...
Researching typological properties of languages is fundamental for progress in multilingual NLP. ...
We exploit the generated word alignments to induce lexicons for all 889,111 language pairs. To this end, we consider aligned words as translations of each other. ...
arXiv:2107.06632v2
fatcat:sq37ij4dfvgptca5bk4omizf44
Compilation and Exploitation of Parallel Corpora
2003
Journal of Computing and Information Technology
Parallel corpora can be used as a translation aid for second-language learners, for translators and lexicographers, or as a data-source for various language technology tools. ...
Two exploitation results over our annotated corpora are also presented, namely a Web concordancer and the extraction of bi-lingual lexica. ...
Acknowledgements The author would like to thank the company Amebis, d.o.o., for lexically annotating the Slovene part of the IJS-ELAN corpus and Jin-Dong Kim for useful comments on a previous version of ...
doi:10.2498/cit.2003.02.02
fatcat:ddwdai2mhnfy3eh72wwesx33dm
Cross-Language Information Retrieval
2010
Synthesis Lectures on Human Language Technologies
A method that exploits parallel texts for query translation is proposed. This method is shown to allow for retrieval effectiveness comparable to the state-of-the-art effectiveness. ...
In order to increase the translation accuracy, compound terms are extracted and incorporated into the translation models, so that compounds can be translated as a unit, rather than as separate words. ...
This problem is more and more acute for IR on the Web due to the fact that the Web is a truly multilingual environment. ...
doi:10.2200/s00266ed1v01y201005hlt008
fatcat:a7ncb6fhkfcu5njlwsdllx45nu
Integration Of Machine Translation In On-Line Multilingual Applications: Domain Adaptation
[chapter]
2018
Zenodo
Large amounts of bilingual corpora are used in the training process of statistical machine translation systems. Usually a general domain is used as the training corpus. ...
In this paper, we used language model interpolation as a domain adaptation method and proved that it is a fast state of the art method that can be used in building adapted translation systems even when ...
We want to thank the anonymous reviewers for their comments and constructive suggestions. ...
doi:10.5281/zenodo.1291936
fatcat:aw5afygi5vh7dm3u6odmp4vbr4
« Previous
Showing results 1 — 15 out of 1,111 results