Filters








13,635 Hits in 4.5 sec

Translation of Unknown Terms Via Web Mining for Information Retrieval [chapter]

Qing Li, Sung Hyon Myaeng, Yun Jin, Bo-Yeong Kang
2006 Lecture Notes in Computer Science  
For CLIR, one of the major hindrances to achieving retrieval performance at the level of monolingual information retrieval is the translation of terms in queries, which are not found in a bilingual dictionary  ...  This paper describes the degree to which these problems arise in Korean Information Retrieval and suggests a novel approach to solve it.  ...  Introduction For cross-language information retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of monolingual information retrieval is the translation of terms  ... 
doi:10.1007/11880592_20 fatcat:c3f7iurctzgo7j3r3y76idl57i

Anchor text mining for translation of Web queries

Wen-Hsiang Lu, Lee-Feng Chien, Hsi-Jian Lee
2004 ACM Transactions on Information Systems  
A series of experiments has been conducted, including performance tests on term translation extraction, cross-language information retrieval, and translation suggestions for practical Web search services  ...  The translation equivalents of a query term can be extracted via its translation in an intermediate language.  ...  Mark Sanderson and the anonymous reviewers for their valuable comments and suggestions. Many thanks are given to Mr.  ... 
doi:10.1145/984321.984324 fatcat:75mnaq3qmza6vdduluhn3yhm5m

Study on Unknown Term Translation Mining from Google Snippets

Bin Li, Jianmin Yao
2019 Information  
Bilingual web pages are widely used to mine translations of unknown terms.  ...  The experimental results revealed that the proposed method performed remarkably well for mining translations of unknown terms.  ...  Acknowledgments: The authors express their gratitude to the anonymous reviewers for their carefulness and patience. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/info10090267 fatcat:fpxv7nsnazbe7exxcsqyij6yiy

Translating unknown queries with web corpora for cross-language information retrieval

Pu-Jen Cheng, Jei-Wen Teng, Ruei-Cheng Chen, Jenq-Haur Wang, Wen-Hsiang Lu, Lee-Feng Chien
2004 Proceedings of the 27th annual international conference on Research and development in information retrieval - SIGIR '04  
We propose an online translation approach to determine effective translations for unknown query terms via mining of bilingual search-result pages obtained from Web search engines.  ...  It is crucial for cross-language information retrieval (CLIR) systems to deal with the translation of unknown queries 1 due to that real queries might be short.  ...  We thank Sukil Kim M.D. and Shih-Jui Lin for their support of this work in examining Japanese and Korean translations.  ... 
doi:10.1145/1008992.1009020 dblp:conf/sigir/ChengTCWLC04 fatcat:4orhlrrxpjatzogacrtroubibi

Introduction to the special topic section on mining Web resources for enhancing information retrieval

Wai Lam, Christopher C. Yang, Filippo Menczer
2007 Journal of the American Society for Information Science and Technology  
It is also common for Web sites to contain information in different languages since many countries adopt more than one language.  ...  Web has been expanding at an enormous pace. There are a variety of Web documents in different genres, such as news, reports, reviews.  ...  For example, one can employ a mining method to harvest terms from bilingual Web sites to discover candidates for the task of term translation in cross-lingual information retrieval.  ... 
doi:10.1002/asi.20626 fatcat:hnnx7uowarampmyo7kjllnpysy

Towards Web Mining of Query Translations for Cross-Language Information Retrieval in Digital Libraries [chapter]

Wen-Hsiang Lu, Jenq-Haur Wang, Lee-Feng Chien
2003 Lecture Notes in Computer Science  
For out-of-dictionary terms, the Web is searched for the most promising translations which are suggested to users, whose click information on selecting suggested translations is then collected and employed  ...  To utilize rich Web resources for query translation, several Web mining methods have been developed to effectively exploit two kinds of Web resources: anchor texts and search results, where several term  ...  Term Translation Anchor-Text Mining Search-Result Mining Web Mining Search Log Anchor-Text-Based Method An anchor text is the brief description of an out-link in a Web page.  ... 
doi:10.1007/978-3-540-24594-0_8 fatcat:zdak4nwubndkzksnk2mt5xmmxi

Improving translation accuracy in web-based translation extraction

Chengye Lu, Yue Xu, Shlomo Geva
2007 NTCIR Conference on Evaluation of Information Access Technologies  
In this paper, we present some approaches to improve translation accuracy in web-based translation extraction.  ...  We proposed some approaches that can improve the translation accuracy in web-based translation extraction which relies on small dynamic small corpus.  ...  Our experiments show that this approach is effective in web mining for translation extraction of unknown query terms.  ... 
dblp:conf/ntcir/LuXG07 fatcat:tdt54bd3mrcfnh6iavlrxqrjnu

Translating unknown cross-lingual queries in digital libraries using a web-based approach

Jenq-Haur Wang, Jei-Wen Teng, Pu-Jen Cheng, Wen-Hsiang Lu, Lee-Feng Chien
2004 Proceedings of the 2004 joint ACM/IEEE conference on Digital libraries - JCDL '04  
In this paper, we investigate the feasibility of exploiting the Web as the corpus source to translate unknown query terms for cross-language information retrieval (CLIR) in digital libraries.  ...  We propose a Web-based term translation approach to determine effective translations for unknown query terms by mining bilingual search-result pages obtained from a real Web search engine.  ...  CONCLUSION In this paper, we have introduced a Web-based approach for dealing with the translation of unknown query terms for cross-language information retrieval in digital libraries.  ... 
doi:10.1145/996350.996378 dblp:conf/jcdl/WangTCLC04 fatcat:4z7xxoqpxzgzncncyl2ubuvbee

Development and Application of a Chinese Webpage Suicide Information Mining System (Sims)

Penglai Chen, Jing Chai, Lu Zhang, Debin Wang
2014 Journal of medical systems  
Objectives: This study aims at designing and piloting a convenient Chinese webpage suicide information mining system (SIMS) to help search and filter required data from the internet and discover potential  ...  Data collection provides a user-friendly interface for retrieving suicide-related news and blogs from Chinese webpages and downloading them into SIMS database.  ...  Acknowledgment This paper was co-funded by the Natural Science Foundation of China (grant number 81172201) and Anhui Provincial Fund for Elite Youth (grant number 2011SQRL060).  ... 
doi:10.1007/s10916-014-0088-z pmid:25265902 fatcat:ns3m32iypbdunjit6aaxsxedwi

Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval [article]

Wessel Kraaij, Jian-Yun Nie, Michel Simard
2003 arXiv   pre-print
Cross-language information retrieval (CLIR) is an application which needs translation functionality of a relatively low level of sophistication since current models for information retrieval (IR) are still  ...  In this paper, we will investigate the problem of automatically mining parallel texts from the Web and different ways of integrating the translation models within the retrieval process.  ...  Part of this work was carried out while the first author was visiting the RALI laboratory at Université de Montréal.  ... 
arXiv:cs/0312008v1 fatcat:hztoxce3frcgpbsmegftpg4rdu

Synonymous Chinese Transliterations Retrieval from World Wide Web by Using Association Words [chapter]

Chung-Chian Hsu, Chien-Hsing Chen
2008 Lecture Notes in Computer Science  
We present a framework for mining synonymous transliterations from a set of Web pages collected via a search engine.  ...  An integrated statistical measure is proposed to form search keywords for a search engine in order to retrieve relevant Web snippets.  ...  In practice, we first download a fixed number of Web snippets D for a transliteration c i via a search engine.  ... 
doi:10.1007/978-3-540-69384-0_96 fatcat:sx2tyowcqfe7zkkd5zh67fkdwe

Embedding Web-Based Statistical Translation Models in Cross-Language Information Retrieval

Wessel Kraaij, Jian-Yun Nie, Michel Simard
2003 Computational Linguistics  
Cross-language information retrieval (CLIR) is an application that needs translation functionality of a relatively low level of sophistication, since current models for information retrieval (IR) are still  ...  In this article, we will investigate the problem of automatically mining parallel texts from the Web and different ways of integrating the translation models within the retrieval process.  ...  Special thanks are due to Jiang Chen, who contributed to the building of PTMiner.  ... 
doi:10.1162/089120103322711587 fatcat:dkxidh7b3vdszodokvwhjd4nre

Improving Translation of Queries with Infrequent Unknown Abbreviations and Proper Names

Wen-Hsiang Lu, Jiun-Hung Lin, Yao-Sheng Chang
2008 International Journal of Computational Linguistics and Chinese Language Processing  
Recently, a few researchers have proposed several effective search-result-based term translation extraction methods which explore search results to discover translations of frequent unknown terms from  ...  Unknown term translation is important to CLIR and MT systems, but it is still an unsolved problem.  ...  On the other hand, Lu et al. [2002] made the first attempt of mining unknown term translations from Web anchor texts.  ... 
dblp:journals/ijclclp/LuLC08 fatcat:6fkwbqlr5jedhidkms2tc3bnva

Knowledge-Based Query Expansion over a Medical Terminology Oriented Ontology on the Web [chapter]

Linda Fatima Soualmia, Catherine Barry, Stefan J. Darmoni
2003 Lecture Notes in Computer Science  
This paper deals with the problem of information retrieval on the Web and present the CISMeF project (acronym of Catalogue and Index of French-speaking Medical Sites).  ...  Information retrieval in the CISMeF catalogue is done with a terminology that is similar to ontology of medical domain and a set of metadata.  ...  Metadata describe Web information resources enhancing information retrieval.  ... 
doi:10.1007/978-3-540-39907-0_29 fatcat:tjl7b27lpbhknjyripld3hpcle

Chinese OOV translation and post-translation query expansion in chinese--english cross-lingual information retrieval

Ying Zhang, Phil Vines, Justin Zobel
2005 ACM Transactions on Asian Language Information Processing  
A major difficulty for cross-lingual information retrieval is the detection and translation of out-of-vocabulary (OOV) terms; for OOV terms in Chinese, another difficulty is segmentation.  ...  We have developed a new segmentation-free technique for automatic translation of Chinese OOV terms using the web.  ...  In previous work we found that by mining the web to collect OOV terms and then using the web to search for translations, we were able to translate 61% of terms correctly and 31% of terms approximately  ... 
doi:10.1145/1105696.1105697 fatcat:doae4glz75eyzawt56qnu6yit4
« Previous Showing results 1 — 15 out of 13,635 results