A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Chinese text retrieval without using a dictionary
1997
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '97
It is generafly believed that words, rather than characters, should be the smallest indexing unit for Chinese text retrieval systems, and that it is essential to have a comprehensive Chinese dictionary ...
Chinese text has no delimiters to mark woni boundaries. As a result, any text retrieval systems that build word-based indexes need to segment text into words. ...
Acknowledgments A portion of this work was supported by grant NSF IRI-9630765 from the Database and Expert Systems program of the Computer and Information Science and Engineering Directorate of the National ...
doi:10.1145/258525.258532
dblp:conf/sigir/ChenHXGM97
fatcat:6qyd6kwixvcuhfoqvvkxcacdpe
Using the web for automated translation extraction in cross-language information retrieval
2004
Proceedings of the 27th annual international conference on Research and development in information retrieval - SIGIR '04
The method can be applied to both Chinese-English and English-Chinese CLIR, correctly extracting translations of OOV terms from the Web automatically, and thus is a significant improvement on earlier work ...
We use a method that extends earlier work in this area by augmenting this with statistical analysis, and corpus-based translation disambiguation to dynamically discover translations of OOV terms. ...
In our first run we used the given Chinese queries without any of the Chinese equivalents of the English OOV terms (C-C), and used this to compare the performance of the translated English queries without ...
doi:10.1145/1008992.1009022
dblp:conf/sigir/ZhangV04
fatcat:zf7ngcjuibgkvalaxy72pifpcm
On Chinese text retrieval
1996
Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '96
Acknowledgment: Wc would like to thank Chris BuckIcy who gave us useful hinL for the adaptation of SMART to Chinese. ...
This approach has been used in several experimental systems for both Chinese [6] and Japanese text retrieval [8, 17, 181. ...
Finally, wc suggest that Chinese text retrieval should move further to include a thesaurus in order to cope with dle rich vocabulary of Chinese. ...
doi:10.1145/243199.243270
dblp:conf/sigir/NieBR96
fatcat:ufhbmb33nvfihcptqestd2bxxq
Research on Lucene-based English-Chinese Cross-Language Information Retrieval
2005
International Journal of Asian Language Processing
On Chinese monolingual retrieval, we investigated the use of different entities as indexes and implement our retrieval system based on the Lucene toolkit. ...
On English-Chinese CLIR, we adopt query translation as the dominant strategy, and utilize English-Chinese bilingual dictionary as the important knowledge resource to acquire correct translations. ...
Dictionary-based approach is a popular word-based approach for text segmentation. In this approach, segmented texts are matched against a dictionary prior to being indexed. ...
dblp:journals/jclc/ZhangZC05
fatcat:362ieayuj5aqhdj5ivxepzrxy4
Automatic thesaurus for enhanced Chinese text retrieval
2000
Library Review
This paper proposes and describes a process for generating an automatic Chinese thesaurus that can be used to provide related terms to a user's queries to enhance retrieval effectiveness. ...
In the absence of existing automatic Chinese thesauri, techniques used in English thesaurus generation have been evaluated and adapted to generate a Chinese equivalent. ...
This segmentation process will be used for all Chinese text processing related fields such as machine translation, natural language processing and information retrieval. ...
doi:10.1108/00242530010331754
fatcat:6ssoqsuyibe5rmfoc6bmaddzbq
Detection and translation of OOV terms prior to query time
2004
Proceedings of the 27th annual international conference on Research and development in information retrieval - SIGIR '04
We have successfully developed new techniques to extract and translate out of vocabulary terms using the Web and add them into a translation dictionary prior to query time. ...
Several new techniques to improve the translation of out of vocabulary terms in English-Chinese cross-language information retrieval have been developed. ...
Our first approach was to collect English text from the Web and exclude all terms that can be found in a translation dictionary. ...
doi:10.1145/1008992.1009102
dblp:conf/sigir/ZhangV04a
fatcat:zxxsptznffhz3mwpgelmjmd77a
Discovering Chinese words from unsegmented text (poster abstract)
1999
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '99
In this paper, we investigate an efficient algorithm to discover the words and their occurrence probabilities from a corpus of unsegmented text without using a dictionary. ...
Thus, effective information retrieval of Chinese text first requires good word segmentation. ...
In the following sections, we investigate how to discover the words and their probabilities from a corpus of unsegmented text without using a dictionary. ...
doi:10.1145/312624.313472
dblp:conf/sigir/GePS99
fatcat:bffzfr3lezgfpczakpdshmkwba
Chinese Information Retrieval Using Lemur: NTCIR-5 CIR Experiments at UNT
2005
NTCIR Conference on Evaluation of Information Access Technologies
This paper describes our participation in NTCIR-5 Chinese Information Retrieval (IR) evaluation. The main purpose is to evaluate Lemur, a freely available information retrieval toolkit. ...
We also compared manual queries vs. automatic queries for Chinese IR. The results show that manually generated queries did not have much effect on IR performance. ...
We applied dictionary based approach to segment the text using forward maximum matching between a Chinese sentence and the dictionary because it was fast and easy to implement. ...
dblp:conf/ntcir/ChenLL05
fatcat:e5vqkrrjb5d7bhzzfjrn7paxb4
Combining multiple sources for short query translation in Chinese-English cross-language information retrieval
2000
Proceedings of the fifth international workshop on on Information retrieval with Asian languages - IRAL '00
We used two transfer dictionaries and a Chinese search engine to translate short Chinese queries into English. ...
In this paper, we examine various factors that affect the retrieval performance of Chinese-English cross-language retrieval. ...
They used the parallel text to construct a Chinese-English bilingual dictionary that was used to translate queries. The parallel text complements existing bilingual dictionaries. ...
doi:10.1145/355214.355217
dblp:conf/iral/ChenJG00
fatcat:bppqa6jwhvafdeevfcnsaq4bka
Error correction in a Chinese OCR test collection
2002
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '02
This article proposes a technique for correcting Chinese OCR errors to support retrieval of scanned documents. ...
Improved retrieval effectiveness on a single term query experiment is demonstrated. ...
CONCLUSIONS A fully automatic error correction method is proposed for use in Chinese OCR text retrieval. ...
doi:10.1145/564376.564478
dblp:conf/sigir/Tseng02
fatcat:hip6qumb4fgzzewddwffv4yavm
Multilingual Information Retrieval Using English and Chinese Queries
[chapter]
2002
Lecture Notes in Computer Science
The coefficients were determined by fitting training data to the logistic regression model using a statistical software package. We refer readers to reference [3] for more details. ...
This paper describes our retrieval experiments. ½ ½· ÐÓ Ç´Ê É µ The documents are ranked in decreasing order by their relevance probability È´Ê É µ with respect to a query. ...
First, we examine all the possible ways to segment a Chinese text into words found in a Chinese dictionary. ...
doi:10.1007/3-540-45691-0_4
fatcat:e5bkqvm3knec3eiceoex45bw5e
CMU in Cross-Language Information Retrieval at NTCIR-3
2002
NTCIR Conference on Evaluation of Information Access Technologies
online dictionary. ...
We participated in the Cross-Language Information Retrieval evaluation at NTCIR-3 for the English-Chinese and English-Japanese tasks. ...
of parallel text; by MRD-based we mean to use an online-readable dictionary. ...
dblp:conf/ntcir/YangM02
fatcat:3pgl4cszxndwpkehyme65kqr6i
From Text to Image: Generating Visual Query for Image Retrieval
[chapter]
2005
Lecture Notes in Computer Science
The retrieval results using textual and visual queries are combined to generate the final ranked list. We conducted English monolingual and Chinese-English cross-language retrieval experiments. ...
The relationships between text and images are modeled. Visual queries are constructed from textual queries using the relationships. ...
The bilingual dictionary is integrated from four resources, including the LDC Chinese-English dictionary, Denisowski's CEDICT 1 , BDC Chinese-English dictionary v2.2 2 and a dictionary used in query translation ...
doi:10.1007/11519645_65
fatcat:vigdrkpeufdubbegcufud4kzcq
Trans-EZ at NTCIR-2 : Synset Co-occurrence Method for English-Chinese Cross-Lingual Information Retrieval
2001
NTCIR Conference on Evaluation of Information Access Technologies
In this paper, a new method for English-Chinese cross-lingual information retrieval is proposed and evaluated in NTCIR-II project. ...
An English-Chinese WordNet and a synset co-occurrence model are adopted t o solve the problem of word sense ambiguity. ...
The resources that we use are a bilingual dictionary, an English-Chinese WordNet, and a target language corpus. ...
dblp:conf/ntcir/BianL01
fatcat:itzz3axyjvgzzlp624z2eldigu
Search Between Chinese and Japanese Text Collections
2007
NTCIR Conference on Evaluation of Information Access Technologies
We also utilized Machine Translation (MT) software between Japanese and Chinese, with English as a pivot language. ...
While Chinese search without translation against Japanese documents performed credibly well for title only runs, the reverse (Japanese topic search of Chinese documents without translation) was poor. ...
We have again found that when a Japanese version of an NTCIR topic consists of primarily Kanji text, then use of the Chinese topic directly (after character code conversion) against Japanese documents ...
dblp:conf/ntcir/Gey07
fatcat:gh4bixhhznggtjqp7wfixyfkqy
« Previous
Showing results 1 — 15 out of 12,225 results