Filters








3,736 Hits in 5.2 sec

Effective foreign word extraction for Korean information retrieval

Byung-Ju Kang, Key-Sun Choi
2002 Information Processing & Management  
So accurate foreign word extraction is crucial for high performance of information retrieval.  ...  Foreign words are usually very important index terms in Korean information retrieval since most of them are technical terms or names.  ...  Acknowledgements This work was supported by the Korea Science and Engineering Foundation (KOSEF) through the Advanced Information Technology Research Center (AITrc).  ... 
doi:10.1016/s0306-4573(00)00065-0 fatcat:3jvkcbxizna5bod6n6utmxnwtm

Two approaches for the resolution of word mismatch problem caused by English words and foreign words in Korean information retrieval

Byung-Ju Kang, Key-Sun Choi
2000 Proceedings of the fifth international workshop on on Information retrieval with Asian languages - IRAL '00  
The mixed use of English words and their various transliterations may cause severe word mismatch problem in Korean information retrieval.  ...  Our information retrieval experiment results support this argument.  ...  Acknowledgement This work was supported by the Korea Science and Engineering Foundation (KOSEF) through the Advanced Information Technology Research Center (AITrc).  ... 
doi:10.1145/355214.355234 dblp:conf/iral/KangC00 fatcat:tbixddfeona6ndjypkuvboskai

Korean Compound Noun Term Analysis Based on a Chart Parsing Technique [chapter]

Kyongho Min, William H. Wilson, Yoo-Jin Moon
2003 Lecture Notes in Computer Science  
Systems based on probabilistic and statistical information extracted from a corpus have shown good performance on Korean compound noun analysis.  ...  Unlike compound noun terms in English and French, where words are separated by white space, Korean compound noun terms are not separated by white space.  ...  Kang, S. from Kookmin University, to use the Linux version of his Korean Morphology Analysis, HAM. His application contributes to the implementation of document classification system in this paper.  ... 
doi:10.1007/978-3-540-24581-0_16 fatcat:j2k3bgntebcw3o5n2gcfw5yilu

Cross-Language IR at University of Tsukuba: Automatic Transliteration for Japanese, English, and Korean

Atsushi Fujii, Tetsuya Ishikawa
2004 NTCIR Conference on Evaluation of Information Access Technologies  
This paper describes our cross-language information retrieval system for the NTCIR-4 CLIR task.  ...  Transliteration is effective if a query includes foreign words, such as technical terms and proper nouns, spelled out by phonetic alphabets.  ...  We applied this method to extracting foreign words from Korean text [8] .  ... 
dblp:conf/ntcir/FujiiI04 fatcat:6dv2nr7lnjh2pdasqdi7l5fnbq

IASL System for NTCIR-6 Korean-Chinese Cross-Language Information Retrieval

Yu-Chun Wang, Cheng-Wei Lee, Richard Tzong-Han Tsai, Wen-Lian Hsu
2007 NTCIR Conference on Evaluation of Information Access Technologies  
This paper describes our Korean-Chinese cross-language information retrieval system for NTCIR-6. Our system uses a bilingual dictionary to perform query translation.  ...  We expand our bilingual dictionary by extracting words and their translations from the Wikipedia site, an online encyclopedia.  ...  For the descriptive part of a Korean query, we use the KLT Term Extractor [1] , developed by Kookmin University in Korea, to extract vital key words and remove stop words.  ... 
dblp:conf/ntcir/Wang0TH07 fatcat:r3sktglgzbcqbhpapwb4ahypxy

POSTECH at NTCIR-5: Combining Evidences of Multiple Term Extractions for Mono-lingual and Cross-lingual Retrieval in Korean and Japanese

Seung-Hoon Na, In-Su Kang, Jong-Hyeok Lee
2005 NTCIR Conference on Evaluation of Information Access Technologies  
This paper describes methodologies for NTCIR-5 CLIR involving Korean and Japanese, and reports the official result as well as retrieval results using NTCIR-3 and NTCIR-4 data.  ...  Unlike English, in Asian languages such as Korean and Japanese term extraction is nontrivial because of segmentation ambiguities.  ...  The size of dictionary is about 230,000 nouns, and its entries contains most Korean words and modern foreign words.  ... 
dblp:conf/ntcir/NaKL05 fatcat:v2d372hgpnetzjdjdd2dc5kyo4

Concept unification of terms in different languages for IR

Qing Li, Sung-Hyon Myaeng, Yun Jin, Bo-yeong Kang
2006 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06  
Due to the historical and cultural reasons, English phases, especially the proper nouns and new words, frequently appear in Web pages written primarily in Asian languages such as Chinese and Korean.  ...  Although these English terms and their equivalences in the Asian languages refer to the same concept, they are erroneously treated as independent index units in traditional Information Retrieval (IR).  ...  In (Jeong et al., 1999) , it extracts the Korean foreign words for concept unification based on statistical information.  ... 
doi:10.3115/1220175.1220256 dblp:conf/acl/LiMJK06 fatcat:2wrh7rny6zdnhgllsafifw5t4y

Simple Query Translation Methods for Korean-English and Korean-Chinese CLIR in NTCIR Experiments

Myung-Gil Jang, Pyung Kim, Yun Jin, Sukhyun Cho, Sung-Hyon Myaeng
2002 NTCIR Conference on Evaluation of Information Access Technologies  
The main goal of our participation in the NTCIR Workshop is to evaluate relatively simple yet practical methods for CLIR using Korean queries for English and Chinese documents.  ...  The Korean-English CLIR was quite successful, but the Korean-Chinese CLIR resulted in unexpectedly low performance.  ...  However, the method also allows for incorrect words to be used for retrieval, degrading retrieval effectiveness.  ... 
dblp:conf/ntcir/JangKJCM02 fatcat:j7rzc44xv5bmde5ghiy37a7pti

Translation of Unknown Terms Via Web Mining for Information Retrieval [chapter]

Qing Li, Sung Hyon Myaeng, Yun Jin, Bo-Yeong Kang
2006 Lecture Notes in Computer Science  
This paper describes the degree to which these problems arise in Korean Information Retrieval and suggests a novel approach to solve it.  ...  For CLIR, one of the major hindrances to achieving retrieval performance at the level of monolingual information retrieval is the translation of terms in queries, which are not found in a bilingual dictionary  ...  As witnessed by previous research as well as in our experiments, translating the OOV words in the query is necessary for an effective cross-lingual information retrieval.  ... 
doi:10.1007/11880592_20 fatcat:c3f7iurctzgo7j3r3y76idl57i

POSTECH at NTCIR-6: Combining Evidences of Multiple Term Extractions for Mono-lingual and Cross-lingual Retrieval in Korean and Japanese

Seung-Hoon Na, Jungi Kim, Yeha Lee, Jong-Hyeok Lee
2007 NTCIR Conference on Evaluation of Information Access Technologies  
This paper describes our methodologies for NTCIR-6 CLIR involving Korean and Japanese, and reports the official result for Stage 1 and Stage 2.  ...  From official results, our methodology in Korean won the top in 6 subtasks of total 9 subtasks for Stage 2,and won the top in 2 subtasks of total 3 subtasks for Stage 1.  ...  The size of dictionary is about 230,000 nouns, and its entries contain most of the Korean words and modern foreign words.  ... 
dblp:conf/ntcir/NaKLL07 fatcat:cgjyhp7na5h65engfpifpndhg4

Using mutual information to resolve query translation ambiguities and query term weighting

Myung-Gil Jang, Sung Hyon Myaeng, Se Young Park
1999 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics -  
This paper describes the degree to which this problem arises in Korean-English cross-language IR and suggests a relatively simple yet effective method for disambiguation using mutual information statistics  ...  An easy way of translating queries in one language to the other for cross-language information retrieval (IR) is to use a simple bilingual dictionary.  ...  Since the bilingual dictionary lacks some words that are essential for a correct interpretation of the Korean query, it is important to identify unknown words such as foreign words and transliterate them  ... 
doi:10.3115/1034678.1034718 dblp:conf/acl/JangMP99 fatcat:y74k3365pjas3dl73r2stoupyy

A Hybrid Model for Extracting Transliteration Equivalents from Parallel Corpora [chapter]

Jong-Hoon Oh, Key-Sun Choi, Hitoshi Isahara
2006 Lecture Notes in Computer Science  
Experiments showed that our hybrid model was more effective than each individual transliteration pair acquisition model alone.  ...  In this paper, we concentrate on a framework for combining several models for transliteration pair acquisition.  ...  The mixed use of various transliterations and their origin English word causes severe word mismatch problems in IR (information retrieval) [1] .  ... 
doi:10.1007/11846406_15 fatcat:zprcxxb45vfh5bvxmfap5os3wa

Applying Multiple Characteristics and Techniques in the NICT Information Retrieval System at NTCIR-6

Masaki Murata, Jong-Hoon Oh, Qing Ma, Hitoshi Isahara
2007 NTCIR Conference on Evaluation of Information Access Technologies  
It can be very useful for foreign languages for which we cannot determine stop words. We also use web-based unknown word translation for bilingual information retrieval.  ...  We participated in two monolingual information retrieval tasks (Korean and Japanese) and five bilingual information retrieval tasks (Chinese-Japanese, English-Japanese, Japanese-Korean, Korean-Japanese  ...  Dosam Hwang for information on the Korean morphological analyzer.  ... 
dblp:conf/ntcir/MurataOMI07 fatcat:i3mx247wrnfwhkbbqa3vcfkkvy

Applying Multiple Characteristics and Techniques in the NICT Information Retrieval System in NTCIR-5

Masaki Murata, Qing Ma, Hitoshi Isahara
2005 NTCIR Conference on Evaluation of Information Access Technologies  
It can be very useful for foreign languages for which we cannot determine stop words.  ...  In particular, we obtained the best precision in the Korean title-based monolingual information retrieval and the Japanese-English bilingual information retrieval.  ...  Satoshi Sekine for developing the OAK system that we used to obtain the stems of words in English sentences. We also thank Prof. Dosam Hwang for information on the Korean morphological analyzer.  ... 
dblp:conf/ntcir/MurataMI05 fatcat:6qyxbuxlpbfkrmlnkrdilvoo44

Applying Multiple Characteristics and Techniques to Obtain High Levels of Performance in Information Retrieval at NTCIR-4

Masaki Murata, Qing Ma, Hitoshi Isahara
2004 NTCIR Conference on Evaluation of Information Access Technologies  
It can be very useful for foreign languages for which we cannot examine stop words. We participated in three tasks (Korean, Japanese, and English) of monolingual information retrieval at NTCIR 4.  ...  In particular, we obtained the best precision in Korean description-based monolingual information retrieval.  ...  Satoshi Sekine for developing the OAK system which we used to obtain the stems of words in English sentences. We thank Prof. Dosam Hwang for the information on the Korean morphological analyzer.  ... 
dblp:conf/ntcir/MurataMI04 fatcat:h5nrsla2w5arrh3bn2f2j2klje
« Previous Showing results 1 — 15 out of 3,736 results