Filters








702 Hits in 6.6 sec

Effective foreign word extraction for Korean information retrieval

Byung-Ju Kang, Key-Sun Choi
2002 Information Processing & Management  
So accurate foreign word extraction is crucial for high performance of information retrieval.  ...  Foreign words are usually very important index terms in Korean information retrieval since most of them are technical terms or names.  ...  Acknowledgements This work was supported by the Korea Science and Engineering Foundation (KOSEF) through the Advanced Information Technology Research Center (AITrc).  ... 
doi:10.1016/s0306-4573(00)00065-0 fatcat:3jvkcbxizna5bod6n6utmxnwtm

Korean Compound Noun Term Analysis Based on a Chart Parsing Technique [chapter]

Kyongho Min, William H. Wilson, Yoo-Jin Moon
2003 Lecture Notes in Computer Science  
Systems based on probabilistic and statistical information extracted from a corpus have shown good performance on Korean compound noun analysis.  ...  In this paper, we will describe the analysis of Korean compound noun terms based on a longest substring algorithm and an agenda-based chart parsing technique, with a simple heuristic method to resolve  ...  Kang, S. from Kookmin University, to use the Linux version of his Korean Morphology Analysis, HAM. His application contributes to the implementation of document classification system in this paper.  ... 
doi:10.1007/978-3-540-24581-0_16 fatcat:j2k3bgntebcw3o5n2gcfw5yilu

Compound noun segmentation based on lexical data extracted from corpus

JUNTAE YOON
2001 Natural Language Engineering  
This paper presents an effective method of Korean compound noun segmentation based on lexical data extracted from corpus.  ...  Compound noun analysis is one of the crucial problems in Korean language processing because a series of nouns in Korean may appear without white space in real texts, which makes it difficult to identify  ...  Compound noun segmentation is necessarily required for improving recall and precision in Korean information * This work was supported by a KOSEF's postdoctoral fellowship grant. retrieval, and obtaining  ... 
doi:10.1017/s1351324901002637 fatcat:53xaspm6fnai5b5ss2qyswv25y

Compound noun segmentation based on lexical data extracted from corpus

Juntae Yoon
2000 Proceedings of the sixth conference on Applied natural language processing -  
This paper presents an effective method of Korean compound noun segmentation based on lexical data extracted from corpus.  ...  Compound noun analysis is one of the crucial problems in Korean language processing because a series of nouns in Korean may appear without white space in real texts, which makes it difficult to identify  ...  Compound noun segmentation is necessarily required for improving recall and precision in Korean information * This work was supported by a KOSEF's postdoctoral fellowship grant. retrieval, and obtaining  ... 
doi:10.3115/974147.974174 dblp:conf/anlp/Yoon00 fatcat:347gfn7fv5gc3gqagmiwlitimq

Two approaches for the resolution of word mismatch problem caused by English words and foreign words in Korean information retrieval

Byung-Ju Kang, Key-Sun Choi
2000 Proceedings of the fifth international workshop on on Information retrieval with Asian languages - IRAL '00  
The mixed use of English words and their various transliterations may cause severe word mismatch problem in Korean information retrieval.  ...  Our information retrieval experiment results support this argument.  ...  Acknowledgement This work was supported by the Korea Science and Engineering Foundation (KOSEF) through the Advanced Information Technology Research Center (AITrc).  ... 
doi:10.1145/355214.355234 dblp:conf/iral/KangC00 fatcat:tbixddfeona6ndjypkuvboskai

Answer Extraction by Flexible Matching, Filtering, and Interpretation [chapter]

Kyung-Soon Lee, Jae-Ho Kim, Key-Sun Choi
2002 Lecture Notes in Computer Science  
We also describe construction of a Korean question answering test collection.  ...  For flexible text retrieval, we represent terms in a question as five data types, and retrieve passages by matching terms according to these data types.  ...  Acknowledgments This research was conducted while I was a PhD student at KAIST (Korea Advanced Institute of Science and Technology) and supported by the Korean Ministry of Science and Technology.  ... 
doi:10.1007/3-540-45683-x_80 fatcat:pevzz5kn5jhdhbcmecugd7lkey

Using Wikipedia to Translate OOV Term on MLIR

Chen-Yu Su, Tien-Chien Lin, Shih-Hung Wu
2007 NTCIR Conference on Evaluation of Information Access Technologies  
We deal with Chinese, Japanese and Korean multilingual information retrieval (MLIR) in NTCIR-6, and submit our results on the C-CJK-T and C-CJK-D subtask.  ...  In these runs, we adopt Dictionary-Based Approach to translate query terms. In addition to tradition dictionary, we incorporate the Wikipedia as a live dictionary.  ...  Many information workers collect information from the global resources, which might be in different languages.  ... 
dblp:conf/ntcir/SuLW07 fatcat:b27op6djnngrjmse2hmyrfdory

POSTECH at NTCIR-6: Combining Evidences of Multiple Term Extractions for Mono-lingual and Cross-lingual Retrieval in Korean and Japanese

Seung-Hoon Na, Jungi Kim, Yeha Lee, Jong-Hyeok Lee
2007 NTCIR Conference on Evaluation of Information Access Technologies  
This paper describes our methodologies for NTCIR-6 CLIR involving Korean and Japanese, and reports the official result for Stage 1 and Stage 2.  ...  From official results, our methodology in Korean won the top in 6 subtasks of total 9 subtasks for Stage 2,and won the top in 2 subtasks of total 3 subtasks for Stage 1.  ...  Our morphological analyzer selects content nouns and numerical words by using compound-noun segmentation based on the longest-matching rule [3] .  ... 
dblp:conf/ntcir/NaKLL07 fatcat:cgjyhp7na5h65engfpifpndhg4

Cross-Language IR at University of Tsukuba: Automatic Transliteration for Japanese, English, and Korean

Atsushi Fujii, Tetsuya Ishikawa
2004 NTCIR Conference on Evaluation of Information Access Technologies  
This paper describes our cross-language information retrieval system for the NTCIR-4 CLIR task.  ...  We apply our method, which was originally proposed for Japanese Katakana words, to Korean Hangul words and realize JEK transliteration in a single framework.  ...  For <DESC> and <NARR>, we perform morphological analysis and regard a sequence of content words (e.g., nouns) as a compound word.  ... 
dblp:conf/ntcir/FujiiI04 fatcat:6dv2nr7lnjh2pdasqdi7l5fnbq

POSTECH at NTCIR-5: Combining Evidences of Multiple Term Extractions for Mono-lingual and Cross-lingual Retrieval in Korean and Japanese

Seung-Hoon Na, In-Su Kang, Jong-Hyeok Lee
2005 NTCIR Conference on Evaluation of Information Access Technologies  
This paper describes methodologies for NTCIR-5 CLIR involving Korean and Japanese, and reports the official result as well as retrieval results using NTCIR-3 and NTCIR-4 data.  ...  Unlike English, in Asian languages such as Korean and Japanese term extraction is nontrivial because of segmentation ambiguities.  ...  Our morphological analyzer selects content nouns and numerical words by using compound-noun segmentation based on longest-matching rule [3] .  ... 
dblp:conf/ntcir/NaKL05 fatcat:v2d372hgpnetzjdjdd2dc5kyo4

Automatic acquisition of named entity tagged corpus from world wide web

Joohui An, Seungwoo Lee, Gary Geunbae Lee
2003 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - ACL '03  
We use an NE list and an web search engine to collect web documents which contain the NE instances.  ...  In this paper, we present a method that automatically constructs a Named Entity (NE) tagged corpus from the web to be used for learning of Named Entity Recognition systems.  ...  The compound noun, '˚ ¶Áě', can be divided into '˚ ¶(surfing)' and 'Áě(club)' by a compound-noun segmenting method (Yun et al., 1997) .  ... 
doi:10.3115/1075178.1075207 dblp:conf/acl/AnLL03 fatcat:g5en5wcw6nd25kk3znm6ewzvum

Cross-lingual Name and Subject Access

Jung-ran Park
2007 Library resources & technical services  
., cataloging, metadata) for cross-lingual information access.  ...  The author examines current mechanisms for cross-lingual name and subject access and identifies major factors that hinder cross-lingual information access.  ...  Drawbacks in word segmentation and transliteration schemes dealing with nonroman languages also call for reexamination of transliteration schemes and for the development of a morpho-syntactic parser for  ... 
doi:10.5860/lrts.51n3.180 fatcat:ecb5aapgx5blvf6s32roasr6vu

Natural Language Processing in Information Retrieval

Thorsten Brants
2003 Computational Linguistics in the Netherlands  
Many Natural Language Processing (NLP) techniques have been used in Information Retrieval. The results are not encouraging.  ...  We review NLP techniques and come to the conclusion that (a) NLP needs to be optimized for IR in order to be effective and (b) document retrieval is not an ideal application for NLP, at least given the  ...  Acknowledgments I would like to thank Hiyan Alshawi, Francine Chen, Ayman Farahat, Alex Franz, Marius Pasca, Jay Ponte, and Amit Singhal for valuable discussions on the topics covered and for help in preparing  ... 
dblp:conf/clin/Brants03 fatcat:vf754qwikrbovdgluw73sydn4m

Applications of multilingual text retrieval

W.B. Croft, J. Broglio, H. Fujii
1996 Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences  
The Center for Intelligent Information Retrieval (CIIR) at the University of Massachusetts is involved in a variety of industrial, government, and digital library applications which have a need for multilingual  ...  The issues covered by these projects include document representation techniques such as morphology and segmentation, query formulation and expansion techniques, relevance feedback, and comparisons of retrieval  ...  This work was supported in part by the NSF Center for Intelligent Information Retrieval at the University of Massachusetts.  ... 
doi:10.1109/hicss.1996.495303 dblp:conf/hicss/CroftBF96 fatcat:knqz6vs645g7teq6l4jl4kwime

Experiments in Japanese text retrieval and routing using the NEAT system

Gareth J. F. Jones, Tetsuya Sakai, Masahiro Kajiura, Kazuo Sumita
1998 Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '98  
Results on the standard BMIR-Jl and BMIR-J2 Japanese retrieval collections indicate that term weighting transfers well to Japanese text.  ...  This paper describes a structured investigation into the retrieval of Japanese text.  ...  A significant source of this segmentation ambiguity is the free generation of new compound nouns in Japanese.  ... 
doi:10.1145/290941.290992 dblp:conf/sigir/JonesSKS98 fatcat:rgvygdng6bfidlirmagy6sllxe
« Previous Showing results 1 — 15 out of 702 results