A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2004; you can also visit the original URL.
The file type is application/pdf
.
Filters
A Statistical Model for Automatic Extraction of Korean Transliterated Foreign Words
2003
International Journal of Computer Processing Of Languages
In this paper, we will describe a Korean transliterated foreign word extraction algorithm. ...
Syllable sequences of Korean strings are modelled by Hidden Markov Model whose state represents a character with binary marking to indicate whether the syllable is part of a transliterated foreign word ...
And this work was partially supported by the Ministry of Science and Technology through the "Knowledge base prototype construction and its application for human knowledge processing modelling" (M1-0107 ...
doi:10.1142/s021942790300084x
fatcat:5gnyrkzrrjcd3hfjxh3ufji7oe
Two approaches for the resolution of word mismatch problem caused by English words and foreign words in Korean information retrieval
2000
Proceedings of the fifth international workshop on on Information retrieval with Asian languages - IRAL '00
To make matters worse the Korean transliterations of an English word may be very various. ...
The mixed use of English words and their various transliterations may cause severe word mismatch problem in Korean information retrieval. ...
For We developed a new effective method of foreign word extraction through word segmentation [4] . ...
doi:10.1145/355214.355234
dblp:conf/iral/KangC00
fatcat:tbixddfeona6ndjypkuvboskai
Japanese term extraction using dictionary hierarchy and machine translation system
2001
Terminology
There have been many studies of automatic term recognition (ATR) and they have achieved good results. However, they focus on a mono-lingual term extraction method. ...
This article describes an automatic term extraction method from documents in foreign languages using a machine translation system. ...
The main assumption for extracting a foreign word is that the composition of foreign words is different from that of pure Korean words, since the Korean phonetic system is different from that of the foreign ...
doi:10.1075/term.6.2.09oh
fatcat:xkr3fnu3g5ai7pmog2cthsemxy
Survey on Machine Transliteration and Machine Learning Models
2015
International Journal on Natural Language Computing
This paper provides the thorough survey on machine transliteration models and machine learning approaches used for machine transliteration over the period of more than two decades for internationally used ...
Survey shows that linguistic approach provides better results for the closely related languages and probability based statistical approaches are good when one of the languages is phonetic and other is ...
In the transliteration approach foreign words and English words were extracted and then English words were transliterated into Korean phonetic equivalents . ...
doi:10.5121/ijnlc.2015.4202
fatcat:kegqa5k4abahvbno2setnkxwtq
Cross-Language IR at University of Tsukuba: Automatic Transliteration for Japanese, English, and Korean
2004
NTCIR Conference on Evaluation of Information Access Technologies
We apply our method, which was originally proposed for Japanese Katakana words, to Korean Hangul words and realize JEK transliteration in a single framework. ...
We produced a transliteration dictionary for Japanese and English letters via the Roman representation. To produce a new dictionary, we use the Unicode system to romanize Korean words. ...
[10] proposed a statistical method to detect foreign words in Korean. However, their method requires a training corpus in which conventional and foreign words are annotated. ...
dblp:conf/ntcir/FujiiI04
fatcat:6dv2nr7lnjh2pdasqdi7l5fnbq
Term recognition using technical dictionary hierarchy
2000
Proceedings of the 38th Annual Meeting on Association for Computational Linguistics - ACL '00
For example, domain dictionaries can improve the performance in ATR. This paper focuses on a method for extracting terms using a dictionary hierarchy. ...
In recent years, statistical approaches on ATR (Automatic Term Recognition) have achieved good results. However, there are scopes to improve the performance in extracting terms still further. ...
Many fundamental researches are supported by the fund of Ministry of Science and Technology under a project of plan STEP2000. ...
doi:10.3115/1075218.1075281
dblp:conf/acl/OhLC00
fatcat:shq4yatqsvganayoazgzditl6i
Effective foreign word extraction for Korean information retrieval
2002
Information Processing & Management
In Korean text, foreign words, which are mostly transliterations of English words, are frequently used. ...
So accurate foreign word extraction is crucial for high performance of information retrieval. ...
We also showed that the impact of accurate foreign word extraction on Korean information retrieval performance is great. Fig. 1 . 1 The HMM model for Korean eojeol. ...
doi:10.1016/s0306-4573(00)00065-0
fatcat:3jvkcbxizna5bod6n6utmxnwtm
Machine transliteration and transliterated text retrieval: a survey
2018
Sadhana (Bangalore)
With the advent of Web 2.0, user-generated content is increasing on the Web at a very rapid rate. A substantial proportion of this content is transliterated data. ...
To leverage this huge information repository, there is a matching effort to process transliterated text. In this article, we survey the recent body of work in the field of transliteration. ...
Overall, the recall rate is low for the foreign words of Korean and Japanese origins. ...
doi:10.1007/s12046-018-0828-8
fatcat:dg3gwugmqrfevnzu3deuk5w67i
Selection of Korean Proper Translation Words Using Bi-Gram-Based Histograms
2007
Data Science Journal
This paper describes a proper translation-selecting and translation-clustering algorithm for Korean translation of words automatically extracted from newspapers. ...
As about 80% of the English words in Korean newspapers appear in abbreviated form, it is necessary to make clusters of translation words to construct easily bilingual knowledge bases such as dictionaries ...
Unfortunately, there is no study of the issues for the Korean newspaper corpus. No one has previously tried to extract a set of Korean translations for an English word in a real newspaper. ...
doi:10.2481/dsj.6.s125
fatcat:3sznxs4ccva4xn3wzayylymw64
Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia
2012
Annual Meeting of the Association for Computational Linguistics
The combination is achieved using a novel semi-CRF model for foreign sentence tagging in the context of a parallel English sentence. ...
In this paper we propose a method to automatically label multi-lingual data with named entity tags. ...
The transliteration model can return an n-best list of transliterations of a foreign string, together with scores. ...
dblp:conf/acl/KimTY12
fatcat:wkzndohprzbqxksgfynboyu74u
A Weighted Finite-State Transducer Implementation of Phoneme Rewrite Rules for English to Korean Pronunciation Conversion
2011
Procedia - Social and Behavioral Sciences
This paper describes a method for developing a finitestate model that predicts how English words and named entities are pronounced in Korean. ...
A formal model that properly captures this change has theoretical implications in phonology and practical applications in speech processing and machine transliteration. ...
Foreign words form a major class of out-of-vocabulary words and pose problems for text-tospeech synthesis and automatic speech recognition. ...
doi:10.1016/j.sbspro.2011.10.599
fatcat:7e3lnfo5ive6zoeje7qfc6n6jq
A phonetic similarity model for automatic extraction of transliteration pairs
2007
ACM Transactions on Asian Language Information Processing
transliteration in the k-neighborhood of a recognized English word. ...
________________________________________________________________________ This article proposes an approach for the automatic extraction of transliteration pairs from Chinese Web corpora. ...
We also thank Yu Chen at the Institute for Infocomm Research, Singapore, for her efforts in improving the manuscript; Wen-Hsiang Lu at the National Cheng-Kung University for providing hyperlink and Web ...
doi:10.1145/1282080.1282081
fatcat:cabttqaf6vd6la4xfh46pxtbcu
Transliteration Generation and Mining with Limited Training Resources
2010
Named Entity Workshop
We also explore a number of diverse resource-free and language-independent approaches to transliteration mining, which range from simple to sophisticated. ...
We present DIRECTL+: an online discriminative sequence prediction model based on many-to-many alignments, which is further augmented by the incorporation of joint n-gram features. ...
Acknowledgments This research was supported by the Alberta Ingenuity Fund, Informatics Circle of Research Excellence (iCORE), and the Natural Sciences and Engineering Research Council of Canada (NSERC) ...
dblp:conf/aclnews/JiampojamarnDBB10
fatcat:g4mnuqlukve2vklcpo4wigbtym
How to Translate Dialects: A Segmentation-Centric Pivot Translation Approach
2013
Journal of Natural Language Processing
This paper proposes a new method to translate a dialect language into a foreign language by integrating transliteration approaches based on Bayesian alignment (BA) models with pivot-based SMT approaches ...
and a standard language automatically, (2) it avoids segmentation mismatches between the input and the translation model by mapping the character sequences of the dialect language to the word segmentation ...
using transliteration pairs, i.e., the most likely sequence of source characters and target words according to a joint language model built from the alignment of Bayesian model. ...
doi:10.5715/jnlp.20.563
fatcat:vmy5rgifxnajxnxtytqbezrugq
Transliteration of proper names in cross-lingual information retrieval
2003
Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition -
We demonstrate the application of statistical machine translation techniques to "translate" the phonemic representation of an English name, obtained by using an automatic text-to-speech system, to a sequence ...
© " is estimated from a paired corpus of foreign-language sentences and their English translations, and the language model © " is trained from English text. Software tools 1 ...
Since we seek Chinese names which are transliteration of a given English name, the notion of words in a sentence in the IBM model above is replaced with phonemes in a word. ...
doi:10.3115/1119384.1119392
dblp:conf/acl/VirgaK03
fatcat:jar523futrauvl6jlgg7ulkfte
« Previous
Showing results 1 — 15 out of 324 results