112 Hits in 4.7 sec

Machine transliteration using target-language grapheme and phoneme

Jong-Hoon Oh, Kiyotaka Uchimoto, Kentaro Torisawa
2009 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration - NEWS '09   unpublished
Then, the outputs from these transliteration engines were combined using re-ranking functions. Our method was applied to all language pairs in "NEWS 2009 Machine Transliteration Shared Task."  ...  We built multiple transliteration engines based on different combinations of two transliteration models and three machine learning algorithms.  ...  . graphemes and phonemes, and our multi-engine transliteration approach are effective, regardless of the nature of the language pairs.  ... 
doi:10.3115/1699705.1699714 fatcat:qaxtewae65affghsz3wd5rytai

Machine transliteration and transliterated text retrieval: a survey

Dinesh Kumar Prabhakar, Sukomal Pal
2018 Sadhana (Bangalore)  
We start with a definition and discussion of the different types of transliteration followed by various deterministic and non-deterministic approaches used to tackle transliteration-related issues in machine  ...  A large proportion of these non-English speakers access the Internet in their native languages but use the Roman script to express themselves through various communication channels like messages and posts  ...  The model first maps source graphemes to source phonemes using a standard pronunciation dictionary and then using both source graphemes and phonemes, target graphemes are generated.  ... 
doi:10.1007/s12046-018-0828-8 fatcat:dg3gwugmqrfevnzu3deuk5w67i

Joint Generation of Transliterations from Multiple Representations

Lei Yao, Grzegorz Kondrak
2015 Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies  
We further generalize this model to include transliterations from other languages, and enhance it with reranking and lexicon features.  ...  Machine transliteration is often referred to as phonetic translation.  ...  This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).  ... 
doi:10.3115/v1/n15-1095 dblp:conf/naacl/YaoK15 fatcat:kifnf6tgqfdgvmlnmfnrjirbba

GRT: Gurmukhi to Roman Transliteration System using Character Mapping and Handcrafted Rules

This system uses the handcrafted-rules and character mapping (CM) approach for transliteration between languages involved. The CM is done for Gurmukhi script with its equivalent to Roman script.  ...  It transliterates text written in Punjabi language into English language. It is tested on 65,130 Punjabi words and achieved accuracy of 99.27%, which is better than other state-of-art system results.  ...  and Handcrafted Rules Shailendra Kumar Singh, Manoj Kumar Sachan target language phoneme(s), target language phoneme(s) to target language graphemes is done.  ... 
doi:10.35940/ijitee.i8636.078919 fatcat:vvfwcotalvephflja45q3nv324

Identification of language in a cross linguistic environment

Merin Thomas, Dr Latha C A, Antony Puthussery
2020 Indonesian Journal of Electrical Engineering and Computer Science  
An algorithmic approach of Stop words based model is depicted in this paper. Model can be also enhanced to address all the Indian Languages that are in use.</p>  ...  In Indian scenario current generations are familiar to talk in native language but not to read and write in the native language, hence they started using English representation of native language in textual  ...  They are based on grapheme, Phoneme and Hybrid Approaches. In grapheme approach, it directly transforms grapheme from source to target. In Phoneme model the key is pronunciation of source language.  ... 
doi:10.11591/ijeecs.v18.i1.pp544-548 fatcat:qb3zveb24jfxznib4loc7ilipi

Hindi And Marathi to English Machine Transliteration using SVM

P H Rathod, M L Dhore, R M Dhore
2013 International Journal on Natural Language Computing  
Proposed approach uses phonetic of the source language and n-gram as two features for transliteration.  ...  In this paper, we have proposed the named entity transliteration for Hindi to English and Marathi to English language pairs using Support Vector Machine (SVM).  ...  RELATED WORK Existing approaches for machine transliterations are Grapheme-based and Phoneme-based.  ... 
doi:10.5121/ijnlc.2013.2404 fatcat:btb73kiilfbftnfvsxze2zxxzm

Hindi to English Machine Transliteration of Named Entities using Conditional Random Fields

Manikrao LDhore, Shantanu K Dixit, Tushar D Sonwalkar
2012 International Journal of Computer Applications  
Machine transliteration has received significant research attention in recent years. In most cases, the source language has been English and the target language is an Asian language.  ...  This system takes Indian place name as an input in Hindi language using Devanagari script and transliterates it into English.  ...  divergences conflation, The grapheme-based and phoneme-based models are used for the machine transliteration.  ... 
doi:10.5120/7522-0624 fatcat:zgovvjg3m5ah7kqmz5zwwo6u74

Combining probability models and web mining models: a framework for proper name transliteration

Yilu Zhou, Feng Huang, Hsinchun Chen
2007 Journal of Special Topics in Information Technology and Management  
Experiments showed that when using HMM alone, a combination of the bigram and trigram HMM approach performed the best for English-Arabic transliteration.  ...  However, language boundaries prevent information sharing and discovery across countries. Proper names play an important role in search queries and knowledge discovery.  ...  Grapheme-based approach: The grapheme-based approach uses probability to directly maps letter sequences in a source language into letter sequences in the target language.  ... 
doi:10.1007/s10799-007-0031-9 fatcat:diqx2xyhwzealptfce6shwh3zq

Compositional Machine Transliteration

A. Kumaran, Mitesh M. Khapra, Pushpak Bhattacharyya
2010 ACM Transactions on Asian Language Information Processing  
We demonstrate the functionality and performance benefits of the compositional methodology using a state of the art machine transliteration framework in English and a set of Indian languages, namely, Hindi  ...  Finally, we underscore the utility and practicality of our compositional approach by showing that a CLIR system integrated with compositional transliteration systems performs consistently on par with and  ...  Further, such measure might help identifying appropriate languages between which parallel corpora needs to be developed, there by paving way for a less resource intensive approaches for providing transliteration  ... 
doi:10.1145/1838751.1838752 fatcat:6a7diwzlrbes5hlpicbt7yriai

Urdu-English Machine Transliteration using Neural Networks [article]

Usman Mohy ud Din
2020 arXiv   pre-print
This approach is tested on three models of statistical machine translation (SMT) which include phrasebased, hierarchical phrase-based and factor based models and two models of neural machine translation  ...  In this work, we presented transliteration technique based on Expectation Maximization (EM) which is un-supervised and language independent.  ...  In phoneme based approach, grapheme of source script is first converted into phoneme of source script and then that phoneme is transliterated into grapheme of target script.  ... 
arXiv:2001.05296v1 fatcat:5zgacud2dfg5jiuzock7x5l4tq

Forward-backward transliteration of Punjabi Gurmukhi script using n-gram language model

Kapil Dev Goyal, Muhammad Raihan Abbas, Vishal Goyal, Yasir Saleem
2022 ACM Transactions on Asian and Low-Resource Language Information Processing  
We transliterate Punjabi person names from Gurmukhi script to English Roman script and from English Roman script back to Gurmukhi script using n-gram language model.  ...  Transliterating the text of a language to a foreign script is called forward transliteration and transliterating the text back to the original script is called backward transliteration.  ...  In the second step, the phonemes of the source language are converted to graphemes of target language [27] .  ... 
doi:10.1145/3542924 fatcat:7yswitrtxrdhhhkezd7tdh3qqe

Tamil Character Recognition, Translation and Transliteration System

2020 International Journal of Engineering and Advanced Technology  
English, is the informal link to all the regional languages in India and is used to publish reports, papers, magazines and records.  ...  We have created a character recognition system that converts the user's input in the Tamil language to English. Additionally, we can also perform transliteration of Tamil to English and vice versa.  ...  Grapheme and Phoneme Based Indexing In other languages, unlike Tamil for the transliteration process, only the alphabets are involved. In Tamil, both graphemes and phonemes are involved.  ... 
doi:10.35940/ijeat.d7633.049420 fatcat:sce4wgan3zgelbhcxk6quu5gau

A comprehensive survey on Indian regional language processing

B. S. Harish, R. Kasturi Rangan
2020 SN Applied Sciences  
The tasks like machine translation, Named Entity Recognition, Sentiment Analysis and Parts-Of-Speech tagging are reviewed with respect to Rule, Statistical and Neural based approaches.  ...  In this survey, the various approaches and techniques contributed by the researchers for Indian regional language processing are reviewed.  ...  The hybrid transliteration model uses both the source language grapheme and phoneme to produce the target language grapheme.  ... 
doi:10.1007/s42452-020-2983-x fatcat:e3u5r5qo7ngapj5mbiwit7qlwi

Transliteration Generation and Mining with Limited Training Resources

Sittichai Jiampojamarn, Kenneth Dwyer, Shane Bergsma, Aditya Bhargava, Qing Dou, Mi-Young Kim, Grzegorz Kondrak
2010 Named Entity Workshop  
We also explore a number of diverse resource-free and language-independent approaches to transliteration mining, which range from simple to sophisticated.  ...  Acknowledgments This research was supported by the Alberta Ingenuity Fund, Informatics Circle of Research Excellence (iCORE), and the Natural Sciences and Engineering Research Council of Canada (NSERC)  ...  The key idea of this approach is to represent each grapheme by a phoneme or a sequence of phonemes that is likely to be represented by the grapheme.  ... 
dblp:conf/aclnews/JiampojamarnDBB10 fatcat:g4mnuqlukve2vklcpo4wigbtym

Comparison of Phonemic and Graphemic Word to Sub-Word Unit Mappings for Lithuanian Phone-Level Speech Transcription

2019 Informatica  
It also compares various phoneme and grapheme based mappings across a broad range of acoustic modelling techniques including monophone and triphone based Hidden Markov models (HMM), speaker adaptively  ...  It also shows that the lowest phone error rate of an ASR system is achieved by the phoneme-based lexicon that explicitly models syllable stress and represents diphthongs as single phonetic units.  ...  to Lithuanian prosody: the intonation, rhythm, and stress" (reg. no.  ... 
doi:10.15388/informatica.2019.219 fatcat:7udhvmo2pjdgzjjc35fgnlxvoa
« Previous Showing results 1 — 15 out of 112 results