575 Hits in 5.0 sec

Machine transliteration and transliterated text retrieval: a survey

Dinesh Kumar Prabhakar, Sukomal Pal
2018 Sadhana (Bangalore)  
We start with a definition and discussion of the different types of transliteration followed by various deterministic and non-deterministic approaches used to tackle transliteration-related issues in machine  ...  A large proportion of these non-English speakers access the Internet in their native languages but use the Roman script to express themselves through various communication channels like messages and posts  ...  [68], English-Japanese (E-J) and English-Korean (E-K) Phonemes Used multiple transliteration engines and hypothesis re-ranking Recall E- J Recall E- K 90.50% 89.70% Chinnakotla et al [30  ... 
doi:10.1007/s12046-018-0828-8 fatcat:dg3gwugmqrfevnzu3deuk5w67i

Compositional Machine Transliteration

A. Kumaran, Mitesh M. Khapra, Pushpak Bhattacharyya
2010 ACM Transactions on Asian Language Information Processing  
We demonstrate the functionality and performance benefits of the compositional methodology using a state of the art machine transliteration framework in English and a set of Indian languages, namely, Hindi  ...  In this paper, we propose compositional machine transliteration systems, where multiple transliteration components may be composed either to improve existing transliteration quality, or to enable transliteration  ...  We conducted an extensive set of experiments to quantify ACKNOWLEDGEMENTS We thank the NEWS 2009 organizers for the transliteration datasets and the FIRE 2008 organizers for the CLIR datasets.  ... 
doi:10.1145/1838751.1838752 fatcat:6a7diwzlrbes5hlpicbt7yriai

Learning transliteration lexicons from the web

Jin-Shea Kuo, Haizhou Li, Ying-Kuei Yang
2006 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06  
The learning algorithm starts with minimum prior knowledge about machine transliteration, and acquires knowledge iteratively from the Web.  ...  The learning process refines the PSM and constructs a transliteration lexicon at the same time.  ...  multiple valid Chinese transliterations and vice versa.  ... 
doi:10.3115/1220175.1220317 dblp:conf/acl/KuoLY06 fatcat:y2jqka27afb5hezfdapdqtbbze

Report of NEWS 2018 Named Entity Transliteration Shared Task

Nancy Chen, Rafael E. Banchs, Min Zhang, Xiangyu Duan, Haizhou Li
2018 Proceedings of the Seventh Named Entities Workshop  
Four performance metrics were used to report the evaluation results.  ...  Similar to previous editions of NEWS, the Shared Task featured 19 tasks on proper name transliteration, including 13 different languages and two different Japanese scripts.  ...  The performance of a system is quantified using multiple metrics (defined in Section 3).  ... 
doi:10.18653/v1/w18-2409 dblp:conf/aclnews/ChenBZDL18 fatcat:ghgbhfebsbcatpwhiya7t7kxfu

Urdu-English Machine Transliteration using Neural Networks [article]

Usman Mohy ud Din
2020 arXiv   pre-print
This approach is tested on three models of statistical machine translation (SMT) which include phrasebased, hierarchical phrase-based and factor based models and two models of neural machine translation  ...  Systems learns the pattern and out-of-vocabulary (OOV) words from parallel corpus and there is no need to train it on transliteration corpus explicitly.  ...  The challenges part is the lack of transliteration corpus for most of the languages, and if data available, integration or use of transliterated words in training of machine translation engine is not available  ... 
arXiv:2001.05296v1 fatcat:5zgacud2dfg5jiuzock7x5l4tq

"Can you give me another word for hyperbaric?": Improving speech translation using targeted clarification questions

Necip Fazil Ayan, Arindam Mandal, Michael Frandsen, Jing Zheng, Peter Blasco, Andreas Kathol, Frederic Bechet, Benoit Favre, Alex Marin, Tom Kwiatkowski, Mari Ostendorf, Luke Zettlemoyer (+3 others)
2013 2013 IEEE International Conference on Acoustics, Speech and Signal Processing  
statistical machine translation (SMT) systems.  ...  Our approach initiates system-driven targeted clarification about errorful regions in user input and repairs them given user responses.  ...  The last step re-ranks the n-best lists of error segments using the estimated POS labels and dependency information from the parser and a classifier that is trained on a large set of confidence scores  ... 
doi:10.1109/icassp.2013.6639302 dblp:conf/icassp/AyanMFZBKBFMKOZSHS13 fatcat:3rafhvtmfnczdbekpmqupigi6u

Cross-lingual Unified Medical Language System entity linking in online health communities

Yonatan Bitton, Raphael Cohen, Tamar Schifter, Eitan Bachmat, Michael Elhadad, Noémie Elhadad
2020 JAMIA Journal of the American Medical Informatics Association  
Results We carry out experiments on 3 disease-specific communities: diabetes, multiple sclerosis, and depression.  ...  We present a method to identify both transliterated and translated Hebrew medical terms and link them with UMLS entities.  ...  transliterations in multiple languages in the news domain. 19 In our work, we build on a neural machine translation model for named-entity transliteration developed for this task. 20 The model uses  ... 
doi:10.1093/jamia/ocaa150 pmid:32910823 fatcat:pjnexvgv5rcydfpgdljfrzbequ

Automatic Induction of Romanization Systems from Bilingual Corpora

Keiko TAGUCHI, Andrew FINCH, Seiichi YAMAMOTO, Eiichiro SUMITA
2015 IEICE transactions on information and systems  
We applied our approach to the task of transliteration mining, and used Levenshtein distance as the romanization selection criterion.  ...  We provide an analysis of the mechanism our approach uses to improve mining performance, and also analyse the differences in characteristics between the induced system for Japanese and the official Japanese  ...  0.71 and p(RE) = 0.05.  ... 
doi:10.1587/transinf.2014edp7236 fatcat:olrkh3qxcrdxvmtuvdbfiynjla

Restoration of Fragmentary Babylonian Texts Using Recurrent Neural Networks [article]

Ethan Fetaya, Yonatan Lifshitz, Elad Aaron, Shai Gordin
2020 arXiv   pre-print
In this work we investigate the possibility of assisting scholars and even automatically completing the breaks in ancient Akkadian texts from Achaemenid period Babylonia by modelling the language using  ...  The main source of information regarding ancient Mesopotamian history and culture are clay cuneiform tablets.  ...  The research reported here received funding from the Ministry of Science and Technology Grant 89540 and the Israel Science Foundation Grant 457/19.  ... 
arXiv:2003.01912v1 fatcat:kdyxl7nbw5auhndlvtko2ldoxe

Translation techniques in cross-language information retrieval

Dong Zhou, Mark Truran, Tim Brailsford, Vincent Wade, Helen Ashman
2012 ACM Computing Surveys  
The usual solution to this mismatch involves translating the query and/or the documents before performing the search. Translation is therefore a pivotal activity for CLIR engines.  ...  Unlike IR, CLIR must reconcile queries and documents which are written in different languages.  ...  ACKNOWLEDGMENTS This research was partially supported by a PHD scholarship from the University of Nottingham and funding from the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for  ... 
doi:10.1145/2379776.2379777 fatcat:mu5p5djufjghvn3xjppekmwnwu

English to Hindi Multi Modal Image Caption Translation

Jagroop Kaur, Gurpreet Singh Josan
2020 Journal of scientific research  
We also tried re-ranking method. The systems are evaluated on BLEU score, RIBES score and AM/FM score. Re-ranking method proves to be best over all our other methods.  ...  Various multi-modal architectures were explored using local visual feature, global visual features, attention mechanisms, and pre-trained embedding.  ...  Acknowledgment We acknowledge the organizers of WAT 2019 for providing data and supporting evaluation process to obtain results for this paper.  ... 
doi:10.37398/jsr.2020.640238 fatcat:fj7fvj42lvffdcrqoh3sadgtae

Improving name tagging by reference resolution and relation detection

Heng Ji, Ralph Grishman
2005 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL '05  
We use an N-best approach to generate multiple hypotheses and have them re-ranked by subsequent stages of processing.  ...  We demonstrate this by using the results of coreference analysis and relation extraction to reduce the errors produced by a Chinese name tagger.  ...  Acknowledgements This research was supported by the Defense Advanced Research Projects Agency under Grant N66001-04-1-8920 from SPAWAR San Diego, and by the National Science Foundation under Grant 03-25657  ... 
doi:10.3115/1219840.1219891 dblp:conf/acl/JiG05 fatcat:nknyahqvpbgzdadfncovrcbbtm

Translation of Untranslatable Words — Integration of Lexical Approximation and Phrase-Table Extension Techniques into Statistical Machine Translation

Michael PAUL, Karunesh ARORA, Eiichiro SUMITA
2009 IEICE transactions on information and systems  
This paper proposes a method for handling out-ofvocabulary (OOV) words that cannot be translated using conventional phrase-based statistical machine translation (SMT) systems.  ...  The effectiveness of the proposed methods is investigated for the translation of Hindi to English, Chinese, and Japanese.  ...  Shukla, and S.S. Agrawal of CDAC Noida and S. Nakamura of NICT for constant support and conducive environment for this work.  ... 
doi:10.1587/transinf.e92.d.2378 fatcat:tbiu3s2xwbeu7fc3nl4gkwlqda

Statistical-based system combination approach to gain advantages over different machine translation systems

Debajyoty Banik, Asif Ekbal, Pushpak Bhattacharyya, Siddhartha Bhattacharyya, Jan Platos
2019 Heliyon  
We find that the system combination model using WordNet and word2vec injection improves the machine translation accuracy.  ...  ., Hierarchical machine translation system, Bing Microsoft Translate, and Google Translate.  ...  Then, the hypothesis is created with the help of a beam search algorithm. The hypotheses are ranked by using language model, word2vec and other features.  ... 
doi:10.1016/j.heliyon.2019.e02504 pmid:31687594 pmcid:PMC6819763 fatcat:uavn7nsgqjaa3comh2icb3yadm

RankEval: Open Tool for Evaluation of Machine-Learned Ranking

Eleftherios Avramidis
2013 Prague Bulletin of Mathematical Linguistics  
Recent research and applications for evaluation and quality estimation of Machine Translation require statistical measures for comparing machine-predicted ranking against gold sets annotated by humans.  ...  , Normalized Discounted Cumulative Gain and Expected Reciprocal Rank.  ...  Maja Popović and Dr. David Vilar for their useful feedback.  ... 
doi:10.2478/pralin-2013-0012 fatcat:bxcuu3yn3ndhvehk2z4sxouqma
« Previous Showing results 1 — 15 out of 575 results