A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Machine transliteration and transliterated text retrieval: a survey
2018
Sadhana (Bangalore)
We start with a definition and discussion of the different types of transliteration followed by various deterministic and non-deterministic approaches used to tackle transliteration-related issues in machine ...
A large proportion of these non-English speakers access the Internet in their native languages but use the Roman script to express themselves through various communication channels like messages and posts ...
[68],
English-Japanese (E-J) and
English-Korean (E-K)
Phonemes
Used multiple transliteration engines and hypothesis
re-ranking
Recall E-
J
Recall E-
K
90.50%
89.70%
Chinnakotla et al [30 ...
doi:10.1007/s12046-018-0828-8
fatcat:dg3gwugmqrfevnzu3deuk5w67i
Compositional Machine Transliteration
2010
ACM Transactions on Asian Language Information Processing
We demonstrate the functionality and performance benefits of the compositional methodology using a state of the art machine transliteration framework in English and a set of Indian languages, namely, Hindi ...
In this paper, we propose compositional machine transliteration systems, where multiple transliteration components may be composed either to improve existing transliteration quality, or to enable transliteration ...
We conducted an extensive set of experiments to quantify
ACKNOWLEDGEMENTS We thank the NEWS 2009 organizers for the transliteration datasets and the FIRE 2008 organizers for the CLIR datasets. ...
doi:10.1145/1838751.1838752
fatcat:6a7diwzlrbes5hlpicbt7yriai
Learning transliteration lexicons from the web
2006
Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06
The learning algorithm starts with minimum prior knowledge about machine transliteration, and acquires knowledge iteratively from the Web. ...
The learning process refines the PSM and constructs a transliteration lexicon at the same time. ...
multiple valid Chinese transliterations and vice versa. ...
doi:10.3115/1220175.1220317
dblp:conf/acl/KuoLY06
fatcat:y2jqka27afb5hezfdapdqtbbze
Report of NEWS 2018 Named Entity Transliteration Shared Task
2018
Proceedings of the Seventh Named Entities Workshop
Four performance metrics were used to report the evaluation results. ...
Similar to previous editions of NEWS, the Shared Task featured 19 tasks on proper name transliteration, including 13 different languages and two different Japanese scripts. ...
The performance of a system is quantified using multiple metrics (defined in Section 3). ...
doi:10.18653/v1/w18-2409
dblp:conf/aclnews/ChenBZDL18
fatcat:ghgbhfebsbcatpwhiya7t7kxfu
Urdu-English Machine Transliteration using Neural Networks
[article]
2020
arXiv
pre-print
This approach is tested on three models of statistical machine translation (SMT) which include phrasebased, hierarchical phrase-based and factor based models and two models of neural machine translation ...
Systems learns the pattern and out-of-vocabulary (OOV) words from parallel corpus and there is no need to train it on transliteration corpus explicitly. ...
The challenges part is the lack of transliteration corpus for most of the languages, and if data available, integration or use of transliterated words in training of machine translation engine is not available ...
arXiv:2001.05296v1
fatcat:5zgacud2dfg5jiuzock7x5l4tq
"Can you give me another word for hyperbaric?": Improving speech translation using targeted clarification questions
2013
2013 IEEE International Conference on Acoustics, Speech and Signal Processing
statistical machine translation (SMT) systems. ...
Our approach initiates system-driven targeted clarification about errorful regions in user input and repairs them given user responses. ...
The last step re-ranks the n-best lists of error segments using the estimated POS labels and dependency information from the parser and a classifier that is trained on a large set of confidence scores ...
doi:10.1109/icassp.2013.6639302
dblp:conf/icassp/AyanMFZBKBFMKOZSHS13
fatcat:3rafhvtmfnczdbekpmqupigi6u
Cross-lingual Unified Medical Language System entity linking in online health communities
2020
JAMIA Journal of the American Medical Informatics Association
Results We carry out experiments on 3 disease-specific communities: diabetes, multiple sclerosis, and depression. ...
We present a method to identify both transliterated and translated Hebrew medical terms and link them with UMLS entities. ...
transliterations in multiple languages in the news domain. 19 In our work, we build on a neural machine translation model for named-entity transliteration developed for this task. 20 The model uses ...
doi:10.1093/jamia/ocaa150
pmid:32910823
fatcat:pjnexvgv5rcydfpgdljfrzbequ
Automatic Induction of Romanization Systems from Bilingual Corpora
2015
IEICE transactions on information and systems
We applied our approach to the task of transliteration mining, and used Levenshtein distance as the romanization selection criterion. ...
We provide an analysis of the mechanism our approach uses to improve mining performance, and also analyse the differences in characteristics between the induced system for Japanese and the official Japanese ...
0.71 and p(RE) = 0.05. ...
doi:10.1587/transinf.2014edp7236
fatcat:olrkh3qxcrdxvmtuvdbfiynjla
Restoration of Fragmentary Babylonian Texts Using Recurrent Neural Networks
[article]
2020
arXiv
pre-print
In this work we investigate the possibility of assisting scholars and even automatically completing the breaks in ancient Akkadian texts from Achaemenid period Babylonia by modelling the language using ...
The main source of information regarding ancient Mesopotamian history and culture are clay cuneiform tablets. ...
The research reported here received funding from the Ministry of Science and Technology Grant 89540 and the Israel Science Foundation Grant 457/19. https://www.overleaf.com/project/5bbd570d71590a2759027677 ...
arXiv:2003.01912v1
fatcat:kdyxl7nbw5auhndlvtko2ldoxe
Translation techniques in cross-language information retrieval
2012
ACM Computing Surveys
The usual solution to this mismatch involves translating the query and/or the documents before performing the search. Translation is therefore a pivotal activity for CLIR engines. ...
Unlike IR, CLIR must reconcile queries and documents which are written in different languages. ...
ACKNOWLEDGMENTS This research was partially supported by a PHD scholarship from the University of Nottingham and funding from the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for ...
doi:10.1145/2379776.2379777
fatcat:mu5p5djufjghvn3xjppekmwnwu
English to Hindi Multi Modal Image Caption Translation
2020
Journal of scientific research
We also tried re-ranking method. The systems are evaluated on BLEU score, RIBES score and AM/FM score. Re-ranking method proves to be best over all our other methods. ...
Various multi-modal architectures were explored using local visual feature, global visual features, attention mechanisms, and pre-trained embedding. ...
Acknowledgment We acknowledge the organizers of WAT 2019 for providing data and supporting evaluation process to obtain results for this paper. ...
doi:10.37398/jsr.2020.640238
fatcat:fj7fvj42lvffdcrqoh3sadgtae
Improving name tagging by reference resolution and relation detection
2005
Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL '05
We use an N-best approach to generate multiple hypotheses and have them re-ranked by subsequent stages of processing. ...
We demonstrate this by using the results of coreference analysis and relation extraction to reduce the errors produced by a Chinese name tagger. ...
Acknowledgements This research was supported by the Defense Advanced Research Projects Agency under Grant N66001-04-1-8920 from SPAWAR San Diego, and by the National Science Foundation under Grant 03-25657 ...
doi:10.3115/1219840.1219891
dblp:conf/acl/JiG05
fatcat:nknyahqvpbgzdadfncovrcbbtm
Translation of Untranslatable Words — Integration of Lexical Approximation and Phrase-Table Extension Techniques into Statistical Machine Translation
2009
IEICE transactions on information and systems
This paper proposes a method for handling out-ofvocabulary (OOV) words that cannot be translated using conventional phrase-based statistical machine translation (SMT) systems. ...
The effectiveness of the proposed methods is investigated for the translation of Hindi to English, Chinese, and Japanese. ...
Shukla, and S.S. Agrawal of CDAC Noida and S. Nakamura of NICT for constant support and conducive environment for this work. ...
doi:10.1587/transinf.e92.d.2378
fatcat:tbiu3s2xwbeu7fc3nl4gkwlqda
Statistical-based system combination approach to gain advantages over different machine translation systems
2019
Heliyon
We find that the system combination model using WordNet and word2vec injection improves the machine translation accuracy. ...
., Hierarchical machine translation system, Bing Microsoft Translate, and Google Translate. ...
Then, the hypothesis is created with the help of a beam search algorithm. The hypotheses are ranked by using language model, word2vec and other features. ...
doi:10.1016/j.heliyon.2019.e02504
pmid:31687594
pmcid:PMC6819763
fatcat:uavn7nsgqjaa3comh2icb3yadm
RankEval: Open Tool for Evaluation of Machine-Learned Ranking
2013
Prague Bulletin of Mathematical Linguistics
Recent research and applications for evaluation and quality estimation of Machine Translation require statistical measures for comparing machine-predicted ranking against gold sets annotated by humans. ...
, Normalized Discounted Cumulative Gain and Expected Reciprocal Rank. ...
Maja Popović and Dr. David Vilar for their useful feedback. ...
doi:10.2478/pralin-2013-0012
fatcat:bxcuu3yn3ndhvehk2z4sxouqma
« Previous
Showing results 1 — 15 out of 575 results