72 Hits in 6.6 sec

Pronunciation of proper names with a joint n-gram model for bi-directional grapheme-to-phoneme conversion

Lucian Galescu, James F. Allen
2002 7th International Conference on Spoken Language Processing (ICSLP 2002)   unpublished
In this paper we apply the joint n-gram model for bi-directional grapheme-to-phoneme conversion, which has already been shown to achieve excellent results on general tasks, to the more specific task of  ...  Pronunciation of proper names is known to be a difficult problem, but one of great practical importance for both speech synthesis and speech recognition.  ...  ACKNOWLEDGEMENTS This work has been supported by ONR grant N00014-01-1-1015, DARPA grant F30602-98-2-0133, and a grant from the W.M. Keck Foundation.  ... 
doi:10.21437/icslp.2002-79 fatcat:hmkf7vqitvbnnnyzb4mkq3hmvu

Applying log linear model based context dependent machine translation techniques to grapheme-to-phoneme conversion

Rong Zhang, Bowen Zhou
2010 2010 IEEE International Conference on Acoustics, Speech and Signal Processing  
Grapheme-to-Phoneme conversion is a challenging task for speech recognition and text-to-speech systems for which the functionality of automatically predicting pronunciations for OOV words is highly desirable  ...  In this paper, Grapheme-to-Phoneme conversion is viewed as a special case of sequence translation problem and we propose to tackle it with phrase based log-linear translation model.  ...  of MT Methods on CMUDict Compared with Joint N-gram Model  ... 
doi:10.1109/icassp.2010.5495551 dblp:conf/icassp/ZhangZ10 fatcat:7fxa6vxis5cadcoyaawpijprmu

Survey on Machine Transliteration and Machine Learning Models

Dhore M L, Dhore R M, Rathod P H
2015 International Journal on Natural Language Computing  
Globalization and growth of Internet users truly demands for almost all internet based applications to support local languages.  ...  This paper provides the thorough survey on machine transliteration models and machine learning approaches used for machine transliteration over the period of more than two decades for internationally used  ...  Martin Jansche and Richard Sproat (2009) performed the named entity transcription with a pair of n-gram models at Google Inc. They used different size n-grams for different pairs.  ... 
doi:10.5121/ijnlc.2015.4202 fatcat:kegqa5k4abahvbno2setnkxwtq

Low-Resource Machine Transliteration Using Recurrent Neural Networks of Asian Languages

Ngoc Tan Le, Fatiha Sadat
2018 Proceedings of the Seventh Named Entities Workshop  
With lowresource language pairs that do not have available and well-developed pronunciation lexicons, grapheme-to-phoneme models are particularly useful.  ...  Grapheme-to-phoneme models are key components in automatic speech recognition and text-to-speech systems.  ...  Acknowledgements We thank the anonymous reviewers for their insightful comments.  ... 
doi:10.18653/v1/w18-2414 dblp:conf/aclnews/LeS18 fatcat:fdb6c4gfmbfvfg4dqoqlltehne

Improving recognition of proper nouns in ASR through generating and filtering phonetic transcriptions

Antoine Laurent, Sylvain Meignier, Paul Deléglise
2014 Computer Speech and Language  
A rule-based grapheme-to-phoneme generator (LIA PHON), a knowledge-based approach (JSM), and a Statistical Machine Translation based system were evaluated for this alignment.  ...  of mere pronunciation rules.  ...  Acknowledgement Special thanks to Dr. Teva Merlin for his help with this work.  ... 
doi:10.1016/j.csl.2014.02.006 fatcat:nf6o5if3dbbdvm6mwhnj5wzah4

Large vocabulary Russian speech recognition using syntactico-statistical language modeling

Alexey Karpov, Konstantin Markov, Irina Kipyatkova, Daria Vazhenina, Andrey Ronzhin
2014 Speech Communication  
For the language model (LM), we introduced a new method that combines syntactical and statistical analysis of the training text data in order to build better n-gram models.  ...  For the acoustic model, we investigated a combination of knowledge-based and statistical approaches to create several different phoneme sets, the best of which was determined experimentally.  ...  Acknowledgements This research is supported by the Ministry of Education and Science of Russia (contract No. 07.514.11.4139), by the grant of the President of Russia (project No.  ... 
doi:10.1016/j.specom.2013.07.004 fatcat:hq2vkvwdlzgqlhyi44duyh44hq

An encoder-decoder based grapheme-to-phoneme converter for Bangla speech synthesis

Arif Ahmad, Mohammad Reza Selim, Muhammed Zafar Iqbal, Mohammad Shahidur Rahman
2019 Acoustical Science and Technology  
This paper proposes an encoder-decoder based sequence-to-sequence model for Grapheme-to-Phoneme (G2P) conversion in Bangla (Exonym: Bengali).  ...  In contrast to joint-sequence based G2P models, our encoder-decoder based model has the flexibility of not requiring explicit graphemeto-phoneme alignment which are not straight forward to perform.  ...  [10] used deep bi-directional LSTM (DBLSTM) with a connectionist temporal classification (CTC) layer. They combined this model with a joint n-gram model and obtained a WER of 21.3% on CMUDict.  ... 
doi:10.1250/ast.40.374 fatcat:ff6ktagmszcwhgcdsvozwdrziu

Hindi to English Machine Transliteration of Named Entities using Conditional Random Fields

Manikrao LDhore, Shantanu K Dixit, Tushar D Sonwalkar
2012 International Journal of Computer Applications  
The input to the system is provided in the form of syllabification in order to apply the n-gram techniques.  ...  As more than 50% named entities are formed as a combination of two and three syllabic units, the ngram approach with unigrams, bigrams and trigrams of Hindi are used to train the corpus.  ...  Conceptually, it is a direct orthographical mapping from source graphemes to target graphemes [4] . Phoneme-based model considers transliteration as a phonetic process.  ... 
doi:10.5120/7522-0624 fatcat:zgovvjg3m5ah7kqmz5zwwo6u74

Direct Combination of Spelling and Pronunciation Information for Robust Back-Transliteration [chapter]

Slaven Bilac, Hozumi Tanaka
2005 Lecture Notes in Computer Science  
Rather than producing back-transliterations based on grapheme and phoneme model independently and then interpolating the results, we propose a method of first combining the sets of allowed rewrites (i.e  ...  Transliterating words and names from one language to another is a frequent and highly productive phenomenon. For example, English word cache is transliterated in Japanese as "kyasshu".  ...  We would like to thank Zhang Min for help with Chinese evaluation data and an anonymous reviewer for valuable comments.  ... 
doi:10.1007/978-3-540-30586-6_44 fatcat:wqdqpcwxffbebmu2gmfs3ligmu

Machine transliteration and transliterated text retrieval: a survey

Dinesh Kumar Prabhakar, Sukomal Pal
2018 Sadhana (Bangalore)  
With the advent of Web 2.0, user-generated content is increasing on the Web at a very rapid rate. A substantial proportion of this content is transliterated data.  ...  We start with a definition and discussion of the different types of transliteration followed by various deterministic and non-deterministic approaches used to tackle transliteration-related issues in machine  ...  The model allows DOM between two languages through a joint source-channel model (n-gram transliteration model).  ... 
doi:10.1007/s12046-018-0828-8 fatcat:dg3gwugmqrfevnzu3deuk5w67i

Morphological decomposition in Arabic ASR systems

F. Diehl, M.J.F. Gales, M. Tomalin, P.C. Woodland
2012 Computer Speech and Language  
In particular, a novel solution for morpheme-to-word conversion is presented which makes use of an N-gram Statistical Machine Translation (SMT) approach.  ...  System integration issues concerning language modelling and dictionary construction, as well as the estimation of pronunciation probabilities, are discussed.  ...  To investigate the implications of context length, three LMs were built -a uni-gram, a bi-gram, and a tri-gramand for the actual morpheme-to-word conversion the MARIE N-gram SMT decoder was used. 13 Decoding  ... 
doi:10.1016/j.csl.2011.12.001 fatcat:eg7tdlhwyjaubdmqazwrfcjvky

String Transduction with Target Language Models and Insertion Handling [article]

Garrett Nicolai, Saeed Najafi, Grzegorz Kondrak
2018 arXiv   pre-print
generation, and phoneme-to-grapheme conversion.  ...  We show that leveraging target language models derived from unannotated target corpora, combined with a precise alignment of the training data, yields state-of-the art results on cognate projection, inflection  ...  We thank the members of the University of Alberta teams who collaborated with us in the context of the 2018 shared tasks on transliteration and morphological reinflection: Bradley Hauer, Rashed Rubby Riyadh  ... 
arXiv:1809.07182v1 fatcat:2rngjmeuhjhcra5rkyilhn42vm

A comprehensive survey on Indian regional language processing

B. S. Harish, R. Kasturi Rangan
2020 SN Applied Sciences  
The tasks like machine translation, Named Entity Recognition, Sentiment Analysis and Parts-Of-Speech tagging are reviewed with respect to Rule, Statistical and Neural based approaches.  ...  Processing of these natural languages for various language processing tasks is challenging. The Indian regional languages are considered to be low resourced when compared to other languages.  ...  The n-gram tokenization is a token of n-words where 'n' indicates the number of words taken together for a lexical unit.  ... 
doi:10.1007/s42452-020-2983-x fatcat:e3u5r5qo7ngapj5mbiwit7qlwi

D2.1 Libraries and tools for multimodal content analysis

Doukhan; David, Danny Francis, Benoit Huet, Sami Keronen, Mikko Kurimo, Jorma Laaksonen, Tiina Lindh-Knuutila, Bernard Merialdo, Mats Sjöberg, Umut Sulubacak, Jörg Tiedemann, Kim Viljanen
2018 Zenodo  
This deliverable describes a joint collection of libraries and tools for multimodal content analysis created by the MeMAD project partners.  ...  As part of this deliverable, the open source components have been gathered into a joint software collection of tools and libraries publicly available on GitHub.  ...  1 Acknowledgements Computational resources were provided by the Aalto Science-IT project and the CSC -IT Center for Science, Finland.  ... 
doi:10.5281/zenodo.3697989 fatcat:bde5x3yggzb2jk2fh2mu6t5wxy

Chester: Towards a personal medication advisor

James Allen, George Ferguson, Nate Blaylock, Donna Byron, Nathanael Chambers, Myroslava Dzikovska, Lucian Galescu, Mary Swift
2006 Journal of Biomedical Informatics  
The emphasis of this paper is on the portability of our generic spoken dialogue technology, and presents a case study of the application of these techniques to the development of a dialogue system for  ...  In this paper, we describe Chester, a prototype intelligent assistant that interacts with its user via conversational natural spoken language to provide them with information and advice regarding their  ...  Although our joint n-gram model for bi-directional grapheme to phoneme conversion has demonstrated excellent performance both on random words [33] and on more difficult, specialized cases like names  ... 
doi:10.1016/j.jbi.2006.02.004 pmid:16545620 fatcat:tyfbd42vafbs5cqmbh3v6bbzr4
« Previous Showing results 1 — 15 out of 72 results