5,457 Hits in 8.8 sec

A New Word Clustering Method for Building N-Gram Language Models in Continuous Speech Recognition Systems [chapter]

Mohammad Bahrani, Hossein Sameti, Nazila Hafezi, Saeedeh Momtazi
Lecture Notes in Computer Science  
In this paper a new method for automatic word clustering is presented. We used this method for building n-gram language models for Persian continuous speech recognition (CSR) systems.  ...  Also reduction in word error rate of CSR system is about 16% compared with a manual clustering method.  ...  The class n-gram language models have been extracted from the clustering results and evaluated in our continuous speech recognition system.  ... 
doi:10.1007/978-3-540-69052-8_30 fatcat:5orn6pdlqfhk3egfleihww4ati

Recent advances in speech recognition for spontaneous speech translation

Y. Sagisaka
1998 Journal of the Acoustical Society of America  
Component technologies form a set of flexible software tools A~SPREC for building a recognition system and they are used in our new speeeh translation system MA~IX.  ...  makes it possible to recognize spontaneous s~h whose linguistic structures are not definitely specified by a conventional grammar for written language.  ...  LANGUAGE MODEL~G AND THE EXTRACTION OF L~G~STIC CONSTRA~TS FROM LANGUAGE CORPORA Word N-grams have been widely used as effective linguistic constraints to reduce search efforts in continuous speech recognition  ... 
doi:10.1121/1.421589 fatcat:fddswetoefh2tjhk4jeojjvole

Incorporating Grammatical Features in the Modeling of the Slovak Language for Continuous Speech Recognition [chapter]

Jan Stas, Daniel Hladek, Jozef Juhar
2012 Modern Speech Recognition Approaches with Case Studies  
Acknowledgement The research presented in this paper was supported by the Ministry of Education under the research project MŠ SR 3928/2010-11 (50%) and Research and Development Operational Program funded  ...  new words into the speech recognition system without the need of re-training the language model; • better estimate probabilities of those n-grams that did not occur in the training corpus.  ...  For word-based n-gram language model, there is a probability value for each n-gram, as well as back-off weight for lower order n-grams.  ... 
doi:10.5772/48506 fatcat:g7r6yjyueja2hliid2n6kykqx4

A large vocabulary continuous speech recognition system for Persian language

Hossein Sameti, Hadi Veisi, Mohammad Bahrani, Bagher Babaali, Khosro Hosseinzadeh
2011 EURASIP Journal on Audio, Speech, and Music Processing  
This continuous speech recognition system uses most standard and state-of-the-art speech and language modeling techniques.  ...  A new robustness method called PC-PMC was also proposed and incorporated in the system.  ...  Since the size of this edition of the corpus was not enough for making a reliable word-based n-gram language model, we built POS-based and class-based n-gram language models, in addition to the word-based  ... 
doi:10.1186/1687-4722-2011-426795 fatcat:wm2tzdvtpfealbl2vvxtecueqe

Linearly Interpolated Hierarchical N-gram Language Models for Speech Recognition Engines [chapter]

Imed Zitouni, Qiru Zhou
2007 Robust Speech Recognition and Understanding  
Introduction Language modeling is a crucial component in natural language continuous speech recognition, due to the difficulty involved by continuous speech [1], [2] .  ...  n-gram language models and backoff n-gram language models in terms of perplexity and also in terms word error rate when intergrated into a speech recognizer engine.  ...  Two chapters on the automatic recognition of a speaker's emotional state highlight the importance of natural speech understanding and interpretation in voice-driven systems.  ... 
doi:10.5772/4756 fatcat:o3qsnubfqzdtpcskuuupyiv6sy

Factored language model adaptation using Dirichlet class language model for speech recognition

Ali Hatami, Ahmad Akbari, Babak Nasersharif
2013 The 5th Conference on Information and Knowledge Technology  
In this paper, we present an idea for using syntactic information such as part-of-speech (POS) in DCLM for combining with one of the language models of n-gram family.  ...  In our work, word clustering is based on POS of previous words and history words in DCLM.  ...  Factored language model (FLM) [3] is another kind of n-gram models. The FLM was proposed using factors for each word. In a FLM, a word is considered as a vector of K factors.  ... 
doi:10.1109/ikt.2013.6620107 fatcat:bgytghpurvhu3p4wuvmtvrfdvi

Vietnamese large vocabulary continuous speech recognition

Ngoc Thang Vu, Tanja Schultz
2009 2009 IEEE Workshop on Automatic Speech Recognition & Understanding  
Our currently best recognition system achieves a word error rate of 11.7% on read newspaper speech. 978-1-4244-5479-2/09/$26.00  ...  To bootstrap the Vietnamese speech recognition system we used our Rapid Language Adaptation scheme applying a multilingual phone inventory.  ...  Our Rapid Language Adaptation Tools (RLAT) [9] aim to significantly reduce the amount of time and effort involved in building speech processing systems for new languages.  ... 
doi:10.1109/asru.2009.5373424 dblp:conf/asru/VuS09 fatcat:2gdmaqi445elrbmnrtwuqady24

Large vocabulary Russian speech recognition using syntactico-statistical language modeling

Alexey Karpov, Konstantin Markov, Irina Kipyatkova, Daria Vazhenina, Andrey Ronzhin
2014 Speech Communication  
For the language model (LM), we introduced a new method that combines syntactical and statistical analysis of the training text data in order to build better n-gram models.  ...  In this paper, we describe our efforts to build an automatic speech recognition (ASR) system for the Russian language with a large vocabulary.  ...  MK-1880.2012.8), by the Russian Foundation for Basic Research (project No. 12-08-01265) and by the Russian Humanitarian Scientific Foundation (project No. 12-04-12062).  ... 
doi:10.1016/j.specom.2013.07.004 fatcat:hq2vkvwdlzgqlhyi44duyh44hq

Recurrent neural network-based language modeling for an automatic Russian speech recognition system

Irina Kipyatkova, Alexey Karpov
2015 2015 Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT)  
In the paper, we describe a research of recurrent neural network language models for N-best list rescoring for automatic continuous Russian speech recognition.  ...  We tried recurrent neural networks with different number of units in the hidden layer. We achieved the relative word error rate reduction of 14% with respect to the baseline 3-gram model.  ...  The block of n-gram model creation performs statistic analysis of text corpus and builds a stochastic n-gram language model.  ... 
doi:10.1109/ainl-ismw-fruct.2015.7382966 fatcat:uhtstyiuprfjrfvpp37yj3sr5m

The ATR Multilingual Speech-to-Speech Translation System

S. Nakamura, K. Markov, H. Nakaiwa, G. Kikui, H. Kawai, T. Jitsuhiro, J.-S. Zhang, H. Yamamoto, E. Sumita, S. Yamamoto
2006 IEEE Transactions on Audio, Speech, and Language Processing  
There are three main modules of our S2ST system: large-vocabulary continuous speech recognition, machine text-to-text (T2T) translation, and text-to-speech synthesis.  ...  In this paper, we describe the ATR multilingual speech-to-speech translation (S2ST) system, which is mainly focused on translation between English and Asian languages (Japanese and Chinese).  ...  In this model, higher-order word N-grams are partially introduced by regarding frequent variable-length word sequences as new word succession entries.  ... 
doi:10.1109/tsa.2005.860774 fatcat:uuilgbilm5bzraladttb6m7wtm

Utilization of Huge Written Text Corpora for Conversational Speech Recognition

Xinhui Hu, Hirofumi Yamamoto, Jinsong Zhang, Keiji Yasuda, Youzheng Wu, Hideki Kashioka
2008 2008 6th International Symposium on Chinese Spoken Language Processing  
In this paper, we propose a new sentence selection method using large written text corpora to augment the language model of conversational speech recognition in order to resolve the insufficiency of in-domain  ...  Next, utterances are selected and mixed with the original conversational training corpus, and language models for conversational speech recognition are built.  ...  To confirm the validity of the addition of clustered sentences in speech recognition, we build a bigram language model and a trigram language model for decoding and rescoring in speech recognition, respectively  ... 
doi:10.1109/chinsl.2008.ecp.36 dblp:conf/iscslp/HuYZYWK08 fatcat:blv5hb35rnfzzml5erksfcqm4u

The SPHINX-II speech recognition system: an overview

Xuedong Huang, Fileno Alleva, Hsiao-Wuen Hon, Mei-Yuh Hwang, Kai-Fu Lee, Ronald Rosenfeld
1993 Computer Speech and Language  
In this paper, we review the SPHINX-II speech recognition system and summarize our recent efforts on improved speech recognition.  ...  In order for speech recognizers to deal with increased task perplexity, speaker variation, and environment variation, improved speech recognition is critical.  ...  Long Distance Bigrams In a traditional stochastic language model, the current word is predicted based on the preceding word (bigram) or the preceding n;1 words (n-gram).  ... 
doi:10.1006/csla.1993.1007 fatcat:4ah33oefirebvjb64f7ezmb4tq

Croatian Large Vocabulary Automatic Speech Recognition

Sanda Martinčić-Ipšić, Miran Pobar, Ivo Ipšić
2011 Automatika  
In addition, Croatian language modeling procedures are evaluated and adopted for speaker independent spontaneous speech recognition.  ...  Original scientific paper This paper presents procedures used for development of a Croatian large vocabulary automatic speech recognition system (LVASR).  ...  Many large vocabulary automatic speech recognition (LVASR) systems use mel-cepstral speech analysis, hidden Markov modeling of acoustic subword units, n-gram language models (LM) and n-best search of word  ... 
doi:10.1080/00051144.2011.11828413 fatcat:hytrvxn72bdmbo5r5jadcga5la

N-Gram Language Model based Continuous Voiced Odia Digit Recognition

2019 International journal of recent technology and engineering  
The performance of the model is explored for different levels of HMM like word-level and phoneme-level. Further the model output is evaluated using different N-Gram approaches of the language model.  ...  A continuous speech recognition system is essential for voice identification hands free system used as a voice dialer, voice originated security systems and voice based automatic electronic machines.  ...  Comparison between Isolated and N-gram Language Model Speech recognition systems are generally categorized as either isolated system or continuous system.  ... 
doi:10.35940/ijrte.b3273.078219 fatcat:tpgt2xoujrg7tgdepdbubi7mpe

Selected Topics from LVCSR Research for Asian Languages at Tokyo Tech

Sadaoki FURUI
2012 IEICE transactions on information and systems  
This paper presents our recent work in regard to building Large Vocabulary Continuous Speech Recognition (LVCSR) systems for the Thai, Indonesian, and Chinese languages.  ...  For Thai, since there is no word boundary in the written form, we have proposed a new method for automatically creating word-like units from a text corpus, and applied topic and speaking style adaptation  ...  Since these words are usually spoken in a query sentence or a phrase, our future work includes evaluation in the context of continuous speech recognition.  ... 
doi:10.1587/transinf.e95.d.1182 fatcat:xrbyx236qjdtrh6pqqhlf6py64
« Previous Showing results 1 — 15 out of 5,457 results