Filters








1,202 Hits in 7.3 sec

Building DNN Acoustic Models for Large Vocabulary Speech Recognition [article]

Andrew L. Maas, Peng Qi, Ziang Xie, Awni Y. Hannun, Christopher T. Lengerich, Daniel Jurafsky, Andrew Y. Ng
2015 arXiv   pre-print
This paper offers an empirical investigation on which aspects of DNN acoustic model design are most important for speech recognition system performance.  ...  This larger corpus allows us to more thoroughly examine performance of large DNN models -- with up to ten times more parameters than those typically used in speech recognition systems.  ...  INTRODUCTION D EEP neural network (DNN) acoustic models have driven tremendous improvements in large vocabulary continuous speech recognition (LVCSR) in recent years.  ... 
arXiv:1406.7806v2 fatcat:4oc3szk3sbewlgstnnaq2s25ca

Building DNN acoustic models for large vocabulary speech recognition

Andrew L. Maas, Peng Qi, Ziang Xie, Awni Y. Hannun, Christopher T. Lengerich, Daniel Jurafsky, Andrew Y. Ng
2017 Computer Speech and Language  
We investigate which aspects of DNN acoustic model design are most important for speech recognition system performance, focusing on feed-forward networks.  ...  Our findings extend previous works to help establish a set of best practices for building DNN hybrid speech recognition systems and constitute an important first step toward analyzing more complex recurrent  ...  Introduction Deep neural network (DNN) acoustic models have driven tremendous improvements in large vocabulary continuous speech recognition (LVCSR) in recent years.  ... 
doi:10.1016/j.csl.2016.06.007 fatcat:6di43jn56jcqhbugprefh4ykby

Mongolian Speech Recognition Based on Deep Neural Networks [chapter]

Hui Zhang, Feilong Bao, Guanglai Gao
2015 Lecture Notes in Computer Science  
Experimental results show that the DNN-based models outperform the conventional models which based on Gaussian Mixture Models (GMMs) for the Mongolian speech recognition, by a large margin.  ...  And better Mongolian Large Vocabulary Continuous Speech Recognition (LVCSR) systems are required.  ...  In this study, we bring the success of DNN-HMM into the Mongolian ASR research, and build a Mongolian Large Vocabulary Continuous Speech Recognition (LVCSR) system.  ... 
doi:10.1007/978-3-319-25816-4_15 fatcat:oexmhqosxrcfdhc6aefc5ezg6i

Unsupervised acoustic model training for the Korean language

Antoine Laurent, William Hartmann, Lori Lamel
2014 The 9th International Symposium on Chinese Spoken Language Processing  
We compare both GMM and DNN acoustic models for both the unsupervised transcription and the final recognition system.  ...  As with previous studies, we begin with only a small amount of manually transcribed data to build preliminary acoustic models.  ...  Introduction For languages with limited resources, building Large Vocabulary Continuous Speech Recognition (LVCSR) systems is a challenge.  ... 
doi:10.1109/iscslp.2014.6936675 dblp:conf/iscslp/LaurentHL14 fatcat:d2khaedqwrhu3d6ktwrxvf26pa

Speech recognition using deep neural network - recent trends

Mousmita Sarma
2017 International Journal of Intelligent Systems Design and Computing  
The later part explains the DNN-based acoustic modelling for speech recognition and recent technology developments reported and the ones available for actual use.  ...  ANNs with deep learning which uses a generative, layer by-layer pre-training method for initialising the weights has provided best solution for acoustic modelling for speech recognition.  ...  Thus the ability of DNN to build up a complex hierarchy of concepts with layer by layer pre training has made it suitable for acoustic modelling of speech recognition.  ... 
doi:10.1504/ijisdc.2017.082853 fatcat:74c7x6rognayjpbuverpumox54

Improving Large Vocabulary Urdu Speech Recognition System Using Deep Neural Networks

Muhammad Umar Farooq, Farah Adeeba, Sahar Rauf, Sarmad Hussain
2019 Interspeech 2019  
Development of Large Vocabulary Continuous Speech Recognition (LVCSR) system is a cumbersome task, especially for low resource languages.  ...  In addition, Recurrent Neural Network Language Model (RNNLM) is also being used for re-scoring.  ...  Hidden Markov Models (HMMs) [8] is the widely used technique to build acoustic models for speech recognition systems [9] .  ... 
doi:10.21437/interspeech.2019-2629 dblp:conf/interspeech/FarooqARH19 fatcat:2dj23qewlnd57buj3w3j2dcake

Deep neural networks for syllable based acoustic modeling in Chinese speech recognition

Xiangang Li, Caifu Hong, Yuning Yang, Xihong Wu
2013 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference  
This paper reports the work about applying DNNs for syllable based acoustic modeling in Chinese automatic speech recognition (ASR).  ...  Recently, the deep neural networks (DNNs) based acoustic modeling methods have been successfully applied to many speech recognition tasks.  ...  Afterwards, context-dependent pre-trained DNN/HMMs for large vocabulary speech recognition [3] and real-world data have been reported [4] .  ... 
doi:10.1109/apsipa.2013.6694176 dblp:conf/apsipa/LiHYW13 fatcat:yqseqqiuxrdsjlhkkpj5ig3mbq

Deep Learning Based Automatic Speech Recognition for Turkish

Burak TOMBALOĞLU, Hamit ERDEM
2020 Sakarya University Journal of Science  
Although DNN has been applied for solving Automatic Speech Recognition (ASR) problem in some languages, DNNbased Turkish Speech Recognition has not been studied extensively.  ...  Each phoneme of Turkish language is also modelled as a sub-word in the model. Sub-word (morpheme) based language model is widely used for agglutinative languages to prevent excessive vocabulary size.  ...  Many studies show that DNN outperforms GMM at acoustic modelling which is used for ASRs having large datasets and vocabularies.  ... 
doi:10.16984/saufenbilder.711888 fatcat:xvdani7y4nfelnnknpknrcn5oq

Multilingual exemplar-based acoustic model for the NIST Open KWS 2015 evaluation

Van Hai Do, Xiong Xiao, Haihua Xu, Eng Siong Chng, Haizhou Li
2015 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)  
With this architecture, our exemplar-based model outperforms the 9-layer-DNN acoustic model significantly for both the speech recognition and keyword search tasks.  ...  Specifically, kerneldensity model is used to replace GMM in HMM/GMM (Hidden Markov Model / Gaussian Mixture Model) or DNN in HMM/DNN (Hidden Markov Model / Deep Neural Network) acoustic model to predict  ...  build the speech recognition system.  ... 
doi:10.1109/apsipa.2015.7415338 dblp:conf/apsipa/DoXXCL15 fatcat:6sthbmtvvffftgs456xt7zm2ce

Computational intelligence in processing of speech acoustics: a survey

Amitoj Singh, Navkiran Kaur, Vinay Kukreja, Virender Kadyan, Munish Kumar
2022 Complex & Intelligent Systems  
This paper presents a comprehensive survey on the speech recognition techniques for non-Indian and Indian languages, and compiled some of the computational models used for processing speech acoustics.  ...  This paper examined major challenges for speech recognition for different languages.  ...  models used for building speech recognition systems.  ... 
doi:10.1007/s40747-022-00665-1 fatcat:6pu2xccbq5as7bn2y2tav2fdwa

Multi-task deep neural network acoustic models with model adaptation using discriminative speaker identity for whisper recognition

Jingjie Li, Ian McLoughlin, Cong Liu, Shaofei Xue, Si Wei
2015 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
2016) Multi-task deep neural network acoustic models with model adaptation using discriminative speaker identity for whisper recognition.  ...  ABSTRACT This paper presents a study on large vocabulary continuous whisper automatic recognition (wLVCSR). wLVCSR provides the ability to use ASR equipment in public places without concern for disturbing  ...  CONCLUSION In this paper, we have proposed and evaluated SI multi-task DNN acoustic models for wLVCSR by taking advantage of the availability of extremely large normal speech resources.  ... 
doi:10.1109/icassp.2015.7178916 dblp:conf/icassp/LiMLXW15 fatcat:o7li4strsjcchpalgx6y5xm3bq

Improving DNN-Based Automatic Recognition of Non-native Children Speech with Adult Speech

Yao Qian, Xinhao Wang, Keelan Evanini, David Suendermann-Oeft
2016 Workshop on Child Computer Interaction  
Acoustic models for state-of-the-art DNN-based speech recognition systems are typically trained using at least several hundred hours of task-specific training data.  ...  In this paper, we investigate how to use an adult speech corpus to improve DNN-based automatic speech recognition for non-native children's speech.  ...  Speech Recognizer for Children As discussed in the Section 2, DNN models in combination with large amounts of training data can significantly improve the performance of a speech recognition system.  ... 
doi:10.21437/wocci.2016-7 dblp:conf/wocci/QianWES16 fatcat:rw46atb4ubhsxmb3cbdzqxld3i

Turkish Speech Recognition Techniques and Applications of Recurrent Units (LSTM and GRU)

Burak TOMBALOĞLU, Hamit ERDEM
2021 GAZI UNIVERSITY JOURNAL OF SCIENCE  
Subword based model is chosen in order not to decrease recognition performance and prevent large vocabulary.  ...  Current speech recognition systems include feature extraction, acoustic model, Language Modeling (LM), vocabulary dictionary and classification sections.  ...  Due to the large number of non-vocabulary words, speech recognition methods used for English give low recognition success results for Turkish language.  ... 
doi:10.35378/gujs.816499 fatcat:cwbp4d5hyzd7rifrbpw2rjfwka

Ses Tanıma için Derin Öğrenme Mimarileri Üzerine Derleme

Yeşim DOKUZ, Zekeriya TÜFEKCİ
2020 European Journal of Science and Technology  
The acoustic model takes features as input and phonetic knowledge and generates an acoustic model score for the variable-length feature sequence.  ...  A typical speech recognition system consists of four modules, namely, signal processing and feature extraction, acoustic model, language model, and hypothesis search (Yu and Deng, 2016) .  ...  Chan et al. (2015) utilized RNNs and DNNs for increasing speech recognition performance on embedded devices by building a large RNNs acoustic model and pass this model to DNNs for speech recognition.  ... 
doi:10.31590/ejosat.araconf22 fatcat:ezethulhwfejhfun5iaiovkqhi

Cloud-based Automatic Speech Recognition systems for Southeast Asian Languages

Lei Wang, Rong Tong, Cheung-Chi Leung, Sunil Sivadas, Chongjia Ni, Bin Ma
2017 2017 International Conference on Orange Technologies (ICOT)  
This paper provides an overall introduction of our Automatic Speech Recognition (ASR) systems for Southeast Asian languages.  ...  This work takes Bahasa Indonesia and Thai as examples to illustrate the strategies of collecting various resources required for building ASR systems.  ...  For popular languages, large amounts of transcribed speech data and text data are available for building the Acoustic Models (AMs) and Language Models (LMs).  ... 
doi:10.1109/icot.2017.8336109 fatcat:fdr4tt7lhzhhhfxfcn6ip532hy
« Previous Showing results 1 — 15 out of 1,202 results