Filters








22,709 Hits in 6.7 sec

Towards End-to-end Automatic Code-Switching Speech Recognition [article]

Genta Indra Winata, Andrea Madotto, Chien-Sheng Wu, Pascale Fung
2018 arXiv   pre-print
We propose a CTC-based end-to-end automatic speech recognition model for intra-sentential English-Mandarin code-switching.  ...  Speech recognition in mixed language has difficulties to adapt end-to-end framework due to the lack of data and overlapping phone sets, for example in words such as "one" in English and "w\'an" in Chinese  ...  CONCLUSION We propose a new direction on automatic code-switching speech recognition by applying end-to-end approach. Our training method can be adapted to any languages pair.  ... 
arXiv:1810.12620v1 fatcat:cp44rmgxf5frddc7vp4e6qgmjy

Towards Language-Universal Mandarin-English Speech Recognition

Shiliang Zhang, Yuan Liu, Ming Lei, Bin Ma, Lei Xie
2019 Interspeech 2019  
Multilingual and code-switching speech recognition are two challenging tasks that are studied separately in many previous works.  ...  More importantly, the proposed bilingual model can automatically learn the language switching.  ...  Recently, researchers have proposed to improve the performance of code-switching speech recognition system by using the popular end-to-end approach [22, 23, 24] .  ... 
doi:10.21437/interspeech.2019-1365 dblp:conf/interspeech/ZhangLLMX19 fatcat:a2jq5aa2arakrcsyh2gqhgjsji

Towards End-to-End Code-Switching Speech Recognition [article]

Ne Luo, Dongwei Jiang, Shuaijiang Zhao, Caixia Gong, Wei Zou, Xiangang Li
2018 arXiv   pre-print
End-to-end automatic speech recognition (ASR) simplifies the building of ASR systems considerably by predicting graphemes or characters directly from acoustic input.  ...  This paper presents a hybrid CTC-Attention based end-to-end Mandarin-English code-switching (CS) speech recognition system and studies the effect of hybrid CTC-Attention based models, different modeling  ...  The end-to-end based code-switching speech recognition, including modeling units, language identification and decoding strategies are studied in Section 3.  ... 
arXiv:1810.13091v2 fatcat:3r7evbvkg5d6hgxttaeymavqfa

Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition [article]

Gurunath Reddy Madhumani, Sanket Shah, Basil Abraham, Vikas Joshi, Sunayana Sitaram
2020 arXiv   pre-print
Recognizing code-switched speech is challenging for Automatic Speech Recognition (ASR) for a variety of reasons, including the lack of code-switched training data.  ...  We train end-to-end ASR systems starting with a pooled model that uses monolingual and code-switched data along with the adversarial discriminator.  ...  Introduction Recognizing code-switched speech is challenging for Automatic Speech Recognition (ASR) systems due to the lack of large amounts of labeled code-switched speech and text data for training Acoustic  ... 
arXiv:2006.05257v1 fatcat:6q2okd6swnd7plet2cjcdjqaj4

Transformer-Transducers for Code-Switched Speech Recognition [article]

Siddharth Dalmia, Yuzong Liu, Srikanth Ronanki, Katrin Kirchhoff
2021 arXiv   pre-print
In this paper, we present an end-to-end ASR system using a transformer-transducer model architecture for code-switched speech recognition.  ...  Finally, we propose a multi-label/multi-audio encoder structure to leverage the vast monolingual speech corpora towards code-switching.  ...  This allows us to train on all the data (monolingual and code-switched) jointly end-to-end, without the need for pre-training the individual encoders, as needed in [15] .  ... 
arXiv:2011.15023v2 fatcat:ugrxkmweffejhahdmqtmihe3iu

Page 1302 of Psychological Abstracts Vol. 91, Issue 4 [page]

2004 Psychological Abstracts  
(France Télécom - FTR&D/DIH/IPS, Lannion, France) Towards improving speech de- tection robustness for speech recognition in adverse conditions. Speech Communication, 2003(May), Vol 40(3), 261-276.  ...  for their widespread involvement in insertional code switching.  ... 

End-to-End Multilingual Multi-Speaker Speech Recognition

Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey
2019 Interspeech 2019  
The expressive power of end-to-end automatic speech recognition (ASR) systems enables direct estimation of the character or word label sequence from a sequence of acoustic features.  ...  There has also been growing interest in multispeaker speech recognition, which enables generation of multiple label sequences from single-channel mixed speech.  ...  INTRODUCTION The expressive power of an end-to-end automatic speech recognition (ASR) system enables direct conversion from input speech feature sequences to output label sequences without any explicit  ... 
doi:10.21437/interspeech.2019-3038 dblp:conf/interspeech/SekiHWRH19 fatcat:kjer62xwfngyxjvtpg6s24vlge

TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline [article]

Chengfei Li, Shuhao Deng, Yaoping Wang, Guangjing Wang, Yaguang Gong, Changbin Chen, Jinfeng Bai
2022 arXiv   pre-print
To our best knowledge, TALCS corpus is the largest well labeled Mandarin-English code-switching open source automatic speech recognition (ASR) dataset in the world.  ...  This paper introduces a new corpus of Mandarin-English code-switching speech recognition--TALCS corpus, suitable for training and evaluating code-switching speech recognition systems.  ...  There is increasing research interest in developing code-switching automatic speech recognition (CS-ASR) [1] systems as most of the off-the-shelf systems are monolingual and cannot handle code-switched  ... 
arXiv:2206.13135v1 fatcat:rhldqph355a5bagwtfe55bzqw4

QASR: QCRI Aljazeera Speech Resource – A Large Scale Annotated Arabic Speech Corpus [article]

Hamdy Mubarak, Amir Hussein, Shammur Absar Chowdhury, Ahmed Ali
2021 arXiv   pre-print
We show that end-to-end automatic speech recognition trained on QASR reports a competitive word error rate compared to the previous MGB-2 corpus.  ...  We report baseline results for downstream natural language processing tasks such as named entity recognition using speech transcript.  ...  This data is hosted on ArabicSpeech portal 18 , which is a community based effort that runs for the benefit of Arabic speech science and technologies.  ... 
arXiv:2106.13000v1 fatcat:w4pvisrj2rgafhgz5bhnyycwji

A Survey of Code-switched Speech and Language Processing [article]

Sunayana Sitaram, Khyathi Raghavi Chandu, Sai Krishna Rallabandi, Alan W Black
2020 arXiv   pre-print
We review code-switching research in various Speech and NLP applications, including language processing tools and end-to-end systems. We conclude with future directions and open problems in the field.  ...  This survey reviews computational approaches for code-switched Speech and Natural Language Processing.  ...  Automatic Speech Recognition Since code-switching is a spoken language phenomenon, it is important that Automatic Speech Recognizers (ASRs) that are deployed in multilingual communities are able to handle  ... 
arXiv:1904.00784v3 fatcat:r5tsg4kdnfbtnndae523c32pta

Multilingual and code-switching ASR challenges for low resource Indian languages [article]

Anuj Diwan, Rakesh Vaideeswaran, Sanket Shah, Ankita Singh, Srinivasa Raghavan, Shreya Khare, Vinit Unni, Saurabh Vyas, Akash Rajpuria, Chiranjeevi Yarra, Ashish Mittal, Prasanta Kumar Ghosh (+10 others)
2021 arXiv   pre-print
Recently, there is increasing interest in multilingual automatic speech recognition (ASR) where a speech recognition system caters to multiple low resource languages by taking advantage of low amounts  ...  In this challenge, we would like to focus on building multilingual and code-switching ASR systems through two different subtasks related to a total of seven Indian languages, namely Hindi, Marathi, Odia  ...  This subtask is a first step towards addressing this gap for research on code-switched speech. The code-switched speech is drawn from spoken tutorials on various topics in computer science.  ... 
arXiv:2104.00235v1 fatcat:eevwpnji2fdtdk7ltatn7hkkua

Dual Language Models for Code Switched Speech Recognition [article]

Saurabh Garg, Tanmay Parekh, Preethi Jyothi
2018 arXiv   pre-print
Similar consistent improvements are also reflected in automatic speech recognition error rates.  ...  In this work, we present a simple and elegant approach to language modeling for bilingual code-switched text.  ...  Codeswitched speech presents many challenges for automatic speech recognition (ASR) systems, in the context of both acoustic models and language models.  ... 
arXiv:1711.01048v2 fatcat:gghcalzjq5bu5abjiisv3tgtme

Acoustic and Textual Data Augmentation for Improved ASR of Code-Switching Speech

Emre Yılmaz, Henk van den Heuvel, David van Leeuwen
2018 Interspeech 2018  
In this paper, we describe several techniques for improving the acoustic and language model of an automatic speech recognition (ASR) system operating on code-switching (CS) speech.  ...  In previous work, we have proposed several automatic transcription strategies for CS speech to increase the amount of available training speech data.  ...  Motivated by the success of automatic training data generation for acoustic modeling, we generate code-switching text from transcriptions of the spoken data which has been found to have lower perplexity  ... 
doi:10.21437/interspeech.2018-52 dblp:conf/interspeech/YilmazHL18 fatcat:cawnnsaerje5jgiceyln5myjee

Multilingual spoken language processing

P. Fung, T. Schultz
2008 IEEE Signal Processing Magazine  
, and code switching, in particular.  ...  Results indicate that it is feasible to build various end-to-end speech translation systems including speech recognition, speech synthesis, and a statistical translation system in new languages for small  ... 
doi:10.1109/msp.2008.918417 fatcat:ezye4rngebdpphtis3szqdhvce

Improving Low Resource Code-switched ASR using Augmented Code-switched TTS [article]

Yash Sharma, Basil Abraham, Karan Taneja, Preethi Jyothi
2020 arXiv   pre-print
Building Automatic Speech Recognition (ASR) systems for code-switched speech has recently gained renewed attention due to the widespread use of speech technologies in multilingual communities worldwide  ...  In this work, we investigate improving code-switched ASR in low resource settings via data augmentation using code-switched text-to-speech (TTS) synthesis.  ...  Li, "A first speech recognition system for Mandarin-English code-switch conversational speech," in Proceedings of ICASSP, 2012, pp. 4889-4892.  ... 
arXiv:2010.05549v1 fatcat:njrets4i3zhjnn6ixinuwuyxie
« Previous Showing results 1 — 15 out of 22,709 results