26 Hits in 4.0 sec

Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings [article]

Chengrui Zhu, Keyu An, Huahuan Zheng, Zhijian Ou
2021 arXiv   pre-print
The use of phonological features (PFs) potentially allows language-specific phones to remain linked in training, which is highly desirable for information sharing for multilingual and crosslingual speech  ...  A series of multilingual and crosslingual (both zero-shot and few-shot) speech recognition experiments are conducted on the CommonVoice dataset (German, French, Spanish and Italian) and the AISHLL-1 dataset  ...  In this paper, we propose a new approach to using phonological features for multilingual and crosslingual speech recognition.  ... 
arXiv:2107.05038v2 fatcat:reeobk6txvebncybbdqqzgjjjm

Zero-Shot Cross-Lingual Phonetic Recognition with External Language Embedding

Heting Gao, Junrui Ni, Yang Zhang, Kaizhi Qian, Shiyu Chang, Mark Hasegawa-Johnson
2021 Conference of the International Speech Communication Association  
Multilingual phonetic recognition systems mitigate data sparsity issues by training models on data from multiple languages and learning a speech-to-phone or speech-to-text model universal to all languages  ...  Even with no transcribed speech, it is possible to train a language embedding using only data from language typologies (phylogenetic node and phoneme inventory) that reduces ASR error rates.  ...  Multilingual and Cross-lingual phonetic recognition attempt to partially solve the low-resource problem by building a universal phone recognizer that transcribes speech from different languages into corresponding  ... 
doi:10.21437/interspeech.2021-1843 dblp:conf/interspeech/GaoNZQCH21 fatcat:h36lrbx54bbjtkplt67cgrkyeq

Domain-Adversarial Based Model with Phonological Knowledge for Cross-Lingual Speech Recognition

Qingran Zhan, Xiang Xie, Chenguang Hu, Juan Zuluaga-Gomez, Jing Wang, Haobo Cheng
2021 Electronics  
When doing the cross-lingual speech recognition, the AFs detectors are used to transfer the phonological knowledge from source languages (English, German and French) to the target language (Mandarin).  ...  This paper investigates a domain-adversarial neural network (DANN) to extract reliable AFs, and different multi-stream techniques are used for cross-lingual speech recognition.  ...  Articulatory information has been proved useful in many related areas, such as pathological speech recognition [2] , pronunciation prediction [3] and multilingual speech recognition [4] .  ... 
doi:10.3390/electronics10243172 fatcat:uefaizzlifdazg56cfrfxiiztu

Autosegmental Neural Nets 2.0: An Extensive Study of Training Synchronous and Asynchronous Phones and Tones for Under-Resourced Tonal Languages

Jialu Li, Mark Hasegawa-Johnson
2022 IEEE/ACM Transactions on Audio Speech and Language Processing  
In this study, we perform an extensive study by multilingual training on four tonal languages and cross-lingual testing on the fifth, in a five-fold cross-validation framework, using four CTC-based systems  ...  Many past studies have investigated cross-lingual adaptation in an automatic speech recognition (ASR) tone-marked phone model, yet very few studied the interaction between cross-lingual adaptation and  ...  ACKNOWLEDGMENT The authors would like to thank their colleague, Shuju Shi, for provide insightful discussions of differences of voice quality symbols among Lao, Thai, and Vietnamese.  ... 
doi:10.1109/taslp.2022.3178238 fatcat:gnjbworpfrgwnlwzbwtbbs466e

Cross-Lingual Neural Network Speech Synthesis Based on Multiple Embeddings

Tijana V. Nosek, Siniša B. Suzić, Darko J. Pekar, Radovan J. Obradović, Milan S. Sečujski, Vlado D. Delić
2021 International Journal of Interactive Multimedia and Artificial Intelligence  
The method is based on the application of neural network embedding to combinations of speaker and style IDs, but also to phones in particular phonetic contexts, without any prior linguistic knowledge on  ...  The paper presents a novel architecture and method for speech synthesis in multiple languages, in voices of multiple speakers and in multiple speaking styles, even in cases when speech from a particular  ...  Speech corpora used in the research were provided by Speech Morphing Systems Inc. for research purposes.  ... 
doi:10.9781/ijimai.2021.11.005 fatcat:vllarnyjgfea5mvjkpbfpwr3rm

How Familiar Does That Sound? Cross-Lingual Representational Similarity Analysis of Acoustic Word Embeddings [article]

Badr M. Abdullah, Iuliia Zaitova, Tania Avgustinova, Bernd Möbius, Dietrich Klakow
2021 arXiv   pre-print
To answer these questions, we present a novel experimental design based on representational similarity analysis (RSA) to analyze acoustic word embeddings (AWEs) -- vector representations of variable-duration  ...  We then employ RSA to quantify the cross-lingual similarity by simulating native and non-native spoken-word processing using AWEs.  ...  We further extend our gratitude to Miriam Schulz and Marius Mosbach for proofreading the paper.  ... 
arXiv:2109.10179v1 fatcat:2ytalejf5bbujeybhcij2e4smu

Multi-Spectral Widefield Microscopy of the Beating Heart Through Post-Acquisition Synchronization and Unmixing

Christian Jaques, Linda Bapst-Wicht, Daniel F. Schorderet, Michael Liebling
2019 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019)  
State-of-the-art acoustic models for Automatic Speech Recognition (ASR) are based on Hidden Markov Models (HMM) and Deep Neural Networks (DNN) and often require thousands of hours of transcribed speech  ...  Through theoretical and experimental comparisons, the proposed approaches are shown to yield significant improvement over the conventional hybrid systems on multilingual speech recognition.  ...  1 Introduction Multilingual Speech Recognition State-of-the-art Automatic speech recognition (ASR) systems usually consist of two major components: acoustic model and language model.  ... 
doi:10.1109/isbi.2019.8759472 dblp:conf/isbi/JaquesBSL19 fatcat:flypznnglbfrzm3ayf6tsfof34

Survey on the Use of Typological Information in Natural Language Processing [article]

Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Anna Korhonen
2016 arXiv   pre-print
In recent years linguistic typology, which classifies the world's languages according to their functional and structural properties, has been widely used to support multilingual NLP.  ...  While the growing importance of typological information in supporting multilingual tasks has been recognised, no systematic survey of existing typological resources and their use in NLP has been published  ...  Acknowledgments This work is supported by ERC Consolidator Grant LEXICAL (no 648909) and by the Center for Brains, Minds and Machines (CBMM) funded by the NSF STC award CCF-1231216.  ... 
arXiv:1610.03349v1 fatcat:qkqyoe5x2jchlidj23zl6xcf2a

Cross-Lingual Bridges with Models of Lexical Borrowing

Yulia Tsvetkov, Chris Dyer
2016 The Journal of Artificial Intelligence Research  
Its features are based on universal constraints from Optimality Theory (OT), and we show that compared to several standard—but linguistically more naïve—baselines, our OT-inspired model obtains good performance  ...  Linguistic borrowing is the phenomenon of transferring linguistic constructions (lexical, phonological, morphological, and syntactic) from a "donor" language to a "recipient" language as a result of contacts  ...  Acknowledgments We thank Nathan Schneider, Shuly Wintner, Lluís Màrquez, and the anonymous reviewers for their help and constructive feedback. We also thank Waleed Ammar for his help with Arabic.  ... 
doi:10.1613/jair.4786 fatcat:yxlrfqmbdrdrrgsc3uom6mvv24

Automatic Speech Recognition for Low-resource Languages and Accents Using Multilingual and Crosslingual Information

Ngoc Thang Vu
This thesis explores methods to rapidly bootstrap automatic speech recognition systems for languages, which lack resources for speech and language processing.  ...  Under application aspects, this thesis also includes research work on non-native and Code-Switching speech.  ...  The following paragraphs provide an overview of the beginnings of using multilingual and crosslingual information in speech recognition systems.  ... 
doi:10.5445/ir/1000041124 fatcat:7kmt7i7tlnglfpdou6346spcky

The EVALITA Dependency Parsing Task: From 2007 to 2011 [chapter]

Cristina Bosco, Alessandro Mazzei
2013 Lecture Notes in Computer Science  
4 on speech technologies.  ...  Established in 2007, EVALITA ( is the evaluation campaign of Natural Language Processing and Speech Technologies for the Italian language, organized around shared tasks focusing on  ...  Acknowledgments Luca Atzori and Daniele Sartiano helped performing the experiments using embeddings and clusters.  ... 
doi:10.1007/978-3-642-35828-9_1 fatcat:p6dyjaxm4zbitfajtciwclwipu

Bilingualism: Theoretical perspectives of language diversity

Carlin L. Stobbart
1992 South African Journal of Communication Disorders  
second language acquisition, particularly within the multicultural and multilingual South African context, is highlighted.  ...  Theoretical persepectives according to Dodson (1985), Skinner (1985) and Krashen (1982) are explored.  ...  Erna Alant (University of Pretoria) and Mrs Glen Jager (University of Durban-Westville) are noted with appreciation.  ... 
doi:10.4102/sajcd.v39i1.272 fatcat:wqv7dbly4bewvkp7ccy4xtfa5e

Multilingual Training and Adaptation in Speech Recognition

Sibo Tong
State-of-the-art acoustic models for Automatic Speech Recognition (ASR) are based on Hidden Markov Models (HMM) and Deep Neural Networks (DNN) and often require thousands of hours of transcribed speech  ...  Through theoretical and experimental comparisons, the proposed approaches are shown to yield significant improvement over the conventional hybrid systems on multilingual speech recognition.  ...  Multilingual Speech Recognition State-of-the-art Automatic speech recognition (ASR) systems usually consist of two major components: acoustic model and language model.  ... 
doi:10.5075/epfl-thesis-7896 fatcat:xjknfsb63fho5drspzdxxpcaqq

Message from the general chair

Benjamin C. Lee
2015 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)  
To maximize the utility of the injected knowledge, we deploy a learning-based multi-sieve approach and develop novel entity-based features.  ...  Learning-based Multi-Sieve Co-reference Resolution with Knowledge Lev Ratinov and Dan Roth Saturday 11:00am-11:30am -202 A (ICC) We explore the interplay of knowledge and structure in co-reference resolution  ...  In addition to the acoustic and language models used in automatic speech recognition systems, HVR uses the haptic and partial lexical models as additional knowledge sources to reduce the recognition search  ... 
doi:10.1109/ispass.2015.7095776 dblp:conf/ispass/Lee15 fatcat:ehbed6nl6barfgs6pzwcvwxria

Multilingual Modulation by Neural Language Codes

Markus Müller
Multilinguale Spracherkennung bleibt eine der großen Herausforderungen in der Sprachverarbeitung.  ...  Multilingual Speech Recognition Multilingual Speech Recognition Multilingual speech recognition poses several challenges [WGT + 00].  ...  As speaker and channel characteristics are strongly signal related, directly shifting the acoustic features based on i-vectors using a neural Speech Recognition Using Recurrent Neural Networks Bottleneck  ... 
doi:10.5445/ir/1000088486 fatcat:osb2krr6yvgwjlx4vw727uzrpa
« Previous Showing results 1 — 15 out of 26 results