Filters








94,938 Hits in 3.5 sec

Language Informed Modeling of Code-Switched Text

Khyathi Chandu, Thomas Manzini, Sumeet Singh, Alan W. Black
2018 Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching  
We hypothesize that encoding language information strengthens a language model by helping to learn code-switching points.  ...  Code-switching (CS), the practice of alternating between two or more languages in conversations, is pervasive in most multilingual communities.  ...  We would also like to thank Graham Neubig at our institute who gave valuable feedback throughout the course of this work.  ... 
doi:10.18653/v1/w18-3211 dblp:conf/acl-codeswitch/ChanduMSB18 fatcat:xoplq7hxhvborndpkmjanj4gha

Exploration of the Impact of Maximum Entropy in Recurrent Neural Network Language Models for Code-Switching Speech

Ngoc Thang Vu, Tanja Schultz
2014 Proceedings of the First Workshop on Computational Approaches to Code Switching  
First, we explore extensively the integration of part-of-speech tags and language identifier information in recurrent neural network language models for Code-Switching.  ...  Finally, we propose to adapt the recurrent neural network language model to different Code-Switching behaviors and use them to generate artificial Code-Switching text data.  ...  Code-Switching text data.  ... 
doi:10.3115/v1/w14-3904 dblp:conf/acl-codeswitch/VuS14 fatcat:bohxkvtp55b6bdfgsbmirg6drq

Emotion Detection in Code-switching Texts via Bilingual and Sentimental Information

Zhongqing Wang, Sophia Lee, Shoushan Li, Guodong Zhou
2015 Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)  
Empirical studies demonstrate the effectiveness of our proposed approach in detecting emotion in code-switching texts.  ...  In this paper, we first utilize two kinds of knowledge, i.e. bilingual and sentimental information to bridge the gap between different languages.  ...  PolyU 5593/13H), and supported by the National Natural Science Foundation of China (No. 61273320, and No. 61375073) and the Key Project of the National Natural Science Foundation of China (No. 61331011  ... 
doi:10.3115/v1/p15-2125 dblp:conf/acl/WangLLZ15 fatcat:wgwdakcclrfuhkb5d7agojnthi

Dual Language Models for Code Switched Speech Recognition

Saurabh Garg, Tanmay Parekh, Preethi Jyothi
2018 Interspeech 2018  
In this work, we present a simple and elegant approach to language modeling for bilingual code-switched text.  ...  We prove the robustness of our model by showing significant improvements in perplexity measures over the standard bilingual language model without the use of any external information.  ...  language models estimated on code-switched text (Section 4.3).  ... 
doi:10.21437/interspeech.2018-1343 dblp:conf/interspeech/GargPJ18 fatcat:usvuehen5zgslbvmkclsgol4gm

Language Modeling for Code-Switched Data: Challenges and Approaches [article]

Ganji Sreeram, Rohit Sinha
2017 arXiv   pre-print
of the parts-of-speech features towards more effective modeling of Hindi-English code-switched data by the monolingual language model (LM) trained on native (Hindi) language data, and (iii) the proposal  ...  In this work, we have studied the intra-sentential problem in the context of code-switching language modeling task.  ...  Thus, they distinguish these words and build a language model of the merged training texts.  ... 
arXiv:1711.03541v1 fatcat:3dfdok44jbdprct746vb5h3cxa

A Survey of Code-switched Speech and Language Processing [article]

Sunayana Sitaram, Khyathi Raghavi Chandu, Sai Krishna Rallabandi, Alan W Black
2020 arXiv   pre-print
Code-switching, the alternation of languages within a conversation or utterance, is a common communicative phenomenon that occurs in multilingual communities across the world.  ...  As code-switching data and resources are scarce, we list what is available in various code-switched language pairs with the language processing tasks they can be used for.  ...  Although there is significantly more code-switched text data compared to speech data in the form of informal conversational data such as on Twitter, Facebook and Internet forums, robust language models  ... 
arXiv:1904.00784v3 fatcat:r5tsg4kdnfbtnndae523c32pta

A first speech recognition system for Mandarin-English code-switch conversational speech

Ngoc Thang Vu, Dau-Cheng Lyu, Jochen Weiner, Dominic Telaar, Tim Schlippe, Fabian Blaicher, Eng-Siong Chng, Tanja Schultz, Haizhou Li
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
On language model level, we investigated statistical machine translation (SMT)based text generation approaches for building code-switching language models.  ...  Furthermore, we integrated the provided information from a language identification system (LID) into the decoding process by using a multi-stream approach.  ...  On language model level, we apply different SMT-based methods to generate artificial code-switch texts.  ... 
doi:10.1109/icassp.2012.6289015 dblp:conf/icassp/VuLWTSBCSL12 fatcat:ht2gbea74rbqrhfaie6kpocxzi

Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis

Ruibo Fu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Chunyu Qiang, Tao Wang
2020 Interspeech 2020  
is proposed to ensure the smooth transition of code-switching.  ...  Most of current end-to-end speech synthesis assumes the input text is in a single language situation.  ...  And the acoustic seq-to-seq model consists of three parts: Method • Encoder mainly processes text information.  ... 
doi:10.21437/interspeech.2020-1754 dblp:conf/interspeech/FuTWYQW20 fatcat:axfpfvlqe5e6fmfmscxzaso274

Towards Code-switched Classification Exploiting Constituent Language Resources [article]

Tanvi Dadu, Kartikey Pant
2020 arXiv   pre-print
The analysis of code-switched data often becomes an assiduous task, owing to the limited availability of data.  ...  Code-switching is a commonly observed communicative phenomenon denoting a shift from one language to another within the same speech exchange.  ...  Access to code-switched data is challenging and limited. This phenomenon makes the analysis and information extraction from code-switched languages a less explored and challenging task.  ... 
arXiv:2011.01913v1 fatcat:jqjk2xyrvnbrhbcrcz6l7lnpf4

Part-of-Speech Tagging for Code-Switched, Transliterated Texts without Explicit Language Identification

Kelsey Ball, Dan Garrette
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
Experiments on Hindi-English part-of-speech tagging demonstrate that our approach outperforms standard models when training on monolingual text without transliteration, and testing on code-switched text  ...  Code-switching, the use of more than one language within a single utterance, is ubiquitous in much of the world, but remains a challenge for NLP largely due to the lack of representative data for training  ...  While we address some of the most prominent issues with code-switching, our model does not deal with style, formality, or domain mismatches between the formal training data and informal evaluation data  ... 
doi:10.18653/v1/d18-1347 dblp:conf/emnlp/BallG18 fatcat:3qfkrhis6zdkzl2d2abkyyfbr4

Dual Language Models for Code Switched Speech Recognition [article]

Saurabh Garg, Tanmay Parekh, Preethi Jyothi
2018 arXiv   pre-print
In this work, we present a simple and elegant approach to language modeling for bilingual code-switched text.  ...  Since code-switching is a blend of two or more different languages, a standard bilingual language model can be improved upon by using structures of the monolingual language models.  ...  primary language can be trained further with large amounts of monolingual text data (which is easier to obtain compared to code-switched text).  ... 
arXiv:1711.01048v2 fatcat:gghcalzjq5bu5abjiisv3tgtme

End-to-End Code Switching Language Models for Automatic Speech Recognition [article]

Ahan M. R., Shreyas Sunil Kulkarni
2020 arXiv   pre-print
approach for extracting monolingual text using Deep Bi-directional Language Models(LM) such as BERT and other Machine Translation models, and also explore different ways of extracting code-switched text  ...  Due to the discrepancies in the extraction of code-switched text from an Automated Speech Recognition(ASR) module, and thereby extracting the monolingual text from the code-switched text, we propose an  ...  Code Switched text to Recovered text Recovery using BERT Most of the general conditional Language Models(LM) are generally uni-directional models, left-right or right-left models, which in most of the  ... 
arXiv:2006.08870v1 fatcat:lz6q5ke3rjehzexj4aj7izmr5y

Code-switched Language Models Using Dual RNNs and Same-Source Pretraining

Saurabh Garg, Tanmay Parekh, Preethi Jyothi
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
This work focuses on building language models (LMs) for code-switched text.  ...  We propose two techniques that significantly improve these LMs: 1) A novel recurrent neural network unit with dual components that focus on each language in the code-switched text separately 2) Pretraining  ...  Language models for code-switched text is an important problem with implications to downstream applications such as speech recognition and machine translation of code-switched data.  ... 
doi:10.18653/v1/d18-1346 dblp:conf/emnlp/GargPJ18 fatcat:ws422mtqdzhv5fovplrvpvlfsm

Code-switched Language Models Using Dual RNNs and Same-Source Pretraining [article]

Saurabh Garg, Tanmay Parekh, Preethi Jyothi
2018 arXiv   pre-print
This work focuses on building language models (LMs) for code-switched text.  ...  We propose two techniques that significantly improve these LMs: 1) A novel recurrent neural network unit with dual components that focus on each language in the code-switched text separately 2) Pretraining  ...  Language models for code-switched text is an important problem with implications to downstream applications such as speech recognition and machine translation of code-switched data.  ... 
arXiv:1809.01962v1 fatcat:qnd3rvvwf5gmnk3oovjxiqaude

Decoupling Pronunciation and Language for End-to-end Code-switching Automatic Speech Recognition [article]

Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Zhengqi wen
2020 arXiv   pre-print
By using monolingual data and unpaired text data, the decoupled transformer model reduces the high dependency on code-switching paired training data of E2E model to a certain extent.  ...  In this paper, we propose a decoupled transformer model to use monolingual paired data and unpaired text data to alleviate the problem of code-switching data shortage.  ...  In this way, the model combines the pronunciation and language information from monolingual data and unpaired text data.  ... 
arXiv:2010.14798v1 fatcat:7qsud65tsfhbrjrnxp37hovqfa
« Previous Showing results 1 — 15 out of 94,938 results