Filters








3,563 Hits in 2.2 sec

Arja Nurmi, Tanja Rütten, and Päivi Pahta (eds): CHALLENGING THE MYTH OF MONOLINGUAL CORPORA

Rachelle Vessey
2018 Applied Linguistics  
Reviewed by Rachelle Vessey Birkbeck, University of London E-mail: r.vessey@bbk.ac.uk doi:10.1093/applin/amy019  ...  through which aspects of multilingualism, such as code-switching, are normally studied.  ...  but rather on supposedly monolingual corpora and how multilingualism nevertheless figures therein.  ... 
doi:10.1093/applin/amy019 fatcat:zbp2zvl3tjce5oedjcjgdb2sji

Thomas Schmidt and Kai Wörner (eds.) 2012. Multilingual Corpora and Multilingual Corpus Analysis

Cristina Toledo
2015 FITISPos international journal. Public service interpreting and translation  
Multilingual Corpora and Multilingual Corpus Analysis focuses on linguistic aspects of multilingualism, specifically on the design of corpora in studies on multilingualism as well as presentations of linguistic  ...  analyses conducted using multilingual corpora.  ...  To summarize, Multilingual Corpora and Multilingual Corpus Analysis reflects the diversity in multilingual corpus studies.  ... 
doi:10.37536/fitispos-ij.2015.2.0.84 fatcat:miyom5b6bbgjfekp2gf624ec6q

Chapter 2. Parallel and Comparable Corpora: What is Happening? [chapter]

Tony McEnery, Richard Xiao, Gunilla Anderman, Margaret Rogers
2007 Incorporating Corpora  
As part of this new wave of research on translation and contrastive studies, corpora, and multilingual corpora in particular, have a prominent role.  ...  In this chapter, we will illustrate the value of parallel and comparable corpora to translation and contrastive studies.  ...  For example, we can say a corpus is monolingual, bilingual or multilingual if we take the number of languages involved as the criterion for definition.  ... 
doi:10.21832/9781853599873-005 fatcat:utgpksfuazbmrepoh6luvjgluu

Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages [article]

Tejas Indulal Dhamecha, Rudra Murthy V, Samarth Bharadwaj, Karthik Sankaranarayanan, Pushpak Bhattacharyya
2021 arXiv   pre-print
Compared to monolingual fine tuning we get relative performance improvement of up to 150% in the downstream tasks.  ...  We explore the impact of leveraging the relatedness of languages that belong to the same family in NLP models using multilingual fine-tuning.  ...  monolingual corpora used in pre-training IndoAryan LMs from scratch. ). 1.  ... 
arXiv:2109.10534v1 fatcat:24442urvdnhmxk6efg5q4gbsyi

Improving Cross-Lingual Knowledge Transferability Using Multilingual TDNN-BLSTM with Language-Dependent Pre-Final Layer

Siyuan Feng, Tan Lee
2018 Interspeech 2018  
In this work, two research aspects are investigated, with the goal of improving multilingual acoustic modeling.  ...  It's widely acknowledged that the shared-hidden-layer multilingual deep neural network (SHL-MDNN) acoustic model (AM) could outperform the conventional monolingual AM, due to its effectiveness in cross-lingual  ...  From Table 3 and Figure 3 , the following observations are made: Results and analyses (1) Multilingual models of DNN, TDNN, BLSTM and TDNN-BLSTM using merged CA and EN corpora outperform their monolingually  ... 
doi:10.21437/interspeech.2018-1182 dblp:conf/interspeech/FengL18 fatcat:t7zoejt7drb2hhxazbbn2gfbti

A Multi-task Approach to Learning Multilingual Representations

Karan Singla, Dogan Can, Shrikanth Narayanan
2018 Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
Our architecture can transparently use both monolingual and sentence aligned bilingual corpora to learn multilingual embeddings, thus covering a vocabulary significantly larger than the vocabulary of the  ...  We present a novel multi-task modeling approach to learning multilingual distributed representations of text.  ...  monolingual corpora.  ... 
doi:10.18653/v1/p18-2035 dblp:conf/acl/SinglaCN18 fatcat:dnfgrpqlbfhfbbgk7tjocccdla

Word Sense Disambiguation Using Wikipedia [chapter]

Bharath Dandala, Rada Mihalcea, Razvan Bunescu
2013 The People's Web Meets NLP  
Starting from a basic monolingual approach, we develop two multilingual systems: one that uses a machine translation system to create multilingual features, and one where multilingual features are extracted  ...  We present three approaches to word sense disambiguation that use Wikipedia as a source of sense annotations.  ...  Through the WIKITRANSSENSE system, we showed how to effectively use a machine translation system to leverage two relevant multilingual aspects of the semantics of text.  ... 
doi:10.1007/978-3-642-35085-6_9 dblp:series/tanlp/DandalaMB13 fatcat:pchga2qz3vgg5jskyhkjplnlvi

A Survey on Low-Resource Neural Machine Translation [article]

Rui Wang and Xu Tan and Renqian Luo and Tao Qin and Tie-Yan Liu
2021 arXiv   pre-print
In this paper, we provide a survey for low-resource NMT and classify related works into three categories according to the auxiliary data they used: (1) exploiting monolingual data of source and/or target  ...  Neural approaches have achieved state-of-the-art accuracy on machine translation but suffer from the high cost of collecting large scale parallel data.  ...  Plenty of works have exploited monolingual data in NMT systems, which we categorize into several aspects: (1) back translation, which is a simple and promising approach to take advantage of the target-side  ... 
arXiv:2107.04239v1 fatcat:4la4zqfafzhk3l4chhmkaqrmwm

FiSSA at SemEval-2020 Task 9: Fine-tuned For Feelings [article]

Bertelt Braaksma, Richard Scholtens, Stan van Suijlekom, Remy Wang, Ahmet Üstün
2020 arXiv   pre-print
We explore both monolingual and multilingual models with the standard fine-tuning method.  ...  We investigate performance of various pre-trained Transformer models by using different fine-tuning strategies.  ...  of each model to better understand the overall results.  ... 
arXiv:2007.12544v3 fatcat:7xvazoxtarfppeuh4t3xrdpfaq

Bilingual embeddings with random walks over multilingual wordnets

Josu Goikoetxea, Aitor Soroa, Eneko Agirre
2018 Knowledge-Based Systems  
The main approach is to train monolingual embeddings first and then map them using bilingual dictionaries.  ...  Italian); 4) the combination of wordnets and text yields the best results, above mapping-based approaches.  ...  The sizes of the monolingual corpora are shown inTable 1.  ... 
doi:10.1016/j.knosys.2018.03.017 fatcat:yyconb4yujhcjhfahrcncrsnnq

Extraction of Code-mixed Aspect Topics in Semantic Representation

Kavita Sanjay Asnani, Jyoti D Pawar
2018 Journal of Computacion y Sistemas  
the state-of-the-art techniques used for aspect extraction of code-mixed data.  ...  This has led to the generation of large volumes of unstructured code-mixed social media text having useful aspects of information highly dispersed.  ...  The core aspect behind the proposed lcms-LDA algorithm is that multilingual synset Also, since multilingual synsets provide synonyms across languages, monolingual representation of aspects aids in improving  ... 
doi:10.13053/cys-22-1-2771 fatcat:myj6nppbrzfhdj36qyokyo4o64

End-to-End Multilingual Speech Recognition System with Language Supervision Training

Danyang LIU, Ji XU, Pengyuan ZHANG
2020 IEICE transactions on information and systems  
On four Babel corpora, the proposed E2E multilingual ASR system achieved an average absolute word error rate (WER) reduction of 2.6% compared with the multilingual baseline system. key words: multilingual  ...  In the current E2E multilingual ASR framework, the output prediction for a specific language lacks constraints on the output scope of modeling units.  ...  Acknowledgments This work is partially supported by the National Key Research and Development Program (Nos. 2019QY1805) and the National Natural Science Foundation of China (Nos. 11590774, 11590770).  ... 
doi:10.1587/transinf.2019edl8214 fatcat:itsc4hdm6rf2tnadzw7vmkb2s4

Using corpora in scientific and technical translation training: resources to identify conventionality and promote creativity

Clara Inés López-Rodríguez
2016 Cadernos de Tradução  
pre-existing corpora, or by means of bilingual or multilingual concordancers displaying aligned texts from international institutions' parallel corpora.  ...  In this second approach, the Web is perceived as a huge corpus which is accessed by means of online tools which produce monolingual wordlists and concordances from texts available from the Internet or  ...  creative aspects of translation.  ... 
doi:10.5007/2175-7968.2016v36nesp1p88 fatcat:7c7llnkqlvhtpo5hfy33rwitne

Irony Detection in a Multilingual Context [article]

Bilal Ghanem, Jihen Karoui, Farah Benamara, Paolo Rosso, Véronique Moriceau
2020 arXiv   pre-print
We show that these monolingual models trained separately on different languages using multilingual word representation or text-based features can open the door to irony detection in languages that lack  ...  of annotated data for irony.  ...  The result can show which kind of features works better in the monolingual settings and can be employed to detect irony in a multilingual setting.  ... 
arXiv:2002.02427v1 fatcat:k35g2wtmzrcv7jiiteblyyss6i

One model, two languages: training bilingual parsers with harmonized treebanks [article]

David Vilares and Carlos Gómez-Rodríguez and Miguel A. Alonso
2016 arXiv   pre-print
We introduce an approach to train lexicalized parsers using bilingual corpora obtained by merging harmonized treebanks of different languages, producing parsers that can analyze sentences in either of  ...  The results show that these bilingual parsers are more than competitive, as most combinations not only preserve accuracy, but some even achieve significant improvements over the corresponding monolingual  ...  Naseem et al. (2012) ) rely on crosslinguistic syntactic regularities to learn aspects of the source language that help parse an unseen language, without parallel corpora.  ... 
arXiv:1507.08449v2 fatcat:pty6qcr3k5b5jfinggxe6rq7vy
« Previous Showing results 1 — 15 out of 3,563 results