1,736 Hits in 6.0 sec

A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment

Sina Ahmadi, John P. McCrae, Sanni Nimb, Thomas Troelsgård, Sussi Olsen, Bolette S. Pedersen, Thierry Declerck, Tanja Wissik, Monica Monachini, Andrea Bellandi, Fahad Khan, Irene Pisani (+32 others)
2020 Zenodo  
We believe that our data will pave the way for further advances in alignment and evaluation of word senses by creating new solutions, particularly those notoriously requiring data such as neural networks  ...  In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages.  ...  Conclusion In this paper, we presented a set of 17 datasets for the task of monolingual word sense alignment covering 15 lan-guages.  ... 
doi:10.5281/zenodo.3842647 fatcat:wymgthiclzen5azbblback7pdy

Word Sense Disambiguation Using Wikipedia [chapter]

Bharath Dandala, Rada Mihalcea, Razvan Bunescu
2013 The People's Web Meets NLP  
We present three approaches to word sense disambiguation that use Wikipedia as a source of sense annotations.  ...  Starting from a basic monolingual approach, we develop two multilingual systems: one that uses a machine translation system to create multilingual features, and one where multilingual features are extracted  ...  A Multilingual Dataset through Machine Translation In order to generate a multilingual representation for the monolingual dataset, we used Google Translate to translate the data from English into several  ... 
doi:10.1007/978-3-642-35085-6_9 dblp:series/tanlp/DandalaMB13 fatcat:pchga2qz3vgg5jskyhkjplnlvi

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context

Shyam Upadhyay, Kai-Wei Chang, Matt Taddy, Adam Kalai, James Zou
2017 Proceedings of the 2nd Workshop on Representation Learning for NLP  
A recent line of work uses bilingual (two languages) corpora to learn a different vector for each sense of a word, by exploiting crosslingual signals to aid sense identification.  ...  Ours is the first approach with the ability to leverage multilingual corpora efficiently for multi-sense representation learning.  ...  Word Sense Induction (WSI). We evaluate our approach on word sense induction task.  ... 
doi:10.18653/v1/w17-2613 dblp:conf/rep4nlp/UpadhyayCTKZ17 fatcat:y2epfxfozfafpmhdzhuy3n3axq

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context [article]

Shyam Upadhyay and Kai-Wei Chang and Matt Taddy and Adam Kalai and James Zou
2017 arXiv   pre-print
A recent line of work uses bilingual (two languages) corpora to learn a different vector for each sense of a word, by exploiting crosslingual signals to aid sense identification.  ...  Ours is the first approach with the ability to leverage multilingual corpora efficiently for multi-sense representation learning.  ...  Word Sense Induction (WSI). We evaluate our approach on word sense induction task.  ... 
arXiv:1706.08160v1 fatcat:khcuczsdwfcw5inwemscw7x3uq

Evaluation of contextual embeddings on less-resourced languages [article]

Matej Ulčar and Aleš Žagar and Carlos S. Armendariz and Andraž Repar and Senja Pollak and Matthew Purver and Marko Robnik-Šikonja
2021 arXiv   pre-print
In monolingual settings, our analysis shows that monolingual BERT models generally dominate, with a few exceptions such as the dependency parsing task, where they are not competitive with ELMo models trained  ...  In cross-lingual settings, BERT models trained on only a few languages mostly do best, closely followed by massively multilingual BERT models.  ...  The results of this publication reflect only the authors' view and the EU Commission is not responsible for any use that may be made of the information it contains.  ... 
arXiv:2107.10614v1 fatcat:vhbn3v2xqzgx5gzysnmb3jtq3e

FCICU at SemEval-2017 Task 1: Sense-Based Language Independent Semantic Textual Similarity Approach

Basma Hassan, Samir AbdelRahman, Reem Bahgat, Ibrahim Farag
2017 Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)  
For all the tracks in Task1, Run1 is a string kernel with alignments metric and Run2 is a sense-based alignment similarity method.  ...  A sense-based language independent textual similarity approach is presented, in which a proposed alignment similarity method coupled with new usage of a semantic network (BabelNet) is used.  ...  From this point, we present a sense-based STS approach that produces a similarity score between texts by means of a multilingual word-sense aligner.  ... 
doi:10.18653/v1/s17-2015 dblp:conf/semeval/HassanABF17 fatcat:m2goob4gevhaxn3frvf2w5iqbi

Semi Supervised Preposition-Sense Disambiguation using Multilingual Data [article]

Hila Gonen, Yoav Goldberg
2016 arXiv   pre-print
The multilingual signals consistently improve results on two preposition-sense datasets.  ...  Supervised corpora for the preposition-sense disambiguation task are small, suggesting a semi-supervised approach to the task.  ...  Learning from multilingual data The use of multilingual data for improving monolingual tasks has a long tradition in NLP, and has been used for target word selection (Dagan et al., 1991) ; word sense  ... 
arXiv:1611.08813v1 fatcat:wkjpkoyfbvfrtjhlkitn33xwge

Sense-level subjectivity in a multilingual setting

Carmen Banea, Rada Mihalcea, Janyce Wiebe
2014 Computer Speech and Language  
We start out with a manual annotation study, and then seek to create an automatic framework to determine subjectivity labeling for unseen senses.  ...  This paper explores the ability of senses aligned across languages to carry coherent subjectivity information.  ...  ., 2009) , as well as a list of 48 additional words, for a total of 134 words encompassing 630 senses manually annotated for subjectivity.  ... 
doi:10.1016/j.csl.2013.03.002 fatcat:rjtfnbhsn5g6beb2ot3hdcwaea

Joining Forces Pays Off: Multilingual Joint Word Sense Disambiguation

Roberto Navigli, Simone Paolo Ponzetto
2012 Conference on Empirical Methods in Natural Language Processing  
We present a multilingual joint approach to Word Sense Disambiguation (WSD).  ...  monolingual and multilingual WSD settings.  ...  BabelNet and its API are available for download at  ... 
dblp:conf/emnlp/NavigliP12 fatcat:gfpbhib7dzdntcnjamycsf6zu4

Massively Multilingual Word Embeddings [article]

Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, Noah A. Smith
2016 arXiv   pre-print
We introduce new methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space.  ...  We also describe a web portal for evaluation that will facilitate further research in this area, along with open-source releases of all our methods.  ...  We are also grateful to Héctor Martínez Alonso for his help with Danish resources.  ... 
arXiv:1602.01925v2 fatcat:bbi4zm63fncqdbd7do7ean2q6y

Meemi: A Simple Method for Post-processing and Integrating Cross-lingual Word Embeddings [article]

Yerai Doval, Jose Camacho-Collados, Luis Espinosa-Anke, Steven Schockaert
2020 arXiv   pre-print
While monolingual word embeddings encode information about words in the context of a particular language, cross-lingual embeddings define a multilingual space where word embeddings from two or more languages  ...  In this work, we propose to apply an additional transformation after this initial alignment step, which aims to bring the vector representations of a given word and its translations closer to their average  ...  Acknowledgments Yerai Doval has been supported by the Spanish Ministry of Economy, Industry and Competitiveness (MINECO) through the ANSWER-ASAP project (TIN2017-85160-C2-2-R); by the Spanish State Secretariat for  ... 
arXiv:1910.07221v4 fatcat:ezeepywtkfdqbalfdpa6vpqqdy

A Survey Of Cross-lingual Word Embedding Models [article]

Sebastian Ruder, Ivan Vulić, Anders Søgaard
2019 arXiv   pre-print
Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models  ...  We also discuss the different ways cross-lingual word embeddings are evaluated, as well as future challenges and research horizons.  ...  Acknowledgements We thank the anonymous reviewers for their valuable and comprehensive feedback.  ... 
arXiv:1706.04902v3 fatcat:lts6uop77zaazhzlbygqmdsama

BabelBERT: Massively Multilingual Transformers Meet a Massively Multilingual Lexical Resource [article]

Tommaso Green and Simone Paolo Ponzetto and Goran Glavaš
2022 arXiv   pre-print
In a series of subsequent controlled experiments, we demonstrate that the pretraining quality of word representations in the MMT for languages involved in specialization has a much larger effect on performance  ...  While existing work primarily focused on lexical specialization of PLMs in monolingual and bilingual settings, in this work we expose massively multilingual transformers (MMTs, e.g., mBERT or XLM-R) to  ...  et al., 2018) for aligning monolingual static word embedding spaces.  ... 
arXiv:2208.01018v1 fatcat:ef7dnwlb6veqxd5nnzj7x2xp44

Expanding the Text Classification Toolbox with Cross-Lingual Embeddings [article]

Meryem M'hamdi, Robert West, Andreea Hossmann, Michael Baeriswyl, and Claudiu Musat
2019 arXiv   pre-print
For all architectures, types of word embeddings and datasets, we notice a consistent gain trend in favor of multilingual joint training, especially for low-resourced languages.  ...  for CLTC; and we move from bi- to multi-lingual word embeddings.  ...  The idea is to make use of the inherent parallelism between the two spaces in the sense that English vectors for words in space EN-FR should be aligned to vectors of the same words in space EN-DE.  ... 
arXiv:1903.09878v2 fatcat:h3ho57z64bea5e3f36lfh2lymy


Pezhman Sheinidashtego, Aibek Musaev
2019 Zenodo  
State-of-the-art methods for learning cross-lingual word embeddings rely on the alignment of monolingual word embedding spaces.  ...  Recent advances in generating monolingual word embeddings based on word co-occurrence for universal languages inspired new efforts to extend the model to support diversified languages.  ...  ACKNOWLEDGEMENTS I would like to thank the Computer Science Department at The University of Alabama, because through their funding and support, the faculty and staff had made it possible for me to work  ... 
doi:10.5281/zenodo.3889326 fatcat:cfmiwnmcazh5flfdbpsr6jwh5q
« Previous Showing results 1 — 15 out of 1,736 results