Filters








34 Hits in 5.3 sec

NEUROSENT-PDI at SemEval-2018 Task 3: Understanding Irony in Social Networks Through a Multi-Domain Sentiment Model

Mauro Dragoni
2018 Proceedings of The 12th International Workshop on Semantic Evaluation  
The output layer has been adapted based on the characteristics of each subtask.  ...  Then, tweets are converted in the corresponding vector representation and given as input to the neural network with the aim of learning the different semantics contained in each emotion taken into account  ...  In Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorož, Slovenia, May 23-28, 2016. European Language Resources Association (ELRA).  ... 
doi:10.18653/v1/s18-1083 dblp:conf/semeval/Dragoni18a fatcat:gb2vndasmfa6bapmjbdkixukxa

A Panoramic Survey of Natural Language Processing in the Arab World [article]

Kareem Darwish and Nizar Habash and Mourad Abbas and Hend Al-Khalifa and Huseein T. Al-Natsheh and Samhaa R. El-Beltagy and Houda Bouamor and Karim Bouzoubaa and Violetta Cavalli-Sforza and Wassim El-Hajj and Mustafa Jarrar and Hamdy Mubarak
2021 arXiv   pre-print
awareness, use, and expectations of what may have seemed like science fiction in the past.  ...  Arabic, the primary language of the Arab world and the religious language of millions of non-Arab Muslims is somewhere in the middle of this continuum.  ...  In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16) (Portorož, Slovenia, May 2016), European Language Resources Association (ELRA), pp. 1808-1812.[22] Al  ... 
arXiv:2011.12631v3 fatcat:cfycp2j6r5gu3a66zu27fo3vya

The NLP4NLP Corpus (II): 50 Years of Research in Speech and Language Processing

Joseph Mariani, Gil Francopoulo, Patrick Paroubek, Frédéric Vernier
2019 Frontiers in Research Metrics and Analytics  
In addition, it allowed us to study the use of language resources, in the framework of the paradigm shift between knowledge-based approaches and content-based approaches, and the reuse of articles and  ...  The NLP4NLP corpus contains articles published in 34 major conferences and journals in the field of speech and natural language processing over a period of 50 years , comprising 65,000 documents, gathering  ...  LREC 2016, Tenth International Conference on Language Resources and Evaluation Proceedings, Portorož, Slovenia, May 23-28, 2016 This paper analyzes the possibility to predict the future research topics  ... 
doi:10.3389/frma.2018.00037 fatcat:fcrnyjo6nvfthhtejoygmnda3q

Towards Improved Model Design for Authorship Identification: A Survey on Writing Style Understanding [article]

Weicheng Ma, Ruibo Liu, Lili Wang, Soroush Vosoughi
2020 arXiv   pre-print
We first describe our survey results on the current state of research in both sets of tasks and summarize existing achievements and problems in authorship-related tasks.  ...  Authorship identification tasks, which rely heavily on linguistic styles, have always been an important part of Natural Language Understanding (NLU) research.  ...  In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3945-3952, Portorož, Slovenia. European Language Resources Association (ELRA).  ... 
arXiv:2009.14445v1 fatcat:bnrh5e6rc5debo4apsoqn257nu

scb-mt-en-th-2020: A Large English-Thai Parallel Corpus [article]

Lalita Lowphansirikul, Charin Polpanumas, Attapol T. Rutherford, Sarana Nutanong
2020 arXiv   pre-print
Our models' performance are comparable to that of Google Translation API (as of May 2020) for Thai-English and outperform Google when the Open Parallel Corpus (OPUS) is included in the training data for  ...  The primary objective of our work is to build a large-scale English-Thai dataset for machine translation.  ...  We thank our data annotation partners Hope Data Annotations and Wang: Data Market; Office of the National Economic and Social Development Council (NESDC) through Phannisa Nirattiwongsakorn for providing  ... 
arXiv:2007.03541v1 fatcat:xv4ymyydtrewxnwe2xrbhah62y

The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation [article]

Orevaoghene Ahia, Julia Kreutzer, Sara Hooker
2021 arXiv   pre-print
However, evaluation of the trade-offs incurred by popular compression techniques has been centered on high-resource datasets.  ...  We introduce the term low-resource double bind to refer to the co-occurrence of data limitations and compute resource constraints.  ...  In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 4543- 4549, Portorož, Slovenia. European Language Re- sources Association (ELRA).  ... 
arXiv:2110.03036v1 fatcat:z4o4boau25cwrbtxqyhjlevvgy

Cross-document Event Identity via Dense Annotation [article]

Adithya Pratapa, Zhengzhong Liu, Kimihiro Hasegawa, Linwei Li, Yukari Yamakawa, Shikun Zhang, Teruko Mitamura
2021 arXiv   pre-print
Such annotation setup reduces the pool of event mentions and prevents one from considering the possibility of quasi-identity relations.  ...  In addition to the links, we further collect overlapping event contexts, including time, location, and participants, to shed some light on the relation between identity decisions and context.  ...  The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the  ... 
arXiv:2109.06417v1 fatcat:6cwedtrjxrg7dkgzbozeteunhy

UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [article]

Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
2021 arXiv   pre-print
Massively multilingual language models such as multilingual BERT offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.  ...  In this work, we propose a series of novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.  ...  We thank Laura Rimell, Nils Reimers, Michael Bugert and the anonymous reviewers for insightful feedback and suggestions on a draft of this paper.  ... 
arXiv:2012.15562v3 fatcat:4zrgue5xyfbu5ft4fun5jaspx4

A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions [article]

Takuma Udagawa, Takato Yamazaki, Akiko Aizawa
2020 arXiv   pre-print
Overall, we propose a novel framework and resource for investigating fine-grained language understanding in visually grounded dialogues.  ...  We demonstrate that our annotation can reveal both the strengths and weaknesses of baseline models in essential levels of detail.  ...  We also thank the anonymous reviewers for their valuable suggestions and comments.  ... 
arXiv:2010.03127v1 fatcat:s3bsa7qb6fdalopkntel2szdpa

Universal Lemmatizer: A sequence-to-sequence model for lemmatizing Universal Dependencies treebanks

Jenna Kanerva, Filip Ginter, Tapio Salakoski
2020 Natural Language Engineering  
We evaluate our lemmatizer on 52 different languages and 76 different treebanks, showing that our system outperforms all latest baseline systems.  ...  Additionally, we study two different data augmentation methods utilizing autoencoder training and morphological transducers especially beneficial for low-resource languages.  ...  We gratefully acknowledge the support of Academy of Finland, CSC -IT Center for Science, and the NVIDIA Corporation GPU Grant Program.  ... 
doi:10.1017/s1351324920000224 fatcat:67yezlpyk5bg7mkstjc266yl6e

Continual Learning in Multilingual NMT via Language-Specific Embeddings [article]

Alexandre Berard
2021 arXiv   pre-print
performance: training on English-centric data is enough to translate between the new language and any of the initial languages.  ...  Because the parameters of the original model are not modified, its performance on the initial languages does not degrade.  ...  In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3530-3534, Portorož, Slovenia.  ... 
arXiv:2110.10478v1 fatcat:76j3vrdn5bhx5btldgl2jiipf4

A Survey of Deep Active Learning [article]

Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Brij B. Gupta, Xiaojiang Chen, Xin Wang
2021 arXiv   pre-print
In addition, we also analyzed and summarized the development of DAL from the perspective of application.  ...  In recent years, due to the rapid development of internet technology, we are in an era of information torrents and we have massive amounts of data.  ...  In Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorož, Slovenia, May 23-28, 2016.  ... 
arXiv:2009.00236v2 fatcat:zuk2doushzhlfaufcyhoktxj7e

Approximating Stacked and Bidirectional Recurrent Architectures with the Delayed Recurrent Neural Network [article]

Javier S. Turek, Shailee Jain, Vy Vo, Mihai Capota, Alexander G. Huth, Theodore L. Willke
2020 arXiv   pre-print
We prove that a weight-constrained version of the delayed-RNN is equivalent to a stacked-RNN. We also show that the delay gives rise to partial acausality, much like bidirectional networks.  ...  In this work, we explore the delayed-RNN, which is a single-layer RNN that has a delay between the input and output.  ...  In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pp. 1659- 1666, Portorož, Slovenia, May 2016. European Language Resources Association (ELRA).  ... 
arXiv:1909.00021v2 fatcat:yjvtgie3yvbupmacqo4qkjjmdq

Gamified crowdsourcing for idiom corpora construction

GülŞen Eryiğit, Ali Şentaş, Johanna Monti
2022 Natural Language Engineering  
The approach is language-independent and evaluated on two languages in comparison to traditional data preparation techniques in the field.  ...  Learning idiomatic expressions is seen as one of the most challenging stages in second-language learning because of their unpredictable meaning.  ...  The authors would like to offer their special thanks to Cihat Eryigit for the discussions during the  ... 
doi:10.1017/s1351324921000401 fatcat:i36rqivxbrgnro6mmgf7yexvvq

Identifying Morality Frames in Political Tweets using Relational Learning [article]

Shamik Roy, Maria Leonor Pacheco, Dan Goldwasser
2021 arXiv   pre-print
The Moral Foundation Theory identifies five moral foundations, each associated with a positive and negative polarity.  ...  We do qualitative and quantitative evaluations, showing that moral sentiment towards entities differs highly across political ideologies.  ...  Acknowledgements We thank Nikhil Mehta, Rajkumar Pujari, and the anonymous reviewers for their insightful comments. This work was partially supported by an NSF CAREER award IIS-2048001.  ... 
arXiv:2109.04535v1 fatcat:ufjt4byevfer3ipwux73z7dnyi
« Previous Showing results 1 — 15 out of 34 results