A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
EVALITA4ELG: Italian Benchmark Linguistic Resources, NLP Services and Tools for the ELG Platform
2020
Italian Journal of Computational Linguistics
This paper describes the EVALITA4ELG project, whose main aim is at systematically collecting the resources released as benchmarks for this evaluation campaign, and making them easily accessible through ...
The collection is moreover integrated with systems and baselines as a pool of web services with a common interface, deployed on a dedicated hardware infrastructure. ...
The European Language Grid project has received funding from the European Union's Horizon 2020 Research and Innovation programme under Grant Agreement no. 825627 (ELG). ...
doi:10.4000/ijcol.754
fatcat:qiwz4yjaf5ecfg5k4tdkfxlcii
Comparing Deep-Learning Architectures and Traditional Machine-Learning Approaches for Satire Identification in Spanish Tweets
2020
Mathematics
Our results suggest that term-counting features and traditional machine learning models provide competitive results regarding automatic satire identification, slightly outperforming state-of-the-art models ...
For this, the state-of-the-art relies on neural networks fed with word embeddings that are capable of learning interesting characteristics regarding the way humans communicate. ...
Conflicts of Interest: The authors declare no conflict of interest. ...
doi:10.3390/math8112075
doaj:b51bf8fa62114428b4085d8d27ef4f3b
fatcat:zdvkhyavbjgqxppwke4ufx3m7m
A novel COVID-19 data set and an effective deep learning approach for the de-identification of Italian medical records
2021
IEEE Access
of their content in accordance with the restrictions imposed by both national and supranational privacy authorities. ...
However, the lack of data sets in other languages has strongly limited their applicability and performance evaluation. ...
and classification. ...
doi:10.1109/access.2021.3054479
pmid:34786303
pmcid:PMC8545240
fatcat:ogndkjscqzfm5eqa4tymglqpwq
SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection
[article]
2021
arXiv
pre-print
We conduct experiments for the three sexism classification tasks making use of state-of-the-art machine learning models. ...
While research in the sexism detection domain is growing, most of this research focuses on English as the language and on Twitter as the platform. ...
French
Sexism
direct, descriptive
reporting, non-sexist
no decision
12k
2020
[28]
MeTwo
Spanish
Sexism
sexist, not-sexist
doubtful
3600
2020
[15]
EXIST@IberLEF
English
Spanish ...
arXiv:2108.03070v1
fatcat:7pnrrr54xrd6jc6hhacglht43e
RigoBERTa: A State-of-the-Art Language Model For Spanish
[article]
2022
arXiv
pre-print
RigoBERTa performance is assessed over 13 NLU tasks in comparison with other available Spanish language models, namely, MarIA, BERTIN and BETO. ...
RigoBERTa outperformed the three models in 10 out of the 13 tasks, achieving new "State-of-the-Art" results. ...
Acknowledgments We would like to acknowledge the support of this project by Instituto de Ingeniería del Conocimiento, as without their faith on us and their investment, this project would not have been ...
arXiv:2205.10233v1
fatcat:ickgk2osgrgvllo3dlucbnrezm
Multi-task Learning of Negation and Speculation for Targeted Sentiment Classification
[article]
2021
arXiv
pre-print
We release both the datasets and the source code at https://github.com/jerbarnes/multitask_negation_for_targeted_sentiment. ...
In this paper, we propose a multi-task learning method to incorporate information from syntactic and semantic auxiliary tasks, including negation and speculation scope detection, to create English-language ...
Acknowledgements This work has been carried out as part of the SANT project (Sentiment Analysis for Norwegian Text), funded by the Research Council of Norway (grant ...
arXiv:2010.08318v2
fatcat:z7b3fmynobarzmj344hadthupu
Gender Bias in Text: Labeled Datasets and Lexicons
[article]
2022
arXiv
pre-print
However, there is a lack of gender bias datasets and lexicons for automating the detection of gender bias using supervised and unsupervised machine learning (ML) and natural language processing (NLP) techniques ...
Therefore, the main contribution of this work is to publicly provide labeled datasets and exhaustive lexicons by collecting, annotating, and augmenting relevant sentences to facilitate the detection of ...
in Ibereval 2018 [14] and Evalita 2020 [13] provided datasets in English, Spanish, and Italian to detect misogynistic content, to classify misogynous behaviour as well as to identify the target of ...
arXiv:2201.08675v1
fatcat:iv66itdd2bhyhevb5vdh5v46wy
On the Use of Parsing for Named Entity Recognition
2021
Applied Sciences
It is intuitive that the structure of a text can be helpful to determine whether or not a certain portion of it is an entity and if so, to establish its concrete limits. ...
we review the different approaches to NER that make use of syntactic information; and we propose a new way of using parsing in NER based on casting parsing itself as a sequence labeling task. ...
Examples from 2020 are the Spanish CAMTEMIST-NER shared task, with 23 participating teams, featured in IberLEF 2020 [97] ; or the English W-NUT-2020 Task 1, with 13 participants, featured in EMNLP 2020 ...
doi:10.3390/app11031090
fatcat:6suu3nnvonfzbmmi3uywcts5xq
A Language Model for Misogyny Detection in Latin American Spanish Driven by Multisource Feature Extraction and Transformers
2021
Applied Sciences
The complexity of recognizing misogyny through computer models lies in the fact that it is a subtle type of violence, it is not always explicitly aggressive, and it can even hide behind seemingly flattering ...
This research contributes to the development of models for the automatic detection of misogynistic texts in Latin American Spanish and contributes to the design of data augmentation methodologies since ...
Conflicts of Interest: The authors declare no conflict of interest. Appl. Sci. 2021, 11, 10467 ...
doi:10.3390/app112110467
fatcat:hrsqz2oe5rbhhnonbjxgtz5eia
Challenging Social Media Threats using Collective Well-being Aware Recommendation Algorithms and an Educational Virtual Companion
[article]
2021
arXiv
pre-print
The full impact of current SM platform design -- both at an individual and societal level -- asks for a comprehensive evaluation and conceptual improvement. ...
On the other hand however, some serious negative implications of SM have repeatedly been highlighted in recent years, pointing at various SM threats for society, and its teenagers in particular: from common ...
Kruschwitz, 2020 ), detections at the spam level at Task 5 in Semeval-2021 (Pavlopoulos et al., 2021) , and generalisation to social media platforms other than those used in training at EXIST in IberLEF ...
arXiv:2102.04211v3
fatcat:ps2pztjcevbardpnkzrbuiia7a
Proceedings of the GermEval 2021 Workshop on the Identification of Toxic, Engaging, and Fact-Claiming Comments
[article]
2021
We created ensembles of these models and investigated whether and how classification performance depends on the number of ensemble members and their composition. ...
On out-of-sample data, our best ensemble achieved a macro-F1 score of 0.73 (for all subtasks), and F1 scores of 0.72, 0.70, and 0.76 for subtasks 1, 2, and 3, respectively. ...
GPUs for training our deep learning models and also to our colleagues at Precog 20 for constant support and feedback. ...
doi:10.48415/2021/fhw5-x128
fatcat:u3fcq4x23jba7ic2a5ldcsdbna
Multi-task Learning of Negation and Speculation for Targeted Sentiment Classification
2021
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
unpublished
We release both the datasets and the source code at https://github.com/ jerbarnes/multitask_negation_ for_targeted_sentiment. ...
In this paper, we propose a multi-task learning method to incorporate information from syntactic and semantic auxiliary tasks, including negation and speculation scope detection, to create English-language ...
Acknowledgements This work has been carried out as part of the SANT project (Sentiment Analysis for Norwegian Text), funded by the Research Council of Norway (grant number 270908). ...
doi:10.18653/v1/2021.naacl-main.227
fatcat:zvfxjm2f6jcytcgwcly33od6da