A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Detection of Abusive Language: the Problem of Biased Datasets
2019
North American Chapter of the Association for Computational Linguistics
We discuss the impact of data bias on abusive language detection. ...
Datasets with a higher proportion of implicit abuse are more affected than datasets with a lower proportion. ...
Acknowledgements The authors were partially supported by the German Research Foundation (DFG) under grants RU 1873/2-1 and WI 4204/2-1.
References ...
doi:10.18653/v1/n19-1060
dblp:conf/naacl/WiegandRK19
fatcat:bbmpyatp7rfipoo6piipwnnwpy
Cross-Domain Detection of Abusive Language Online
2018
Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)
We investigate to what extent the models trained to detect general abusive language generalize between different datasets labeled with different abusive language types. ...
To this end, we compare the cross-domain performance of simple classification models on nine different datasets, finding that the models fail to generalize to out-domain datasets and that having at least ...
Detecting abusive language online is a subject of much ongoing research in the NLP community. ...
doi:10.18653/v1/w18-5117
dblp:conf/acl-alw/KaranS18
fatcat:xd2evaqe6rembc24ajg3hwx6ki
Joint Modelling of Emotion and Abusive Language Detection
[article]
2020
arXiv
pre-print
Aiming to tackle this problem, the natural language processing (NLP) community has experimented with a range of techniques for abuse detection. ...
In this paper, we present the first joint model of emotion and abusive language detection, experimenting in a multi-task learning framework that allows one task to inform the other. ...
This stresses the need for automated techniques for abusive language detection, a problem that has recently gained a great deal of interest in the natural language processing community. ...
arXiv:2005.14028v1
fatcat:nnltfnth4fb57npktge3gwu5xe
Abusive Language Detection and Characterization of Twitter Behavior
[article]
2020
arXiv
pre-print
Here the main objective is to focus on various forms of abusive behaviors on Twitter and to detect whether a speech is abusive or not. ...
In this work, abusive language detection in online content is performed using Bidirectional Recurrent Neural Network (BiRNN) method. ...
Example of Abusive language A large number of studies has been done in recent years to develop automatic methods for the detection of abusive languages in social media platforms. ...
arXiv:2009.14261v1
fatcat:tigf4y7tcfgi5evboqp55cwqh4
Reducing Gender Bias in Abusive Language Detection
2018
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Abusive language detection models tend to have a problem of being biased toward identity words of a certain group of people because of imbalanced training datasets. ...
In this work, we measure gender biases on models trained with different abusive language datasets, while analyzing the effect of different pre-trained word embeddings and model architectures. ...
Acknowledgments This work is partially funded by ITS/319/16FP of Innovation Technology Commission, HKUST, and 16248016 of Hong Kong Research Grants Council. ...
doi:10.18653/v1/d18-1302
dblp:conf/emnlp/ParkSF18
fatcat:ybtesdtm2fejxb2fevt7du4td4
Studying Generalisability across Abusive Language Detection Datasets
2019
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)
Work on Abusive Language Detection has tackled a wide range of subtasks and domains. As a result of this, there exists a great deal of redundancy and non-generalisability between datasets. ...
Through experiments on cross-dataset training and testing, the paper reveals that the preconceived notion of including more non-abusive samples in a dataset (to emulate reality) may have a detrimental ...
Acknowledgments Thanks to all the researchers who have made their datasets available, specially Waseem and Hovy, Davidson et al., Founta et al., and Zampieri et al., the organisers of SemEval-2019 Task ...
doi:10.18653/v1/k19-1088
dblp:conf/conll/SwamyJG19
fatcat:mn6slro7dfhybh2gl4j7zbmrpy
Neural Character-based Composition Models for Abuse Detection
2018
Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)
Acknowledgements Special thanks to the anonymous reviewers for their valuable comments and suggestions. ...
Conclusions In this paper, we considered the problem of obfuscated words in the field of automated abuse detection. ...
Datasets Following the proceedings of the 1 st Workshop on Abusive Language Online (Waseem et al., 2017) , we use three datasets from two different domains. ...
doi:10.18653/v1/w18-5101
dblp:conf/acl-alw/MishraYS18
fatcat:y7mbghaq6japjkuojozdhwhtjm
Multi-Class Detection of Abusive Language Using Automated Machine Learning
[chapter]
2020
WI2020 Zentrale Tracks
We propose Auto-ML as a promising approach to the field of abusive language detection, especially for small companies who may have little machine learning knowledge and computing resources. ...
Abusive language detection online is a daunting task for moderators. We propose Automated Machine Learning (Auto-ML) to semi-automate abusive language detection and to assist moderators. ...
Acknowledgements The research leading to these results received funding from the federal state of North Rhine-Westphalia and the European Regional Development Fund (EFRE.NRW 2014-2020), Project: (No. ...
doi:10.30844/wi_2020_r7-jorgensen
dblp:conf/wirtschaftsinformatik/JorgensenCNB020
fatcat:unwwznyihfazvdxgfign2ktjua
Abusive Language Detection in Heterogeneous Contexts: Dataset Collection and the Role of Supervised Attention
[article]
2021
arXiv
pre-print
This is due in part to the lack of datasets that explicitly annotate heterogeneity in abusive language. ...
Abusive language is a massive problem in online social platforms. ...
Acknowledgements This work was supported in part by the National Science Foundation under grant no. 1720268. We would like to thank ...
arXiv:2105.11119v1
fatcat:2u2ltkykzvh37gyj5tlk7nthkm
On Cross-Dataset Generalization in Automatic Detection of Online Abuse
[article]
2021
arXiv
pre-print
NLP research has attained high performances in abusive language detection as a supervised classification task. ...
We explore the topic bias and the task formulation bias in cross-dataset generalization. We show that the benign examples in the Wikipedia Detox dataset are biased towards platform-specific topics. ...
Detection of Abusive Language: the Problem of Biased Datasets. ...
arXiv:2010.07414v3
fatcat:tr6njwf2nzbvvl4a35an7eaeji
Aggression Detection on Social Media Text Using Deep Neural Networks
2018
Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)
In this paper, we introduce a deep learning based classification system for Facebook posts and comments of Hindi-English Code-Mixed text to detect the aggressive behaviour of/towards users. ...
Our work focuses on text from users majorly in the Indian Subcontinent. The dataset that we used for our models is provided by TRAC-1 1 in their shared task. ...
These network has been used in the past for tasks similar to our task like hate speech detection (Badjatiya et al., 2017) , bullying detection (Agrawal and Awekar, 2018) , Abusive language detection ...
doi:10.18653/v1/w18-5106
dblp:conf/acl-alw/SinghVAVS18
fatcat:7ymguyz25bfzzdb5spld5p45xm
WAC: A Corpus of Wikipedia Conversations for Online Abuse Detection
[article]
2020
arXiv
pre-print
We also propose, in addition to this corpus, a complete benchmarking platform to stimulate and fairly compare scientific works around the problem of content abuse detection, trying to avoid the recurring ...
problem of result replication. ...
By comparison, in the abuse detection literature, datasets are often annotated by considering comments flagged by moderators as abusive, whereas the rest of the comments are deemed non-abusive by default ...
arXiv:2003.06190v1
fatcat:2fr4mzluerbythqbpnp4plxncq
Reducing Gender Bias in Abusive Language Detection
[article]
2018
arXiv
pre-print
Abusive language detection models tend to have a problem of being biased toward identity words of a certain group of people because of imbalanced training datasets. ...
In this work, we measure gender biases on models trained with different abusive language datasets, while analyzing the effect of different pre-trained word embeddings and model architectures. ...
Acknowledgments This work is partially funded by ITS/319/16FP of Innovation Technology Commission, HKUST, and 16248016 of Hong Kong Research Grants Council. ...
arXiv:1808.07231v1
fatcat:irtywc5hkneyvacueh4ovnvfqa
Detecting Recovery Problems Just in Time: Application of Automated Linguistic Analysis and Supervised Machine Learning to an Online Substance Abuse Forum
2018
Journal of Medical Internet Research
Results: To distinguish recovery problem disclosures, the Bag-of-Words approach relied on domain-specific language, including words explicitly linked to substance use and mental health ("drink," "relapse ...
Conclusions: Differences in language use can distinguish messages disclosing recovery problems from other message types. ...
Acknowledgments This research was funded by the National Institute of Alcohol Abuse and Alcoholism (R01 AA017192) and the National Institute on Drug Abuse (R01DA034279, R01DA040449, and DP2DA042424). ...
doi:10.2196/10136
pmid:29895517
pmcid:PMC6019846
fatcat:fsu37llrdfbwxphzycmh4sj7xu
Directions in Abusive Language Training Data: Garbage In, Garbage Out
[article]
2020
arXiv
pre-print
This paper systematically reviews abusive language dataset creation and content in conjunction with an open website for cataloguing abusive language data. ...
Data-driven analysis and detection of abusive online content covers many different tasks, phenomena, contexts, and methodologies. ...
Creating a training dataset for online abuse detection is typically motivated by the desire to address a particular social problem. ...
arXiv:2004.01670v2
fatcat:vj5mxajmsbbtfk4e7u2iyiacmy
« Previous
Showing results 1 — 15 out of 7,814 results