Do Characters Abuse More Than Words?

Yashar Mehdad, Joel Tetreault
2016 Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue  
Although word and character n-grams have been used as features in different NLP applications, no systematic comparison or analysis has shown the power of character-based features for detecting abusive language. In this study, we investigate the effectiveness of such features for abusive language detection in user-generated online comments, and show that such methods outperform previous state-of-theart approaches and other strong baselines.
doi:10.18653/v1/w16-3638 dblp:conf/sigdial/MehdadT16 fatcat:ayck5uos6rclth7kdlpoc74tje