A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
A New Corpus and Lexicon for Offensive Tamazight Language Detection
2022
7th International Workshop on Social Media World Sensors
In this paper, we address the offensive language detection on Tamazight language, which is one of the under-resourced languages that are still in their infancy and lack of standard orthography. We are particularly interested in the Kabyle dialect, mainly spoken in some cities of northern Algeria (i.e. Tizi-ouzou and Bejaïa). We propose a new corpus of offensive Tamazight language (i.e. OTAM corpus) compiling 6.2k texts, as well as a new lexicon of offensive and abusive Tamazight words with
doi:10.1145/3544795.3544852
fatcat:gilkdb2mmnb3xfntc3ptkx5jj4