Online Extremism Detection: A Systematic Literature Review with Emphasis on Datasets, Classification Techniques, Validation Methods and Tools

Mayur Gaikwad, Swati Ahirrao, Shraddha Phansalkar, Ketan Kotecha
2021 IEEE Access  
Social media platforms are popular for expressing personal views, emotions and beliefs. Social media platforms are influential for propagating extremist ideologies for group-building, fund-raising, and recruitment. To monitor and control the outreach of extremists on social media, detection of extremism in social media is necessary. The existing extremism detection literature on social media is limited by specific ideology, subjective validation methods, and binary or tertiary classification. A
more » ... comprehensive and comparative survey of datasets, classification techniques, validation methods with online extremism detection tool is essential. The systematic literature review methodology (PRISMA) was used. Sixty-four studies on extremism research were collected, including 31 from SCOPUS, Web of Science (WoS), ACM, IEEE, and 33 thesis, technical and analytical reports using Snowballing technique. The survey highlights the role of social media in propagating online radicalization and the need for extremism detection on social media platforms. The review concludes lack of publicly available, class-balanced, and unbiased datasets for better detection and classification of social-media extremism. Lack of validation techniques to evaluate correctness and quality of custom data sets without human interventions, was found. The information retrieval unveiled that contemporary research work is prejudiced towards ISIS ideology. We investigated that deep learning based automated extremism detection techniques outperform other techniques. The review opens the research opportunities for developing an online, publicly available automated tool for extremism data collection and detection. The survey results in conceptualization of architecture for construction of multi-ideology extremism text dataset with robust data validation techniques for multiclass classification of extremism text.
doi:10.1109/access.2021.3068313 fatcat:56xuyhtuxvdsxf7s7s5i3jvvbe