Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, May 27-31, 2019 - Abstract Collection

Michael Bosnjak, Leibniz Institut Für Psychologische Information Und Dokumentation (ZPID), Leibniz Institut Für Psychologische Information Und Dokumentation (ZPID)
2019
Virtual communities as a facilitating arena for broadcasting radical beliefs, as well as connecting and recruiting sympathizers are broadly established in research, yet also seen as a panacea for predicting radicalization of specific ideological content. A plethora of research in the realms of terrorism research and radicalization has focused on social media, particularly Twitter to characterize from a network analysis perspective accounts and interactions or to detect individuals at risk or
more » ... eady radicalized. Though a desideratum as to the significance of virtual platforms such as 4chan or Reddit, as potential radicalization ecologies persists (see Schmid & Forest, 2018), a considerable amount of publications irrespectively focuses on collecting data from Twitter. Importantly, approaches used for data collection are prone to sampling bias (e.g. platform-specific biases along proxy population biases) jeopardizing the reliability of results, irrespective of the platform. Objectives The present work, on investigating social media data sampling practices and associated pitfalls, builds on a framework of data collection developed by Parekh et al. (2018). Their general model comprises four phases: initialization, expansion, filtering and validation. In an attempt to replicate their framework, their procedure leads the way to identifying prevalent data sampling methods (of user data, group data and interaction data) used in existing research on online radicalization, making use of social network analysis. In a similar vein, the objective is to investigate data collection limitations and implications in accordance to the outlined phases. As an extension, also studies dealing with other radical manifestations than ISIS sympathizers, such as extreme right-wing are considered. Research Questions(s)/ hypothesis/es Building on the results from Parekh et al. (2018), the question poses whether considered publications do sufficiently validate and filter their sampled data and consider possible sources of bias and whether the results compare to the work of the latter. Method Departing from the approach by Parekh et al. (2018), the impact of expansion and filtering on the quality of the data is tested by constructing an own dataset from Twitter during one month, via the Twitter API. However, in contrast, not jihadist 3 accounts are crawled, but white supremacist/ right-wing sympathizers' accounts and metadata. In the realm of the expansion phase (i.e. adding more accounts to the seed accounts) the friend and follower relationship of Twitter accounts are exploited and two data sets thereof created, which in turn are each subject to the filtering phase and non-filtering (comprising exclusion of accounts based on the number of followers and activity status). A random sample from each of the data sets is manually annotated as either neutral, radical right-wing, ambiguous, irrelevant or containing insufficient information, in order to establish the composition of the data set and validity. Results/Findings Preliminary findings are to be discussed. Conclusions and Implications (expected) By replicating the work of Parekh et al. (2018) and appraising and comparing existing data collection procedures of empirical studies in the realm of radicalization research, lenses are offered to improve the methodological founding of psychological research facing new possibilities with non-obtrusive insights into human behavior.
doi:10.23668/psycharchives.2449 fatcat:gr2qmtaehfau7hmbzik23rbmye