Thematic Analysis of 18 Years of PERC Proceedings using Natural Language Processing [article]

Tor Ole B. Odden and Alessandro Marin, Marcos D. Caballero
2020 arXiv   pre-print
We have used an unsupervised machine learning method called Latent Dirichlet Allocation (LDA) to thematically analyze all papers published in the Physics Education Research Conference Proceedings between 2001 and 2018. By looking at co-occurrences of words across the data corpus, this technique has allowed us to identify ten distinct themes or "topics" that have seen varying levels of prevalence in Physics Education Research (PER) over time and to rate the distribution of these topics within
more » ... h paper. Our analysis suggests that although all identified topics have seen sustained interest over time, PER has also seen several waves of increased interest in certain topics, beginning with initial interest in qualitative, theory-building studies of student understanding, which has given way to a focus on problem solving in the late 2010s. Since 2010 the field has seen a shift towards more sociocultural views of teaching and learning with a particular focus on communities of practice, student identities, and institutional change. Based on these results, we suggest that unsupervised text analysis techniques like LDA may hold promise for providing quantitative, independent, and replicable analyses of educational research literature.
arXiv:2001.10753v1 fatcat:txsymkvdh5eh3h3kpttm7xnrlu