An R Toolbox for Score-Based Measurement Invariance Tests in IRT Models [post]

Lennart Schneider, Carolin Strobl, Achim Zeileis, Rudolf Debelak
2020 unpublished
The detection of differential item functioning (DIF) is a central topic in psychometrics and educational measurement. In the past few years, a new family of score-based tests of measurement invariance has been proposed that allows the detection of DIF along arbitrary person covariates in a variety of item response theory (IRT) models. This paper illustrates the application of these tests within the R system for statistical computing, making them accessible to a broad range of users. This
more » ... ation also includes IRT models for which these tests have not previously been investigated, such as the generalized partial credit model. The paper has three goals: First, we review the ideas behind score-based tests of measurement invariance. Second, we describe the implementation of these tests within the R system for statistical computing, which is based on the interaction of the R packages mirt, psychotools and strucchange. Third, we illustrate the application of this software and the interpretation of its output in two empirical datasets, and show how to conduct simulation studies, such as IRT-based power analyses, in the context of DIF investigations. The complete R code for reproducing our results is reported in the paper and its appendix.
doi:10.31234/osf.io/r9w34 fatcat:4lxfigrzinbenixt3xd3e5an7e