Cross-cultural Measurement Invariance of the Items in the Science Literacy Test in the Programme for International Student Assessment (PISA-2015)
International Journal of Education and Literacy Studies
This study aimed to investigate cross-cultural measurement invariance of the PISA (Programme for International Student Assessment, 2015) science literacy test and items and to carry out a bias study on the items which violate measurement invariance. The study used a descriptive review model. The sample of the study consisted of 2224 students taking the S12 test booklet from Australia, France, Singapore, and Turkey. Measurement invariance analyses for the test were done using Multi-Group
... Multi-Group Confirmatory Factor Analysis (MGCFA). Differential Item Functioning (DIF), in other words, measurement invariance of the test items, was analyzed using the item response theory log-likelihood ratio (IRTLR), Hierarchical Generalized Linear Model (HGLM), and the Simultaneous Item Bias Test (SIBTEST) methods.According to the findings, the test was determined to exhibit structural invariance across cultures. The highest number of items showing DIF was observed in the comparisons of Australia-Singapore and Australia-France with 35%. The number of items showing DIF, with 24%, determined in bilateral comparisons which included Turkey, the only country taking the translated form among other countries, did not show a significant difference compared to the other comparisons. While the lowest number of items showing DIF was obtained from Singapore-France samples with 12%, the rate of items indicating DIF in the France-Turkey samples was 18%. On the other hand, 35% of the items showed cross cultural measurement invariance. An item bias study was carried out based on expert opinions on items identified and released as showing DIF in the comparisons of Turkey with Australia and Singapore.According to the findings, translation-bound differentiation of the items, familiarity of a culture group with the contents of the items, polysemy in the expressions or words used in the items, the format, or the stylistic characteristics of the items were determined to be the cause of the bias in the skills measured with the items.