On the Reliability of the Findings of PISA Tests

Shlomo Yitzhaki
2016 Research & Reviews: Journal of Social Sciences   unpublished
Knowledge is a hidden variable, and we therefore require a test in order to rank subjects according to their level of knowledge. A test is a battery of questions of varying levels of difficulty. The test results constitute an ordinal variable, since one cannot measure knowledge quantitatively, as one would height or weight. A test can merely rank subjects according to their level of knowledge. It is common practice to rank the success of education systems in various countries according to the
more » ... according to the average score achieved by students who take a certain international test. An example of such is the PISA test, on which Israel is ranked 29 th out of the 33 OECD countries. Averaging is a valid procedure for a quantitative variable, but not for an ordinal variable, the items of which can only be ranked. Since an ordinal variable can be ranked but not averaged, some of the rankings based on averages are unreliable, because one could have devised an alternative test with questions of a different degree of difficulty that would have altered the ranking of the mean scores. This article formulates the theoretical conditions for constituting an alternative test that would alter the ranking of the mean scores, and proceeds to an empirical examination of these cases regarding all possible comparisons between Israel and other OECD countries. The findings show that alternative tests exist that would alter the ranking of Israel's mean scores in relation to half of the OECD member states. This means that in exactly half the comparisons between the OECD countries and Israel, an alternative test exists that would alter the ranking. A further finding indicates that the greater the gap between the mean scores, the less likely one is to find an alternative test that would alter the ranking of the mean scores. The conclusion to be drawn is that one should attach less importance to ranking according to mean scores.
fatcat:qddbmzhbz5bbze62oprssxxriu