Comparison of Correlation Measures for Nominal Data [post]

Tanweer Islam, Mahvish Rizwan
2020 unpublished
In social sciences, a plethora of studies utilize nominal data to establish the relationship between the variables. This, in turn, requires the correct use of correlation technique. The choice of correlation technique depends upon the underlying assumptions and power of the test of significance. The objective of the research is to explore the best measure of association for nominal data in terms of size, power and bias in estimation. Monte Carlo simulations reveal that the Phi and Pearson
more » ... i and Pearson correlation statistics performs equally well in terms of size, power, and bias for naturally dichotomous variables. When both variables are artificially dichotomized, the Tetrachoric statistic has an edge in terms of bias to Pearson correlation statistic. If one variable is continuous and other is artificially dichotomized, the Biserial correlation measure turns out to be less biased as compared to Pearson statistic although both statistics exhibit similar power and size properties. If one variable is continuous and other is naturally dichotomized, it is hard to choose between the Point Biserial and Pearson correlation measures. Finally, if one variable is naturally dichotomous and other is artificially dichotomized, correlation coefficient V is compared with Pearson, Phi and Tetrachoric correlation techniques in terms of bias in estimate. The results indicate that the Tetrachoric statistic considerably overestimates the correlation value against non-normal distributions. Pearson and Phi correlation slightly underestimate the correlation value. In contrast, the correlation statistic V perform well.
doi:10.20944/preprints202004.0276.v1 fatcat:qrg6wzr6gbd2zkylohtfvv6twi