The handling of missing binary data in language research
Studies in Second Language Learning and Teaching
Researchers are frequently confronted with unanswered questions or items on their questionnaires and tests, due to factors such as item difficulty, lack of testing time, or participant distraction. This paper first presents results from a poll confirming previous claims (Rietveld & van Hout, 2006; Schafer & Graham, 2002 ) that data replacement and deletion methods are common in research. Language researchers declared that when faced with missing answers of the yes/no type (that translate into
... at translate into zero or one in data tables), the three most common solutions they adopt are to exclude the participant's data from the analyses, to leave the square empty, or to fill in with zero, as for an incorrect answer. This study then examines the impact on Cronbach's α of five types of data insertion, using simulated and actual data with various numbers François Pichette, Sébastien Béland, Shahab Jolani, Justyna Leśniewska 154 of participants and missing percentages. Our analyses indicate that the three most common methods we identified among language researchers are the ones with the greatest impact on Cronbach's α coefficients; in other words, they are the least desirable solutions to the missing data problem. On the basis of our results, we make recommendations for language researchers concerning the best way to deal with missing data. Given that none of the most common simple methods works properly, we suggest that the missing data be replaced either by the item's mean or by the participants' overall mean to provide a better, more accurate image of the instrument's internal consistency.