Type I: families, planning and errors

Gordon B. Drummond, Sarah L. Vowler
2012 Advances in Physiology Education  
Comparisons propose no difference, and then ask "How probable?" • Misclassification is inevitable from time to time: false conclusions result • Families of observations are best only tested once • The more comparisons, the more likely is misclassification • For several comparisons in one family, test criteria should be more stringent SCIENTISTS FREQUENTLY WANT to answer the question "has this treatment had an effect?" Most are unaware that the tests they usually use do not directly address this
more » ... rectly address this question. These tests usually pose a different question, based on the possibility that nothing has happened. The question becomes: "how probable are these data, if there were NO difference between the original populations from which the data have been randomly drawn?" (In fact for most laboratory experiments, this supposition is patently false-the experiment has been conducted using a pre-ordained sample, possibly randomly divided into treatment and control groups, but certainly not randomly sampled.) However, if we continue to consider the usual analysis that is used, we have to assume that we have random samples, from the same population. Such samples will always differ to some extent. Occasionally, the difference might be substantial, large enough to suspect that they might not have come from the same source population. The usual context in which we use this test is that the data are already "under suspicion": we usually don't want to believe the null hypothesis at all, and we are testing to see if the data are unlikely to be consistent with this hypothesis. To assess how "suspicious" our results can be, we estimate how frequently we might obtain results like ours: • if the "null hypothesis" were true, • if we were to repeatedly sample the population, and • if the results are workings of chance. Generally, we reject the null hypothesis if chance alone could yield data like ours less than 1 time in 20 (or equivalently 95 times out of 100), an arbitrary and probably unnecessarily inflexible value (6). We believe our suspicions are justified, and we can then accept the alternative hypothesis: the samples are not from the same population. We rarely employ the same cautious vocabulary as the statistician, who might qualify this interpretation. The researcher assumes, wrongly, a possibility of 0.05 (i.e. 5%, or 1 in 20) to indicate that the null hypothesis is false,
doi:10.1152/advan.00118.2012 pmid:23209005 fatcat:aa5kgiqwdbg5tbyyogw23g4br4