Another post aimed mostly at myself for future reference, but feel free to read along! This is a review of The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives by Stephen T. Ziliak and Deidre N. McCloskey. To begin, I admit that I could not even finish the introductory chapter, and do not foresee myself doing so in the future. If you read my previous review post of Andrew Hartley’s Christian and Humanist Foundations for Statistical Inference, you may have caught my irritation with the accusation that statisticians confuse the probability of the data given an assumed hypothesis (the P-value) with the probability that the hypothesis is true given the data. Since it comes up again in this text, let me elaborate a little further on the accusation, and the correct thought process, from both frequentist and Bayesian perspectives.
First, in statistical inference, the goal is generally to decide between two competing hypotheses. These are different possible explanations for the underlying population from which the data come. We assume that one of these is true (sort of like the assumption of innocence in a courtroom), and look to find how likely our data are under this assumption. If the data are sufficiently unlikely given the assumed hypothesis, we generally conclude that this hypothesis is likely false, and that the alternative hypothesis is a more reasonable explanation.
The accusation of the authors here (Note: both authors are economists, not statisticians, by training) claim that instead of interpreting a P-value of 0.03 correctly as a 3% chance of seeing data like this if the assumed hypothesis is true, users read this as a 3% chance that the assumed hypothesis is correct. While I agree that this is an incorrect conclusion, I disagree with their assertion that we need a major revisiting of statistical thought based on this confusion among some statistical users. I have never heard a trained statistician make this mistake. I have heard of, and seen firsthand, users with little statistical training make this mistake frequently. How is this statistics fault?
Evaluating the correct interpretation depends on the view of statistics that you take. In a frequentist view, the assumed hypothesis is either correct, or it is not. Thus, while we don’t know the probability that it is correct, the only options are 1 (if it is true) and 0 (if it is not). There is no such thing as a 3% chance the hypothesis is correct, so the backwards interpretation is nonsense.
In a Bayesian view, the probability that the assumed hypothesis is correct depends on how likely it was to be correct before gathering data, how likely the data is assuming the hypothesis (the 0.03 above), and the likelihood of the data given the alternate hypothesis! Clearly, this is unlikely to be exactly the 0.03 found as part of the calculation.
So neither of the major perspectives within statistics is likely to fall into this error. Perhaps the users that Ziliak and McCloskey come into contact with are confused, but this is not a reason to claim statistics is wrong, it is simply evidence of a greater need to rely on those of us who are trained to interpret things correctly.
The other major problem that Ziliak and McCloskey cite is the concern with significance rather than estimation. They claim that it is silly to ask whether a difference exists between two groups on average, rather than how big the difference might be. Once again, I wonder where these authors learned their statistics. All the introductory level courses I have taken or taught have stressed that there are two parts of an analysis, and both are important. We should evaluate whether there are differences, and then if there is a statistically significant difference, we should estimate their size and evaluate whether that difference is practically significant.
When a book starts off griping about problems that statisticians have been warning those who use the methods of our field about, it is hard for me to feel motivated to continue to read the book. Would an economist read through a book about poor economic theory if it were written by a biologist? Sure, the biologist may have some knowledge or experience, but, as an outsider, it is unlikely that it would be taken seriously by those who are really experts in the field. This is part of why I abandoned the book, and don’t plan to go back.
Here’s my proposition for the authors: ask schools to require that students take more/better statistics courses to be better consumers of the statistical help that they need! When doing a study/publishing a paper involving the analysis of data, a statistician should always be involved to insure that the methods are appropriately applied, and the conclusions are stated in legitimate ways. Bemoaning the poor application of statistics by false and over-stated accusations against the field itself doesn’t help anyone, in my opinion.
By the way, while I’m ranting, when I mention I teach statistics, don’t tell me you had “that course”, as if there is only one course. You wouldn’t say to someone who teaches Spanish, “I had that course.”, so don’t do it to statisticians (or philosophers, historians, accountants, etc. for that matter).