I argued in this post the other day that what we need for predictive success is not so much big data as self-awareness. Hypothesis testing ought to help us revise our point of view.
Silver rightly emphasizes prediction is largely a means to an end:
I keep talking about the importance of purpose, for example here about the difference between maps and models.
The philosophy of this book is that prediction is as much a means as an end. Prediction serves a very central role in hypothesis testing, for instance, and therefore in all of science.As the statistician George E. P. Box wrote, “All models are wrong, but some models are useful.” What he meant by that is that all models are simplifications of the universe, as they must necessarily be. As another mathematician said, “The best model of a cat is a cat.”Everything else is leaving out some sort of detail. How pertinent that detail might be will depend on exactly what problem we’re trying to solve and on how precise an answer we require.
The potential pitfalls mean we have to know ourselves, says Silver:
This is why it is so crucial to develop a better understanding of ourselves, and the way we distort and interpret the signals we receive, if we want to make better predictions.
FrequentismHowever, much recent statistics has run well and truly off the rails by assuming that error arises from our measurements rather than our perception or judgement. Silver criticizes simple-minded statistical "frequentism", which he says mostly stems from nineteenth century English statistician Ronald Fisher.
The idea is you can act as if you can repeat an experiment innumerable times. The more random experiments you do, the more accurate the outcome.
The idea behind frequentism is that uncertainty in a statistical problem results exclusively from collecting data among just a sample of the population rather than the whole population.
Essentially, the frequentist approach toward statistics seeks to wash its hands of the reason that predictions most often go wrong: human error. It views uncertainty as something intrinsic to the experiment rather than something intrinsic to our ability to understand the real world. The frequentist method also implies that, as you collect more data, your error will eventually approach zero: this will be both necessary and sufficient to solve any problems. Many of the more problematic areas of prediction in this book come from fields in which useful data is sparse, and it is indeed usually valuable to collect more of it. However, it is hardly a golden road to statistical perfection if you are not using it in a sensible way. As Ioannidis noted, the era of Big Data only seems to be worsening the problems of false positive findings in the research literature.
Frequentism dominated statistics in the twentieth century. Fisher criticized Bayesian statistics (which we will come to in a moment) for beng insufficiently objective. But, says Silver,
Plenty of investors have lost their shirts by having risk models which assume that market events follow a neat normal ( or similar ) distribution.
Nor is the frequentist method particularly objective, either in theory or in practice. Instead, it relies on a whole host of assumptions. It usually presumes that the underlying uncertainty in a measurement follows a bell-curve or normal distribution. This is often a good assumption, but not in the case of something like the variation in the stock market. The frequentist approach requires defining a sample population, something that is straightforward in the case of a political poll but which is largely arbitrary in many other practical applications. What “sample population” was the September 11 attack drawn from? The bigger problem, however, is that the frequentist methods—in striving for immaculate statistical procedures that can’t be contaminated by the researcher’s bias—keep him hermetically sealed off from the real world. These methods discourage the researcher from considering the underlying context or plausibility of his hypothesis, something that the Bayesian method demands in the form of a prior probability. Thus, you will see apparently serious papers published on how toads can predict earthquakes, or how big-box stores like Target beget racial hate groups,which apply frequentist tests to produce “statistically significant” (but manifestly ridiculous) findings.
Bayesian ProbabilityInstead, Silver strongly advocates the older Bayesian statistics. In essence, one must specify a prior probability of an outcome , based on one's current beliefs. Bayes' formula then specifies how you should alter that probability in response to incoming data and events, which produces a posterior probability. It is about recognizing your current expectations and beliefs, amd learning from new evidence.
We took a step backwards when frequentism arose.
The Bayesian viewpoint, instead, regards rationality as a probabilistic matter. In essence, Bayes and Price are telling Hume, don’t blame nature because you are too daft to understand it: if you step out of your skeptical shell and make some predictions about its behavior, perhaps you will get a little closer to the truth.
For me, the point about Bayesian probability ( which I haven't ever used professionally) is not so much the math but a procedure which requires you to test and revise your beliefs in response to evidence. I think Silver overdoes Bayesian probability as THE answer, but his main target in his own intellectual world is likely very much the frequentists. He is a statistician. So we can understand his emphasis on an alternative statistical tradition.
As we will see, science may have stumbled later when a different statistical paradigm, which deemphasized the role of prediction and tried to recast uncertainty as resulting from the errors of our measurements rather than the imperfections in our judgments, came to dominate in the twentieth century.
Incidentally, I am no statistician, but I have been intrigued in the past by Keynes' arguments in his A Treatise on Probability (Classic Reprint), but I'll leave that for another time.