Lies, they say, come in three varieties: lies, damn lies and statistics. Cynical though it may sound, that rings true for anyone whose profession revolves around numbers. Medicine, of course, is rife with statistics: the incidence and prevalence of diseases, the efficacy of therapies, optimal dosing regimens, adverse events, risk factors, survival rates—all are expressed in numeric terms, sometimes to a hundredth of a decimal point. Even quality of life, by its very name a qualitative measure, gets quantified in health care.
Do doctors rely too much on the statistics in medical literature? The people in charge of medicine and the people in charge of statistics think so. An editorial in the Journal of the American Medical Association, echoing the American Statistical Association, argues that probabilities (expressed by the P value) “are misinterpreted, overtrusted and misused.”1
Actually, it’s fairer to say that the problem isn’t so much reliance on statistics but rather the way in which we ascribe validity to medical ‘facts’ that are expressed numerically. The two groups seek to tighten that up by advocating for “lowering the routine P value threshold for claiming statistical significance from 0.05 to 0.005 for new discoveries.” That means the likelihood of a random (rather than causative) association in a study would drop from 5% to 0.5%, raising the bar for what gets called significant. This would “shift about one-third of the statistically significant results of past biomedical literature to the category of just ‘suggestive,’” the editorial states.
Poof, one-third of your ‘facts’ just went up in smoke.
It’s a scary proposition, but may be for the best. Such a move would cut a lot of noise out of the conversation. Too many studies using P values in the 0.05 range are quoted and promoted as gospel, in eye care and other disciplines. “Most claims supported with P values slightly below 0.05 are probably false (i.e., the claimed associations and treatment effects do not exist),” the editorial states. “Even among those claims that are true, few are worth acting on in medicine and health care.”
And yet, many of those claims are the lifeblood of medical practice today. Health care would be better off, says the editorial, “with fewer, larger, and more carefully conceived and designed studies with sufficient power to pass these more demanding thresholds.”
Of course, this would be no panacea. Those with a vested interest, most notably for-profit entities more interested in finding a marketing angle rather than the purity of truth, could just move the goalposts. “Selected study end points may become even less clinically relevant because it is easier to reach lower P values with weak surrogate end points than with hard clinical outcomes,” JAMA warns.
It’s not easy to think of scientific validity as a tunable instrument: turn the dial up and certainty goes down. But this proposal reminds us that our facts are only as ‘real’ as our methods—and motives—demand.
|1. Ioannidis JP. The proposal to power P value thresholds to .005. JAMA. March 22, 2018 [Epub ahead of print.]