Bias in the evidence base

evidenceFrom The British Psychological Society’s Research Digest:

In the last few years the social sciences, including psychology, have been taking a good look at themselves. While incidences of fraud hit the headlines, pervasive issues are just as important to address, such as publication bias, the phenomenon where non-significant results never see the light of day thanks to editors rejecting them or savvy researchers recasting their experiments around unexpected results and not reporting the disappointments. Statistical research has shown the extent of this misrepresentation in pockets of social science, such as specific journals, but a new meta-analysis suggests that the problem may infect the entire discipline of psychology.

A team of psychologists based in Salzburg looked at “effect sizes”, which provide a measure of how much experimental variables actually change an outcome. The researchers randomly sampled the PsycINFO database to collect 1000 psychology articles across the discipline published in 2007, and then winnowed the list down to 395 by focusing only on those that used quantitative data to test hypotheses. For each main finding, the researchers extracted or calculated the effect size.

. . .

The authors, led by Anton Kühberger, argue that the literature is thin on modest effect sizes thanks to the non-publication of non-significant findings (rejection by journals would be especially plausible for non-significant smaller studies), and the over-representation of spurious large effects, due to researchers retrospectively constructing their papers around surprising effects that were only stumbled across thanks to inventive statistical methods.

Read the rest here.

Hard to kill


Nature has a new article on the troubling shelf life of bad psychology research:

Positive results in psychology can behave like rumours: easy to release but hard to dispel. They dominate most journals, which strive to present new, exciting research. Meanwhile, attempts to replicate those studies, especially when the findings are negative, go unpublished, languishing in personal file drawers or circulating in conversations around the water cooler.

Psychology is not alone in facing these problems. In a now-famous paper, John Ioannidis, an epidemiologist currently at Stanford School of Medicine in California argued that “most published research findings are false”, according to statistical logic. In a survey of 4,600 studies from across the sciences, Daniele Fanelli, a social scientist at the University of Edinburgh, UK, found that the proportion of positive results rose by more than 22% between 1990 and 2007. Psychology and psychiatry, according to other work by Fanelli, are the worst offenders: they are five times more likely to report a positive result than are the space sciences, which are at the other end of the spectrum.

The limits of empiricism

While listening to On Point last week I was struck by an argument on a show that focused on Charles Murray‘s new book. I have no interest in arguing the merits of his thesis here, but he believes that, for a variety of reasons, America has been dividing by class and he is profoundly concerned about the implications. In one segment he expresses concern that one result is an growing concentration of the smartest people in the elite class, and, by extension, an growing concentration of the least smart people in the lower classes. The host and other guest push back against what they hear as genetic determinism. Exasperated, Murray says, “There’s a statistical relationship between parental IQ and child IQ… on average, parents with high IQs will produce offspring with higher IQs than parents with lower IQs…It’s a fact!…I’m talking about an empirical relationship that is not contestable!”

I have no interest in entering this debate on this blog, but I think the exchange offers a chance to step outside of the debates in our field.

Murray’s insistence that he was simply reporting a data point shows how blind we can be to our own narratives. He seems only vaguely aware that he has already attributed meaning to the data point—its source, its implications, its importance, and its characteristics. (fixed vs. static, that genetic determinants are powerful and important in comparison to other determinants, etc.)

The other host and the other guest were so troubled by the meaning that Murray ascribed that all of their responses focused on this meaning and they never really responded to the data point.

It seems like a lot of drug policy debates follow a very similar pattern. I find myself frustrated with people who argue that their position is empirically based as though the meaning they derive from their facts is self-evident, that they hold the only rational understanding and their conclusions are value-free.

In turn, I could do a better job of responding to their data and concerns.