Just One-Third of Published Psychology is Reliable

A team of 270 researchers have now published the findings from their “Reproducibility Project”—an attempt to replicate the findings in published psychology papers—in Scienceand the results are dismal. Nina Strohminger (Yale) and Elizabeth Gilbert (Virginia) discuss the findings in an essay at The Conversation:

Almost all of the original published studies (97%) had statistically significant results. This is as you’d expect – while many experiments fail to uncover meaningful results, scientists tend only to publish the ones that do.

What we found is that when these 100 studies were run by other researchers, however, only 36% reached statistical significance. This number is alarmingly low. Put another way, only around one-third of the rerun studies came out with the same results that were found the first time around. That rate is especially low when you consider that, once published, findings tend to be held as gospel.

The bad news doesn’t end there. Even when the new study found evidence for the existence of the original finding, the magnitude of the effect was much smaller — half the size of the original, on average.

If you are curious about how particular original studies fared, you can look them up at the Open Science Framework website, where the full paper is also available.

Strohminger and Gilbert warn not to jump to conclusions:

One caveat: just because something fails to replicate doesn’t mean it isn’t true. Some of these failures could be due to luck, or poor execution, or an incomplete understanding of the circumstances needed to show the effect (scientists call these “moderators” or “boundary conditions”). For example, having someone practice a task repeatedly might improve their memory, but only if they didn’t know the task well to begin with. In a way, what these replications (and failed replications) serve to do is highlight the inherent uncertainty of any single study – original or new.

In conversation, one thing Strohminger drew attention to was the difference in replication rates between cognitive psychology and social psychology. Around 53% of the former were successfully replicated, while only around 28% of the latter were, which Strohminger suspects “has a lot to do with greater human variance in social than cognitive psychology.”

I did not go through the individual studies, but if you notice any that have been used by philosophers, or are of relevance to particular philosophical questions, please draw attention to them in the comments.

(And don’t get smug; imagine how low the rates would be in philosophy!)

reproducibility project figure 3

