There’s a Problem with Many ‘Replication’ Studies in Economics. Here’s a Simple Fix.

April 09, 2015

In Milan Kundera’s The Unbearable Lightness of Being, Sabina and Franz are doomed lovers. Kundera traces the demise of their relationship to a disagreement about what words mean. Sabina and Franz never realize that they mean different things when they say simple words—like “woman” and “truth.”

In development economics, a series of small wars have recently erupted between research colleagues. They involve disagreements over “replication” studies—studies that revisit earlier empirical research but arrive at different results. Similar battles have played out across the economics discipline.

Part of these conflicts arise from the fact that different people mean different things when they say the word replication. That can be fixed, and I propose a way to do it in a new research paper.

Replication lies in the bedrock of science. The pathological science underlying “cold fusion” was uncovered after one lab claimed to generate boundless energy in a cheap, low-temperature chemical reaction but other labs could not replicate it. Thus in economics, when others learn that a famous result could not be “replicated,” they wonder about the original researchers—thinking at best of carelessness or freak chance, and at worst of deliberate fraud.

But often, studies that are perceived as attempting to replicate others’ work, and failing, are doing nothing of the kind. Rather, they get different results because they are asking fundamentally different questions, with different methods or different data. Again and again, the original authors have protested that the critique of their work got different results by construction, not because anything was objectively incorrect about the original work. (See Berkeley’s Ted Miguel et al. here, Oxford’s Stefan Dercon et al. here, and Princeton’s Angus Deaton here, among many others. Chris Blattman at Columbia and Berk Özler at the World Bank have weighed in on some of these controversies.)

Some of these rancorous, years-long battles would simply evaporate if social science had a single, clear definition of what is and what is not a replication. Then it would be clear when a follow-up study can claim that it fails to replicate an earlier result. But neither economics nor social science in general has such a definition.

In my paper I put forward a modest fix: an unambiguous definition of what a replication test is. The current literature is hopelessly confused—I document 41 different and conflicting definitions!—and the definition I propose is one way to solve the problem. I also classify numerous recent critique papers, arguing that about two thirds of them are not replications by my definition.

It’s a modest step. But research colleagues, like Kundera’s lovers, make more progress when they know what each other are saying.

Disclaimer

CGD blog posts reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions.