Closing the Evaluation Gap: Q&A with Ruth Levine

January 14, 2010

Each year, donors spend more than $30 billion and developing countries spend hundreds of billions more on programs to improve health, education and other social outcomes. But few programs are evaluated to learn whether they make a difference in people's lives. This shortfall in evaluation wastes money and means that many decisions about social sector spending are made on political grounds.

CGD's Evaluation Gap Initiative aims to address this problem by highlighting the need for more and better impact evaluations, and proposing ways to increase the supply of knowledge about “what works.” Ruth Levine, CGD director of programs and a co-author of CGD’s Evaluation Gap Working Group draft report, recently traveled to Mexico to hear from senior Latin American officials their views about closing the evaluation gap.

Read the Mexico communiqué (pdf)

Gonzalo Hernandez and Ruth Levine
(Gonzalo Hernandez, head of Mexico's National Council for the Evaluation of Social Programs, confers with CGD’s Ruth Levine.)

Q: What is the single most striking aspect of your discussions in Mexico?

A: Two points stand out: First, I learned about the very impressive impact evaluation work being undertaken in Mexico, Chile and Argentina. Mexico is a real pioneer in this area. The legislature has mandated that impact evaluations be conducted, and the country is developing a track record of looking carefully at very important questions: how well is public spending reaching the poor; how are services being utilized; and--remarkably--what are the real-world effects on measures such child nutrition, school completion and household income? They are taking a strategic view: what do we need to know so that big anti-poverty, food supplement, housing and other programs work better in the future? Second, I was struck how the champions for good impact evaluation must fight daily battles--the same battles being fought by those who work on evaluation within development agencies and NGOs. Budgets are inadequate; it's hard to connect with and learn from technical colleagues outside the country; program managers feel threatened because they think of evaluation as a sort of policing function; the media focuses on the "bad news"; and evaluation results--whether positive or negative--are sometimes discredited by being labeled as part of a political agenda.

Q: Who did you meet?

A: CGD co-hosted the meeting with SEDESOL, Mexico's federal agency that manages major social programs like Oportunidades, to get feedback on the ideas generated by the Evaluation Gap Working Group. We met with the officials who lead and run the evaluation office of SEDESOL, as well as individuals who have been active in the design and evaluation of social programs in Argentina and Chile, and several people from research institutes, NGOs, USAID and philanthropic foundations.

Q: Do you see demand for better impact evaluation coming from other quarters?

A: I think the right question is not whether there's demand for evaluation, but whether there's demand for knowledge. From what we've seen and heard, there's lots of demand for genuine, credible knowledge about what works. Politicians want to hold ministries accountable; program designers want to learn from others; even widely dispersed “beneficiaries” have a stake in knowing what governments and donor agencies have actually accomplished, and how programs can be improved. While it's naïve to think that there could or should ever be a mechanistic application of evaluation results in the highly political domain of social sector spending, we are in an era of greater access to information, movement toward more evidence-based policy making, and more demands for accountability and transparency. This is the right moment for a big push.

But knowledge is a public good--a global public good, in fact--so individual countries, programs, and agencies don't have the incentives to put in the resources for adequate evaluation. The “knowledge agenda” is also hampered by political, bureaucratic and technical difficulties of conducting evaluations to generate that knowledge. We need to figure out how to get those who want the knowledge to recognize that the way to get it is by working with others to support impact evaluations.

Q: Is impact evaluation the only kind of evaluation that matters?

A: Absolutely not. I think about evaluation questions in terms of "are we doing things right?" and "are we doing the right things?" It's clearly essential to look at that first question: to understand the complex social processes affecting the implementation of a program; and to look closely at how well or haltingly a given program is rolled-out. It really does matter whether funds are being disbursed smoothly, people are being hired and retained, schools are being built and equipped. All of that is absolutely vital to feedback to program managers and designers, so that adjustments can be made. And I think for that sort of evaluation, it’s important to have a close link between evaluation and implementation to promote real-time learning.

But at the same time, we have to be thinking about whether we are doing the right things. When we build the schools, train the teachers, and introduce an innovation like computers in the classroom, are children going to school, staying in school and learning more than they would otherwise? And the way to figure that out is with impact evaluation.

Q: What is the biggest obstacle you see to improving social sector impact evaluations?

A: I think the biggest obstacle is that the "doing it" and the "learning whether it works" functions of organizations like social sector ministries, NGOs and development agencies have a hard time co-existing. To "do"--that is, to design a grant, to convince a government to take a loan, to implement a program--often requires being convinced that your approach is the best one. "I'm testing an idea" is not nearly as compelling as "This is going to work." In contrast, learning about whether the program is achieving the anticipated impact requires distance from the doing; you need to be able to look in an impartial way at what actually happened. So far, each organization has tried to solve this problem by setting up separate and sometimes explicitly independent evaluation units. But it is inevitable that except under extraordinarily visionary leadership the organization's priority will go toward the doing, leaving the learning function undervalued, under-resourced and, sometimes, undermined.

These are problems that can be solved if we get out of the mode of thinking about each organization as an isolated unit. If national governments, NGOs and donor agencies share a demand for knowledge, then it seems quite possible for them to participate in and benefit from a collective approach to generate that knowledge.

Q: What are the next steps for this initiative?

A: We are continuing to consult with a broad set of individuals about why there is a relatively weak base of evidence in the social sectors, and what can be done about it. We are particularly seeking feedback on the idea of establishing an independent international facility to provide flexible funding to support evaluation opportunities, to collaborate with countries and agencies in building an agenda of learning around some of the enduring questions in international development, and to widely share impact evaluation methods and findings. The next developing country consultation is in India in early April, at a meeting co-hosted by Rajat Gupta, Senior Partner Worldwide of McKinsey and Company, and Suman Bery, Director-General of the National Council for Applied Economic Research.

We are getting very thoughtful and constructive input from these discussions, as well as meeting champions in the field, which is really inspiring. We will be finalizing the report of the working group in the next couple of months, with specific, practical recommendations for the international community. I am optimistic that we can achieve a genuine breakthrough.

Topics

Girls Count: A Global Investment and Action Agenda

Evaluation

Corruption, Transparency, and Governance