Frequently Asked Questions

What is the "Evaluation Gap?"
Aren't many programs funded by bilateral or multilateral development agencies evaluated?
Why do impact evaluations have to compare what happened with the program to what would have happened without the program? Isn’t "before-after" enough?
What methods can be used to make these comparisons?
Isn't this experimentation? Is that ethical?
Are impact evaluations expensive?
How are evaluation results used?
Why does the evaluation gap persist?
What does it mean for an evaluation to be "independent"?
What should the international community do?
How would an initiative focusing on impact evaluation differ from other initiatives?
How would this initiative assist developing country policy makers?
Why would governments, agencies and foundations want to join?
How would a new initiative on impact evaluation make a difference?
Why is CGD working on this?
What do we envision the role will be for NGOs?
Who will undertake impact evaluations?

1. What is the "Evaluation Gap?"

Looking at health, education and other social development concerns in low- and middle-income countries, we have found important achievements in identifying and describing the nature of social problems and the populations affected, documenting the resources going into social programs, and linking those resources to services that are produced. But when we ask whether programs have made a difference – whether they have really met their aims – we find relatively little reliable information or evidence. To understand the effectiveness in addressing social problems of both government and privately operated programs, impact evaluations are required. Impact evaluations measure the effects that can be attributed to a particular social development program after controlling for other factors that might otherwise account for changes observed in the population. The "Evaluation Gap" refers to the missing body of impact evaluation work that is required to guide future social development policy.

2. Aren't many programs funded by bilateral or multilateral development agencies evaluated?

Many evaluations are done, but most of them focus on the outputs of the programs, and the process of program implementation. For example, an education program evaluation might document the number of schools built and textbooks purchased, or the number of teachers provided with in-service training. It might also highlight bottlenecks in procurement of goods and consulting services, issues related to the sustainability of the investments, and other important features of the program’s operation. But those evaluations, which can be very useful for some purposes, typically do not answer the question of whether the original “theory of change” underlying the program was borne out: For example, did the greater number of schools, textbooks and trained teachers translate into higher enrollment and better learning outcomes?

3. Why do impact evaluations have to compare what happened with the program to what would have happened without the program? Isn’t "before-after" enough?

Health, education and other conditions can change for many reasons; some are well known to us and can be controlled, but many are not. If we look only at what happened to people who participated in a particular program, then we cannot be sure that the observed changes in their health, education or economic status was the result of the program; it might have been the result of something else altogether. Many evaluations that have not made proper comparisons, risk demonstrating that a particular program was successful when it wasn’t – or did not succeed when it did. A teacher training program might look successful when student attendance increases, but the improved attendance may be the result of migration, rising family income, or changes in labor markets. Comparing attendance in schools with the trained teachers to comparable schools without the program would give a better indication of the program’s impact. Similarly, an HIV prevention program might look like a failure if the disease’s incidence continues to rise, yet the program cannot be expected to immediately reverse the epidemic. Comparing the rate at which the disease is spreading in groups that were reached by a new prevention program to those who weren’t reached might demonstrate that the intervention was successful because it slowed down the rate of infection. In most cases, it is not possible to draw valid inferences about population impact from before-and-after data alone.

4. What methods can be used to make these comparisons?

In general terms, there are two ways of making these comparisons. One way is to start by randomly determining which individuals, households or communities will be offered a particular program. If data is collected before and some period after the program has been implemented, this strategy permits comparisons of those who participated and those who did not. And if the random assignment is done properly, the resulting comparisons are quite robust. Other ways to make these comparisons use statistical methods to essentially create an artificial comparison group. Some examples of the statistical methods in use are “differences-in-differences,” “propensity-score matching,” and “instrumental variables.” It is still necessary to have data that was collected “before and after,” but sometimes existing household surveys or other sources of information that are already available can be used. When random assignment methods have been compared to other methods, they are consistently shown to provide the most reliable measurements of program impacts at the individual, household or community level, but they aren't always feasible.

5. Isn't this experimentation? Is that ethical?

Governments and NGOs often create programs that seek to change people’s behavior, yet these programs tend to be designed with little evidence that they are effective and free of harmful effects. That itself is "experimentation," but we only think of it as such – and we only learn from these programs – when an evaluation is built in. Randomly assigning people to participate or not in a program or, for other methods, collecting data from people who are not participating in a program, can be ethical when there is genuine uncertainty about the efficacy of a program, when resources are insufficient to allow everyone to participate, or when concerns about potentially negative effects have been raised. These evaluation methods, including random assignment, have become a normal part of medical research, with guidelines that specify when, how and under what conditions such “trials” can be done. As with treatments studied by medical researchers, there is an ethical imperative to evaluate social programs in order to know whether they are effective, whether they have unintended negative effects, and whether their costs are justified by their impact.

6. Are impact evaluations expensive?

Impact evaluations are usually more costly than other types of evaluation, but they can provide information that can be used to greatly increase the effectiveness of both public and private spending. The cost of a well done impact evaluation needs to be assessed in comparison to the benefits of the knowledge that it will generate. Not doing impact evaluations is, in fact, very costly because social programs that are not designed on the basis of good evidence of “what works” are likely to be less than optimally effective.

7. How are evaluation results used?

The point of undertaking impact evaluations is to inform public policy and/or decision making so that future health, education and other programs are more effective than those in the past. Evaluation results can help decision makers focus resources on programs that are relatively effective and, importantly, can provide evidence to improve the performance of key programs that are not performing as well as might be hoped. Getting real-world benefits out of evaluation requires that those who are making decisions receive evaluation results in a timely manner and in a form that is both credible and understandable; it also requires that those decision makers – who are often politicians or political appointees – have a genuine interest in paying attention to the effectiveness of their policies, and are not simply using social programs for the purposes of getting votes or providing patronage employment. In other words, evaluations are most meaningful when there is good leadership and governance.

8. Why does the evaluation gap persist?

When new programs are put in place, decision-makers and project managers are focused on getting started and see few benefits to putting effort into designing and conducting studies. It is only once a program is being implemented or nearing completion that questions often arise as to what successes have been achieved, and by then it is often too late to collect the data necessary for a proper evaluation.

Other factors also inhibit the production of good impact evaluations. Some people argue that:

politicians and managers prefer not to seek information that might produce bad news;
the benefits of social sector programs are too difficult to measure;
unreliable studies are easier to produce than reliable ones and often can be made to sound just as credible;
people who make decisions regarding social development policies are only weakly accountable to the taxpayers and philanthropists who provide funds, so they are not under pressure to produce good information about the impact of their programs;
knowledge of impact evaluation and its value was not widespread among cohorts of managers and decision-makers, but is becoming more so every day.

9. What does it mean for an evaluation to be "independent"?

It is a paradox that the individuals who know most about how a program operates are not in the best position to evaluate it. The program designers, implementers and funders often have a preconception about whether the program is achieving its aims, and may have a vested interest in the evaluation results; therefore, they may not be able to look at the evidence with an objective eye. Moreover, to the audience for evaluation results, the credibility of the findings is compromised when those who have "something to gain" are closely associated with the evaluation itself. The evaluations that have the biggest impact on future decisions, therefore, are likely to be those that are conducted by a third-party. That said, the cooperation and involvement of the program designers, implementers and funders is essential in establishing the evaluation questions, because it is these parties who know the program’s original intent and the relevant policy questions. Also, a well designed evaluation requires the cooperation of the program designers and implementers to be successfully completed.

10. What should the international community do?

The CGD Evaluation Gap Working Group proposes that a pioneering group of governments, international agencies, and private foundations create a new entity that would focus on impact evaluations. The entity would lead the definition of shared questions that are of interest across countries and agencies; provide funds for the design of evaluations at the all-important moment when programs are being designed; develop quality standards for evaluation; disseminate the results of good evaluations; and undertake other core functions.

11. How would an initiative focusing on impact evaluation differ from other initiatives?

Current initiatives are achieving many things and are very worthwhile – whether making data and studies more accessible, training and increasing capacity to do impact evaluations, promoting high quality standards, synthesizing the existing literature, or beginning new studies. But these initiatives do not change the incentives faced by policymakers, program designers or project managers with regard to starting and sustaining impact evaluations. Creating a new entity would make it easier for high-level decision-makers to commit their governments and organizations to learning from impact evaluations; for project managers and designers to begin impact evaluations by providing timely funding; and for researchers to conduct and publish reliable studies by providing external quality review.

12. How would this initiative assist developing country policy makers?

Developing country policy makers would benefit in several ways. First, as more knowledge is generated about social development programs that work, policy makers will be able to rely on a more solid evidence base when making decisions. Second, they will be able to shape the debate over which questions are important enough to justify studying. Third, they can get external validation of impact evaluations that are conducted domestically, if they adhere to the initiative's external review process. Finally, they can strengthen their domestic capacity for conducting impact evaluations and designing programs on the basis of evidence.

13. Why would governments, agencies and foundations want to join?

High-level decision-makers would want their government or organization to be an active member of the initiative to:

leverage funds – because the institution could potentially access more funds than it contributed;
participate in a committee that would select “enduring questions” to guide requests for proposals;
participate in a committee that would identify potential subject programs for studies based on expected learning; and
comply with mandates from their stakeholders requiring implementation of results-oriented management, and demonstrate a genuine commitment toward evidence-based policymaking.

Those public program managers, agency staff and policymakers who have an interest in learning from impact evaluations would find that the existence of the initiative would lower the costs and barriers they face because it would:

provide short-term grants for exploring the feasibility of evaluating a program or collecting baseline data;
be a well known source for longer term funds dedicated to impact evaluation;
provide models for good impact evaluation design and implementation;
give external credibility, legitimacy and continuity to impact evaluation studies;
act as a link to experts and technical review processes.

14. How would a new initiative on impact evaluation make a difference?

A major new initiative to stimulate more and better impact evaluations will accelerate the process of learning what works in social development programs, and contribute to the creating knowledge that can be tapped by developing country governments when deciding how to allocate public resources and design public policies. By showing the value of learning from reliable impact evaluations, such an initiative will also contribute to establishing an evaluation “culture” in government and non-governmental organizations concerned with social development programs so that the process of studying and learning from such experiences becomes an integral and automatic part of the process.

15. Why is CGD working on this?

A major part of CGD's work is on the effectiveness of development assistance. One of the greatest obstacles to improve effectiveness is the lack of knowledge about what the best ways are to support health system development, education sector reform, microfinance, and other programs into which donors put large amounts of money. Over the long haul, the value of development assistance - and the willingness of taxpayers in wealthy countries to fund it - depends on documenting whether and under what circumstances programs achieve their aims. That knowledge can then be used to better focus resources in the future, and to report back to taxpayers about the benefits of their generosity. CGD has no institutional interest in implementing the recommendations developed by the Working Group, or in undertaking impact evaluations.

16. What do we envision the role will be for NGOs?

NGOs can participate in many different ways. They could join as members on a par with other institutions (the level of contributions is still to be worked out). They can submit their own proposals for program evaluations. They can collaborate with other organizations to make proposals to evaluate their own programs.

17. Who will undertake impact evaluations?

Research institutions, universities, NGOs, governments, development agencies, etc. will all continue to produce impact evaluations. If a new initiative is created, it would not conduct impact evaluations. Rather it could offer grants to assist in the design of impact evaluations or, depending on decisions by stakeholders, it could have resources to finance impact evaluations with grants that would be awarded on an open, competitive basis.