Ideas to Action:

Independent research for global prosperity

X

Views from the Center

Feed

The use of foreign aid to support poor countries with inadequate implementation capacity and weak regulatory institutions has at times been described as “pouring money into a leaky bucket.” Given that there is seldom a quick fix for inadequate state capacity, aid programs can employ internal controls and monitoring mechanisms which increase effectiveness and value for money. This is part of the reason why aid organizations have in recent decades paid significant attention to monitoring and evaluation (M&E).

Against a backdrop of COVID-19 recovery, increasing demand for scarce resources and the upcoming replenishment of the World Bank’s International Development Association (IDA)—the bank’s main instrument for assisting the world’s poorest countries—the role of multilateral development banks is in the spotlight, and M&E is receiving more attention than ever.

As the leading multilateral development bank, the World Bank routinely embeds M&E in its lending operations around the globe. In this blog, I explore what makes a World Bank funded project successful, expanding on previous analysis that looked at the difference good M&E makes and examining whether project-specific M&E systems can compensate for ineffective state institutions in making a project successful. The analysis covers World Bank projects completed from 2009 to 2020.

M&E quality trumps government effectiveness across the board

The Independent Evaluation Group scores the design, implementation, and use of the M&E system of World Bank funded projects on a four-point scale as high, substantial, modest, or negligible. On average, a project with “substantial” rating for M&E quality is 38 percent more likely to attain a better final outcome than a project with “modest” rating (controlling for per capita income, government effectiveness, political accountability, project duration, project size, and year fixed effects). A project with “negligible” rating is 15 percent more likely to attain a worse outcome than a project with “modest” rating (figure 1).

Government effectiveness (as measured by the quality of policy implementation and public service delivery) on the other hand, has a much weaker correlation with project outcomes compared to M&E quality. A one unit increase in the government effectiveness index (standardized to the same scale as the measure of M&E quality) is associated with just a 6.5 percent increase in the probability of project success (government effectiveness is the average rating for the years between the start and the end of the project). In other words, good M&E in World Bank projects is a better predictor of success than government effectiveness.

In general, there is little complementarity between M&E quality and government effectiveness in predicting the success of World Bank projects. This shows that project-specific M&E systems function fairly independently of government institutions. It also implies that a change in state capacity would have limited impact in terms of augmenting the effect of World Bank M&E at least in the short run.

Figure 1. The relative importance of country-level and project-specific factors in predicting project success

Figure 1. The relative importance of country-level and project-specific factors in predicting project success

Note: Estimates are based on ordered probit regression (N=2194). Estimation accounts for time fixed effects.

The importance of M&E quality in predicting project outcomes holds true across all major program areas, while government effectiveness has little predictive power for project success once M&E quality is accounted for, particularly in human development programs (education, health, and social protection).

As M&E quality has improved, so has project performance

Over the last decade, there has been a significant improvement in the M&E quality of World Bank lending programs. Raimondo (2016) defined a good M&E system as having “clear institutional setup and division of labor around monitoring and evaluation activities; simple monitoring and evaluation framework that is well aligned with clients’ existing monitoring and evaluation systems; good integration with operational tasks; and a system that can generate regular and timely reporting, and that is used during and after lending.” Between 2009 and 2020, the share of projects with a “high” or “substantial” rating for M&E has increased by 28 percentage points (figure 2). Disaggregated across program areas, by far the largest improvement in M&E quality occurred in human development programs where the share of projects with at least “substantial” rating tripled from 26 percent to 77 percent.

For most of the last decade, the rise in M&E quality has been matched by a corresponding increase in project outcome. From 2009 to 2020, the share of projects with a “highly satisfactory” or “satisfactory” rating increased by nearly 22 percentage points. The improvement in project outcome seems to have more to do with sheer increase in M&E capabilities, rather than better use of existing M&E systems towards superior project outcomes. We can see this because the marginal contribution of M&E to project performance has not increased as much or consistently as M&E quality.

Figure 2. Trends in project success, M&E quality, and contribution of M&E to outcomes

Figure 2. Trends in project success, M&E quality, and contribution of M&E to outcomes

But having a good M&E system is no guarantee for project success. Over 16 percent of projects with high or substantial ratings for M&E ended up with a below satisfactory outcome. There is a significant correlation between the incidence of well-monitored projects that have performed poorly and the total amount of funding allocated to the project. This implies that larger projects might require a lot more than a robust M&E system to succeed. The sectors that are more likely to have projects that have performed poorly despite good monitoring are energy/extractives, education, and social protection/labor.

IDA projects are more likely to succeed, but their performance is less correlated with M&E quality

There are multiple types of lending agreements available for World Bank clients including the two major ones: IDA, for the poorest countries who qualify for interest-free loans and grants, and IBRD, for middle-income and credit-worthy low-income countries. In general, projects funded under IDA agreements are more likely to succeed (controlling for M&E quality, per capita income, government effectiveness, political accountability, project size, duration and year fixed effects). This is good news, in light of the more pressing need to ensure aid effectiveness in IDA countries. However, despite the claim made in the current replenishment document that IDA has been a pioneer in results monitoring, M&E quality appears to have less power in predicting the success of IDA projects (figure 3).

Figure 3. Correlation between IDA status and project outcomes under different M&E scenarios

Figure 3. Correlation between IDA status and project outcomes under different M&E scenarios

Note: Estimates are based on ordered probit regression (N=2194). Estimation also accounts for GDP per capita, government effectiveness, voice and accountability, project value, project duration, M&E quality, and time fixed effects.

The link between M&E quality and project success is encouraging enough to look for more causal evidence on how much M&E contributes to improved performance. There may be potential to leverage spillovers from World Bank project M&E systems to improve the capabilities of state institutions in client countries to perform their own monitoring and evaluation. This could take the form of systematic knowledge transfer or building institutional memory that can be tapped by public officials for future projects.

In the meantime, investing further in project-specific M&E in general while strengthening the links between M&E and project performance in larger projects and IDA-funded operations is likely to help the World Bank improve its effectiveness.

This blog benefited from helpful feedback from Susannah Hares and Justin Sandefur.

Disclaimer

CGD blog posts reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions.