BLOG POST

Can the Overestimation of Vaccination Rates Be Better Explained?

December 19, 2008

My colleague Ruth Levine has recently posted on a new study by Stephen Lim and colleagues at the Institute for Health Metrics and Evaluation of the discrepancies between vaccination rates from two sources: those obtained officially from governments and those estimated directly and unofficially from household surveys from the same countries.I'm glad Ruth is emphasizing that the household surveys which the IHME authors use as a gold standard may in fact often under-estimate coverage. The IHME authors themselves admit that the mothers whose reports of vaccination coverage are collected by the household surveys are fallible sources of information. Their analysis of inter-survey reliability in Webappendix 3 only shows that mothers are about 70% consistent with themselves, not that they are accurate.I have one complaint about this paper and one suggestion for further investigation.My complaint is that the methodology is so complex and poorly described. While it seems straightforward to simply correlate two measures of national vaccination rates, the IHME authors are unsatisfied with the small number of observations available on both of these variables. So they interpolate, extrapolate, cross-pollinate, engage in unholy practices called "multiple imputation" and "bi-directional regression" and otherwise practice their statistical legerdemain in order to greatly expand the number of "observations". Perhaps because of Lancet's requirements, the author's verbally describe all of these mathematical statistics without offering the reader a single equation on which to hang his hat. One is left wondering whether the same conclusions about overestimation of vaccination coverage could have been reached by simply comparing the actual observed numbers. And if not, why not? Perhaps the authors will publish other papers in statistical journals where the details of their methods will be exposed more transparently.My suggestion is the offer of an alternative approach to comparing the quality of the two separate estimates of vaccination rates. Since IHME's results are based on almost 200 separate surveys, there must be heterogeneity across the surveys in a variety of dimensions in addition to whether the country had a GAVI agreement, the only independent variable the authors investigate. One should be able to use many attributes of the countries and of the surveys themselves to explain each of the two different measures of vaccination coverage separately and then to try to explain the gap between the two measures.For example there is a recent paper by Varun Gauri and co-authors that explains variations in the officially measured vaccination coverage rate using country data.Also see: : http://go.worldbank.org/67J3RGNDL0Would running the exact same regression using IHME's survey-based measure for the dependent variable produce more or less plausible results than Varun was able to get with the official data? Depending on the outcome of that horserace, one might have more confidence in the official data or in the survey data.It's also likely that there is heterogeneity in the quality of the questions that were asked and in the accuracy of the data as judged by other questions. It strikes me that it should be possible to assess the likely accuracy of each of the 193 surveys, at least in relation to one another, by closely reading the survey questions, the interviewer's manual and by assessing the accuracy of certain "sentinel" questions - such as questions on infant mortality - for which there might be other corroborating evidence. Then the question would be, is there a correlation between survey quality/accuracy and the size (and direction) of the difference between the survey estimate and the GAVI-accepted official estimate of vaccine coverage? If cross-country analysis shows that the gap gets larger for more accurate surveys, that would support IHME's dramatic headlines. But if it gets smaller for more accurate surveys, that would call into question IHME's result.One last point on heterogeneity of the countries: I find it particularly interesting that 8 countries gave officially estimated coverage rates that were lower than the rate implied by the household survey data. This finding is not consistent with the clever, greedy bureaucrat model. It would be consistent with a dumb bureaucrat model. Since I don't believe that bureaucrats are that dumb, there are two other possibilities that occur to me. One is that these countries have low levels of corruption combined with inadequate statistical capacity. This hypothesis of the over-estimations being the product of well-meaning, smart bureaucrats with limited resources could be tested by the analysis I suggest above. The other possibility is that BOTH the official and the household survey measures of coverage contain an enormous amount of error of three sorts: traditional sampling error, traditional non-sampling error (due to bad survey design or administration) and bi-directional regression and multiple imputation error, the scariest sounding of the three.None of this is to dispute the authors' conclusion that GAVI and all results-based aid programs must be a lot more careful about measurement if they are to motivate development advances rather than clever cheating.

Disclaimer

CGD blog posts reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions.

Topics