Ideas to Action:

Independent research for global prosperity


Views from the Center


In "The Future of Statistical Computing," Leland Wilkinson argues that technological advances are going to shape the future of statistical analysis more than most other factors. The article is a helpful overview of today's statistical analysis, let alone predicting the future, for someone who remembers doing his first statistical models in Gauss (does anyone else even remember that package?).

My big question, though, is about the "Data Quality" issue that the author addresses in just a few paragraphs. Wilkinson writes:

Data quality may emerge as one of the most critical factors affecting analysis in the coming decade. As Karr pointed out, we often hear that we will drown in a flood of data, but a flood of bad data may be more of a threat. The electronic collection and assembly of data threatens to swamp the close examination of data before analysis. But this threat can be used to advantage if we develop automated assistants that can work with data experts to identify problem areas.

This presumes that bad data is a problem of culling through already collected data to "clean it" of anomalies, duplications, etc. However, for most of my work, on issues in developing the countries, the data often isn't even there. While Google is amassing huge amounts of data on people in the United States and Europe, people in Mali and Guyana rarely leave a trace on the Web. So, will the old statistical approaches continue to be relevant to data-poor contexts? Or are the new approaches better at analyzing the typical survey data that is probably all we will have for some time to come (e.g. Demographic Health Surveys and Living Standards Measurement Surveys)? Or am I being short-sighted and unimaginative because there are other ways of collecting useful data in developing countries that will replace these rather cumbersome surveys?

Related Topics:


CGD blog posts reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions.