BLOG POST

Using Mobile Phone Records to Improve Public Health: Evidence from Malawi

This blog is part of CGD’s Governing Data for Development project, which explores how governments can use data to support innovation, development, and inclusive growth while protecting citizens and communities against harm. Rachel Sibande is a member of the working group that guides the project.

The tremendous growth of mobile phone penetration, internet-connected devices, network infrastructure, and high-speed internet has fundamentally shifted relationships between individuals and the public and private sectors. In many low- and middle-income countries (LMICs), rapidly expanding mobile services improve access to information, digital financial products, and civil society engagement. In high-income countries, expanding digital infrastructure provides an endless variation of commerce, community, and media interactions. In every context, convenience and access come with an implicit bargain: the capture and use of personal data to derive better-targeted products.

The private sector quickly adopted routine tracking of digital behaviors for market research, while the public sector and non-governmental organizations have been slow to catch-up—perhaps with the notable exception of national security applications. Opportunities abound to responsibly and ethically leverage large datasets—like anonymized call detail records (CDRs) provided by Mobile Network Operators (MNOs)—to better allocate and enhance quality of public services for health, education, agriculture, and disaster response in LMICs. The global COVID-19 crisis brought into sharp focus the need for better partnerships, frameworks, governance, and methods for using aggregated individual records to generate nuanced insights for policy and response. Transparency and accountability are key to these efforts.

Since 2017, the government of Malawi, through a collaborative partnership with Cooper/Smith, the Digital Impact Alliance (DIAL), and the Bill & Melinda Gates Foundation, has worked with government entities, regulatory authorities, telecommunications providers, technical experts, and users to better leverage CDR data for public good. The government of Malawi has proven that CDR data can be an excellent proxy for geographically granular population density estimates and offer the only routinely available view into population movement and migration. Such insights have tremendous potential for better meeting public needs with limited resources. This piece discusses two real-world applications that highlight the value of leveraging CDR data, the ethical ramifications of using these data, the importance of collaborative and sustainable partnerships, and what we see as the way forward.

Persistent challenge: Equitable access to essential healthcare services

In Malawi, an estimated 7.73 million people, or 45 percent of the population, live more than five kilometers from a health facility, severely limiting access to essential healthcare services.

To address this dilemma, the Ministry of Health enacted a Capital Investment Plan that includes key targets for improving access to health services, including a goal to ensure that 95 percent of Malawians live within five kilometers of a health facility by 2023. Identifying locations for new health facilities that will cover the most underserved is no easy task, especially in a setting with highly mobile populations and low-resolution (infrequent and highly aggregated) population estimates. The Ministry of Health, MNOs, and regulatory authorities worked to address this problem using CDR data, triangulated with available population and program data.

The first step was to assess whether CDR data can be a reasonable proxy for population. The government established a secure data pipeline to receive anonymized records from MNOs, mapped cell tower clusters with Malawi administrative boundaries, and compared subscriber CDR data with existing population estimates from the National Census and WorldPop datasets. This analysis confirmed a strong correlation between CDRs and population density.

The research team then analyzed the CDR data to observe population movement patterns in the short and medium term. These trends were used to refine population projections down to the traditional-authority level (one administrative unit below district). The team entered these new population estimates into an optimization model aimed at maximizing coverage of those individuals more than five kilometers from a health facility. The model was further weighted with disease-burden statistics drawn from the local health management information system.

The model outputs increased the efficiency of new clinic placement over other methods, predicting an additional 226,000 Malawians would have improved health access by 2023 if the model recommendations were implemented. The model, which updates routinely, has since been turned into a dashboard to support the Ministry of Health’s decisions on facility placement as resources become available.

Figure 1. Proposed allocation of new health posts

In 2017, 55.3 percent of Malawians lived no more than five kilometers of a health facility. By using the results of the model to optimize resource allocation and decide where to build the next 900 facilities, this number could be brought to 95 percent by 2023.  Using the model to site new health facilities results in 226,000 more people living within 5 km of a health facility than would be the case if the facilities were sited based on the prior criteria.

Population insights in times of crisis

With the onset of the COVID-19 epidemic in Malawi, the Ministry of Health needed to quickly leverage existing data and improve models to improve data-informed pandemic response. Having embedded technical partners like Cooper/Smith and DIAL helped the ministry pivot existing resources and activities toward responding to the crisis. The Public Health Institute of Malawi collaborated with its technical partners to develop a dynamic, epidemiological model for estimating COVID-19 transmission and comparing mitigation scenarios. Leveraging CDR data and partnerships improved the accuracy of this model and supported the development of rapid response tools for local authorities.

Simulating disease transmission across the cellular network can improve epidemiological estimations

Augmenting the epidemiology model with population mobility insights derived from CDR statistics allowed the government to estimate the epidemic start date down to the lowest administrative level. While mobility plays a huge part in disease transmission dynamics, little data are available that can help a country understand movement within and between communities and how this will likely affect disease burden. The CDR data allowed the government to improve temporal accuracy of the epidemiological model outputs for all 350+ traditional authorities. A dashboard enabled public health authorities to adjust policy levers—like social distancing and mask usage—and, as new data became available, allowed them to visualize the change in COVID-19 outcome predictions.

Detecting anomalous gatherings with mobile network data can help identify high-transmission events and tailor public health responses locally

Infectious disease transmission requires physical proximity, and CDR data provide the best lens into population density at any given point in time.

Researchers developed a mass-event detection tool using daily CDR records. Estimates for typical subscriber volume were developed for each cell tower cluster. Actual data on subscriber volume was compared to assess variance. When significant differences between estimated and actual volume were observed, a dashboard flagged these locations on a map for the Emergency Operations Centre. These outputs were highly restricted, but capable of directing limited response resources where they might best be used to interrupt transmission.

Figure 2 . Example of mass event report, August 2020

Note: Data in this image has been scrambled and anonymized to protect user privacy.

Ethical and responsible data use

Responsible data use, protection of privacy, and strict avoidance of non-intended uses of data are essential to build trust and protect individual and group privacy. Before the MNO in Malawi shared data, it was pre-processed to ensure privacy, protection, and quality assurance.

The MNO anonymized data by: 

  1. Implementing an MD5 hashing algorithm which stripped all identifying data—such as name, phone number, gender, and age—and assigned each user an individual code unrelated to the individual’s phone number.
  2. Removing the timestamp and replacing it with a four-hour block, which prevented potential triangulation of a user’s location and movements.

The Institutional Review Board (IRB) of the National Commission for Science & Technology (NCST) of Malawi granted authorization to the government and its partners to analyze mobile data using these anonymized CDR data sets provided by the MNO and authorized by the regulatory authority. Additionally, the MNO and the technical partner, Cooper/Smith, signed a data sharing agreement, which established a framework and protective mechanisms governing what data could be shared between project partners. Contracts for Collaboration provides examples of such agreements between MNOs and third-party organizations.

To ensure compliance with best practices regarding the use of sensitive data and the design of algorithms and models for public health purposes, Cooper/Smith completed a risk/benefit self-assessment that was reviewed by the Digital Impact Alliance’s compliance team, as well as an external assessment led by the United Nations Office for the Coordination of Humanitarian Affairs (UNOCHA). Both assessments validated the methodologies used by the project against global standards for data privacy and security.

Unlocking MNO data for future applications

The use cases above underscore the value of enhancing public-private partnerships with MNOs and leveraging a rich, nuanced, and routine dataset to improve the provision of public services. Additional use cases are rather obvious and sufficient ethical and legal frameworks need to be established to responsibly access these data going forward.

Long-term success of initiatives using MNO data will be achieved when data models have become:

  • Institutionalized: Used, maintained, and updated by a designated organization.
  • Repeatable: The analytical process can be done repeatedly and produce consistent results.
  • Replicable: The analytical process can be used across sectors.
  • Scalable: The data processing infrastructure can accommodate ever growing data sets.
  • Sustainable: Stewards can ensure the continuity of data access, data processing infrastructure and data analysis.

In Malawi, the government is deepening its partnerships with MNOs and the public sector to achieve all these goals. A technical working group with the National Statistics Office works to (1) build local capabilities to manage extremely large, sensitive data sets, (2) improve frequency and accuracy of population density estimates, and (3) explore and define use cases across sectors that will benefit from MNO data insights.

Tackling challenges associated with the use of big data is not optional. We must find ways to protect privacy, avoid harm, and leverage these data for public benefit. 

Rachel Sibande is senior director, country outreach at the Digital Impact Alliance (DIAL). Tyler Smith is chief technical officer at Cooper/Smith.

Disclaimer

CGD blog posts reflect the views of the authors, drawing on prior research and experience in their areas of expertise. CGD is a nonpartisan, independent organization and does not take institutional positions.