2013-03-19

How The UN's New Data Lab In Indonesia Uses Twitter To Preempt Disaster

Predictive disaster relief is the goal, says Robert Kirkpatrick, Director of the UN's Global Pulse initiative, and Twitter data may be the key. The program uses social network analysis to study living conditions throughout the world and preempt crises. “We found that a combination of food words and mood state was able to predict the consumer price index several weeks ahead," says Kirkpatrick.



“About three weeks ago in Indonesia our lab was monitoring Twitter and we caught a discussion on whether vaccines were halal," says Robert Kirkpatrick, Director of the UN's Global Pulse initiative. The program tracks global well-being via digital data. "People were saying that vaccines contain pork products. You end up with kids paralyzed from the neck down because they haven't been given a vaccine," he says. The goal of the new UN lab is to try to preempt misfortune and misinformation by finding it first in conversations--before a chain reaction occurs. "If you can identify the province where these signals are coming from, you can send a team in to get the word out.”

This is the most recent evolution in the work of UN Global Pulse, a UN program introduced in 2010 to monitor global well-being via digital data: the first "Pulse Labs" in Asia. It's a local innovation center which pilots new ways for governments and UN agencies to use real-time data monitoring.

Big Data In The UN's New Labs

Indonesia has 200 million mobile phone subscribers, many of whom are also enthusiastic Twitter users. The residents of Jakarta alone generate 9 million tweets a day. The country is also vulnerable to every kind of shock: economic fluctuations, security crises, and natural disasters. Indonesians go online to air grievances, fears, and anxieties, making the country a fertile data source for the new Global Pulse Labs.

Global Pulse is not a standalone global data-gathering or monitoring body. The organization partners with companies and universities to get access to data, tools, and expertise which supplement the data scientists and engineers in its local Pulse Labs. Pulse Labs, which are currently located in Jakarta and New York (Pulse Lab Kampala, Uganda opens soon), are experimental data labs testing ways to use real-time data at a country level.

A local lab comprises of a multi-disciplinary team of 7-10 data scientists and engineers, development practitioners, and policy experts. The lab staff work with UN partners who have domain expertise in areas like food security, nutrition, water, and sanitation and who are often the "customers" of the technology. Private sector partners contribute data, tools, and expertise. Techniques tested, and correlations discovered, in a local pulse lab may later be adopted by labs globally, and hopefully eventually by UN agencies and governments.

Predicting A Food Crisis With Twitter Data

Global Pulse has found that certain technological indicators have strong correlations with changes in living conditions. A decline in mobile airtime purchases can indicate that a household's incoming is dropping. An increase in social media discussions about holiday cancellations or selling a car often happens several months after people are laid off. The Jakarta Pulse Lab was even able to predict a food crisis in Indonesia using Twitter.

“In June 2012 we started a project looking at sentiment around both price and supply of foodstuffs," says Kirkpatrick. "Rice, chilies, cooking oil, fish--then did mood state analysis to identify anxiety, depression, uncertainty, happiness, sadness," Kirkpatrick explains. “We found that a combination of food words and mood state was able to predict the consumer price index several weeks ahead.”

The food crisis research started as a joint project between Pulse Labs researchers and private sector partner Crimson Hexagon, whose tools were used for the initial data exploration and to categorize keywords related to staple food commodities. Crimson Hexagon's ForSight, a social-media analytics platform, captured trends and anomalies in Twitter conversations about food. Back in 2011, researchers found that there was a strong correlation between the volume of food tweets and the official Indonesian food basket inflation index.

In June 2012, investigations started again with the addition of a new partner -- SAS. SAS's Social Media Analytics and Text Miner tools, normally used by brands to track customer conversations and sentiment, analyzed two years of data (more than 200,000 documents per day) from Twitter, Facebook, blogs, forums, and news sites. Time series data was extracted, showing changes in the volume of mentions of price and supply of commodities like rice, milk, cooking oil and fuel. By performing mood state classification on this data set, the researchers could track trends like the level of anxiety about the supply of fuel or anger about the price of rice.

Kirkpatrick says that each data source can provide a different view of the same issue. “Twitter is very ephemeral in nature: When we were trying to predict unemployment spikes from mood analysis we found that people don't use Twitter to reflect on the past or discuss causality or to predict the future," he says. "It's really about what's happening right now.” Blogs, on the other hand, tend to give a longer-term view.

Global Pulse is currently in discussions with all 10 of Indonesia's mobile phone operators to get access to three years of anonymized mobile call records and airtime purchase data for their 200 million subscribers, a veritable data goldmine. Researchers at the Jakarta lab can use this data to find correlations between mobile phone subscriber demographics and activity and everything from food and fuel prices to earthquakes and disease outbreaks.

Where The Data Comes From

Global Pulse relies heavily on private companies like mobile phone operators to contribute data. “We realized pretty early on that the most valuable data is behind the walls of private companies,” says Kirkpatrick. The companies who hold that data are often cagey about sharing it. They need to protect the privacy of their customers and require reassurance that the data will not fall into the hands of their competitors or be used by governments to put them under regulatory scrutiny. But there are upsides too. “They see opportunities for data-driven CSR (Corporate Social Responsibility). It's a way to show their customers and community that they care and to show their employees, many of whom are millennials who want to do well by doing good, that the company is actually doing something positive,” he says.

The motivation for companies in emerging markets to contribute data is less about CSR and more about survival. “If your customers fall back into poverty, there goes your market," says Kirkpatrick. "If it turns out that there are reliable, real-time signals this is happening in your own database, you can share that data with governments to allow them to take action faster. The data used by governments is normally collected through surveys. There's a lag intrinsic to that process, so in a sense, real-time is [like] prediction.”


Disaster's Coming. Now What?

Kirkpatrick has led Global Pulse since its foundation three years ago. His previous gigs include Groove Network's Humanitarian Systems team, which involved a stint living and working in Saddam Hussein's former palace in Iraq, supporting NGOs after the 2005 Kashmir earthquake, and developing a collaboration tool for relief workers after Hurricane Katrina.

He describes Global Pulse's task as “searching for digital smoke signals which indicate changes in people's well-being in different sectors-- health, education, finance. If something changes at a household level, it changes how people use services. If people's incomes are declining very slowly over time that's not going to create a recognizable signature in the short term but there are certain events that are going to cause fairly immediate changes in behavior: the loss of a job, getting sick, running out of money or having food become very quickly unaffordable.”

Using data from digital services is also cheaper than sending people into the field. “It may be less accurate, but it can at least tell you where to go first, in order to collect hard evidence,” says Kirkpatrick. “We want to accelerate the adoption of analytics in the public sector for international development work. In Vegas, all the casinos share pictures of known card counters. All weather stations now share their predictions and everybody benefits. We want the private sector to find a way to join a real-time data commons to help make the world more resilient.”

[Image: Flickr user Scott Cooper]