2014-07-28

Co.Labs

Inside The Data Science That's Shining New Light On Syria's Civil War

The Carter Center and Palantir Technologies are opening their data platform to understand the Syrian conflict.



In a conflict as grisly as Syria's civil war, getting humanitarian aid to those who need it can be a life-threatening affair. Fortunately for those hoping to help, data from sources like Twitter, YouTube, and a range of others lets researchers turn war into a giant data science project, helping understand the tension between groups, how armed they are, and where they're headed next.

One year ago, Palantir Technologies donated their data organization software to nonprofit the Carter Center. “We wanted to see who the biggest fish amongst the opposition are, everyone relates to one another, and who's funding who,” says Christopher McNaboe, who works on the Syria Conflict Mapping project. Now that the the U.N. has granted unauthorized border crossing into Syria to provide relief, that data can finally be put into action.

Finding Relationships From Formation Videos

In the past year, McNaboe found that a big part of understanding the conflict was understanding the overall structure of opposition groups--and the place to go was YouTube. Formation group videos now account for 90% of his overall data, at one point showing 600 YouTube videos per day proving to be an enormous resource, McNaboe says. “Each video is two and a bit minutes long. That's more minutes of video than of conflict,” he says. “When I started watching these videos it was just individuals. Then it became armed groups and they started making very flashy formation videos for propaganda purposes." Since then he says the groups have formalized. "Now they release statements, videos of government abuse, and much more.”

By aggregating data from armed groups and mining conflict events from social media, activist blogs, news sites, and humanitarian organizations, the Carter Center has documented over 11,000 events and tracked almost 6,000 armed groups and 100,000 fighters. They used Palantir’s Gotham platform to do visual analysis of events including a hierarchal view of opposition groups’ relationships to each other, which McNaboe uses to see who the main actors are.

So what's the impact? Tracking 14,000 defectors by the summer of 2012, McNaboe was able to reverse-engineer the defection videos where he found a lot of information related to the structure of Syrian military itself. He points to a web of color-coded groups in what resembles a complicated spider web.

"These are the armed troops that have formed in one governate of Syria,” McNaboe says. “We can very easily see who is central to this network or we can find out who has tanks. The blue groups have tanks, or who was formed with religious reasons in their statement of purpose, this clusters."

Such visual data allows McNaboe to call bluffs on leaders who give out misinformation--an impossible feat with previous data methods.

“I met with somebody from the Supreme Military Council of the armed opposition in Syria and he said that he was in control of 70% of the armed groups in Syria, which is clearly an exaggeration,” he says. “You can look at these networks and see that not even 70% of the groups are connected to one another in a logical way.”

Making Sense Of A Complex Web Of Data

It can get a pretty hairy trying to connect pieces of a 6,000-piece jigsaw puzzle and differentiate formation groups like the “Military Council of Damascus and its Countryside” and the “Revolutionary Military Council of Damascus and its Countryside” McNaboe says.

“There are local names, there are Arabic names, there are Kurdish names, and then various English transliterations of all of those and oftentimes we'll come across a place that we've never heard of before, that doesn’t show up in Google Maps or on Google," he says.

Sorting through tons of video material, McNaboe and other analysts vigilantly coded 70 different attributes for each formation video they came across. “We look at how many people are in it, if there is a leader who gets his name, the name of the leader or leaders, the name of affiliated organizations, and others.”

Not only does each node have 70 different attributes to import, but they have these attributes for a fixed periods of time, McNaboe says. "It's 70 different attributes for 6,000 different groups for 1,000 different days and the file size and ability for a computer to handle this amount of computation becomes really difficult."

Prior to Palantir’s software, McNaboe and data scientist Russell Shepherd had to limit the number of attributes to 25 in order to preserve the temporal aspect of the visualization they needed.

Geo Locating For Passage Planning

Using tagged geo location data, McNaboe can see the conflict history of any given location, oftentimes down to the exact neighborhood.

“If organizations get satellite imagery of aerial bombs being loaded into a plane, that's great, but we've got the historic information on where all these aerial bombs have been used and the changing areas of control.”

They can see if each attack is comprised of shelling, clashes, aerial bombardments, IEDs, suicide bombs, or sniper fire--and they can even tell which groups use what. A heat map displays visual data, showing where ISIS--one of the most powerful rebel groups--has been operating in more than 900 events since last November.

“December, a lot of activity in the south,” McNaboe says, pointing to the map. “In January, when trading took off between the opposition and ISIS in western Aleppo, they were very, very active."

Green-coded patches show where the most aerial bombardments took place. "That's likely where the fighting has been very intense and how these bombs have progressed over time, where the first ones were, how they coincide with fighting."

With history data, humanitarian aid organizations can better plan how they’ll engage with the opposition groups to ensure safe passage for humanitarian aid in a given location.

“You have a history of cease-fires they may have agreed to in the past and therefore information on what they're willing to compromise on or what they're not willing to compromise on.”

Geo tags from tweets help a lot too, McNaboe says. “A map of geo-tagged tweets come only from areas of government control. They very rarely overlap with where there has been conflict.”

But censorship by both the government and by citizens make it a constant game of cat and mouse especially in the case of Damascus and Aleppo, he says.

“It could be destruction of utilities in opposition-controlled areas, simply the government switches the area off, making it difficult for people to tweet,” McNaboe says. “The uniformity of it suggests something like that is going on.”

Conflict events in Syria, showing 10,000 events that the Center has recorded since July, 2013.

An Open Platform

When video content--like sophisticated weaponry--is blurry or unidentifiable, McNaboe outsources the findings to experts like Brown Moses, who return and input the data themselves. "We've structured the data by providing subsets of information to those experts for feedback, which is really easy. We've been doing that a lot lately," he says.

McNaboe says the success of this project has largely been due to partnerships and collaboration such as the ones with Palantir, Brown Moses, and the citizens themselves.

Both Palantir and the Carter Center are working to make the data more accessible and user-friendly so humanitarian organizations can import their own data collections directly into the platform. “We can overlay areas of internally displaced person camps with information on the conflict events, who controls what, and so on," McNaboe says.

And while McNaboe uses YouTube as a resource, the opposition groups use it as their own platform, forming more than 100 armed groups in the first two months of 2012.

So why are these groups so active on social media, knowing they can be seen by the whole world? To build their organization and connect with international funders, McNaboe says.

"They've done a cost-benefit analysis and have determined that the public relations that is now so readily available to them through Facebook, YouTube, and Twitter is far more beneficial for their sustainability as an organization than a lack thereof. Some use YouTube as a way to tell their funders 'thank you' and some have even named their organization, the battalion of whatever, after the funders themselves."

Based on the YouTube data, McNaboe's personal theory is that the openness of the Internet has helped facilitate the increase of more opposition groups.

“It's a slow-moving but very serious movement because of their ability to connect with funders and supporters where now they don’t have to go through this long process of building an organizational philosophy and a sustainable model,” he says.

Nintey-nine Percent Perspiration

Up to this point, the northern border cities have controlled the border, making it near impossible for the U.N. to bring in relief to citizens while McNaboe watched in real time.

“There were about 40,000 people in need right across the border,” he says. “The convoy drove almost within sight of them without being able to deliver it.”

It’s the hope for safe passage that fuels McNaboe and team's sometimes painstaking data input as they try to fill in as many blanks as they can. For the first nationwide report, McNaboe stayed up all but three hours every day for about a month going through the source file database that converted visual files, the network maps, and then the geographic maps of conflict distribution.

"It's one of those 99% perspiration, 1% inspiration type of things where the success of it is not based upon some flashy click of a button tech innovation--though it helps--but rather a lot of sweat, hard work, data structuring and data collection,” he says.

Gotham’s collaboration tools have enabled more contribution from outside experts, other organizations, and most importantly, the citizens themselves. No other conflict in history has ever been this carefully documented just by its citizens, McNaboe says.

With a new resolution in place, the U.N. will rely on McNaboe, his team of data scientists, and Palantir’s engineers to know where to make their next move.

Despite knowing how closely their online activity is being monitored, citizens are still determined to get information out to the rest of the world, while McNaboe tries to make sense of it all.

“There are a couple of compilation videos which show people literally looking down the barrels of tanks that are firing at buildings around them and staying there, narrating what they're seeing and what's happening,” McNaboe says. “A tank is shooting or pointing straight at the camera, and they're still there.”

McNaboe says they are just now looping in more analysts of outside organizations--including the U.N.--so they’re able to directly view Syria data in the next few weeks. By combining the databases that are normally siloed within the Palantir platform, analysts can better evaluate the conflict and people’s needs by picturing it, literally.

“What they [citizens] are doing is beneficial," McNaboe says. "They really want the world to see what is happening and we're listening--or trying to.”

[Image: Shutterstock]




Add New Comment

1 Comments

  • Mike Elias II

    Outstanding #technology article. Unbelievable the way they are able to aggregate so much data from such openly public sources. Great technology, providing desperately needed humanitarian relief.