Storyful is a news agency for the social media age. Founded by veteran Irish journalist Mark Little, the startup spots breaking stories and verifies social media content, in particular video, so that it can be used by mainstream news organizations. I talked to Little and Claire Wardle, Storyful's Director of News Services, about how they collect and verify material.
What are the main kinds of social media content that you verify?
Claire Wardle: We work mainly with video, most of which is on YouTube, and the social profiles on which they are posted. We also use social profiles like Twitter accounts as sources. Photos are very easy to manipulate but sometimes on big stories we will also look for photos. After the Brazilian nightclub fire a few months ago, the DJ posted a photo on his Facebook page just before the fire started.
How do you discover new stories and social media content relevant to them?
Claire Wardle: We use a lot of technology to sift through the ridiculous amounts of content. 72 hours of content are uploaded onto YouTube every minute. Often Twitter is the earliest signal that something is happening. We have over 600 Twitter lists, which have all been hand-curated. When we have a list of 35 interesting people on the ground somewhere, who are they talking to, retweeting, mentioning? Sometimes they will be talking to someone who is also a journalist but is not using his real name. This list-building technology suggests new list members to ensure that our lists are up to date.
Have you custom built anything to monitor Twitter?
Claire Wardle: We have also built velocity technology with Clique at University College Dublin, so if any of those lists starts moving more quickly than the average speed (average tweets per second) and contains relevant keywords like “bomb,” “killing,” “explosion,” we get an alert. About two-thirds of those alerts are related to a news story. One third is a soap-opera storyline or just isn't anything relevant. In a bus bombing the alerts will pump out keywords like “bus,” “bomb,” “Damascus,” “19 confirmed” and then the software will pull out the related tweets. Twitter will give you the keywords in the local language and we use them to search YouTube to find what people are uploading from the scene. A couple of months ago an alert went off on our Florida list and it was actually a Canadian tennis player who had been arrested for some kind of sexual misdemeanor. The local journalists in Florida (where he was arrested) were talking about it. One of our clients is the Canadian Broadcast Corporation and they didn't know about it.
What about other social networks? Not everyone uses Twitter.
Claire Wardle: Twitter is much more open. We are building technology for Facebook but it's much harder to do. Every social network has value for different stories. The Congo rebels or Syrian activists will be sharing information on Facebook so we have interest lists that we follow. We have built little hacks for YouTube so we can get in and find things before they are indexed. Sometimes it can take 3 hours for a video to come up in search results.
How are your partnerships with these networks?
Mark Little: A lot of it is about working with YouTube to get them to change their product. They come to us and say “How can we do news?” It's the one area they can't do properly. We have these platforms--primarily YouTube and Twitter--which are revolutionary for journalism, and yet I think between them there are two journalists working in those organizations and neither of those people are involved in product development. We want to protect the rights of uploaders and give them credit but the incentive at the moment is not to reward the original uploader but to rip them off. If you put a reference in that video, you can track it and also refer back to the original. But we don't just need technological changes; we need a movement of people like us and other news organisations to persuade companies like YouTube to make these changes.
Claire Wardle: We also work quite closely with YouTube's policy unit. If we see something quite gruesome we will give them a call to say “You really need to take down that execution video from Syria,” but at the same time that needs to go into your archive since the International Criminal Court may need to use it for evidence.
With what level of certainty can you say that a video is authentic?
Claire Wardle: Verification is time consuming and manual but our staff who have been looking at, say, Syrian content for the last two years can can now look at the corner of a building and say “Yes, that's that minaret in Aleppo” or see a wingtip and say “That's a MIG.” There are a number of different checks which have to be done. We often use the example of bodies being dumped over a bridge in Syria. We could geolocate the bridge and talk to activists on the ground. What we didn't have was a date. The water levels in the river when the activists said this had happened were too high. We can verify up to a point but can't say when it was. We will explain to news organizations, “Here's the map, here's the upload history, here's any other information we have,” but we leave the final editorial decision to the newsroom.
Mark Little: After the Aurora movie theatre shootings, ABC came to us with a video 90 minutes before they were going on air and said “Can you absolutely verify this video?” and we said no. Four out of five things match but that one thing means we wouldn't use it.
Do you also look at what people are saying in a video?
Claire Wardle: We have networks of people on the ground who are very good on dialects, but they will also say things like “That group never swears and those people are swearing” or “Those boots that they are wearing are not ex-army.” With ongoing stories like Syria, we are much stronger since we are doing it all the time. Some of this is technology but some of it is just old-fashioned journalism like knowing your network.
Mark Little: Is that an Aleppo accent or Damascus accent? No algorithm ever is going to solve that. You have to talk to somebody who knows. While I would like to get automation to 99 percent, that one percent is the thing that makes this all work. Technology is needed, but it's not sufficient.
Geofedia: Social media search and monitoring by location.
Google Maps and Street view: Use to compare a video or photo with the location where it was purportedly taken. Flickr also has a function that allows a search for photos on a map.
Wikimapia: Wikipedia for maps, which allows anyone to "tag" a location and describe it. This is often useful for identifying districts or buildings that don’t appear on commercial map services like Google Maps. Maplandia has similar features.
Google Image search: Use "Search by image" rather than text to find where an image appears online. It also returns visually similar images. Often there will be dates attached, giving you an idea of the history of an image.
TinEye: An alternative reverse image search. Sorts results by image size and by how much the image has been modified.
FourMatch: Image forensics to detect whether or not an image has been tampered with. It's a paid service but you can get a demo key for $20 which lets you analyze 30 images in 30 days.
DomainTools: Check domain ownership.
Spokeo People Search: Find information about people in the U.S.
This article branches off a longer story we’re tracking about the future of the news business. Read back for previous updates.
[Image: Flickr user Laogooli]