A bored policeman scrolls through hours of video and suddenly spots a crucial piece of evidence. It's a typical scene from a TV cop show, and one that Icelandic startup Videntifier aims to consign to history. Videntifier's blazingly fast video-matching technology can, in a matter of seconds, scan an entire hard drive to find suspicious material such as child pornography or terrorism propaganda. For most police forces today, that process is still largely manual.
“The Icelandic police had a huge case in 2007-2008 where they seized 15 terabytes of images and video data,” says Videntifier CEO Herwig Lejsek. “The whole department worked on it for four to five months.” Now, Videntifier is working with international police organization Interpol to add video identification to the next version of the latter's International Child Sexual Exploitation Image Database (ICSE), which is already used by investigators all over the world to identify victims and offenders and link investigations together.
“The initial idea with going online with ICSE in 2009 was to make this international collection available to trained investigators doing child abuse cases in the field, “ explains Uri Sadeh, coordinator of Interpol's Crimes against Children Unit. “Am I investigating a case with a kid who has already been identified somewhere else in the world? Am I investigating five images when 20 more exist of the same victim? A little over 3,000 identified victims from close to 50 countries are documented in the database. When we launched it in 2009, we had 800 declared victims.”
Interpol works with police forces in 119 countries, 40 of which already have direct access to the database, to try to improve the tools available to them. “The first and foremost problem specialized teams are facing is the volume,” says Sadeh. "There are so many people with a sexual interest in children and so much material being produced and shared that it's overwhelming for any police force.” ICSE already contains millions of images, not all of which are illegal. Related material like images taken in the same room or at the same location or leading up to the abuse are also collected. ISCE already offers a visual comparison tool, a mathematical comparison based on color and shape, which helps investigators to link together images they are working on and group duplicates. Investigators can also check whether a case has already been resolved and, if it hasn't, collaborate with colleagues from around the world.
The next step is video. “We see more and more video on seized computers and online,” Sadeh tells me. “Video also has additional value since you get sound and a language can be recognized. There are multiple scenes, which can make the offender less aware of what he is sharing. An image is pretty controlled and it's easier for an intelligent offender to cover his tracks.” That's where Videntifier comes in.
Offenders are just as likely as the rest of us to have a hard drive full of Hollywood blockbusters and TV shows, making finding suspicious material a hunt for the proverbial needle in a haystack. At the core of Videntifier Forensic is a database of 70,000 hours of video, covering most Hollywood movies from the past 50 years, U.S. and European TV shows, music videos, and legal adult content.
Videos are split into frames and converted into a set of visual fingerprints, mathematical representations of the characteristics of a particular object within an image frame. A new video is matched against the 6 billion fingerprints in the database in order to identify it. “On average we get 80,000 to 100,000 fingerprints per hour of video, but it really depends on the content itself,” says Lejsek.“If there's a sitcom scene where there is not much camera movement, you might only have 20,000 but when an action movie might contain several hundred thousand fingerprints.” The system can also find malicious content which a suspect has tried to hide within another video. “Our clients have seen popular German TV shows, for example, which have these tiny clips of child abuse inserted every once in awhile.”
The software can identify 70% to 80% of the video content on a typical hard drive. The remaining video remains unidentified and therefore potentially malicious. That's still a lot of content, and Lejsek says that in most cases less than 1% of the material on a a seized hard drive is malicious, but it gives investigators a head start. “Typically a person doesn't just download one piece of child abuse material; they collect a lot. So it doesn't matter if you find 101 clips as opposed to 103.”
Videntifier has been working with Interpol since 2012 and will develop the third version of the ICSE, adding its own video analysis technology and Forensic Image Analyser from UK company Forensic Pathways, which can identify the device on which an image was captured (Sadeh asked me to note that Forensic Image Analyser has not yet been tested by Interpol but it will be available as part of Videntifier's implementation). The aim of Interpol's ICSE is not to identify child sex exploitation images or video but to link them together, so Videntifier's technology will be used in a new way. “When a particular police force finds some child abuse content, they upload this material to the ICSE database and research whether this material has been seen before in any of the Interpol member states. Maybe there are different versions of the same video, the background is similar or the location is the same. Videntifier can help by finding visual similarities. Forensic Pathways can help when two children have been abused at a different location so there are no visual similarities but the images have been taken on the same device.”
Richard Leary is a former senior police detective and currently the managing director of Forensic Pathways. The company's Forensic Image Analyser can identify images which have been taken taken with the same device and also match an image to a particular device. “All silicon chips have slight differences which thought of as 'scratches' or minute imperfections on the chip, “ says Leary. “It's very similar to ballistics fingerprinting and the scratches which appear on a bullet when it is forced down the barrel of a gun. When the light comes through on the lens and interacts with the sensor, we can identify a noisy signature created by these 'scratches'. We use that signature to distinguish one camera from another.” If you can identify content and the device which captured that content, you are half way to finding an offender. “Imagine that there are 250 images of child exploitation on a laptop and they were taken on the same camera as a set of wedding photographs on a different laptop with lots of faces and information about the location where they were taken. We can to a certain degree establish the identify of the owner of that particular device. A suspect will often admit that the images are illegal but will say, 'But I didn't take them.'”
Interpol is working on another project related to ISCE, internally dubbed Baseline. “We are aiming to create a category of images that is flagged as illegal in any country that has legislation on this topic and provide a list of signatures of that confirmed child abuse content to law enforcement,” says Sadeh. “We are also trying to encourage Internet companies or even large companies, companies who have 40,000 people working for them around the world, to detect and act on these illegal images because you will find child abuse material on their internal network.”
One implementation will be a list of signatures which can be implemented locally in any solution. A tool to query the signature list on the fly as it is updated will also be available. “It saves industry from exposing their employees to these images. We are hoping this will happen in 2014,” says Sadeh.
Videntifier also wants to sell its technology to Internet-hosting companies and social networks to help them filter child sex-abuse content. When perpetrators are constantly finding new ways of sharing content, the latest being Bitcoin, that's a critical issue. “We are talking to several U.S. authorities who are involved in the fight against child abuse,” says Lejsek.”Their need is even higher than that of Interpol. Large organizations in the U.S. have gathered more than 10 million child-abuse images and many thousands of hours of video. The amount of material out there is tremendous.”