You know CAPTCHA, that string of distorted characters that forces you to prove you’re human? We hate it too. Which is why we were fascinated to find that a six-person startup called Vicarious was claiming it had cracked CAPTCHA altogether, fulfilling a 10-year ambition to create an algorithm able to think and identify characters as well as a living person. In the process, they are disrupting everything we know about machine intelligence.
“We started off by using machine learning tools to create a model of what comprises individual letters—thereby training our system to recognize them,” says Dr. Dileep George, who with D. Scott Phoenix heads the team of AI researchers at Vicarious. That’s not hard. But the next step was to make the system good at learning, even when there wasn’t much data available to suggest a pattern—something much, much harder for a computer to do.
Unlike normal machine learning algorithms, which use large datasets to learn patterns, Vicarious's algorithm was designed to operate on a tiny sample size, without long learning sessions beforehand. This effectively mimics the conditions people find in real world CAPTCHA tests. It’s not that CAPTCHA is bulletproof—people have cracked it before in limited ways—but its small scope makes it perfect for demonstrating the Vicarious algorithm’s power to learn on the spot.
Normally, machine intelligence starts with long practice sessions in which the computer is shown thousands of different versions of an item (say, a chair) and is corrected by a human being when wrong. But that’s a far cry from the abilities of real human beings, who can identify something after seeing only a couple of examples.
“It is easy to create an illusion of intelligence by using large datasets,” Phoenix says. “It takes a child only a few dozen examples to learn the shapes of letters like ‘a’ and ‘b’. This is because human brains are very good at generalizing from a few examples—the hallmark of intelligence,” he says. “Limiting the number of training examples is important because it shows that the algorithm is able to generalize like the human brain,” says Phoenix. To accurately recognize reCAPTCHAs, the Vicarious algorithm only requires between one and five training examples per letter.
Standing for “Completely Automated Public Turing test to tell Computers and Humans Apart,” CAPTCHAs have been among the front line of anti-spam defense for years now. They help Gmail block automated spammers, let eBay screen its marketplace for bots from overloading the site with scams, and prevent Facebook hackers from creating fraudulent profiles.
As with many technologies, the first few generations of CAPTCHAs were poorly designed—with each letter separated out in a way that was easy to solve using standard machine learning techniques or Optical Character Recognition (OCR). Other successful attempts to break individual CAPTCHAs similarly centered on learning to exploit specific bugs or idiosyncrasies within the image generation process.
“If you look at the history of CAPTCHA, there have been a bunch of researchers who have solved one particular CAPTCHA, at one particular point in time, using one particular hack, that only works with that CAPTCHA,” says Phoenix. “For example, they might have taken Yahoo’s CAPTCHA at a time when all of the letters were slanted at exactly 45 degrees, and when the noise that was added to the image had a certain specific set of properties. In other words, what they were creating wasn’t an all-purpose solution.”
Today, George points out, CAPTCHAs have become far more advanced, with letters crowded closely together in a way that can be difficult even for a person to read. With these new CAPTCHAs old solutions have a zero percent chance of solving them. “These are far more complex to solve because in order to separate out the letters, you need to actually understand what the letters are,” he says. “This is where state-of-the-art machine learning tools come in to play, because these are what is needed to understand segmented letters.”
So what does Vicarious's work mean for anti-spam security in a world that is post-CAPTCHA? For now, website owners operating under this system can breathe a sigh of relief since Phoenix and George are not releasing the software publicly. But long-term security will have to improve. “How we distinguish between a human and a computer is going to have to change,” says George. “What people need to understand is that CAPTCHA is a temporary solution. People have got to start thinking beyond it, and to their credit many people are already doing that. Google, for example, has recently announced that it plans to track click patterns and other heuristics to try and filter out bots.”
The real forward momentum of Vicarious's work, however, has nothing to do with anti-spam software and everything to do with artificial intelligence. “Understanding how the brain creates intelligence is the ultimate scientific challenge,” George continues. “Vicarious has a long-term strategy for developing human level artificial intelligence, and it starts with building a brain-like vision system. Modern CAPTCHAs provide a snapshot of the challenges of visual perception, and solving those in a general way required us to understand how the brain does it.”
It is this breakthrough in image recognition that makes what would be simply a neat algorithm into an infinitely scalable solution. “In the real world when you’re trying to recognize all of the objects in [a] particular scene that you’re perceiving at any given moment, the objects aren’t cleanly presented to you against a white background with none overlapping with the others,” Phoenix says. “Disambiguating which contours belong to which objects is an example of something that is very easy for our brains, but has historically been next to impossible for computers.”
There is a whole world of textual data contained in images that computers are unable to understand, which could benefit from this work. It may be, for example, that this technology could allow for the intelligent automated reading of X-rays, where a computer could pick up on information that may otherwise be missed by doctors. “In the long run we’re trying to create systems that can think and learn like the human brain,” Phoenix continues. “Anything that a brain can do, our system should be able to do as well.”
Breaking CAPTCHA might initially sound like a minor computer science puzzle, but as Vicarious's founders point out, its implications are anything but small. Having received funding worth $16.1 million since May 2010, Vicarious could well be at the forefront of cutting-edge artificial intelligence work over the next several years.
“We should be careful not to underestimate the significance of Vicarious crossing this milestone,” says Facebook cofounder Dustin Moskovitz, who serves as a board member at the startup. “This is an exciting time for artificial intelligence research, and [D. Scott Phoenix and Dr. Dileep George] are at the forefront of building the first truly intelligent machines.”
[Image: Flickr user Wonderlane]