Google's acquisition of DeepMind Technologies last month was a huge deal. By snatching up the artificial intelligence company, Google signified a growing interest in deep learning. But what does this buzzword actually mean?
DeepMind was founded in 2012 by neuroscientist and former teenage chess prodigy Demis Hassabis and two colleagues. As its website describes, "We combine the best techniques from machine learning and systems neuroscience to build powerful general-purpose learning algorithms" with applications in a broad range of industries.
Deep learning is an emerging topic in artificial intelligence. A subcategory of machine learning, deep learning deals with the use of neural networks to improve things like speech recognition, computer vision, and natural language processing. It's quickly becoming one of the most sought-after fields in computer science. But how did it turn from an obscure academic topic into one of tech's most exciting fields—in under a decade?
"Deep learning is being very highly prized at the moment," says Yoshua Bengio, full professor at the Department of Computer Science and Operations Research at the University of Montreal—home to one of the world’s biggest concentrations of deep learning researchers. "The reason for that is that there is currently a lack of experts. It takes around five years to train a PhD student, and five years ago there weren’t that many PhD students starting a career in deep learning. What this means now is that those few that there are are being prized very highly."
In the last few years, deep learning has helped forge advances in areas as diverse as object perception, machine translation, and voice recognition—all research topics that have long been difficult for AI researchers to crack.
To understand what deep learning is, it’s first important to distinguish it from other disciplines within the field of AI. Early work in artificial intelligence dealt with explicit forms of knowledge—essentially, telling computers how to interact with their surroundings based on programmed facts and rules.
One outgrowth of AI was machine learning, in which the computer extracts knowledge through supervised experience. This typically involved a human operator helping the machine learn by giving it hundreds or thousands of training examples, and manually correcting its mistakes.
While machine learning has become dominant within the field of AI, it does have its problems. For one thing, it’s massively time consuming. For another, it’s still not a true measure of machine intelligence since it relies on human ingenuity to come up with the abstractions that allow computer to learn.
"A lot of successful machine learning applications depend on hand-engineering features where the researcher manually encodes relevant information about the task at hand and then there is learning on top of that," says George E. Dahl, a PhD candidate working in the Machine Learning Group at the University of Toronto. "The difference between that and deep learning is that the deep learning researcher will try and get the system to engineer its own features as much as is feasible."
Unlike machine learning, deep learning is mostly unsupervised. It involves, for example, creating large-scale neural nets that allow the computer to learn and "think" by itself without the need for direct human intervention.
"What the computer is learning using deep learning algorithms are more abstract representations of concepts," says Bengio. "Deep learning comes from the notion that as humans we have multiple types of representation—with simpler features at the lower levels and high-level abstractions built on top of that. By representing information in this more abstract way, the machine can generalize more easily."
In 2011, Stanford computer science professor Andrew Ng founded Google’s Google Brain project, which created a neural network trained with deep learning algorithms, which famously proved capable of recognizing high level concepts, such as cats, after watching just YouTube videos—and without ever having been told what a "cat" is.
Last year, Facebook named computer scientist Yann LeCun as its new director of AI Research, using deep learning expertise to help create solutions that will better identify faces and objects in the 350 million photos and videos uploaded to Facebook each day.
Another example of deep learning in action is voice recognition like Google Now and Apple’s Siri. Much of this work owes a debt to Dahl, whose 2012 paper "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition" represented a breakthrough in deep learning speech recognition.
"All recent speech recognition products by major companies either use deep neural networks of the sort that I work on—or else will soon," Dahl notes.
What is impressive is how dramatically deep learning can enhance these areas compared to the shallow networks and Gaussian mixture models (GMMs) used before. According to Google researchers, the voice error rate in the new version of Android—after adding insights from deep learning—stands at 25% lower than previous versions of the software. "In terms of speech recognition, we’re going to see both wider adoption, and increasing gains in accuracy. That’s where I think acoustic modeling is going," Dahl continues.
Thanks to deep learning, Yoshua Bengio says another area we’re likely to see change in the next couple of years is the field of natural language processing. "This is something that companies like Facebook and Google are very interested in because the possibility of understanding the meaning of the text that people type or say is very important for providing better user interfaces, advertisements, and posts for your [news feed]," he says. "If deep learning can make the kind of impact in this area that it has in speech and object recognition, that could be a very, very important development in terms of value."
A unique development in Google’s DeepMind acquisition was the mandatory establishment of an ethics board. According to people close to the situation, Google’s willingness to establish an ethics board was a deciding factor in it purchasing DeepMind instead of Facebook. While almost any sci-fi movie of the past 50 years has dealt with ethical questions in some form or other, in the real world there are still relatively few concrete laws dealing with this part of AI—aside from the usual rules concerning things like privacy and product liability.
Bengio says this is with good reason: Currently the kind of models that can be built using even the most sophisticated deep learning tools are comparable only with the brain of an insect of terms of overall number of neurons. "Unsupervised learning is something that still presents big challenges—both computationally and mathematically," he says, explaining why fears concerning AI run amok may be a little premature.
George Dahl agrees. "We still have a very limited understanding of how the human brain works and some of that understanding might be 'platform specific' and not be relevant for artificial learning," he says. "Computers are a lot more powerful than they were even 10 years ago, but there’s so much more scientific progress that needs to be made before we can realize the ambitions of researchers working in this field."
The call for an AI ethics board—and the resulting conversation—speaks less to where artificial intelligence currently is, and more to the level of public awareness surrounding it.
"We’re nowhere near the kind of AI you see in science fiction, but that’s not to say that deep learning isn’t working in a whole lot of areas that are commercially viable—and which could be very useful to people," Dahl says.
A big part of what makes deep learning fascinating, says Dahl, is how fresh the field is.
"Computer science is a young discipline, and deep learning is a very young discipline within that," he says. "This isn’t a subject like mathematics, where to make progress you have to be so specialized that only a few people can understand what you’re doing. It’s a young field—there’s still a lot of low-hanging fruit, or else problems that may not end up being too hard, but which no one has yet had time to attack."
"It’s very exciting for me to be working in a subject where there’s so much potential to have an impact."
[Image: Flickr user Christopher Neugebauer]