Intelligent Machines

A Master Algorithm Lets Robots Teach Themselves to Perform Complex Tasks

One researcher has developed a simple way to let robots generate remarkably sophisticated behaviors.

Dec 21, 2015

For all the talk of machines becoming intelligent, getting a sophisticated robot to do anything complex, like grabbing a heavy object and moving it from one place to another, still requires many hours of careful, patient programming.

Igor Mordatch, a postdoctoral fellow at the University of California, Berkeley, is working on a different approach–one that could help hasten the arrival of robot helpers, if not overlords. He gives a robot an end goal and an algorithm that lets it figure out how to achieve the goal for itself. That’s the kind of independence that will be necessary for, say, a home assistance bot to reliably fetch you a cup of coffee from the counter.

Mordatch works in the lab of Pieter Abbeel, an associate professor of robotics at Berkeley. When I visited the lab this year, I saw all sorts of robots learning to perform different tasks. A large white research robot called PR2, which has an elongated head and two arms with pincer-like hands, was slowly figuring out how to pick up bright building blocks, through a painstaking and often clumsy process of trial and error.

As he works on a better teaching process, Mordatch is mainly using software that simulates robots. This virtual model, first developed with his PhD advisor at the University of Washington, Emo Todorov, and another professor at the school, Zoran Popović, has some understanding of how to make contact with the ground or with objects. The learning algorithm then uses these guidelines to search for the most efficient way to achieve a goal. “The only thing we say is ‘This is the goal, and the way to achieve the goal is to try to minimize effort,’” Mordatch says. “[The motion] then comes out these two principles.”

Mordatch’s simulated robots come in all sorts of shapes and sizes, rendered in blocky graphics that look like something from an unfished video game. He has tested his algorithm on humanoid shapes; headless, four-legged creatures with absurdly fat bodies; and even winged creations. In each case, after a period of learning, some remarkably complex behavior emerges.

As this video shows, a humanoid robot can learn to get up from any position on the ground and stand on two legs in a very natural-looking way; or it will clamber over onto a ledge, or even perform a headstand. The same process works no matter what form the robot takes, and it can even enable two robots to collaborate on a task, such as moving a heavy object.

Building upon this earlier work, this year Mordatch devised a way for robots to perform repetitive behaviors such as walking, running, swimming, or flying. A simulated neural network is trained to control the robot using information about its body, the physical environment, and the objective of moving in a particular direction. This produces natural-seeming locomotion in virtual robots with a humanoid body shape, and flapping motions in ones that have wings. When an operator tells the robot where to go, its neural network adapts the means of locomotion accordingly.

Something similar may happen in humans and other animals as they learn to move around. An infant spends a lot of time working out how to move his or her body, and later uses that knowledge to quickly and instinctively plan new motions.

“This stuff is beautiful,” says Josh Tenenbaum, a professor in the Department of Brain and Cognitive Science at MIT who studies how humans learn and is working on ways to apply those principles to machines. “They’re really trying to solve a problem that I think very few people have tried to solve until recently.”

Mordatch recently began using some of his techniques in a small humanoid robot called Darwin (see “Robot Toddler Learns to Stand by Imagining How to Do It”). Using the same optimization and learning techniques in the real world is more challenging, however, because the physical world is more complex and unpredictable, and because an algorithm will have imperfect, or noisy, information about it.