Alphabet's Latest AI Show Pony Has More Than One Trick

AlphaZero can teach itself to be the world's best at chess, Go, or Shogi in eight hours or less.
Image may contain Helmet Clothing Apparel Electronics Computer Hardware Hardware Keyboard and Computer
Robot arm plays Japanese chess, also known as Shogi.Yuya Shino/Reuters

The history of artificial intelligence is a procession of one-trick ponies. Over decades researchers have crafted a series of super-specialized programs to beat humans at tougher and tougher games. They conquered tic-tac-toe, checkers, and chess. Most recently, Alphabet’s DeepMind research group shocked the world with a program called AlphaGo that mastered the Chinese board game Go. But each of these artificial champions could play only the game it was painstakingly designed to play.

DeepMind has now revealed the first multi-skilled AI board-game champ. A paper posted late Tuesday describes software called AlphaZero that can teach itself to be super-human in any of three challenging games: chess, Go, or Shogi---a game sometimes dubbed Japanese chess.

AlphaZero couldn’t learn to play all three games at once. But the ability of one program to learn three different, complex games to such a high level is striking because AI systems---including those that can “learn”---typically are extremely specialized, honed to tackle a particular problem. Even the best AI systems can’t generalize between problems---one reason why many experts say we still have a long way to go before machines rival human abilities.

AlphaZero could be a small step towards making AI systems less specialized. In a tweet Tuesday, NYU professor Julian Togelius noted that truly generalized AI remains a way off, but called DeepMind’s paper “excellent work.”

AlphaZero can learn to play each of the three games in its repertoire from scratch, although it needs to be programmed with the rules of each game. The program becomes expert by playing against itself to improve its skills, experimenting with different moves to discover what leads to a win.

DeepMind’s new program is modeled on AlphaGoZero, a Go-playing program revealed by DeepMind in October that learns through that same self-play mechanism. The algorithm at the heart of AlphaZero is an upgraded version of the one that powered that previous program, capable of searching a broader range of possible moves to accommodate different games.

DeepMind’s new paper describes taking three blank-slate versions of AlphaZero, and directing each to learn a different game. Humans are no longer the best players at chess, Go, and Shogi, so AlphaZero was tested against the best specialized artificial players available. The new software beat all three---quickly. AlphaZero required four hours to become world-beating at chess, two hours to reach that level in Shogi, and eight hours to get good enough to beat DeepMind’s previous best Go player, AlphaGoZero.

More flexible learning software could help Google accelerate its expansion of artificial-intelligence technology inside its business.

Techniques at work in DeepMind’s newest creation might also help the group take on the videogame StarCraft, on which it has set its sights. A popular commercial video game may seem less daunting than a formal, abstract board game. But StarCraft is considered more complex, because there are far more possible arrangements of pieces and features, and players must anticipate unseen actions by their opponents.

AlphaZero still remains a relatively limited slice of intelligence. The human brain can learn more than three board games, and tackle all kinds of spatial, common sense, logic, artistic, and social conundrums to boot. It also requires a lot less energy than AlphaZero. DeepMind reports that training the program used 5,000 of Google’s powerful custom machine-learning processors, dubbed TPUs.