A.I. for Texas Hold'em has 10 times the win rate of professional poker players

A powerful algorithm known as DeepStack has refined its own strategy at heads-up Poker Hold’em to win against professional poker players at 10 times the rate of a human player. The core innovations demonstrated by the algorithm could find real-world applications where ‘intuition’ is prized.

In the mid-1990s, everyone became excited after an IBM machine called Deep Blue capable of calculating 100 million positions per second won at chess against Garry Kasparov. Last year, an even bigger breakthrough was reported after a Google artificial intelligence beat the world’s Go champion in a series of games. Chess has 10¹²⁰ moves but in Go, there are an estimated 10⁷⁶¹ moves.

Compared to chess or Go, you might think Poker is a piece of cake. However, this deceptively simple looking game is anything but that. First of all, they’re too different kind of games. While chess and Go are open games, meaning each player knows exactly what the opponent is doing, poker is closed since each player can only see his own cards but not that of the opponents. To add more to the complexity, Poker typically involves more than just two players.

Since poker is a game of imperfect information it has always been challenging for engineers to develop AIs that can confidently win against professional human players.

“Poker has been a longstanding challenge problem in artificial intelligence,” says Michael Bowling, professor in the University of Alberta’s Faculty of Science and principal investigator on the study. “It is the quintessential game of imperfect information in the sense that the players don’t have the same information or share the same perspective while they’re playing.”

Creating a machine that can perform well at imperfect information games could help us tackle numerous real-life problems, which are inherently information limited.

“Think of any real world problem. We all have a slightly different perspective of what’s going on, much like each player only knowing their own cards in a game of poker,” Bowling said.

A machine with intuition

For their study, a mixed team of Czech and Canadian researchers recruited 33 professional poker players and asked each of them to play 3,000 hands of heads-up, no-limit Texas hold’em against DeepStack. Astonishingly, DeepStack obtained a win rate of 49 big blinds per 100 hands. Just so you get an idea, any professional poker player is satisfied with a 5 BB/100 win rate so DeepStack effectively scored 10 times that rate, and against other professional players to boot.

Central to this level of performance was DeepStack’s ability to treat each round as a game of mini Poker. What I mean by that is the machine tried to find the best strategy to win a hand without taking into account previous games — it sort of had its own intuition.

We’ve got some major AI ethics blind spots and we’re running out of time to fix them

AI learns to play chess by studying game commentaries instead of practicing

Swarm AI correctly predicted TIME’s Person of the Year

AI is better than doctors at predicting how long a patient has to live

By avoiding abstraction like the size of the opponent’s bet or how many times the opponent won in the past, DeepStack is able to avoid most of the pitfalls previous poker AIs fell for. Using a technique called continual re-solving, the researchers found that leaving DeepStack to reason only a few actions deep — Deepstack’s action, opponent response, DeepStack reaction, etc. — before taking a step back to assess the current situation rendered the best results. This mirrors human intuition which acts on fairly limited information but which reasons as if we are always close to the end of the game.

“Instead of solving one big poker game, it solves millions of these little poker games, each one helping the system to refine its intuition of how the game of poker works. And this intuition is the fuel behind how DeepStack plays the full game,” Bowling said.

“Putting these pieces together, DeepStack reasons uniquely about each situation that arises during play. It reasons only a limited amount ahead into the game before using its trained intuition to evaluate how good it is to reach possible poker situations. This results in probabilities for each action it should take. When DeepStack must act again, it repeats this whole process,” Bowling told ResearchGate, adding DeepStack was trained with millions of situations whose results were fed into a neural network.

While such sophisticated AIs typically require a supercomputer to run, DeepStack could be run on any computer, researchers say. The algorithm used a single GPU, which you can find in any gaming laptop. If you had the code, you could theoretically adapt it for an online poker website and starting earning — but that wasn’t the point of this research.

“Real life decisions are much closer to poker decisions than to decisions in chess or go. Algorithms that can handle these types of situations make AI more generally applicable and open up many more areas for AI to have an impact,” Bowling said.

Immediate applications for DeepStack include making robust medical treatment recommendations, strategic defense planning, and negotiation, to name a few. For instance, if a stock performs well for years this historic metric does not guarantee that it won’t fall next year. Like poker, the stock market is a robust decision-making playing field where you have to look at the whole distribution of possible outcomes rather than the outcome’s average, and like at poker, DeepStack could kill it in the financial market.

Tags: artificial intelligence