Home Lex Fridman Notes
Lex Fridman · 2018-12-28 · 1h 06m

Tuomas Sandholm: Poker and Game Theory | Lex Fridman Podcast #12

CMU's Tuomas Sandholm explains how his AI Libratus beat top human poker pros and why game theory will reshape business and military strategy.

Tuomas Sandholm: Poker and Game Theory | Lex Fridman Podcast #12
The guest

Tuomas Sandholm — Carnegie Mellon professor and co-creator of Libratus, the first AI to beat top human players at heads-up no-limit Texas Hold'em; founder of startups Strategic Machine and Strategy Robot.

The gist

Tuomas Sandholm walks Lex Fridman through the 2017 'Brains vs AI' event where his system Libratus beat four of the world's top heads-up no-limit Texas Hold'em players over 120,000 hands. He explains why imperfect-information games are far harder than chess or Go, how abstraction (information and action) and Nash equilibrium let an AI derive beliefs without any opponent data, and why two-player zero-sum games are tractable while three-player and collusion games are not. The discussion moves into mechanism design, including impossibility results and finding 'islands of possibility' within them. Sandholm shares his strong optimism about AI's positive real-world impact (kidney exchange, combinatorial sourcing auctions) and his worries about climate change and nuclear war. He closes on his goal of bringing computational game theory into business and military strategy at scale.

Big reveals

  • Libratus beat four top-10 heads-up players over 120,000 hands at Rivers Casino across 20 days, winning close to $2 million in (non-real) chips.
  • International betting sites made the AI a 4-to-1 or 5-to-1 underdog, and stayed 50/50 even after 105-108 hands of the AI beating the humans.
  • In imperfect-information games an evaluation function isn't enough: a state's value depends on both players' belief distributions, not just the cards.
  • Nash equilibrium itself derives the belief distributions via Bayes' rule, so Libratus needs no opponent data or play history at all.
  • Two-player zero-sum games are conceptually and computationally tractable, but three-player or general-sum games fall apart due to equilibrium selection and collusion.
  • Sandholm says he became 'the most hated person in the world of poker,' receiving aggressive comments just short of death threats.
  • His nationwide kidney exchange has kept hundreds of people alive who otherwise would not be, while also increasing employment.

Things worth remembering

  • Heads-up means two players, making poker like chess or Go but with imperfect information since each holds two private cards.
  • The poker game tree is on the order of 10^161 states, far too large to solve directly even with the fastest algorithms.
  • In games, a finer-grained abstraction can paradoxically yield a worse real-game strategy than a coarse one (abstraction pathologies).
  • No-limit Texas Hold'em has such high variance that statistical significance requires over 100,000 hands, not just thousands.
  • A Norwegian poker player reportedly won a tournament without looking at her own cards, though Sandholm calls it extremely rare.
  • Nash equilibrium was introduced by John Nash in 1950 and defines not just strategies but beliefs for each information set.
  • The Myerson-Satterthwaite theorem (1983) proves the impossibility of efficient trade under imperfect information.
  • Sandholm's CombineNet ran 800 combinatorial sourcing auctions (2001-2010), improving $60B of spend efficiency by 12.6%, over $6B in savings.
  • He names climate change and nuclear war as the two biggest threats facing mankind, and considers nuclear risk higher now than ever before.
  • Sandholm published the first automated algorithm-configuration paper with theoretical generalization guarantees, after 17 years of the field lacking such theory.