Berkeley professor Pieter Abbeel on deep reinforcement learning, robot manipulation, self-play, imitation learning, and the hard problem of hierarchical reasoning.

Pieter Abbeel — Professor at UC Berkeley and director of the Berkeley robotics learning lab, a leading researcher on teaching robots to understand and interact with the world using imitation and deep reinforcement learning.
Lex Fridman talks with UC Berkeley robotics professor Pieter Abbeel about the state of deep reinforcement learning and robotics. They discuss why beating Roger Federer at tennis is as much a hardware as a software problem, the psychology of interacting with robots, and why RL works despite sparse and delayed rewards. Abbeel shares his intuition that neural-net control benefits from being a gradual tiling of linear feedback controllers, and explains the open challenges of hierarchical reasoning and credit assignment over long time horizons. The conversation covers transfer learning, self-play, imitation and third-person learning, simulation ensembles for sim-to-real transfer, AI safety and testing, and ends on whether RL robots could be taught kindness and affection.