Home Lex Fridman Notes
Lex Fridman · 2024-06-19 · 3h 02m

Aravind Srinivas: Perplexity CEO on Future of AI, Search & the Internet | Lex Fridman Podcast #434

Perplexity CEO Aravind Srinivas on building an answer engine, how search and LLMs combine, and the future of AI reasoning.

Aravind Srinivas: Perplexity CEO on Future of AI, Search & the Internet | Lex Fridman Podcast #434
The guest

Aravind Srinivas — Co-founder and CEO of Perplexity, an AI answer engine that backs every answer with citations. Formerly a Berkeley PhD student and AI researcher at DeepMind, Google, and OpenAI.

The gist

Aravind Srinivas explains how Perplexity works as an answer engine that combines traditional search with large language models, forcing every sentence of an answer to be backed by a retrieved source to reduce hallucinations. He and Lex dig into the technical machinery: retrieval-augmented generation, web crawling and indexing, ranking signals like BM25, and the tradeoffs of latency, context windows, and model choice. The conversation ranges across Google's ad business model, lessons from founders like Bezos, Musk, Jensen Huang, and Larry Page, and the history of breakthroughs in deep learning from attention to Transformers to scaling laws. Srinivas argues the next frontier is decoupling reasoning from facts and scaling inference compute, and frames Perplexity's mission as serving human curiosity and knowledge discovery rather than beating Google at its own game.

Big reveals

  • Perplexity is best described as an answer engine where every answer is backed by sources, inspired by how academics cite peer-reviewed work in papers.
  • The idea was born out of frustration with health-insurance searches on Google, leading the founders to build a Slack bot on GPT-3.5 that initially gave incorrect answers.
  • Srinivas says Perplexity never tried to beat Google at its own 10-blue-links game; the disruption comes from rethinking the search UI to lead with answers, not links.
  • Perplexity's first product searched over Twitter using academic API accounts, generating SQL queries from natural-language questions, before pivoting to general web search.
  • Perplexity launched December 7, 2022, and the founders' early ambition was just to build a small business serving enterprises before usage unexpectedly exploded.
  • Srinivas argues a breakthrough that decouples reasoning from facts could yield much smaller models that reason well, removing the need for million-GPU clusters.
  • Perplexity trained its own model called Sonar on Llama 3 70B, post-trained for summarization, citations, and long-context support, but stays model-agnostic.
  • Srinivas deliberately rejected investor advice to build AI girlfriend/boyfriend products where hallucination is a feature, choosing the harder path of truth-grounded answers.

Things worth remembering

  • Google Cloud and YouTube together were announced as being on a $100 billion annual recurring revenue rate, which alone could justify a trillion-dollar valuation.
  • The Google AdWords auction model was first conceived by a company called Overture; Google added a small change to the bidding system to make it more mathematically robust.
  • 'Answer engine optimization' lets people embed invisible text in websites instructing AIs to say specific things, a form of prompt injection.
  • Bezos's relentless.com domain redirects to amazon.com; it was among the first names he registered for the company in 1994.
  • Yann LeCun famously said at NeurIPS that reinforcement learning is just the cherry on the cake, supervised learning the icing, and unsupervised learning the bulk of the cake.
  • The Transformer architecture has remained essentially unchanged since 2017, with only minor tweaks to nonlinearities and scaling.
  • BM25, a refinement of the decades-old TF-IDF, still beats most embeddings on ranking; OpenAI's embeddings initially failed to beat BM25 on many benchmarks.
  • Instagram founder Mike Krieger told Srinivas the most common Instagram search is people searching for themselves.
  • Perplexity tracks P90 and P99 tail latencies across every component, an approach inspired by a Jeff Dean paper, and works with Nvidia on TensorRT-LLM kernel optimization.
  • Around 2010, roughly one-third of Google's three billion daily queries were answered directly by instant answers from the Knowledge Graph.

Recommended in this episode

Books, products and media the guest or host genuinely endorsed here — with the buy link.

Affiliate link — we may earn a commission at no extra cost to you.

Guest’s ownProduct

Perplexity

Perplexity AI

“so you know I got together with my co-founders Dennis and Johnny and all we wanted to do was build cool products with llms” — Aravind Srinivas 01:41:21
Find it on Amazon