PodcastsSciencesThe Information Bottleneck

The Information Bottleneck

Ravid Shwartz-Ziv & Allen Roush
The Information Bottleneck
Dernier épisode

37 épisodes

  • The Information Bottleneck

    Reasoning Models and Planning - with Rao Kambhampati (Arizona State)

    29/04/2026 | 1 h 11 min
    We sat down with Rao Kambhampati, a Professor of CS at Arizona State University and former President of AAAI, to talk about reasoning models: what they are, when they work, and when they break.
    Rao has been working on planning and decision-making since long before deep learning, which makes him one of the most grounded voices on what today's reasoning systems actually do. We start with definitions of what reasoning is, why planning is the hard subset of it, and what changed when systems like o1 and DeepSeek R1 moved the verifier from inference into post-training. From there we get into where these models generalize, where they don't, and why benchmarks can be misleading about both.
    A big chunk of the conversation is on chain-of-thought: what intermediate tokens are actually doing, why they help the model more than they help the reader, and what outcome-based RL does to whatever semantic content was there to begin with. We also cover world models and why Rao thinks the video-only framing is the wrong bet, the difference between agentic safety and existential risk, and what the planning community figured out decades ago that the LLM community keeps rediscovering.

    Timeline
    (00:12) Intros
    (01:32) Defining "reasoning" and the System 1 / System 2 framing
    (04:12) Blocksworld vs Sokoban, and non-ergodicity
    (06:42) Pre-o1: PlanBench and "LLMs are zero-shot X" papers
    (07:42) LLM-Modulo and moving the verifier into post-training
    (10:12) Is RL post-training reasoning, or case-based retrieval?
    (13:12) τ-Bench and benchmarks that avoid action interactions
    (14:12) OOD generalization and what we don't know about post-training data
    (19:02) Does it matter how they work if they answer the questions we care about?
    (21:27) Architecture lotteries and why no one tries different designs
    (23:42) Intermediate tokens and the "reduce thinking effort" cottage industry
    (26:12) The 30×30 maze experiment
    (27:42) Sokoban, NetHack, and Mystery Blocksworld
    (34:58) Stop Anthropomorphizing Intermediate Tokens — the swapped-trace experiment
    (46:12) Latent reasoning, Coconut, and why R0 beat R1
    (50:12) How outcome-based RL erodes CoT semantics
    (52:12) Dot-dot-dot and Anthropic's CoT monitoring paper
    (53:42) Safety: Hinton, Bengio, LeCun
    (57:12) Existential risk vs real safety work
    (59:42) World models, transition models, and video-only approaches
    (1:03:12) Why linguistic abstractions matter — pick and roll
    (1:05:42) What the planning community knew in 2005
    (1:08:12) Multi-agent LLMs
    (1:09:57) Closing thoughts: the bridge analogy

    Music:
    "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
    "Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
    Changes: trimmed
    About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.
  • The Information Bottleneck

    What Actually Matters in AI? - with Zhuang Liu (Princeton)

    24/04/2026 | 1 h 9 min
    In this episode, we hosted Zhuang Liu, Assistant Professor at Princeton and former researcher at Meta, for a conversation about what actually matters in modern AI and what turns out to be a historical accident.
    Zhuang is behind some of the most important papers in recent years (with more than 100k citations): ConvNeXt (showing ConvNets can match Transformers if you get the details right), Transformers Without Normalization (replacing LayerNorm with dynamic tanh), ImageBind, Eyes Wide Shut on CLIP's blind spots, the dataset bias work showing that even our biggest "diverse" datasets are still distinguishable from each other, and more.
    We got into whether architecture research is even worth doing anymore, what "good data" actually means, why vision is the natural bridge across modalities but language drove the adoption wave, whether we need per-lab RL environments or better continual learning, whether LLMs have world models (and for which tasks you'd need one), why LLM outputs carry fingerprints that survive paraphrasing, and where coding agents like Claude Code fit into research workflows today and where they still fall short.

    Timeline
    00:13 — Intro
    01:15 — ConvNeXt and whether architecture still matters
    06:35 — What actually drove the jump from GPT-1 to  GPT-3
    08:24 — Setting the bar for architecture papers today
    11:14 — Dataset bias: why "diverse" datasets still aren't
    22:52 — What good data actually looks like
    26:49 — ImageBind and vision as the bridge across modalities
    29:09 — Why language drove the adoption wave, not vision
    32:24 — Eyes Wide Shut: CLIP's blind spots
    34:57 — RL environments, continual learning, and memory as the real bottleneck
    43:06 — Are inductive biases just historical accidents?
    44:30 — Do LLMs have world models?
    48:15 — Which tasks actually need a vision world model
    50:14 — Idiosyncrasy in LLMs: pre-training vs post-training fingerprints
    53:39 — The future of pre-training, mid-training, and post-training
    57:57 — Claude Code, Codex, and coding agents in research
    59:11 — Do we still need students in the age of autonomous research?
    1:04:19 — Transformers Without Normalization and the four pillars that survived
    1:06:53 — MetaMorph: Does generation help understanding, or the other way around?
    1:09:17 — Wrap

    Music:
    "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
    "Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
    Changes: trimmed
    About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.
  • The Information Bottleneck

    The Future of Coding Agents with Sasha Rush (Cursor/Cornell)

    15/04/2026 | 1 h 24 min
    We talked with Sasha Rush, researcher at Cursor and professor at Cornell, about what it actually feels like to we in the heart of the AI revolution and build coding agents right now. Sasha shared how these systems are changing day-to-day work and how it feels to develop these systems.
    A big part of the conversation was about why coding has become such a powerful setting for these tools. We discussed what makes code different from other domains, why agents seem to work especially well there, and how much of today’s progress comes not just from better models, but from better ways of using them. Sasha also gave an inside look at how Cursor thinks about training coding models, long-running agents, context limits, bug finding, and the balance between autonomy and human oversight.
    We also talked about the broader shift happening in software engineering. Are developers moving to a higher level of abstraction? Is this just a phase where we “babysit” models, or the beginning of a deeper change in how software gets built? Sasha had a very thoughtful perspective here, including what he’s seeing from students, researchers, and engineers who are growing up native to these tools.
    More broadly, this episode is about what it means to do serious technical work in a moment when the tools are changing incredibly fast. Sasha brought both optimism and skepticism to the discussion, and that made this a really grounded conversation about where coding agents are today, what they are already surprisingly good at, and where all of this might be going next.

    Timeline
    00:00 Intro and Sasha joins us
    01:11 What “coding agents” actually mean
    02:34 Why coding became the breakout use case
    08:56 Long-running agents and autonomous workflows
    15:08 How these tools are changing the work of engineers
    17:15 Are people just babysitting models right now?
    22:11 How Cursor builds its coding models
    26:29 Rewards, training, and what makes agents work
    34:53 Memory, continual learning, and agent communication
    38:00 How context compaction works in practice
    41:29 Why coding agents recently got much better
    50:31 Refactoring, maintenance, and self-improving codebases
    52:16 Bug finding, oversight, and verification
    54:43 Will this pace of progress continue?
    56:42 Can this spread beyond coding?
    58:27 The future of Cursor and coding agents
    1:03:08 Model architectures beyond standard transformers
    1:05:37 World models, diffusion, and what may come next
    Music:
    "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
    "Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
    Changes: trimmed
    About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.
  • The Information Bottleneck

    The Hidden Engine of Vision with Peyman Milanfar (Google)

    10/04/2026 | 1 h 24 min
    How Denoising Secretly Powers Everything in AI

    Peyman Milanfar is a Distinguished Scientist at Google, leading its Computational Imaging team. He's a member of the National Academy of Engineering, an IEEE Fellow, and one of the key people behind the Pixel camera pipeline. Before Google, he was a professor at UC Santa Cruz for 15 years and helped build the imaging pipeline for Google Glass at Google X. Over 35,000 citations.
    Peyman makes a provocative case that denoising, long dismissed as a boring cleanup task, is actually one of the most fundamental operations in modern ML, on par with SGD and backprop. Knowing how to remove noise from a signal basically means you have a map of the manifold that signals live on, and that insight connects everything from classical inverse problems to diffusion models.
    We go from early patch-based denoisers to his 2010 "Is Denoising Dead?" paper, and then to the question that redirected his research: if denoising is nearly solved, what else can denoisers do? That led to Regularization by Denoising (RED), which, if you unroll it, looks a lot like a diffusion process, years before diffusion models existed. We also cover how his team shipped a one-step diffusion model on the Pixel phone for 100x ProRes Zoom, the perception-distortion-authenticity tradeoff in generative imaging, and a new paper on why diffusion models don't actually need noise conditioning. The conversation wraps with a debate on why language has dominated the AI spotlight while vision lags, and Peyman's argument that visual intelligence, grounded in physics and robotics, is coming next.

    Timeline
    0:00 Intro and Peyman's background
    1:22 Why denoising matters more than you think Sensor diversity and Tesla's vision-only bet
    15:04 BM3D and why it was secretly an MMSE estimator
    17:02 "Is Denoising Dead?" then what else can denoisers do?
    18:07 Plug-and-play methods and Regularization by Denoising (RED)
    26:18 Denoising, manifolds, and the compression connection
    28:12 Energy-based models vs. diffusion: "The Geometry of Noise"
    31:40 Natural gradient descent and why flow models work
    34:48 Gradient-free optimization and high-dimensional noise
    45:13 Image quality and the perception-distortion tradeoff
    48:39 Information theory, rate-distortion, and generative models
    52:57 Denoising vs. editing
    54:25 The changing role of theory
    57:07 Hobbyist tools vs. shipping consumer products
    59:40 Coding agents, vibe coding, and domain expertise
    1:05:00 Vision and more complex-dimensional signals
    1:09:31 Do models need to interact with the physical world?
    1:11:28 Continual learning and novelty-driven updates
    1:13:00 On-device learning and privacy
    1:15:01 Why has language dominated AI? Is vision next?
    1:17:14 How kids learn: vision first, language later
    1:19:36 Academia vs. industry
    1:22:28 10,000 citations vs. shipping to millions, why choose?
    Music:
    "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
    "Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
    Changes: trimmed
    About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.
  • The Information Bottleneck

    How to Build the Smartest Camera in Your Pocket - with Peyman Milanfar (Google)

    05/04/2026 | 1 h 26 min
    In this episode, we sit down with Peyman Milanfar, Distinguished Scientist at Google, where he leads the Computational Imaging team. Peyman is a member of the National Academy of Engineering, an IEEE Fellow, and one of the key minds behind the imaging pipeline in Google Pixel phones. Before joining Google, he was a professor of Electrical Engineering at UC Santa Cruz for 15 years, and he helped develop the imaging pipeline for Google Glass during his time at Google X. With over 35,000 citations and decades of work at the intersection of image processing and AI, Peyman makes a compelling case that denoising, long dismissed as a "digital janitor" task, is actually one of the most fundamental operations in modern machine learning, on par with SGD and backpropagation.
    We trace the full arc from classical denoising algorithms to modern diffusion models. Peyman explains how early denoisers implicitly learned from image patches, how the "Is Denoising Dead?" paper in 2010 led him to ask what else denoisers could do beyond cleaning up noise, and how that question opened the door to regularization by denoising and, eventually, to the diffusion models powering image generation today.
    We also dig into the practical side, including how Peyman's team shipped a one-step diffusion model on the Pixel phone for 100x ProRes Zoom, the challenges of controlling hallucinations in generative models for consumer products, and why understanding physics and the image formation process still matters in the age of large models.
    The conversation wraps with a big-picture debate: why has language dominated the AI spotlight while vision lags behind? Peyman argues that visual intelligence is coming next, and that, unlike language, vision requires grounding in the physical world through robotics, world models, and continuous learning. He also reflects on his journey from professor to industry researcher and why he wouldn't trade the ability to take ideas from theory to millions of users.
    Timeline
    0:13 Intro
    1:42 Why denoising matters
    3:20 History of denoising
    5:57 How denoisers work
    9:39 Why phones need denoising
    12:54 Tesla's vision-only bet
    14:14 BM3D's dominance
    16:58 "Is Denoising Dead?"
    18:21 Regularization by Denoising (RED)
    24:26 RED looks like diffusion
    26:19 Denoising & manifolds
    28:42 Energy-based vs. diffusion models
    33:46 Blind denoisers
    40:30 Diffusion for text
    45:44 Perception-distortion tradeoff
    53:05 Denoising vs. editing
    57:01 ComfyUI & democratization
    58:51 One-step diffusion on Pixel
    59:51 Coding agents & domain expertise
    1:02:45 Diffusion for music
    1:06:53 World models & continuous learning
    1:15:01 Why vision will overtake language
    1:21:12 Professor vs. Google
    1:25:08 Wrap-up
    Music:
    "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
    "Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
    Changes: trimmed
    About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

Plus de podcasts Sciences

À propos de The Information Bottleneck

Two AI Researchers - Ravid Shwartz Ziv, and Allen Roush, discuss the latest trends, news, and research within Generative AI, LLMs, GPUs, and Cloud Systems.
Site web du podcast

Écoutez The Information Bottleneck, The Great Simplification with Nate Hagens ou d'autres podcasts du monde entier - avec l'app de radio.fr

Obtenez l’app radio.fr
 gratuite

  • Ajout de radios et podcasts en favoris
  • Diffusion via Wi-Fi ou Bluetooth
  • Carplay & Android Auto compatibles
  • Et encore plus de fonctionnalités
Applications
Réseaux sociaux
v8.8.13| © 2007-2026 radio.de GmbH
Generated: 5/1/2026 - 8:10:50 PM