
The $100M Problem: How Lyft's Data Platform Prevents ML Failures with Ritesh Varyani at Lyft
16/12/2025 | 25 min
What if your data platform could serve AI-native workloads while scaling reliably across your entire organization? In this episode, Benjamin sits down with Ritesh, Staff Engineer at Lyft, to explore how to build a unified data stack with Spark, Trino, and ClickHouse, why AI is reshaping infrastructure decisions, and the strategies powering one of the industry's most sophisticated data platforms. Whether you're architecting data systems at scale or integrating AI into your analytics workflow, this conversation delivers actionable insights into reliability, modernization, and the future of data engineering. Tune in to discover how Lyft is balancing open-source investments with cutting-edge AI capabilities to unlock better insights from data.

60 Billion Predictions Daily: Inside Credit Karma’s Agentic Data Layer with Maddie Daianu
19/11/2025 | 19 min
What does MLOps look like when you are deploying 60 billion machine learning predictions a day? Maddie Daianu, Head of Data and AI at Intuit Credit Karma, joins the Data Bros to pull back the curtain on one of the most high-volume data environments in FinTech. With a 100-person team serving 140 million members, standard data practices break down. Maddie shares how her team manages terabytes of daily data on Google Cloud and explains the massive strategic pivot they are undertaking right now: The move from "Information" to "Agency."

Block Bad Data Before the Write with Nike’s Ashok Singamaneni
07/10/2025 | 20 min
Nike’s Principal Data Engineer Ashok Singamaneni joins Benjamin and Eldad to discuss his open-source data quality framework, Spark Expectations. Ashok explains how the tool, which was inspired by Databricks DLT Expectations, shifts data quality checks to before the data is written to a final table. This proactive approach uses row-level, aggregation-level, and query data quality checks to fail jobs, drop bad records, or alert teams - ultimately saving huge costs on recompute and engineering effort in mission-critical data pipelines.

Postgres vs. Elasticsearch: The Unexpected Winner in High-Stakes Search for Instacart with Ankit Mittal
17/9/2025 | 21 min
Modernizing Search Infrastructure: How Instacart Transitioned from Elasticsearch to PostgreSQL for Enhanced Performance and Simplicity. In this episode of The Data Engineering Show, host Benjamin Wagner speaks with Ankit Mittal, former senior engineer at Instacart, about the company's innovative approach to modernizing their search infrastructure by transitioning from Elasticsearch to PostgreSQL for single-retailer search functionality.

Is Self-Service BI a False Promise? Lei Tang of Fabi.ai Thinks So
28/8/2025 | 21 min
AI is reshaping business intelligence by enabling true self-service analytics and transforming how organizations interact with their data through natural language processing. In this episode of The Data Engineering Show, host Benjamin interviews Lei, Co-founder and CTO of Fabi.ai, to explore how AI-native BI platforms are reshaping data analytics and empowering non-technical users to derive meaningful insights from complex datasets.



The Data Engineering Show