This week on Ship It Weekly, Brian covers three “automation meets reality” stories that every DevOps, SRE, and platform team can learn from. Cloudflare accidentally withdrew customer BYOIP prefixes due to a buggy cleanup task, Clerk got knocked over by a Postgres auto-analyze query plan flip, and AWS responded to reports about its internal Kiro tooling by framing the incident as misconfigured access controls. Plus: a quick EKS node monitoring update, and a tight security lightning round. Links Cloudflare BYOIP outage postmortem https://blog.cloudflare.com/cloudflare-outage-february-20-2026/ Clerk outage postmortem (Feb 19, 2026) https://clerk.com/blog/2026-02-19-system-outage-postmortem AWS outage report (Reuters) https://www.reuters.com/business/retail-consumer/amazons-cloud-unit-hit-by-least-two-outages-involving-ai-tools-ft-says-2026-02-20/ AWS response on Kiro + access controls https://www.aboutamazon.com/news/aws/aws-service-outage-ai-bot-kiro EKS Node Monitoring Agent (open source) https://aws.amazon.com/about-aws/whats-new/2026/02/amazon-eks-node-monitoring-agent-open-source/ Grafana CVE-2026-21721 https://grafana.com/security/security-advisories/cve-2026-21721/ runc CVEs (AWS-2025-024) https://aws.amazon.com/security/security-bulletins/rss/aws-2025-024/ GitLab patch releases https://about.gitlab.com/releases/2025/11/26/patch-release-gitlab-18-6-1-released/ Atlassian Feb 2026 security bulletin https://confluence.atlassian.com/security/security-bulletin-february-17-2026-1722256046.html Human story: SRE Is Anti-Transactional (ACM Queue) https://queue.acm.org/detail.cfm?id=3773094 More episodes and show notes at https://shipitweekly.fm On Call Briefs at: https://oncallbrief.com
Ship It Conversations: Mike Lady on Day Two Readiness + Guardrails in the AI Era
24/02/2026 | 34 min
This is a guest conversation episode of Ship It Weekly (separate from the weekly news recaps). In this Ship It: Conversations episode I talk with Mike Lady (Senior DevOps Engineer, distributed systems) from Enterprise Vibe Code on YouTube. We talk day two readiness, guardrails/quality gates, and why shipping safely matters even more now that AI can generate code fast. Highlights Day 0 vs Day 1 vs Day 2 (launching vs operating and evolving safely) What teams look like without guardrails (“hope is not a strategy”) Why guardrails speed you up long-term (less firefighting, more predictable delivery) Day-two audit checklist: source control/branches/PRs, branch protection, CI quality gates, secrets/config, staging→prod flow AI agents: they’ll “lie, cheat, and steal” to satisfy the goal unless you gate them Multi-model reviews (Claude/Gemini/Codex) as different perspectives AI in prod: start read-only (logs/traces), then earn trust slowly Mike’s links YouTube: https://www.youtube.com/@EnterpriseVibeCode Site: https://www.enterprisevibecode.com/ LinkedIn: https://www.linkedin.com/in/mikelady/ Stuff mentioned Vibe Coding (Gene Kim + Steve Yegge): https://www.simonandschuster.com/books/Vibe-Coding/Gene-Kim/9781966280026 Beads (agent memory/issue tracker): https://github.com/steveyegge/beads Gas Town (agent orchestration): https://github.com/steveyegge/gastown AGENTS.md (agent instructions file): https://agents.md/ OpenAI Codex: https://openai.com/codex/ More episodes + details: https://shipitweekly.fm
Ship It Weekly – DevOps and SRE News for Engineers Who Run Production
22/02/2026 | 0 min
Ship It Weekly is a DevOps and SRE news podcast for engineers who run real systems. Every week I break down what actually matters in cloud, Kubernetes, CI/CD, infrastructure as code, and production reliability. No hype. No vendor spin. Just practical analysis from someone who’s been on call and shipped systems at scale. This isn’t a tutorial show. It’s a signal filter. I cover major industry shifts, security incidents, cloud provider changes, and tooling updates, then explain what they mean for platform teams and engineers operating in production. If you work in DevOps, SRE, platform engineering, or cloud infrastructure and want context instead of clickbait, you’re in the right place. New episodes weekly. You can also find detailed write-ups at: https://shipitweekly.fm And curated production-focused briefs at: https://oncallbrief.com Subscribe, and let’s ship.
This week on Ship It Weekly, Brian hits five stories where the “defaults” are shifting under ops teams. GitHub is bringing Agentic Workflows into Actions, Gentoo is migrating off GitHub to Codeberg, Argo CD upgrades are forcing Server-Side Apply in some paths, AWS Config quietly expanded coverage again, and EC2 nested virtualization is now possible on virtual instances. Links YouTube episodes https://www.youtube.com/watch?v=tuuLlo2rbI0&list=PLYLi5KINFnO7dVMbhsJQTKRFXfSSwPmuL&pp=sAgC OnCallBrief https://oncallbrief.com Teller’s Tech Substack https://tellerstech.substack.com/ GitHub Agentic Workflows (preview) https://github.blog/changelog/2026-02-13-github-agentic-workflows-are-now-in-technical-preview/ Gentoo moves to Codeberg https://www.theregister.com/2026/02/17/gentoo_moves_to_codeberg_amid/ Argo CD upgrade guide: 3.2 -> 3.3 (SSA) https://argo-cd.readthedocs.io/en/latest/operator-manual/upgrading/3.2-3.3/ AWS Config: 30 new resource types https://aws.amazon.com/about-aws/whats-new/2026/02/aws-config-new-resource-types EC2 nested virtualization (virtual instances) https://aws.amazon.com/about-aws/whats-new/2026/02/amazon-ec2-nested-virtualization-on-virtual/ GitHub status page update https://github.blog/changelog/2026-02-13-updated-status-experience/ GitHub Actions: early Feb updates https://github.blog/changelog/2026-02-05-github-actions-early-february-2026-updates/ Runner min version enforcement extended https://github.blog/changelog/2026-02-05-github-actions-self-hosted-runner-minimum-version-enforcement-extended/ Open Build Service postmortem https://openbuildservice.org/2026/02/02/post-mortem/ Human story: AI SRE vs incident management https://surfingcomplexity.blog/2026/02/14/lots-of-ai-sre-no-ai-incident-management/ More episodes and show info on https://shipitweekly.fm
Special: OpenClaw Security Timeline and Fallout: CVE-2026-25253 One-Click Token Leak, Malicious ClawHub Skills, Exposed Agent Control Panels, and Why Local AI Agents Are a New DevOps/SRE Control Plane (OpenAI Hires Founder)
17/02/2026 | 18 min
In this Ship It Weekly special, Brian breaks down the OpenClaw situation and why it’s bigger than “another CVE.” OpenClaw is a preview of what platform teams are about to deal with: autonomous agents running locally, wired into real tools, real APIs, and real credentials. When the trust model breaks, it’s not just data exposure. It’s an operator compromise. We walk through the recent timeline: mass internet exposure of OpenClaw control panels, CVE-2026-25253 (a one-click token leak that can turn your browser into the bridge to your local gateway), a skills marketplace that quickly became a malware delivery channel, and the Moltbook incident showing how “agent content” becomes a new supply chain problem. We close with the signal that agents are going mainstream: OpenAI hiring the OpenClaw creator. Chapters 1. What OpenClaw Actually Is 2. The Situation in One Line 3. Localhost Is Not a Boundary (The CVE Lesson) 4. Exposed Control Panels (How “Local” Went Public) 5. The Marketplace Problem (Skills Are Supply Chain) 6. The Ecosystem Spills (Agent Platforms Leaking Real Data) 7. Minimum Viable Safety for Local Agents 8. The Plot Twist (OpenAI Hires the Creator) Links from this episode Censys exposure research https://censys.com/blog/openclaw-in-the-wild-mapping-the-public-exposure-of-a-viral-ai-assistant GitHub advisory (CVE-2026-25253) https://github.com/advisories/GHSA-g8p2-7wf7-98mq NVD entry https://nvd.nist.gov/vuln/detail/CVE-2026-25253 Koi Security: ClawHavoc / malicious skills https://www.koi.ai/blog/clawhavoc-341-malicious-clawedbot-skills-found-by-the-bot-they-were-targeting Moltbook leak coverage (Reuters) https://www.reuters.com/legal/litigation/moltbook-social-media-site-ai-agents-had-big-security-hole-cyber-firm-wiz-says-2026-02-02/ OpenClaw security docs https://docs.openclaw.ai/gateway/security OpenAI hire coverage (FT) https://www.ft.com/content/45b172e6-df8c-41a7-bba9-3e21e361d3aa More information and past episodes on https://shipitweekly.fm
À propos de Ship It Weekly - DevOps, SRE, and Platform Engineering News
Ship It Weekly is a short, practical recap of what actually matters in DevOps, SRE, and platform engineering.Each episode, your host Brian Teller walks through the latest outages, releases, tools, and incident writeups, then translates them into “here’s what this means for your systems” instead of just reading headlines. Expect a couple of main stories with context, a quick hit of tools or releases worth bookmarking, and the occasional segment on on-call, burnout, or team culture.This isn’t a certification prep show or a lab walkthrough. It’s aimed at people who are already working in the space and want to stay sharp without scrolling status pages and blogs all week. You’ll hear about things like cloud provider incidents, Kubernetes and platform trends, Terraform and infrastructure changes, and real postmortems that are actually worth your time.Most episodes are 10–25 minutes, so you can catch up on the way to work or between meetings. Every now and then there will be a “special” focused on a big outage or a specific theme, but the default format is simple: what happened, why it matters, and what you might want to do about it in your own environment.If you’re the person people DM when something is broken in prod, or you’re building the platform everyone else ships on top of, Ship It Weekly is meant to be in your rotation.