Dapr University is live.Explore The Free Courses
Back to Videos

AI Agents Fail in Production. Here's Why State Management Matters

Mark Fussell, Co-creator of Dapr, explains how Dapr Agents 1.0 solves Day 2 operational challenges for running AI agents at scale in production on Kubernetes.

Mark Fussell

Mark Fussell

CEO & Co-Founder

April 10, 2026

Most AI agent prototypes never make it to production. The reason? They fail spectacularly when networks drop, machines crash, or state gets lost mid-transaction. Imagine processing a Stripe payment, the system crashes, and your workflow restarts — charging the customer twice. That's the reliability gap killing enterprise AI adoption today.

In this interview with Swapnil Bhartiya, Mark Fussell, Co-creator and Core Maintainer of Dapr, explains how Dapr Agents 1.0 solves the Day 2 operational nightmare of running AI agents at scale. Built on Dapr's durable workflow engine and battle-tested in Kubernetes environments, this CNCF graduated project provides the recovery guarantees that microservices-plus-LLM architectures desperately need.

Key topics covered:

  • Durable execution patterns for stateful AI workflows with automatic crash recovery and checkpoint logging
  • How Dapr's workflow engine prevents duplicate transactions and data loss during network failures in distributed agent systems
  • Production deployment strategies for agentic applications on Kubernetes with vendor-neutral, multi-state store flexibility
  • Real-world case study: Zeiss Vision Care using Dapr Agents for personalized prescription glass manufacturing workflows
  • The evolution from microservices to agentic applications and why workflow reliability is the new competitive advantage

Read the full story and transcript at www.tfir.io