Category
video
Date
02 Jul 25
Author
Yaron Schneider
Dapr
Jul 2, 2025

Build Robust Agentic Workflows with Dapr & Dapr Agents

Build durable agents that are resilient to failures, network outages, and full system shutdowns
Yaron Schneider

Yaron Schneider, Diagrid CTO and co-creator of Dapr, demonstrates how Dapr and Dapr Agents provide a proven framework for developing AI-powered applications. You'll learn how to build intelligent agents that can call tools, interact with external systems, and maintain memory--all with consistent, provider-agnostic APIs

You'll learn:

  • What Dapr Agents is, and how it works
  • How Dapr is uniquely equipped to help build reliable agentic workflows
  • How to use the Conversation API to integrate with multiple LLM providers
  • How to build reliable agents with MCP support and long-term memory.

TL;DR: Keep your agent logic where it is. Drop Dapr underneath for durability, async A2A, discovery, auth, state, and Pub/Sub. That turns experimental agents into production-ready systems without framework lock-in.

Video Summary:


What “agentic” means (working definition)

  • Agents ≠ single LLM calls. A chat UI with one prompt/response is not an agent.
  • Augmented LLMs: LLM + “tools” (e.g., weather, CRM) + short-term memory; may or may not act autonomously.
  • Workflow agents: Deterministic orchestrations of multiple LLM/tool calls (control, validation, external events).
  • Reasoning/Planning agents: Task → reason → plan → tool actions; optionally review loops & confidence thresholds.
  • Agent meshes: Multiple domain agents collaborating. Brings scale—and observability, security, and safety needs.

Why agentic systems are hard (production gaps in common frameworks)

Typical agent frameworks (AutoGen, LangGraph, LangChain, Semantic Kernel, etc.) focus on LLM abstraction, tool use, roles, and basic flows, but are often missing:

  • Security: No built-in mTLS between agents, weak/inline auth for MCP tools, no uniform ACL/authorization.
  • Resilience: Sparse retries, timeouts, circuit breakers, or backoff; limited error handling.
  • Durability: Lack of state persistence for long workflows; crash at step N ⇒ restart from step 0.
  • Asynchrony: A2A (agent-to-agent) interactions are typically synchronous HTTP; no durable, event-driven handoff.
  • Discovery: Agents are hard-coded endpoints; no service discovery/identity.
  • Ambient operation: Little first-class support to ingest and react to event streams (Kafka, queues, webhooks).
  • Data governance: No consistent guardrails for PII obfuscation, prompt caching, or per-hop data policy.

Where Dapr fits (foundational, vendor-neutral “plumbing”)

Dapr provides cloud-native building blocks you can drop under any agent framework to make it enterprise-ready:

  • Workflows: Wrap agent steps in durable workflows; resume from last successful activity after failures.
  • Pub/Sub: Turn meshes into event-driven systems (async A2A); durable handoff instead of brittle HTTP chaining.
  • Service Invocation: mTLS + identity + discovery for synchronous calls; now with SSE streaming (v1.16) for A2A/MCP.
  • State: Agent memory/checkpoints in your DB of choice (Redis, Postgres, DynamoDB, Cosmos DB, …) with etags, encryption.
  • Conversation API (beta): Built-in PII obfuscation and prompt caching across LLM calls.
  • Auth & Policy: Externalize OAuth2 flows, rate limits, and ACLs outside agent code; shrink attack surface.
  • Resiliency & Observability: Retries, timeouts, circuit breakers, OpenTelemetry traces/metrics by default.

Practical patterns

1) Make any agent workflow durable

  • Problem: CrewAI/LangChain pipelines restart from scratch on failure.
  • Dapr fix: Model each agent/tool call as a Dapr Workflow activity. On crash at step 99/100, resume at 99, not 0.

2) Durable agent-to-agent (A2A) collaboration

  • Problem: Pure HTTP A2A loses in-flight tasks when a callee restarts.
  • Dapr fix: Publish A2A tasks via Pub/Sub; consumers ack on completion; producer receives a durable completion event.

3) Pluggable, secure memory/checkpoints

  • LangGraph: Use a Dapr checkpointer to persist graph state to any Dapr state store (adds encryption, metrics, etags).
  • General: Swap in enterprise DBs without changing app code; add client-side encryption with your own keys.

4) Ambient agents

  • Subscribe agents to topics (Kafka/SQS/Service Bus, etc.) via Dapr components. Agents react to events, not just chats.

5) Safer tool use (MCP and beyond)

  • Terminate TLS at Dapr, enforce OAuth2/ACL/rate limits before forwarding to tools; keep auth out of agent code.

Dapr Agents (the “batteries-included” path)

  • Durable by default (soon named “Durable Agents”): steps and memory persisted automatically.
  • Cloud-native & decoupled: Pub/Sub and State are configured, not coded; switch Kafka↔Service Bus↔SQS, Redis↔Postgres↔Cosmos without code changes.
  • Vendor-neutral: No model/cloud lock-in; polyglot SDKs (Python, Java, .NET, etc.).
  • Mesh orchestration: Random/round-robin/LLM-guided routing; JSON audit trails; integrates with standard tracing stacks.

Security & governance checklist (with Dapr)

  • Identity & mTLS between agents/services.
  • OAuth2 / OIDC brokering to enterprise IdP for A2A and MCP tools.
  • PII controls via Conversation API; client-side encryption for state.
  • Policy: rate limits, ACLs, and circuit breakers at the Dapr layer.

Minimal “how-to” recipes

Durable CrewAI

  1. Wrap each CrewAI task as a Dapr Workflow activity.
  2. Use a Dapr state store for workflow history.
  3. On restart, call continue with the same instance ID to resume.

LangGraph with Dapr state

  1. Install a Dapr Checkpointer (library/adapter).
  2. Configure target state store (component.yaml) for your DB.
  3. Compile graph with the Dapr checkpointer; no app logic changes.

Async A2A

  1. Define a topic for task delegation; agents publish/subscribe with Dapr Pub/Sub.
  2. Return a completion event to the originator topic.
  3. Add Dapr resiliency policies (retries/backoff/circuit breakers).

Adoption reality and ROI

  • Most agent projects are early-stage; many demos run single-node.
  • Expect cancellations where ROI isn’t proven and where platforms lack the production plumbing above.
  • Using Dapr to supply durability, security, and ops can de-risk the path to production and clarify ROI.

When to reach for Dapr

  • You need durable, observable, secure agent workflows.
  • You want event-driven collaboration or ambient agents.
  • You need enterprise data stores, policy, and auth without rewriting frameworks.
  • You want a vendor-neutral foundation that works with LangGraph, LangChain, CrewAI, Semantic Kernel—or on its own via Dapr Agents

More videos

No items found.
Jan 23, 2026
This is some text inside of a div block.

Run Your Mission Critical Workloads Reliably with Diagrid Catalyst

See how Catalyst brings stateful workflows, durable execution, and built-in governance together to run distributed applications and AI agents reliably in production.

Dapr
Jan 13, 2026
This is some text inside of a div block.

Dapr, Simple APIs, & Distributed Systems | RedMonk Conversation with James Governor

Mark Fussell discusses Dapr, a runtime designed to simplify the development of distributed applications.

Dapr
Catalyst
Dec 10, 2025
This is some text inside of a div block.

Dapr University - Running Dapr Workflows with Catalyst

Catalyst streamlines your development efforts and provides greater observability and governance.

Diagrid newsletter

Signup for the latest Dapr & Diagrid news:
Dapr