Category
blog
Date
Jan 14, 2026
Author
Bilgin Ibryam
table of Content

The Tiniest Durable Agent

How to go from a 10-line demo to a reliable and secure ops-ready agent.

AI agents are easy to demo and still hard to ship to production. You can spin up a chatbot in minutes, but the moment you care about crashes, retries, state, identity, or observability, the code balloons. Suddenly, you’re wiring databases, message brokers, auth, tracing, and retry logic around what was supposed to be “just an agent.”

A production ready durable agent with Dapr

In this post, I’ll show how to build a durable, secure, operationally ready AI agent using 10 lines of Python and a few YAML files. The entire agent logic fits in a few lines of Python whereas everything else such as state, retries, identity, workflows, observability is handled for you. No magic demo. No skipped code. A fully working examples you can run from here.

Install Dapr 

To follow this example end-to-end, you’ll need: Python 3.11+, Docker, and an OpenAI API key (or another LLM provider).

First, install Dapr.

On macOS it is: 

brew install dapr/tap/dapr-cli
dapr init

This sets up everything you need locally: the Dapr runtime, Redis, and Zipkin. For Windows and Linux, see how to install Dapr here.

Create a virtual environment

python3.11 -m venv .venv && source .venv/bin/activate && pip 
install "dapr-agents>=0.10.5"

At this point, you haven’t written any agent code yet, but you already have durability, security, and observability available.

Create the tiniest durable agent

Now for the agent itself. This is the complete implementation of a durable agent exposed over HTTP:

from dapr_agents import DurableAgent
from dapr_agents.workflow.runners import AgentRunner‍

runner = AgentRunner()
agent = DurableAgent(name="Assistant", system_prompt="You are a helpful assistant")‍

try:
	runner.serve(agent, host="0.0.0.0", port=8001)
finally:
	runner.shutdown(agent)

That’s it. No retry logic. No workflow code. No state handling. No security setup. And yet, this agent is durable and production-ready.

What this code actually gives you

That small block of Python code is doing much more than it looks like:

  • Exposes an HTTP endpoint to trigger the agent
  • Executes agent logic inside a durable workflow
  • Persists conversation history and workflow execution state automatically
  • Stores agent memory using ~30 supported state stores
  • Abstracts LLM usage across 10+ providers
  • Uses a Conversation API backed by 10+ LLM backends
  • Assigns the agent a workload identity via SPIFFE
  • Emits distributed traces, metrics, and logs by default

You didn’t write any of that code but you still get the guarantees.

This is the core idea: Agent logic stays business focused. Operational guarantees live outside your code.

Configure your LLM provider

Next, we configure how the agent talks to an LLM. This configuration lives entirely outside the agent code. The agent does not import an SDK, manage API keys, or bind itself to a specific model. All of that is handled declaratively.

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:  
	name: llm-provider
spec:  
	type: conversation.openai  
	version: v1  
	metadata:    
		- name: key      
		value: OPEN_AI_API_KEY    
		- name: model      
		value: gpt-4.1-2025-04-14

While I'm not doing it in this example, the API key in production would come from an external secret store such as Kubernetes Secrets or a cloud service, which is easily doable with one more YAML file.

Configure state for conversation memory

Next, we configure where conversation memory lives:

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:  
	name: agent-statestore
spec:  
	type: state.redis
	version: v1  
	metadata:    
		- name: redisHost      
		value: localhost:6379    
		- name: redisPassword     
		value: ""    
		- name: actorStateStore     
		value: "false"

This store is used for conversation history and long-term memory.

Configure workflow execution state

Durability requires a separate execution store:

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:  
	name: agent-wfstatestore
spec:  
	type: state.redis  
	version: v1  
	metadata:    
		- name: redisHost      
		value: localhost:6379   
		- name: redisPassword     
		value: ""    
		- name: actorStateStore      
		value: "true"

Both state stores use Redis, but they serve different purposes:

  • One stores memory
  • One stores workflow execution progress

This is what allows the agent to resume safely after failures.

Run the agent

I can run the sidecar and the agent separately into separate terminals but Dapr lets me do all of that at once with the following format:

dapr run -f tiny-durable-agent.yaml

This launches:

  • A Dapr sidecar configured with the YAML files configured above
  • Our Python agent

Trigger the agent over HTTP

In a separate terminal:

curl -i -X POST http://localhost:8001/run \  
	-H "Content-Type: application/json" \  
	-d '{"task": "Write a haiku about programming."}'

Each request that prompts the agent starts a durable workflow execution behind the scenes. In this example, the workflow happens to be simple with a single interaction with the LLM. But in a real production agent, this execution could involve multiple turns between the LLM and tool calls, some of which may take a long time to complete or depend on external systems.

Because the agent runs inside a durable workflow:

  • Execution state is persisted automatically
  • A crash or restart does not lose progress
  • The agent resumes from where it left off instead of starting over
  • Any failing step is automatically retried
  • The user does not need to re-prompt the agent

This is the key difference between a demo agent and a reliable one.

Triggering the Agent Asynchronously

This agent is also one YAML file away from being triggered asynchronously. By adding a Pub/Sub component, the same agent automatically subscribes to a topic and be invoked via messaging. To see how to enable this and trigger the agent over Pub/Sub, check out the full source in the GitHub repository, which shows how the agent can be invoked through messaging as well as synchronous HTTP calls.

What this agent does (out of the box)

Despite its size, this agent:

  • Exposes an HTTP endpoint at localhost:8001/run
  • Uses a provider-agnostic LLM interface
  • Executes logic durably using workflows
  • Persists conversation history
  • Has built-in identity and mTLS
  • Has built-in retry logic
  • Emits traces, metrics, and logs automatically

None of these concerns leak into your agent code.

Examine workflow executions locally

Once you’ve triggered the agent, you can inspect exactly what happened during execution. Start the Diagrid Dashboard locally in a separate terminal:

docker run -p 8080:8080 
ghcr.io/diagridio/diagrid-dashboard:latest

Then open: http://localhost:8080

This dashboard shows all workflow executions created by your agent.

Diagrid Dashboard for local workflow executions

To see workflow executions in more detail with inputs and outputs for every step, every LLM call, and every tool invocation try Diagrid Catalyst here.

Diagrid Catalyst with detailed Workflow Visualizer

What we built vs. typical agent frameworks

In many agent frameworks:

  • Identity, access control, mTLS is bolted on later by external gateways or a mesh
  • Durable execution is provided by another workflow engine
  • Messaging, recurring tasks, infrastructure decoupling, configuration is additional responsibility

This is the difference between a demo and something you can run in production.

Capability Typical Agent Frameworks Dapr Agents
Durable execution Requires additional workflow engine ✅ Included
Identity & mTLS Requires additional mesh or gateway ✅ Included
Messaging, jobs, etc Requires additional frameworks ✅ Included
Infrastructure lock-in High ✅ Low
Vendor lock-in High ✅ CNCF

This approach avoids both infrastructure lock-in and agent framework lock-in.

Try it yourself

The full, runnable example is here. Clone it, run it locally, and break it on purpose. Kill the process. Restart it. Trigger it again.

Want to go deeper or try this in your environment?

More blog posts

AI Agents
Feb 25, 2026
 min read
February 25, 2026

Checkpoints Are Not Durable Execution

Why LangGraph, CrewAI, Google ADK, and Others Fall Short for Production Agent Workflows

Yaron Schneider
AI Agents
Jan 21, 2026
 min read
January 21, 2026

Agent Identity: The Foundational Layer that AI Is Still Missing

AI agents are not magic. They are workflows with probabilistic decision-making and real-world privileges. Workflows need identity, policy, and governance, especially when they run across environments, clouds, and tools.

Josh Van Leeuwen
News
AI Agents
Dec 15, 2025
 min read
December 15, 2025

Diagrid Joins the Agentic AI Foundation as a Gold Member

We’re excited to announce that Diagrid has officially joined the Agentic AI Foundation (AAIF) as a Gold Member, underscoring our commitment to driving open standards and production-grade infrastructure for reliable AI agents.

Yaron Schneider

Diagrid newsletter

Signup for the latest Dapr & Diagrid news:
AI Agents