Skip to content

Instantly share code, notes, and snippets.

@donbr
Created February 6, 2026 00:51
Show Gist options
  • Select an option

  • Save donbr/780c6c342a14b151011af8e83b8f48a6 to your computer and use it in GitHub Desktop.

Select an option

Save donbr/780c6c342a14b151011af8e83b8f48a6 to your computer and use it in GitHub Desktop.
Deep Agents Cloud Deployment Guide

Deep Agents Cloud Deployment Guide

This guide documents how to deploy Deep Agents to LangSmith cloud with a focus on workspace/file lifecycle - a critical topic for production deployments.

Overview

Deep Agents CAN be deployed to LangSmith cloud. From the official LangChain documentation:

"Deep agents applications can be deployed via LangSmith Deployment and monitored with LangSmith Observability."

Note: LangGraph Platform was renamed to LangSmith Deployment as of October 2025.


Notebook Workspace Analysis (FAQ)

This section addresses common questions about the notebook's approach to workspace management.

What the Notebook Does (Cell 19)

workspace_path = Path("workspace").absolute()
filesystem_backend = FilesystemBackend(
    root_dir=str(workspace_path),
    virtual_mode=True  # Sandboxing only, NOT persistence
)

Q: Is persisting workspace files without cleanup atypical?

NO - for local development, this is the standard pattern:

  • FilesystemBackend with virtual_mode=True is the documented local dev approach
  • Files persist to disk for easy inspection and debugging
  • No automatic cleanup is intentional - lets developers see accumulated results

YES - this would be atypical (problematic) in production:

  • Files would be lost on container restart
  • Files don't migrate between containers during scaling
  • Production should use StateBackend or CompositeBackend
Context Typical Pattern Notebook Approach
Local Development FilesystemBackend + real directory ✅ Matches
Production Cloud StateBackend or CompositeBackend ❌ Would need change

Q: What determines "end of thread" in a notebook?

Thread ends when .invoke() returns - but workspace files are NOT deleted because they're written to real disk, not agent state.

Event In-Memory State Workspace Files
.invoke() returns Thread complete PERSIST on disk
Run another cell New thread starts Previous files remain
Kernel restart All memory lost PERSIST on disk
Delete workspace/ N/A Finally cleaned

Why files persist: FilesystemBackend writes to real disk (workspace/), not agent state. The "transient per-thread" rule applies to StateBackend, not real filesystem writes.

Timeline Example

Cell 20: agent.invoke(...) → Thread 1 ends → Files written to workspace/
Cell 30: agent.invoke(...) → Thread 2 ends → More files added
Cell 50: agent.invoke(...) → Thread 3 ends → Files accumulate
Kernel restart:            → Memory cleared → Files still on disk
rm -rf workspace/:         → Finally cleaned

Notebook Pattern vs Cloud Pattern

NOTEBOOK (Local Dev):
  Agent → FilesystemBackend → Real disk (workspace/)
  Files persist indefinitely until manually deleted

CLOUD (Production):
  Agent → StateBackend → Agent state (ephemeral)
  OR
  Agent → CompositeBackend:
      /memories/* → StoreBackend (PostgreSQL - persistent)
      /* → StateBackend (ephemeral per-thread)

Single-User Research Pattern vs Multi-User Production

The notebook implements a single-user deep research pattern - this is intentional for local development but requires significant changes for production.

What Makes It "Single-User"?

Characteristic Notebook (Single-User) Production (Multi-User)
User isolation None - shared workspace/ (user_id, "...") namespace per user
Store backend InMemoryStore PostgresStore with user scoping
File backend FilesystemBackend → local disk CompositeBackend → StateBackend + StoreBackend
Authentication None Required (JWT, OAuth, etc.)
Session management Jupyter kernel lifecycle Thread IDs + checkpointer
Cleanup policy Manual rm -rf workspace/ TTL policies per user namespace

Why Single-User is Appropriate for the Notebook

  1. Learning focus: Students see all agent outputs in one place
  2. Debugging convenience: Inspect workspace/ files directly
  3. Iterative development: Accumulated context helps experimentation
  4. No infrastructure: No database, no auth, no deployment

Transitioning to Multi-User Production

# NOTEBOOK (Single-User)
store = InMemoryStore()
backend = FilesystemBackend(root_dir="workspace", virtual_mode=True)

# PRODUCTION (Multi-User)
store = PostgresStore.from_conn_string(DB_URI)
backend = CompositeBackend(
    routes={
        "/memories/": StoreBackend(store=store),  # User-scoped persistence
        "/": StateBackend(),  # Ephemeral per-thread
    }
)

# User isolation via namespaces
def get_user_namespace(user_id: str, category: str):
    return (user_id, category)  # e.g., ("user_123", "profile")

Key Architectural Differences

NOTEBOOK (Single-User Deep Research):
┌─────────────────────────────────────┐
│  Jupyter Notebook                   │
│  └─ Deep Agent                      │
│      ├─ InMemoryStore (shared)      │
│      └─ FilesystemBackend           │
│          └─ workspace/ (shared)     │
└─────────────────────────────────────┘

PRODUCTION (Multi-User):
┌─────────────────────────────────────┐
│  API Server + Auth                  │
│  └─ Deep Agent (per request)        │
│      ├─ PostgresStore               │
│      │   └─ (user_id, category)     │
│      └─ CompositeBackend            │
│          ├─ /memories/* → Store     │
│          └─ /* → State (ephemeral)  │
└─────────────────────────────────────┘
         │
         ▼
    PostgreSQL (durable, user-isolated)

1. Workspace & File Lifecycle (KEY FINDING)

Files Are TRANSIENT by Default

From official LangChain documentation:

"Deep agents come with a local filesystem to offload memory. By default, this filesystem is stored in agent state and is transient to a single thread—files are lost when the conversation ends."

Source: Long-term memory - Deep Agents

Storage Lifecycle Summary

Storage Location Scope Lifetime Cleanup
Agent Filesystem Per-thread Thread ends → LOST Automatic on thread end
Container Local FS Per-container Container restart → LOST Container termination
StateBackend Per-thread Thread ends → lost Automatic
StoreBackend (PostgreSQL) Cross-thread Configurable TTL TTL policy or manual
Redis Ephemeral only Auto-expires Built-in TTL

When Are Workspace Files Cleaned Up?

  1. Thread ends → Files in agent state are garbage collected
  2. Container restarts → All local filesystem data lost
  3. Scaling events → Files don't migrate between containers
  4. TTL expiration → Configured via langgraph.json

Recommended Pattern: CompositeBackend Router

Agent[Deep Agent] --> Router{Path Router}
    Router --> |/memories/*| Store[StoreBackend] (Persistent across threads)
    Router --> |other| State[StateBackend] (Ephemeral - single thread)
from deepagents.backends import CompositeBackend, StateBackend, StoreBackend

backend = CompositeBackend(
    routes={
        "/memories/": StoreBackend(store=store),  # Persistent
        "/": StateBackend(),  # Ephemeral working files
    }
)

2. Deployment Options

Mode Control Plane Data Plane Best For Pricing
Cloud (SaaS) LangChain LangChain Startups, rapid prototyping $0.005/run
Hybrid (BYOC) LangChain Your cloud Data residency requirements Enterprise ($100k+/yr)
Self-Hosted Your cloud Your cloud Air-gapped, maximum control Enterprise ($100k+/yr)

3. Infrastructure Architecture

From LangGraph Data Plane docs:

Container Infrastructure

  • Agent Server: Containerized instances (Docker/Kubernetes)
  • Stateless Design: No resources maintained in memory between requests
  • Auto-scaling: Production scales up to 10 containers automatically
  • Load Balancing: AWS ALB, GCP Load Balancer, or Nginx

Storage Architecture

Component Purpose Persistence
PostgreSQL Checkpoints, threads, memory store, runs Durable
Redis Communication, streaming, heartbeats Ephemeral
Container FS Working files during execution Ephemeral

Critical: "No user or run data is stored in Redis" - all durable data goes to PostgreSQL.

Container Lifecycle

  1. Container receives SIGINT → stops accepting new requests
  2. Grace period (default 180s) for in-progress runs
  3. Incomplete runs requeued for other instances
  4. Sweeper task runs every 2 minutes to detect stalled runs
  5. PostgreSQL MVCC ensures exactly-once semantics

4. Backend Selection (CRITICAL)

Backend Cloud-Safe Notes
StateBackend Yes Ephemeral, per-thread - good for stateless APIs
StoreBackend Yes Persistent via LangGraph Store - recommended for production
FilesystemBackend Conditional Sandboxes writes but still ephemeral
LocalShellBackend Never Arbitrary shell execution - severe security risk
CompositeBackend Yes Recommended - hybrid routing to appropriate backends

FilesystemBackend Clarification

virtual_mode=True provides sandbox isolation (prevents writing outside root_dir) but does NOT provide persistence:

  • Files are still ephemeral (lost on thread end)
  • Use for security sandboxing, not durability
from deepagents.backends import FilesystemBackend

backend = FilesystemBackend(
    root_dir=str(workspace_path),
    virtual_mode=True  # REQUIRED for security, but still ephemeral!
)

5. TTL Configuration

Configure cleanup policies in langgraph.json:

{
  "checkpointer": {
    "ttl": {
      "strategy": "delete",
      "sweep_interval_minutes": 60
    }
  },
  "store": {
    "ttl": {
      "default_ttl_minutes": 10080
    }
  }
}

Source: How to add TTLs to your application


6. State Persistence with PostgreSQL

For production deployments, use PostgreSQL for both checkpoints and store:

from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.store.postgres import PostgresStore

DB_URI = "postgresql://user:pass@host:5432/db?sslmode=require"

with PostgresStore.from_conn_string(DB_URI) as store, \
     PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    graph = builder.compile(checkpointer=checkpointer, store=store)

Why PostgreSQL?

  • Durability: Survives restarts and deployments
  • Scalability: Read replicas for high-traffic scenarios
  • Transactions: ACID guarantees for state consistency
  • Managed options: AWS RDS, GCP Cloud SQL, Supabase, Neon

7. Memory Namespace Strategy

# User-scoped namespaces (persistent via StoreBackend)
(user_id, "profile")      # Demographics, goals
(user_id, "preferences")  # Communication style
(user_id, "progress")     # Milestones, check-ins

# Global namespaces
("wellness", "knowledge") # Shared knowledge base

Benefits

  • User isolation: Each user's data is namespaced
  • Easy queries: Filter by namespace tuple prefix
  • Compliance: GDPR deletion targets specific user namespaces

8. Security Checklist

Before deploying to production:

  • NEVER use LocalShellBackend in production
  • Use CompositeBackend to route persistent data to StoreBackend
  • Store API keys in environment variables (never in code)
  • Never commit .env files (only .env.example)
  • Enable SSL/TLS for database connections (sslmode=require)
  • Implement user isolation via namespace strategy
  • Configure human-in-the-loop for sensitive operations:
# Interrupt before sensitive tool executions
graph = builder.compile(
    checkpointer=checkpointer,
    interrupt_before=["sensitive_tool_node"]
)

9. Deployment Configuration

Required langgraph.json

{
  "dependencies": ["."],
  "graphs": {
    "my_agent": "./src/agent.py:graph"
  },
  "env": ".env",
  "checkpointer": {
    "ttl": { "strategy": "delete", "sweep_interval_minutes": 60 }
  }
}

Deploy Command

langgraph deploy --config langgraph.json

Connect to Deployed Agent

from langgraph.pregel.remote import RemoteGraph

remote_graph = RemoteGraph(
    name="my_agent",
    url="https://your-deployment.langgraph.com",
    api_key="your-langsmith-api-key"
)

# Use the remote graph just like a local one
result = remote_graph.invoke({"messages": [...]})

10. Production Architecture

┌─────────────────────────────────────────────────────┐
│              LangSmith Deployment                    │
│  ┌───────────────────────────────────────────────┐  │
│  │            Deep Agent Graph                    │  │
│  │  - Planning (TodoListMiddleware)               │  │
│  │  - Context (CompositeBackend)                  │  │
│  │     └─ /memories/* → StoreBackend (persistent) │  │
│  │     └─ /* → StateBackend (ephemeral)           │  │
│  │  - Subagents (SubAgentMiddleware)              │  │
│  │  - Memory (MemoryMiddleware)                   │  │
│  └───────────────────┬───────────────────────────┘  │
│                      │                               │
│   Container FS       │         Redis                 │
│   (ephemeral)        │         (ephemeral comms)     │
└──────────────────────┼───────────────────────────────┘
                       │
       ┌───────────────┼───────────────┐
       ▼               ▼               ▼
  PostgreSQL     LangSmith        Vector DB
  (DURABLE:      Observability    (Optional)
   checkpoints,
   store items,
   threads)

11. Scaling Considerations

Component Scaling Strategy
StateBackend Horizontal scaling works out-of-box (stateless)
StoreBackend PostgreSQL with read replicas for high traffic
Subagents Run in isolated contexts, scale independently
Container failures Transparent - sweeper requeues interrupted runs

Cost Optimization

  • Coordinator agent: Use capable model (Claude Sonnet, GPT-4o)
  • Subagents: Use cheaper models (openai:gpt-4o-mini ~10x cheaper)
  • Embeddings: Use text-embedding-3-small for semantic search
  • Context offloading: Use ephemeral files for working data, StoreBackend only for persistent

12. Implementation Roadmap

Phase Activities
1. Local Dev Set up env, test with langgraph dev, use InMemoryStore
2. Database Provision PostgreSQL, migrate to PostgresStore/PostgresSaver
3. Deploy Create langgraph.json with TTL config, deploy to LangSmith
4. Harden Enable tracing, configure HITL, set up monitoring

13. Example: Deploying the Wellness Coach

Here's how to deploy the Wellness Coach agent from the notebook:

Step 1: Refactor for Production

# src/wellness_agent.py
from deepagents import DeepAgent, TodoListMiddleware, MemoryMiddleware
from deepagents.backends import CompositeBackend, StateBackend, StoreBackend
from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.store.postgres import PostgresStore
import os

DB_URI = os.environ["DATABASE_URL"]

def create_wellness_agent():
    """Create a production-ready wellness coach agent."""

    with PostgresStore.from_conn_string(DB_URI) as store, \
         PostgresSaver.from_conn_string(DB_URI) as checkpointer:

        # Use CompositeBackend for hybrid persistence
        backend = CompositeBackend(
            routes={
                "/memories/": StoreBackend(store=store),  # Persistent
                "/": StateBackend(),  # Ephemeral working files
            }
        )

        agent = DeepAgent(
            model="anthropic:claude-sonnet-4-20250514",
            backend=backend,
            middlewares=[
                TodoListMiddleware(),
                MemoryMiddleware(store=store),
            ],
            checkpointer=checkpointer,
        )

        return agent.compile()

# Export for langgraph.json
graph = create_wellness_agent()

Step 2: Create langgraph.json

{
  "dependencies": ["."],
  "graphs": {
    "wellness_coach": "./src/wellness_agent.py:graph"
  },
  "env": ".env",
  "checkpointer": {
    "ttl": { "strategy": "delete", "sweep_interval_minutes": 60 }
  },
  "store": {
    "ttl": { "default_ttl_minutes": 10080 }
  }
}

Step 3: Deploy

# Test locally first
langgraph dev

# Deploy to LangSmith
langgraph deploy

14. Monitoring and Observability

LangSmith provides built-in observability:

# Enable tracing (already enabled by default in LangSmith deployments)
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "wellness-coach-prod"

Key Metrics to Monitor

  • Latency: P50, P95, P99 response times
  • Token usage: Input/output tokens per request
  • Error rates: Failed invocations, timeouts
  • Memory usage: Store operations per session

Key Answers to Common Questions

Q: When are workspace files cleaned up in LangSmith Cloud?

A: Files are cleaned up at multiple points:

  1. Thread ends - Agent filesystem in state is garbage collected immediately
  2. Container restarts - All local filesystem data lost (scaling, deploys, crashes)
  3. TTL expiration - Configurable via langgraph.json for StoreBackend items

Q: What is the scope of workspace usage?

A:

  • Default scope: Per-thread (single conversation)
  • Container scope: Ephemeral, lost on container lifecycle events
  • For persistence: Route paths to StoreBackend via CompositeBackend

Q: Lifecycle duration?

Storage Type Duration
Agent filesystem (default) Until thread ends
Container local FS Until container restart
StoreBackend items Until TTL expires or manual deletion
PostgreSQL checkpoints Configurable TTL

Sources (Verified)

Primary Documentation

Deployment & Operations


Quick Reference

Minimum Viable Production Config

# 1. Use PostgreSQL
from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.store.postgres import PostgresStore

# 2. Use CompositeBackend (not FilesystemBackend alone)
from deepagents.backends import CompositeBackend, StateBackend, StoreBackend

# 3. Enable SSL
DB_URI = "postgresql://...?sslmode=require"

# 4. Configure hybrid persistence
backend = CompositeBackend(
    routes={
        "/memories/": StoreBackend(store=store),
        "/": StateBackend(),
    }
)

# 5. Deploy
# langgraph deploy --config langgraph.json

Environment Variables for Production

# .env.example (commit this, not .env!)
DATABASE_URL=postgresql://user:pass@host:5432/db?sslmode=require
ANTHROPIC_API_KEY=your-key-here
OPENAI_API_KEY=your-key-here
LANGCHAIN_API_KEY=your-key-here
LANGCHAIN_TRACING_V2=true
LANGCHAIN_PROJECT=your-project-name

Verification Steps

  1. Check official docs: https://docs.langchain.com/oss/python/deepagents/long-term-memory
  2. Test locally with langgraph dev - observe files lost between threads
  3. Deploy test agent - verify files don't persist across container restarts
  4. Configure TTL in langgraph.json - verify cleanup via LangSmith UI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment