Deep Agents Cloud Deployment Guide

This guide documents how to deploy Deep Agents to LangSmith cloud with a focus on workspace/file lifecycle - a critical topic for production deployments.

Overview

Deep Agents CAN be deployed to LangSmith cloud. From the official LangChain documentation:

"Deep agents applications can be deployed via LangSmith Deployment and monitored with LangSmith Observability."

Note: LangGraph Platform was renamed to LangSmith Deployment as of October 2025.

Notebook Workspace Analysis (FAQ)

This section addresses common questions about the notebook's approach to workspace management.

What the Notebook Does (Cell 19)

workspace_path = Path("workspace").absolute()
filesystem_backend = FilesystemBackend(
    root_dir=str(workspace_path),
    virtual_mode=True  # Sandboxing only, NOT persistence
)

Q: Is persisting workspace files without cleanup atypical?

NO - for local development, this is the standard pattern:

FilesystemBackend with virtual_mode=True is the documented local dev approach
Files persist to disk for easy inspection and debugging
No automatic cleanup is intentional - lets developers see accumulated results

YES - this would be atypical (problematic) in production:

Files would be lost on container restart
Files don't migrate between containers during scaling
Production should use StateBackend or CompositeBackend

Context	Typical Pattern	Notebook Approach
Local Development	FilesystemBackend + real directory	✅ Matches
Production Cloud	StateBackend or CompositeBackend	❌ Would need change

Q: What determines "end of thread" in a notebook?

Thread ends when .invoke() returns - but workspace files are NOT deleted because they're written to real disk, not agent state.

Event	In-Memory State	Workspace Files
`.invoke()` returns	Thread complete	PERSIST on disk
Run another cell	New thread starts	Previous files remain
Kernel restart	All memory lost	PERSIST on disk
Delete `workspace/`	N/A	Finally cleaned

Why files persist: FilesystemBackend writes to real disk (workspace/), not agent state. The "transient per-thread" rule applies to StateBackend, not real filesystem writes.

Timeline Example

Cell 20: agent.invoke(...) → Thread 1 ends → Files written to workspace/
Cell 30: agent.invoke(...) → Thread 2 ends → More files added
Cell 50: agent.invoke(...) → Thread 3 ends → Files accumulate
Kernel restart:            → Memory cleared → Files still on disk
rm -rf workspace/:         → Finally cleaned

Notebook Pattern vs Cloud Pattern

NOTEBOOK (Local Dev):
  Agent → FilesystemBackend → Real disk (workspace/)
  Files persist indefinitely until manually deleted

CLOUD (Production):
  Agent → StateBackend → Agent state (ephemeral)
  OR
  Agent → CompositeBackend:
      /memories/* → StoreBackend (PostgreSQL - persistent)
      /* → StateBackend (ephemeral per-thread)

Single-User Research Pattern vs Multi-User Production

The notebook implements a single-user deep research pattern - this is intentional for local development but requires significant changes for production.

What Makes It "Single-User"?

Characteristic	Notebook (Single-User)	Production (Multi-User)
User isolation	None - shared `workspace/`	`(user_id, "...")` namespace per user
Store backend	`InMemoryStore`	`PostgresStore` with user scoping
File backend	`FilesystemBackend` → local disk	`CompositeBackend` → StateBackend + StoreBackend
Authentication	None	Required (JWT, OAuth, etc.)
Session management	Jupyter kernel lifecycle	Thread IDs + checkpointer
Cleanup policy	Manual `rm -rf workspace/`	TTL policies per user namespace

Why Single-User is Appropriate for the Notebook

Learning focus: Students see all agent outputs in one place
Debugging convenience: Inspect workspace/ files directly
Iterative development: Accumulated context helps experimentation
No infrastructure: No database, no auth, no deployment

Transitioning to Multi-User Production

# NOTEBOOK (Single-User)
store = InMemoryStore()
backend = FilesystemBackend(root_dir="workspace", virtual_mode=True)

# PRODUCTION (Multi-User)
store = PostgresStore.from_conn_string(DB_URI)
backend = CompositeBackend(
    routes={
        "/memories/": StoreBackend(store=store),  # User-scoped persistence
        "/": StateBackend(),  # Ephemeral per-thread
    }
)

# User isolation via namespaces
def get_user_namespace(user_id: str, category: str):
    return (user_id, category)  # e.g., ("user_123", "profile")

Key Architectural Differences

NOTEBOOK (Single-User Deep Research):
┌─────────────────────────────────────┐
│  Jupyter Notebook                   │
│  └─ Deep Agent                      │
│      ├─ InMemoryStore (shared)      │
│      └─ FilesystemBackend           │
│          └─ workspace/ (shared)     │
└─────────────────────────────────────┘

PRODUCTION (Multi-User):
┌─────────────────────────────────────┐
│  API Server + Auth                  │
│  └─ Deep Agent (per request)        │
│      ├─ PostgresStore               │
│      │   └─ (user_id, category)     │
│      └─ CompositeBackend            │
│          ├─ /memories/* → Store     │
│          └─ /* → State (ephemeral)  │
└─────────────────────────────────────┘
         │
         ▼
    PostgreSQL (durable, user-isolated)

1. Workspace & File Lifecycle (KEY FINDING)

Files Are TRANSIENT by Default

From official LangChain documentation:

"Deep agents come with a local filesystem to offload memory. By default, this filesystem is stored in agent state and is transient to a single thread—files are lost when the conversation ends."

Source: Long-term memory - Deep Agents

Storage Lifecycle Summary

Storage Location	Scope	Lifetime	Cleanup
Agent Filesystem	Per-thread	Thread ends → LOST	Automatic on thread end
Container Local FS	Per-container	Container restart → LOST	Container termination
StateBackend	Per-thread	Thread ends → lost	Automatic
StoreBackend (PostgreSQL)	Cross-thread	Configurable TTL	TTL policy or manual
Redis	Ephemeral only	Auto-expires	Built-in TTL

When Are Workspace Files Cleaned Up?

Thread ends → Files in agent state are garbage collected
Container restarts → All local filesystem data lost
Scaling events → Files don't migrate between containers
TTL expiration → Configured via langgraph.json

Recommended Pattern: CompositeBackend Router

Agent[Deep Agent] --> Router{Path Router}
    Router --> |/memories/*| Store[StoreBackend] (Persistent across threads)
    Router --> |other| State[StateBackend] (Ephemeral - single thread)

from deepagents.backends import CompositeBackend, StateBackend, StoreBackend

backend = CompositeBackend(
    routes={
        "/memories/": StoreBackend(store=store),  # Persistent
        "/": StateBackend(),  # Ephemeral working files
    }
)

2. Deployment Options

Mode	Control Plane	Data Plane	Best For	Pricing
Cloud (SaaS)	LangChain	LangChain	Startups, rapid prototyping	$0.005/run
Hybrid (BYOC)	LangChain	Your cloud	Data residency requirements	Enterprise ($100k+/yr)
Self-Hosted	Your cloud	Your cloud	Air-gapped, maximum control	Enterprise ($100k+/yr)

3. Infrastructure Architecture

From LangGraph Data Plane docs:

Container Infrastructure

Agent Server: Containerized instances (Docker/Kubernetes)
Stateless Design: No resources maintained in memory between requests
Auto-scaling: Production scales up to 10 containers automatically
Load Balancing: AWS ALB, GCP Load Balancer, or Nginx

Storage Architecture

Component	Purpose	Persistence
PostgreSQL	Checkpoints, threads, memory store, runs	Durable
Redis	Communication, streaming, heartbeats	Ephemeral
Container FS	Working files during execution	Ephemeral

Critical: "No user or run data is stored in Redis" - all durable data goes to PostgreSQL.

Container Lifecycle

Container receives SIGINT → stops accepting new requests
Grace period (default 180s) for in-progress runs
Incomplete runs requeued for other instances
Sweeper task runs every 2 minutes to detect stalled runs
PostgreSQL MVCC ensures exactly-once semantics

4. Backend Selection (CRITICAL)

Backend	Cloud-Safe	Notes
`StateBackend`	Yes	Ephemeral, per-thread - good for stateless APIs
`StoreBackend`	Yes	Persistent via LangGraph Store - recommended for production
`FilesystemBackend`	Conditional	Sandboxes writes but still ephemeral
`LocalShellBackend`	Never	Arbitrary shell execution - severe security risk
`CompositeBackend`	Yes	Recommended - hybrid routing to appropriate backends

FilesystemBackend Clarification

virtual_mode=True provides sandbox isolation (prevents writing outside root_dir) but does NOT provide persistence:

Files are still ephemeral (lost on thread end)
Use for security sandboxing, not durability

from deepagents.backends import FilesystemBackend

backend = FilesystemBackend(
    root_dir=str(workspace_path),
    virtual_mode=True  # REQUIRED for security, but still ephemeral!
)

5. TTL Configuration

Configure cleanup policies in langgraph.json:

{
  "checkpointer": {
    "ttl": {
      "strategy": "delete",
      "sweep_interval_minutes": 60
    }
  },
  "store": {
    "ttl": {
      "default_ttl_minutes": 10080
    }
  }
}

Source: How to add TTLs to your application

6. State Persistence with PostgreSQL

For production deployments, use PostgreSQL for both checkpoints and store:

from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.store.postgres import PostgresStore

DB_URI = "postgresql://user:pass@host:5432/db?sslmode=require"

with PostgresStore.from_conn_string(DB_URI) as store, \
     PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    graph = builder.compile(checkpointer=checkpointer, store=store)

Why PostgreSQL?

Durability: Survives restarts and deployments
Scalability: Read replicas for high-traffic scenarios
Transactions: ACID guarantees for state consistency
Managed options: AWS RDS, GCP Cloud SQL, Supabase, Neon

7. Memory Namespace Strategy

# User-scoped namespaces (persistent via StoreBackend)
(user_id, "profile")      # Demographics, goals
(user_id, "preferences")  # Communication style
(user_id, "progress")     # Milestones, check-ins

# Global namespaces
("wellness", "knowledge") # Shared knowledge base

Benefits

User isolation: Each user's data is namespaced
Easy queries: Filter by namespace tuple prefix
Compliance: GDPR deletion targets specific user namespaces

8. Security Checklist

Before deploying to production:

NEVER use LocalShellBackend in production
Use CompositeBackend to route persistent data to StoreBackend
Store API keys in environment variables (never in code)
Never commit .env files (only .env.example)
Enable SSL/TLS for database connections (sslmode=require)
Implement user isolation via namespace strategy
Configure human-in-the-loop for sensitive operations:

# Interrupt before sensitive tool executions
graph = builder.compile(
    checkpointer=checkpointer,
    interrupt_before=["sensitive_tool_node"]
)

9. Deployment Configuration

Required `langgraph.json`

{
  "dependencies": ["."],
  "graphs": {
    "my_agent": "./src/agent.py:graph"
  },
  "env": ".env",
  "checkpointer": {
    "ttl": { "strategy": "delete", "sweep_interval_minutes": 60 }
  }
}

Deploy Command

langgraph deploy --config langgraph.json

Connect to Deployed Agent

from langgraph.pregel.remote import RemoteGraph

remote_graph = RemoteGraph(
    name="my_agent",
    url="https://your-deployment.langgraph.com",
    api_key="your-langsmith-api-key"
)

# Use the remote graph just like a local one
result = remote_graph.invoke({"messages": [...]})

10. Production Architecture

┌─────────────────────────────────────────────────────┐
│              LangSmith Deployment                    │
│  ┌───────────────────────────────────────────────┐  │
│  │            Deep Agent Graph                    │  │
│  │  - Planning (TodoListMiddleware)               │  │
│  │  - Context (CompositeBackend)                  │  │
│  │     └─ /memories/* → StoreBackend (persistent) │  │
│  │     └─ /* → StateBackend (ephemeral)           │  │
│  │  - Subagents (SubAgentMiddleware)              │  │
│  │  - Memory (MemoryMiddleware)                   │  │
│  └───────────────────┬───────────────────────────┘  │
│                      │                               │
│   Container FS       │         Redis                 │
│   (ephemeral)        │         (ephemeral comms)     │
└──────────────────────┼───────────────────────────────┘
                       │
       ┌───────────────┼───────────────┐
       ▼               ▼               ▼
  PostgreSQL     LangSmith        Vector DB
  (DURABLE:      Observability    (Optional)
   checkpoints,
   store items,
   threads)

11. Scaling Considerations

Component	Scaling Strategy
`StateBackend`	Horizontal scaling works out-of-box (stateless)
`StoreBackend`	PostgreSQL with read replicas for high traffic
`Subagents`	Run in isolated contexts, scale independently
Container failures	Transparent - sweeper requeues interrupted runs

Cost Optimization

Coordinator agent: Use capable model (Claude Sonnet, GPT-4o)
Subagents: Use cheaper models (openai:gpt-4o-mini ~10x cheaper)
Embeddings: Use text-embedding-3-small for semantic search
Context offloading: Use ephemeral files for working data, StoreBackend only for persistent

12. Implementation Roadmap

Phase	Activities
1. Local Dev	Set up env, test with `langgraph dev`, use InMemoryStore
2. Database	Provision PostgreSQL, migrate to PostgresStore/PostgresSaver
3. Deploy	Create langgraph.json with TTL config, deploy to LangSmith
4. Harden	Enable tracing, configure HITL, set up monitoring

13. Example: Deploying the Wellness Coach

Here's how to deploy the Wellness Coach agent from the notebook:

Step 1: Refactor for Production

# src/wellness_agent.py
from deepagents import DeepAgent, TodoListMiddleware, MemoryMiddleware
from deepagents.backends import CompositeBackend, StateBackend, StoreBackend
from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.store.postgres import PostgresStore
import os

DB_URI = os.environ["DATABASE_URL"]

def create_wellness_agent():
    """Create a production-ready wellness coach agent."""

    with PostgresStore.from_conn_string(DB_URI) as store, \
         PostgresSaver.from_conn_string(DB_URI) as checkpointer:

        # Use CompositeBackend for hybrid persistence
        backend = CompositeBackend(
            routes={
                "/memories/": StoreBackend(store=store),  # Persistent
                "/": StateBackend(),  # Ephemeral working files
            }
        )

        agent = DeepAgent(
            model="anthropic:claude-sonnet-4-20250514",
            backend=backend,
            middlewares=[
                TodoListMiddleware(),
                MemoryMiddleware(store=store),
            ],
            checkpointer=checkpointer,
        )

        return agent.compile()

# Export for langgraph.json
graph = create_wellness_agent()

Step 2: Create langgraph.json

{
  "dependencies": ["."],
  "graphs": {
    "wellness_coach": "./src/wellness_agent.py:graph"
  },
  "env": ".env",
  "checkpointer": {
    "ttl": { "strategy": "delete", "sweep_interval_minutes": 60 }
  },
  "store": {
    "ttl": { "default_ttl_minutes": 10080 }
  }
}

Step 3: Deploy

# Test locally first
langgraph dev

# Deploy to LangSmith
langgraph deploy

14. Monitoring and Observability

LangSmith provides built-in observability:

# Enable tracing (already enabled by default in LangSmith deployments)
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "wellness-coach-prod"

Key Metrics to Monitor

Latency: P50, P95, P99 response times
Token usage: Input/output tokens per request
Error rates: Failed invocations, timeouts
Memory usage: Store operations per session

Key Answers to Common Questions

Q: When are workspace files cleaned up in LangSmith Cloud?

A: Files are cleaned up at multiple points:

Thread ends - Agent filesystem in state is garbage collected immediately
Container restarts - All local filesystem data lost (scaling, deploys, crashes)
TTL expiration - Configurable via langgraph.json for StoreBackend items

Q: What is the scope of workspace usage?

Default scope: Per-thread (single conversation)
Container scope: Ephemeral, lost on container lifecycle events
For persistence: Route paths to StoreBackend via CompositeBackend

Q: Lifecycle duration?

Storage Type	Duration
Agent filesystem (default)	Until thread ends
Container local FS	Until container restart
StoreBackend items	Until TTL expires or manual deletion
PostgreSQL checkpoints	Configurable TTL

Sources (Verified)

Primary Documentation

Deep Agents Long-term Memory - File transience, CompositeBackend pattern
Deep Agents Backends - Backend types, routing configuration
LangGraph Data Plane - PostgreSQL/Redis architecture
TTL Configuration - Cleanup policies

Deployment & Operations

LangSmith Deployment Docs
Scalability & Resilience - Container lifecycle, sweeper task
Data Purging for Compliance
deepagents GitHub

Quick Reference

Minimum Viable Production Config

# 1. Use PostgreSQL
from langgraph.checkpoint.postgres import PostgresSaver
from langgraph.store.postgres import PostgresStore

# 2. Use CompositeBackend (not FilesystemBackend alone)
from deepagents.backends import CompositeBackend, StateBackend, StoreBackend

# 3. Enable SSL
DB_URI = "postgresql://...?sslmode=require"

# 4. Configure hybrid persistence
backend = CompositeBackend(
    routes={
        "/memories/": StoreBackend(store=store),
        "/": StateBackend(),
    }
)

# 5. Deploy
# langgraph deploy --config langgraph.json

Environment Variables for Production

# .env.example (commit this, not .env!)
DATABASE_URL=postgresql://user:pass@host:5432/db?sslmode=require
ANTHROPIC_API_KEY=your-key-here
OPENAI_API_KEY=your-key-here
LANGCHAIN_API_KEY=your-key-here
LANGCHAIN_TRACING_V2=true
LANGCHAIN_PROJECT=your-project-name

Verification Steps

Check official docs: https://docs.langchain.com/oss/python/deepagents/long-term-memory
Test locally with langgraph dev - observe files lost between threads
Deploy test agent - verify files don't persist across container restarts
Configure TTL in langgraph.json - verify cleanup via LangSmith UI

donbr/langsmith-cloud-deepagents-deployments.md