History Notebooks Phase 10: Agent Integration

The Problem

History Notebooks capture narrative alongside computation — but right now, only humans write. The notebook is a passive document: the user opens the editor, types markdown, inserts dataset references via toolbox or drag-and-drop, and saves. The AI agent framework (PR #21434, merged Dec 2025) gives Galaxy a multi-agent system with structured output, tool calling, and multi-turn conversation — but it has no surface for writing or editing documents tied to histories. These two systems sit side-by-side with no connection.

The original vision (from THE_PROBLEM_AND_GOAL.md):

History Notebooks are living documents that grow alongside your analysis. As analysis progresses — whether driven by human clicks, agent actions, or conversation between the two — the narrative builds up iteratively.

This creates a "Claude Code for data analysis" experience. The agent doesn't just run tools — it builds up a polished document with rich visualizations, updates figures when parameters change, and refines the presentation in response to human feedback.

Phase 10 bridges the gap: the agent can read the notebook, propose changes, and write new revisions — all while the human stays in control.

What We're Building

A split-view experience where the user sees the notebook editor on one side and a chat panel on the other. The agent can:

Read the current notebook content + history datasets
Propose markdown changes (with diff preview)
Write new revisions when the user approves (tracked as edit_source="agent")
Discuss the analysis in multi-turn conversation, with full context of what's in the history and notebook

The user can accept, reject, or modify agent proposals before they become revisions. Every agent edit creates a new revision — never destructively overwrites.

Infrastructure Already Built

History Notebooks (Phases 1-8, complete)

Backend:

HistoryNotebook + HistoryNotebookRevision SQLAlchemy models
Revision model with edit_source field: "user" | "agent" | "restore" (16-char column)
HistoryNotebookManager with full CRUD + save_new_revision(trans, notebook, payload, edit_source="user")
The edit_source parameter is already accepted by the manager but never called with "agent" yet
Content stored with raw HID references (hid=N), resolved at read time via resolve_history_markdown()
Two-step read pipeline: resolve_history_markdown() (hid→internal_id) then ready_galaxy_markdown_for_export() (internal_id→encoded_id)
9 API endpoints under /api/histories/{history_id}/notebooks
prepare-for-page endpoint for Page export with full HID resolution

Frontend:

Pinia store (useHistoryNotebookStore) with dirty tracking, save/discard, revision management
HistoryNotebookView.vue orchestrator (list mode, editor mode, displayOnly mode, revision panel)
HistoryNotebookEditor.vue wrapping MarkdownEditor in history_notebook mode
NotebookRevisionList.vue with edit_source labels: "Manual" / "AI" / "Restored"
Drag-and-drop from history panel (dataset + collection, inserts hid=N directives)
Window Manager integration (rendered Markdown.vue in WinBox iframe)
Routes at /histories/:historyId/notebooks[/:notebookId]

Key notebook API endpoints:

Method	Path	Purpose
GET	`/api/histories/{id}/notebooks`	List notebooks
POST	`/api/histories/{id}/notebooks`	Create notebook
GET	`/api/histories/{id}/notebooks/{nbId}`	Get notebook (content resolved+encoded)
PUT	`/api/histories/{id}/notebooks/{nbId}`	Update (creates revision)
GET	`/api/histories/{id}/notebooks/{nbId}/revisions`	List revisions
GET	`/api/histories/{id}/notebooks/{nbId}/revisions/{revId}`	Get revision content
POST	`/api/histories/{id}/notebooks/{nbId}/revisions/{revId}/revert`	Revert to old revision

Agent Framework (PR #21434, merged)

Architecture:

BaseGalaxyAgent (ABC) built on pydantic-ai with GalaxyAgentDependencies dataclass injection
Dependencies carry: trans, user, config, optional job_manager, dataset_manager, workflow_manager, tool_cache, toolbox, get_agent (for inter-agent calls)
AgentRegistry maps agent_type strings → agent classes; singleton in lib/galaxy/agents/__init__.py
AgentService in lib/galaxy/managers/agents.py creates dependencies and executes agents
5 registered agents: router, error_analysis, custom_tool, orchestrator, tool_recommendation

Key base class capabilities:

class BaseGalaxyAgent(ABC):
    agent_type: str                          # Class attribute
    deps: GalaxyAgentDependencies            # Injected
    agent: Agent[GalaxyAgentDependencies, T] # pydantic-ai agent

    async def process(query, context) -> AgentResponse
    def _create_agent() -> Agent            # Abstract: define pydantic-ai agent + tools
    def get_system_prompt() -> str          # Abstract: system prompt (typically from prompts/*.md)
    async def _call_agent_from_tool(...)    # Agent-to-agent delegation
    def _get_model() -> model               # Multi-provider: OpenAI, Anthropic, Google, local
    def _get_agent_config(key, default)     # 4-level config cascade

Tool calling pattern (pydantic-ai):

def _create_agent(self):
    agent = Agent(self._get_model(), deps_type=GalaxyAgentDependencies, ...)

    @agent.tool
    async def search_tools(ctx: RunContext[GalaxyAgentDependencies], query: str) -> str:
        toolbox = ctx.deps.toolbox
        # ... search and return results
        return formatted_results

    return agent

Response schema:

class AgentResponse(BaseModel):
    content: str                         # Main response (markdown)
    confidence: ConfidenceLevel          # low|medium|high
    agent_type: str                      # Which agent answered
    suggestions: list[ActionSuggestion]  # Actionable next steps
    metadata: dict[str, Any]             # Token usage, model, etc.
    reasoning: Optional[str]             # Agent's reasoning chain

class ActionSuggestion(BaseModel):
    action_type: ActionType              # tool_run|save_tool|contact_support|view_external|documentation
    description: str
    parameters: dict[str, Any]
    confidence: ConfidenceLevel
    priority: int                        # 1=high, 2=medium, 3=low

Chat API & conversation persistence:

POST /api/chat — main endpoint, supports exchange_id for multi-turn, agent_type selection
ChatExchange model stores conversations per user (with optional job_id)
ChatExchangeMessage stores individual messages as JSON (query + response + agent metadata)
GET /api/chat/history — list user's conversations
Frontend ChatGXY.vue — full-page chat with conversation sidebar, agent selector, action cards, feedback

Config system:

inference_services:
  default:
    model: "llama-4-scout"
    api_base_url: "http://localhost:4000/v1/"
  notebook_assistant:          # Per-agent config
    model: "anthropic:claude-sonnet-4-5"
    temperature: 0.3

Integration Seams Already Present

edit_source="agent" — the revision model accepts it, the manager accepts the parameter, the frontend already displays "AI" badge in revision list. Just not called yet.
GalaxyAgentDependencies — extensible dataclass. Can add history_notebook_manager: Optional[HistoryNotebookManager] alongside existing dataset_manager, workflow_manager.
ActionType enum — currently has tool_run, save_tool, contact_support, view_external, documentation. Can add notebook-specific action types.
Context dict — AgentService.route_and_execute() passes arbitrary context: dict to agents. Can include history_id, notebook_id, notebook_content.
Chat history — ChatManager.get_chat_history() returns conversation for multi-turn. Can include notebook context in system prompt.
Markdown infrastructure — both agents and notebooks produce/consume Galaxy-flavored markdown. Agents already render markdown responses. Notebooks store and resolve Galaxy markdown directives.

Key Design Questions

1. Chat Panel Architecture

Option A: Embedded chat panel alongside notebook editor A split-view HistoryNotebookSplit.vue component — editor on left, chat on right. The chat panel is notebook-specific (scoped to one notebook + its history). This is what the original Phase 10 plan sketched.

Option B: Extend ChatGXY with notebook context Add a "Notebook Assistant" mode to the existing ChatGXY page. When activated from a notebook, ChatGXY receives notebook_id + history_id in context and the agent operates on that notebook.

Option C: Window Manager split User opens notebook in main view and ChatGXY in a WinBox window (or vice versa). Communication via shared state (Pinia store) or postMessage.

User Assessment:

I think we want to do A & B together in the the MVP (if question below looks like what I suspect it does).
When drafting this into a concrete plan - please be sure to include option C as step after the MVP. We've built this on top of the window manager and we should continue in that direction.

User Questions:

Please research ChatGXY works and is structured now? Could we not put a ChatGXY window side-by-side with the HistoryMarkdown? I suspect we wouldn't want to rewrite that - but create a pros and cons? If any of the cons are features lacking in ChatGXY - could we add those or leverage the existing Vue components and APIs in a way (composition/specialization).

2. Agent Scope

User reviewed this section and agrees.

What should the notebook agent be able to do?

Read current notebook content
Read history datasets (names, types, metadata, peek at content)
Read history dataset collection structure
Propose full notebook rewrites or targeted section edits
Insert Galaxy markdown directives with correct hid=N references
Suggest tool runs that would add new datasets to the history
Explain existing datasets and their relationships
Generate summaries, methods sections, figure legends
Insert visualizations into a history markdown.

What should it NOT do (at least initially)?

Directly run Galaxy tools (that's the orchestrator agent's job)
Modify datasets or history metadata
Auto-save without user approval

User Assessment: Agreed!

3. Proposal/Apply Workflow

How should agent-proposed changes flow?

Option A: Full replacement proposals Agent proposes complete new notebook content. User sees diff, clicks "Apply" or "Reject". Apply calls save_new_revision(edit_source="agent").

Option B: Section-level patches Agent proposes changes to specific sections (identified by markdown headers or line ranges). Multiple proposals can be pending. User applies individually.

Option C: Streaming insertion Agent streams markdown directly into the editor at cursor position (like GitHub Copilot inline suggestions). User accepts with Tab/Enter or dismisses.

User Assessment: My hunch is all of these are useful. Maybe we start with Option A and then move to Option B or C as user experience optimizations once we have it working.

User Questions:

Are there any Vue libraries that would help us do this? Either full diff or the section-level pathches or streaming with Tab/Enter?
Do we think current models would be able to decide which of the first two interactions are more appropriate based on a question. How would be prompt them to decide between the two. What would the API look like for this.

4. Conversation Scoping

Should notebook chat conversations be:

Option A: Stored as ChatExchange (existing model) Reuse the chat infrastructure. Add notebook_id FK to ChatExchange. Conversations persist across sessions.

Option B: Stored as notebook revisions with chat metadata Each agent interaction creates a revision. Chat history reconstructed from revision sequence. Simpler model but loses non-notebook-modifying conversation turns.

Option C: Separate model New NotebookChatMessage table linked to notebook. Keeps notebook-specific conversations separate from general ChatGXY conversations.

User Assessment: Probably A works best - I don't see a lot of value in these other two when the A path will be the most heavily used in practice.

User Question: What are foreign-keys like on ChatExchange? Do they link to other artifacts? Would an association object between ChatExchange and HistoryNotebooks buy us anything useful?

5. Agent Registration

Option A: New dedicated agent Register notebook_assistant in the agent registry. Has its own system prompt, tools, and structured output type. Router can delegate to it when notebook context is present.

Option B: Extend existing router Add notebook tools to the router agent. No new agent type — the router handles notebook queries when notebook context is in the request.

Option C: Orchestrator-based The orchestrator coordinates between a "notebook writer" agent and other specialists (error_analysis for failed datasets, tool_recommendation for next steps). More powerful but more complex.

User Assessment: When we scope out a plan for this - lets do Option A in in the MVP and sketch an outline of Option C in subsequent steps of the implementation plan. After Option C is implemented - also add scope out implementing a visualizing history contents agent. That agent should know about common visualizations and how to build and embed them in History Markdown documents.

User Question: What are there ramifications of Option A vs Option B - what are the practical design, runtime, and implementation implications?

6. History Context Injection

How does the agent learn about the history's contents?

Option A: Tool-based discovery Agent has tools like list_history_datasets(history_id), get_dataset_peek(hid), get_dataset_metadata(hid). Agent calls tools as needed. Scales well — only fetches what's relevant.

Option B: Full context in system prompt Inject a summary of all history items (HID, name, type, state, size) into the system prompt. Agent has immediate context but burns tokens on large histories.

Option C: Hybrid System prompt includes a compact summary (HID + name + type for all items). Tools available for deep-dive (peek at content, metadata, collection structure).

User Assessment:

Yeah - I don't see a world where Option B or C seem viable at this time. We will need something else.

User Question:

I think we're waiting on an MCP branch here to do this right but please review PR 21706 - summarized at "/Users/jxc755/projects/repositories/galaxy-brain/vault/research/PR 21706 - Data Analysis Agent Integration.md" and provide me a answer for what this should look like now.

Relevant Files

Notebook System

File	What's There
`lib/galaxy/model/__init__.py` (lines 11374-11461)	HistoryNotebook + HistoryNotebookRevision models, edit_source field
`lib/galaxy/managers/history_notebooks.py`	Manager: CRUD, save_new_revision(edit_source=), resolve, prepare-for-page
`lib/galaxy/webapps/galaxy/api/history_notebooks.py`	9 API endpoints
`lib/galaxy/schema/schema.py` (lines 4146-4219)	Notebook request/response schemas
`lib/galaxy/managers/markdown_util.py` (lines 1364-1391)	resolve_history_markdown(), HID resolution
`lib/galaxy/managers/markdown_parse.py`	VALID_ARGUMENTS with hid= support
`client/src/stores/historyNotebookStore.ts`	Pinia store (dirty tracking, revisions)
`client/src/components/HistoryNotebook/HistoryNotebookView.vue`	Main orchestrator component
`client/src/components/HistoryNotebook/HistoryNotebookEditor.vue`	Editor wrapper
`client/src/components/HistoryNotebook/NotebookRevisionList.vue`	Revision sidebar (shows edit_source labels)
`client/src/api/historyNotebooks.ts`	TypeScript API client

Agent Framework

File	What's There
`lib/galaxy/agents/__init__.py`	Agent registry, 5 registered agents
`lib/galaxy/agents/base.py`	BaseGalaxyAgent ABC, GalaxyAgentDependencies, tool calling, retry logic
`lib/galaxy/agents/registry.py`	AgentRegistry (register, get_agent, list_agents)
`lib/galaxy/managers/agents.py`	AgentService (create_dependencies, execute_agent, route_and_execute)
`lib/galaxy/agents/router.py`	QueryRouterAgent (handoff via output functions)
`lib/galaxy/agents/tools.py`	ToolRecommendationAgent (example of @agent.tool pattern with Galaxy toolbox)
`lib/galaxy/agents/error_analysis.py`	ErrorAnalysisAgent (structured output example)
`lib/galaxy/schema/agents.py`	AgentResponse, ActionSuggestion, ActionType, ConfidenceLevel
`lib/galaxy/webapps/galaxy/api/chat.py`	Chat API (multi-turn, exchange_id, agent routing)
`lib/galaxy/managers/chat.py`	ChatManager (ChatExchange persistence, message storage, history)
`client/src/components/ChatGXY.vue`	Full-page chat UI (conversation, agent selector, actions)
`client/src/composables/agentActions.ts`	Frontend action dispatch (tool_run, save_tool, etc.)
`lib/galaxy/config/schemas/config_schema.yml`	inference_services per-agent config

Shared Infrastructure

File	What's There
`client/src/components/Markdown/MarkdownEditor.vue`	Shared editor (modes: report, page, history_notebook)
`client/src/components/Markdown/Markdown.vue`	Rendered markdown view (used by Pages, notebook displayOnly)
`client/src/components/Markdown/Editor/TextEditor.vue`	CodeMirror textarea with drag-drop support

Success Criteria

User can chat with an agent that understands their history contents and current notebook
Agent can propose notebook edits that the user previews before applying
Applied agent edits create revisions with edit_source="agent", visible in revision sidebar
Multi-turn conversation persists across page reloads
Agent uses hid=N references (not encoded IDs) when writing notebook content
Agent can reference specific datasets by HID in its explanations
The experience feels like collaborative editing, not a separate chat that happens to modify a file

jmchilton/HISTORY_MARKDOWN_AGENT_INTEGRATION_PROBLEM_AND_GOAL.md

Select an option

No results found