History Notebooks capture narrative alongside computation — but right now, only humans write. The notebook is a passive document: the user opens the editor, types markdown, inserts dataset references via toolbox or drag-and-drop, and saves. The AI agent framework (PR #21434, merged Dec 2025) gives Galaxy a multi-agent system with structured output, tool calling, and multi-turn conversation — but it has no surface for writing or editing documents tied to histories. These two systems sit side-by-side with no connection.
The original vision (from THE_PROBLEM_AND_GOAL.md):
History Notebooks are living documents that grow alongside your analysis. As analysis progresses — whether driven by human clicks, agent actions, or conversation between the two — the narrative builds up iteratively.
This creates a "Claude Code for data analysis" experience. The agent doesn't just run tools — it builds up a polished document with rich visualizations, updates figures when parameters change, and refines the presentation in response to human feedback.
Phase 10 bridges the gap: the agent can read the notebook, propose changes, and write new revisions — all while the human stays in control.
A split-view experience where the user sees the notebook editor on one side and a chat panel on the other. The agent can:
- Read the current notebook content + history datasets
- Propose markdown changes (with diff preview)
- Write new revisions when the user approves (tracked as
edit_source="agent") - Discuss the analysis in multi-turn conversation, with full context of what's in the history and notebook
The user can accept, reject, or modify agent proposals before they become revisions. Every agent edit creates a new revision — never destructively overwrites.
Backend:
HistoryNotebook+HistoryNotebookRevisionSQLAlchemy models- Revision model with
edit_sourcefield:"user"|"agent"|"restore"(16-char column) HistoryNotebookManagerwith full CRUD +save_new_revision(trans, notebook, payload, edit_source="user")- The
edit_sourceparameter is already accepted by the manager but never called with"agent"yet - Content stored with raw HID references (
hid=N), resolved at read time viaresolve_history_markdown() - Two-step read pipeline:
resolve_history_markdown()(hid→internal_id) thenready_galaxy_markdown_for_export()(internal_id→encoded_id) - 9 API endpoints under
/api/histories/{history_id}/notebooks prepare-for-pageendpoint for Page export with full HID resolution
Frontend:
- Pinia store (
useHistoryNotebookStore) with dirty tracking, save/discard, revision management HistoryNotebookView.vueorchestrator (list mode, editor mode, displayOnly mode, revision panel)HistoryNotebookEditor.vuewrappingMarkdownEditorinhistory_notebookmodeNotebookRevisionList.vuewith edit_source labels: "Manual" / "AI" / "Restored"- Drag-and-drop from history panel (dataset + collection, inserts
hid=Ndirectives) - Window Manager integration (rendered Markdown.vue in WinBox iframe)
- Routes at
/histories/:historyId/notebooks[/:notebookId]
Key notebook API endpoints:
| Method | Path | Purpose |
|---|---|---|
| GET | /api/histories/{id}/notebooks |
List notebooks |
| POST | /api/histories/{id}/notebooks |
Create notebook |
| GET | /api/histories/{id}/notebooks/{nbId} |
Get notebook (content resolved+encoded) |
| PUT | /api/histories/{id}/notebooks/{nbId} |
Update (creates revision) |
| GET | /api/histories/{id}/notebooks/{nbId}/revisions |
List revisions |
| GET | /api/histories/{id}/notebooks/{nbId}/revisions/{revId} |
Get revision content |
| POST | /api/histories/{id}/notebooks/{nbId}/revisions/{revId}/revert |
Revert to old revision |
Architecture:
BaseGalaxyAgent(ABC) built on pydantic-ai withGalaxyAgentDependenciesdataclass injection- Dependencies carry:
trans,user,config, optionaljob_manager,dataset_manager,workflow_manager,tool_cache,toolbox,get_agent(for inter-agent calls) AgentRegistrymaps agent_type strings → agent classes; singleton inlib/galaxy/agents/__init__.pyAgentServiceinlib/galaxy/managers/agents.pycreates dependencies and executes agents- 5 registered agents: router, error_analysis, custom_tool, orchestrator, tool_recommendation
Key base class capabilities:
class BaseGalaxyAgent(ABC):
agent_type: str # Class attribute
deps: GalaxyAgentDependencies # Injected
agent: Agent[GalaxyAgentDependencies, T] # pydantic-ai agent
async def process(query, context) -> AgentResponse
def _create_agent() -> Agent # Abstract: define pydantic-ai agent + tools
def get_system_prompt() -> str # Abstract: system prompt (typically from prompts/*.md)
async def _call_agent_from_tool(...) # Agent-to-agent delegation
def _get_model() -> model # Multi-provider: OpenAI, Anthropic, Google, local
def _get_agent_config(key, default) # 4-level config cascadeTool calling pattern (pydantic-ai):
def _create_agent(self):
agent = Agent(self._get_model(), deps_type=GalaxyAgentDependencies, ...)
@agent.tool
async def search_tools(ctx: RunContext[GalaxyAgentDependencies], query: str) -> str:
toolbox = ctx.deps.toolbox
# ... search and return results
return formatted_results
return agentResponse schema:
class AgentResponse(BaseModel):
content: str # Main response (markdown)
confidence: ConfidenceLevel # low|medium|high
agent_type: str # Which agent answered
suggestions: list[ActionSuggestion] # Actionable next steps
metadata: dict[str, Any] # Token usage, model, etc.
reasoning: Optional[str] # Agent's reasoning chain
class ActionSuggestion(BaseModel):
action_type: ActionType # tool_run|save_tool|contact_support|view_external|documentation
description: str
parameters: dict[str, Any]
confidence: ConfidenceLevel
priority: int # 1=high, 2=medium, 3=lowChat API & conversation persistence:
POST /api/chat— main endpoint, supportsexchange_idfor multi-turn,agent_typeselectionChatExchangemodel stores conversations per user (with optionaljob_id)ChatExchangeMessagestores individual messages as JSON (query + response + agent metadata)GET /api/chat/history— list user's conversations- Frontend
ChatGXY.vue— full-page chat with conversation sidebar, agent selector, action cards, feedback
Config system:
inference_services:
default:
model: "llama-4-scout"
api_base_url: "http://localhost:4000/v1/"
notebook_assistant: # Per-agent config
model: "anthropic:claude-sonnet-4-5"
temperature: 0.3-
edit_source="agent" — the revision model accepts it, the manager accepts the parameter, the frontend already displays "AI" badge in revision list. Just not called yet.
-
GalaxyAgentDependencies — extensible dataclass. Can add
history_notebook_manager: Optional[HistoryNotebookManager]alongside existingdataset_manager,workflow_manager. -
ActionType enum — currently has
tool_run,save_tool,contact_support,view_external,documentation. Can add notebook-specific action types. -
Context dict —
AgentService.route_and_execute()passes arbitrarycontext: dictto agents. Can includehistory_id,notebook_id,notebook_content. -
Chat history —
ChatManager.get_chat_history()returns conversation for multi-turn. Can include notebook context in system prompt. -
Markdown infrastructure — both agents and notebooks produce/consume Galaxy-flavored markdown. Agents already render markdown responses. Notebooks store and resolve Galaxy markdown directives.
Option A: Embedded chat panel alongside notebook editor
A split-view HistoryNotebookSplit.vue component — editor on left, chat on right. The chat panel is notebook-specific (scoped to one notebook + its history). This is what the original Phase 10 plan sketched.
Option B: Extend ChatGXY with notebook context Add a "Notebook Assistant" mode to the existing ChatGXY page. When activated from a notebook, ChatGXY receives notebook_id + history_id in context and the agent operates on that notebook.
Option C: Window Manager split User opens notebook in main view and ChatGXY in a WinBox window (or vice versa). Communication via shared state (Pinia store) or postMessage.
User Assessment:
- I think we want to do A & B together in the the MVP (if question below looks like what I suspect it does).
- When drafting this into a concrete plan - please be sure to include option C as step after the MVP. We've built this on top of the window manager and we should continue in that direction.
User Questions:
- Please research ChatGXY works and is structured now? Could we not put a ChatGXY window side-by-side with the HistoryMarkdown? I suspect we wouldn't want to rewrite that - but create a pros and cons? If any of the cons are features lacking in ChatGXY - could we add those or leverage the existing Vue components and APIs in a way (composition/specialization).
User reviewed this section and agrees.
What should the notebook agent be able to do?
- Read current notebook content
- Read history datasets (names, types, metadata, peek at content)
- Read history dataset collection structure
- Propose full notebook rewrites or targeted section edits
- Insert Galaxy markdown directives with correct
hid=Nreferences - Suggest tool runs that would add new datasets to the history
- Explain existing datasets and their relationships
- Generate summaries, methods sections, figure legends
- Insert visualizations into a history markdown.
What should it NOT do (at least initially)?
- Directly run Galaxy tools (that's the orchestrator agent's job)
- Modify datasets or history metadata
- Auto-save without user approval
User Assessment: Agreed!
How should agent-proposed changes flow?
Option A: Full replacement proposals
Agent proposes complete new notebook content. User sees diff, clicks "Apply" or "Reject". Apply calls save_new_revision(edit_source="agent").
Option B: Section-level patches Agent proposes changes to specific sections (identified by markdown headers or line ranges). Multiple proposals can be pending. User applies individually.
Option C: Streaming insertion Agent streams markdown directly into the editor at cursor position (like GitHub Copilot inline suggestions). User accepts with Tab/Enter or dismisses.
User Assessment: My hunch is all of these are useful. Maybe we start with Option A and then move to Option B or C as user experience optimizations once we have it working.
User Questions:
-
Are there any Vue libraries that would help us do this? Either full diff or the section-level pathches or streaming with Tab/Enter?
-
Do we think current models would be able to decide which of the first two interactions are more appropriate based on a question. How would be prompt them to decide between the two. What would the API look like for this.
Should notebook chat conversations be:
Option A: Stored as ChatExchange (existing model)
Reuse the chat infrastructure. Add notebook_id FK to ChatExchange. Conversations persist across sessions.
Option B: Stored as notebook revisions with chat metadata Each agent interaction creates a revision. Chat history reconstructed from revision sequence. Simpler model but loses non-notebook-modifying conversation turns.
Option C: Separate model
New NotebookChatMessage table linked to notebook. Keeps notebook-specific conversations separate from general ChatGXY conversations.
User Assessment: Probably A works best - I don't see a lot of value in these other two when the A path will be the most heavily used in practice.
User Question: What are foreign-keys like on ChatExchange? Do they link to other artifacts? Would an association object between ChatExchange and HistoryNotebooks buy us anything useful?
Option A: New dedicated agent
Register notebook_assistant in the agent registry. Has its own system prompt, tools, and structured output type. Router can delegate to it when notebook context is present.
Option B: Extend existing router Add notebook tools to the router agent. No new agent type — the router handles notebook queries when notebook context is in the request.
Option C: Orchestrator-based The orchestrator coordinates between a "notebook writer" agent and other specialists (error_analysis for failed datasets, tool_recommendation for next steps). More powerful but more complex.
User Assessment: When we scope out a plan for this - lets do Option A in in the MVP and sketch an outline of Option C in subsequent steps of the implementation plan. After Option C is implemented - also add scope out implementing a visualizing history contents agent. That agent should know about common visualizations and how to build and embed them in History Markdown documents.
User Question: What are there ramifications of Option A vs Option B - what are the practical design, runtime, and implementation implications?
How does the agent learn about the history's contents?
Option A: Tool-based discovery
Agent has tools like list_history_datasets(history_id), get_dataset_peek(hid), get_dataset_metadata(hid). Agent calls tools as needed. Scales well — only fetches what's relevant.
Option B: Full context in system prompt Inject a summary of all history items (HID, name, type, state, size) into the system prompt. Agent has immediate context but burns tokens on large histories.
Option C: Hybrid System prompt includes a compact summary (HID + name + type for all items). Tools available for deep-dive (peek at content, metadata, collection structure).
User Assessment:
Yeah - I don't see a world where Option B or C seem viable at this time. We will need something else.
User Question:
I think we're waiting on an MCP branch here to do this right but please review PR 21706 - summarized at "/Users/jxc755/projects/repositories/galaxy-brain/vault/research/PR 21706 - Data Analysis Agent Integration.md" and provide me a answer for what this should look like now.
| File | What's There |
|---|---|
lib/galaxy/model/__init__.py (lines 11374-11461) |
HistoryNotebook + HistoryNotebookRevision models, edit_source field |
lib/galaxy/managers/history_notebooks.py |
Manager: CRUD, save_new_revision(edit_source=), resolve, prepare-for-page |
lib/galaxy/webapps/galaxy/api/history_notebooks.py |
9 API endpoints |
lib/galaxy/schema/schema.py (lines 4146-4219) |
Notebook request/response schemas |
lib/galaxy/managers/markdown_util.py (lines 1364-1391) |
resolve_history_markdown(), HID resolution |
lib/galaxy/managers/markdown_parse.py |
VALID_ARGUMENTS with hid= support |
client/src/stores/historyNotebookStore.ts |
Pinia store (dirty tracking, revisions) |
client/src/components/HistoryNotebook/HistoryNotebookView.vue |
Main orchestrator component |
client/src/components/HistoryNotebook/HistoryNotebookEditor.vue |
Editor wrapper |
client/src/components/HistoryNotebook/NotebookRevisionList.vue |
Revision sidebar (shows edit_source labels) |
client/src/api/historyNotebooks.ts |
TypeScript API client |
| File | What's There |
|---|---|
lib/galaxy/agents/__init__.py |
Agent registry, 5 registered agents |
lib/galaxy/agents/base.py |
BaseGalaxyAgent ABC, GalaxyAgentDependencies, tool calling, retry logic |
lib/galaxy/agents/registry.py |
AgentRegistry (register, get_agent, list_agents) |
lib/galaxy/managers/agents.py |
AgentService (create_dependencies, execute_agent, route_and_execute) |
lib/galaxy/agents/router.py |
QueryRouterAgent (handoff via output functions) |
lib/galaxy/agents/tools.py |
ToolRecommendationAgent (example of @agent.tool pattern with Galaxy toolbox) |
lib/galaxy/agents/error_analysis.py |
ErrorAnalysisAgent (structured output example) |
lib/galaxy/schema/agents.py |
AgentResponse, ActionSuggestion, ActionType, ConfidenceLevel |
lib/galaxy/webapps/galaxy/api/chat.py |
Chat API (multi-turn, exchange_id, agent routing) |
lib/galaxy/managers/chat.py |
ChatManager (ChatExchange persistence, message storage, history) |
client/src/components/ChatGXY.vue |
Full-page chat UI (conversation, agent selector, actions) |
client/src/composables/agentActions.ts |
Frontend action dispatch (tool_run, save_tool, etc.) |
lib/galaxy/config/schemas/config_schema.yml |
inference_services per-agent config |
| File | What's There |
|---|---|
client/src/components/Markdown/MarkdownEditor.vue |
Shared editor (modes: report, page, history_notebook) |
client/src/components/Markdown/Markdown.vue |
Rendered markdown view (used by Pages, notebook displayOnly) |
client/src/components/Markdown/Editor/TextEditor.vue |
CodeMirror textarea with drag-drop support |
- User can chat with an agent that understands their history contents and current notebook
- Agent can propose notebook edits that the user previews before applying
- Applied agent edits create revisions with
edit_source="agent", visible in revision sidebar - Multi-turn conversation persists across page reloads
- Agent uses
hid=Nreferences (not encoded IDs) when writing notebook content - Agent can reference specific datasets by HID in its explanations
- The experience feels like collaborative editing, not a separate chat that happens to modify a file