Summary of open GitHub issues related to workflow extraction limitations in Galaxy.
#17506 - Convert workflow extraction interface to Vue
Status: Open | Labels: enhancement, UI-UX, refactoring, backend
The build_from_current_history Mako template is the last non-data display Mako in Galaxy. Needs conversion to FastAPI + Vue.
Discussion points:
- Option A: Keep selection UI, convert to Vue
- Option B: Extract full workflow automatically, let users edit in workflow editor
- Concern about large histories generating "a lot of crap" without selection
- jmchilton: "extracting workflows from histories is pretty much the core idea of Galaxy"
#9161 - Extracting workflow from history with copied datasets breaks
Status: Open | Labels: bug, workflows
When datasets are copied from other histories:
- All connections are broken
- Includes tools from original history that weren't run in target history
- Copied datasets not treated as inputs (should be like uploads)
Workaround: Download and re-upload datasets instead of copying.
#21336 - Extract workflow from history misses connections
Status: Open (Nov 2025)
- Extracted workflow has 5 tools without any inputs
- May be related to job cache usage
- Reported on usegalaxy.eu (v25.0)
#13823 - Workflow extraction fails (in specific identified cases)
Status: Open | Labels: bug
Fails when:
- Tool has multiple collection outputs (e.g., FastQC)
- One output collection is copied to new history
- Another tool runs on that copied collection
- Extraction produces empty workflow
#12590 - Expression tool disconnected when extracting workflow
Status: Open | Labels: bug, workflows, paper-cut
- Multiple root datasets cause label mix-ups
- "Compose text parameter value" left disconnected
- Input dataset labels assigned incorrectly
#14541 - Extract Workflow does not include Extract Dataset Tool
Status: Open | Labels: bug
- Extract Dataset tool missing from extracted workflows
- Creates duplicate input files instead of preserving tool chain
#12236 - Extracting workflows with Unzip Collection for Copied Collections is Broken
Status: Open | Labels: bug, workflows, dataset-collections
#14423 - build_from_current_history crashed with key error
Status: Open | Labels: bug, workflows
AttributeError: 'dict' object has no attribute 'hid'
- Crash in
extract.py:416within__cleanup_param_values - Occurs with collections and deferred datasets (DL)
- Expected HDA object, received dict with
{'values': [{'src': 'hda', 'id': '...'}]}
#5189 - workflow extraction failed
Status: Open (2018) | Labels: bug
#6126 - cannot extract workflow with history that contains 'quast' tool
Status: Open | Labels: bug, workflows
#6714 - Option of applying dataset annotations to extracted workflow
Status: Open | Labels: feature-request
#7003 - Feature: Extract workflow also to include queued jobs
Status: Open | Labels: feature-request, workflows
Include jobs that are queued but not yet completed in extraction.
#17194 - Set a label for checkboxes in workflow extraction view
Status: Open | Labels: UI-UX, feature-request, paper-cut, accessibility
- #18484 - Workflow extraction from history with empty collection will fail (Fixed)
- #11059 - Workflow extraction fails for list:paired inputs copied from another history (Fixed)
- #19524 - Extract workflow from purged history does not work (Fixed)
- Copied datasets are a major source of extraction failures
- Multi-output tools with collections cause issues
- Expression tools and parameter-based tools often disconnect
- Deferred/DL datasets can cause crashes
- Legacy Mako UI limits modern UX improvements
The current extraction architecture relies on tracing HDAs/HDCAs back to their creating jobs. When this tracing fails (no job exists, job in wrong history, etc.), extraction breaks. See WORKFLOW_EXTRACTION_LIMITATIONS.md for detailed evidence.
| Issue | Problem | Why Job-Based Extraction Fails |
|---|---|---|
| #9161 | Copied datasets break extraction | creating_job_associations points to job in original history, causing wrong HIDs and foreign jobs pulled in |
| #21336 | Missing connections | Job cache causes associations to point to cached jobs in different history contexts |
| #13823 | Multi-output copied collection fails | Partial copy of collection outputs breaks job tracing chain |
| #14541 | Extract Dataset tool missing | Association chain incomplete for __EXTRACT_DATASET__ tool |
| #7003 | Queued jobs not included | JobParameter records may not be available until job completes |
| Issue | Problem | Possible Model Relationship |
|---|---|---|
| #12590 | Expression tool disconnected | Parameter-based inputs handled differently than dataset inputs in job associations |
| #12236 | Unzip Collection with copied collections | Variant of copied collection problem |
| #14423 | Key error crash with deferred datasets | Object reconstruction failure when tracing job parameters |
| Issue | Problem | Actual Cause |
|---|---|---|
| #17506 | Convert to Vue | UI framework modernization |
| #17194 | Checkbox labels | UI accessibility |
| #6714 | Dataset annotations | Metadata feature request |
| #5189 | Extraction failed (2018) | Insufficient information |
| #6126 | QUAST tool fails | Tool-specific compatibility |
The copied dataset problem is the most prevalent root cause, affecting 3+ open issues (#9161, #13823, #12236, possibly #21336).
Why it happens:
- User copies dataset/collection from History A to History B
- The copied item's
creating_job_associationsstill points to the job in History A - When extracting from History B:
- Extraction follows the association to History A's job
- That job's inputs/outputs reference HIDs in History A
- HID mismatches cause broken connections
- Jobs from History A incorrectly appear in extracted workflow
Current workaround: Download and re-upload datasets instead of copying.
The ToolRequest model (already exists in codebase) would fix all "Likely" issues because:
- Created at request time - before job cache lookup, before jobs queue
- Captures current history context - not dependent on where data originated
- Stores original request dict - parameters available regardless of job state
- Links to output collections - via
ToolRequestImplicitCollectionAssociation
Existing TODO comments confirm this is the intended solution:
extract.py:323: "TODO track this via tool request model"test_workflow_extraction.py:265: "TODO: after adding request models we should be able to recover implicit collection job requests"