Task Milestone Viability Analysis

Date: 2026-02-13
Authors: Dylan Fitzgerald + Claude (Opus 4.6)
Scope: Standalone tasks only (not subtasks). All numbers are total tasks including the current ~120.

Context

The client currently has ~120 accepted standalone tasks on the Nebula platform and wants to reach 900 total.

Nebula is a simulated DevOps environment running a complete k3s Kubernetes cluster inside a single Docker container. It includes a full CI/CD pipeline (Gitea → Harbor → ArgoCD), an observability stack (Prometheus, Grafana, Loki, Jaeger), an Istio service mesh, and a microservices application ("Bleater," 8 Python/FastAPI services). The environment boots from a snapshot in ~60 seconds and runs ~60 pods across 15 namespaces in an air-gapped network. Tasks are standalone investigation/troubleshooting/implementation challenges that AI coding agents solve within this environment.

This analysis evaluates the viability of reaching four milestones — 200, 300, 500, and 900 total standalone tasks — across three scenarios:

Nebula alone — expanding the existing platform
Nebula + 1 additional environment — one new simulation platform
Nebula + N additional environments — multiple new platforms

Each milestone is assessed for probability of achievement, estimated timeline, and critical dependencies.

How We Derived These Numbers

We ran a systematic gap analysis across all four task categories (SRE, platform-engineering, devops, cloud-ops), scanning 185+ Discord threads across both client feedback channels, 37+ local task directories, and the full Nebula infrastructure manifest. Key findings:

43 raw gap ideas across categories, deduplicating to 28-30 unique viable ideas on the current platform
19 of those are low-risk (minimal overlap with existing tasks, clearly supported by Nebula infrastructure)
5 Nebula components have deployed infrastructure but zero task coverage (KEDA, GlitchTip, Statping-ng, Event Exporter, Maddy)
9+ SRE skill areas have no coverage at all (SLO/SLI management, postmortem processes, alert routing, graceful degradation, runbook automation, toil reduction, health probe design, LogQL alerting, GlitchTip error tracking)
Platform-engineering is the densest category (~40 tasks), with remaining headroom mostly in platform operations (ArgoCD notifications, Harbor admin, KEDA triggers)
Cloud-ops is the most heavily saturated (~55 tasks), with the biggest blind spots in workload controller lifecycle operations (StatefulSet, CronJob, DaemonSet, init containers)

These numbers define the carrying capacity of the current Nebula platform and inform all milestone estimates below.

Nebula Platform Capacity Model

Current State

pie title "Current Task Coverage by Saturation Level"
    "Heavily Saturated (PostgreSQL, ArgoCD, CI/CD, Prometheus, Istio)" : 55
    "Moderately Covered (Harbor, Redis, Helm, DNS, RabbitMQ)" : 40
    "Low/No Coverage (KEDA, GlitchTip, Statping-ng, CronJobs, etc.)" : 25

The current 120 tasks are unevenly distributed. Five areas account for ~45% of all tasks (PostgreSQL operations, ArgoCD/GitOps, CI/CD pipelines, Prometheus observability, Istio service mesh), while 13 deployed components have zero task coverage.

Nebula Standalone Task Ceiling by Expansion Level

Expansion level	New tasks possible	Total ceiling	Engineering investment
None (current platform)	35-50	155-170	0
Light (utilize unused components)	50-80	170-200	2-4 weeks
Moderate (add 3-5 components)	80-130	200-250	2-3 months
Aggressive (add 7-10 components)	150-230	270-350	4-8 months
Maximum (theoretical limit)	200-330	320-450	8-14 months

Why the ceiling exists: Each task runs in a fresh, isolated Nebula environment — there is no resource contention between tasks. The constraint is conceptual overlap. Tasks are rejected if solving one would automatically solve part of another, if two tasks produce the same deliverable, or if their graders check the same artifact. As task density on a given component increases, the remaining investigation space that doesn't trigger these overlap gates shrinks. A platform with ~35 components and a fixed application topology has a finite number of meaningfully distinct investigation patterns. The single-node architecture also eliminates multi-node scenarios (node affinity, topology spread, cluster federation) from the task space entirely.

What "Expansion" Means Concretely

Light expansion — no new components, just deeper task coverage:

Bring 13 zero-coverage components (KEDA, GlitchTip, Statping-ng, Maddy, Event Exporter, CronJobs, Init Containers, etc.) to 2-4 tasks each
Explore cross-component interaction tasks
Add process/workflow tasks (postmortems, runbooks, toil audits)

Moderate expansion — add 3-5 new components:

Component	Effort	Tasks enabled	Air-gap feasible?
Argo Workflows	2-3 weeks	10-15	Yes
Cert-manager (advanced)	1-2 weeks	8-12	Yes
Falco	2-3 weeks	8-12	Yes
External Secrets Operator	1-2 weeks	5-10	Yes
Velero (advanced)	1-2 weeks	5-8	Yes (already present)

Aggressive expansion — add 7-10 new components:

Everything above, plus: Tekton Pipelines, Crossplane, Thanos/Mimir, Linkerd, Vault (advanced)
Some candidates conflict with existing components (Linkerd vs. Istio, Tekton vs. Gitea Actions)
Snapshot boot time and image size become constraints at 80+ pods, and some candidates conflict with existing components

Additional Environment Capacity Model

What an "Additional Environment" Means

A new simulation platform comparable to Nebula in scope — a self-contained, snapshot-bootable container with a full infrastructure stack, microservices application, and tooling. Building one is a major engineering project.

Aspect	Nebula (reference)	New environment (estimate)
Engineering to build	6+ months (already done)	3-6 months
Component count	~35	20-40
Microservices	8 (Bleater)	5-10 (new app)
Task capacity	320-450	200-400
Boot time target	60 seconds	60-120 seconds

Candidate Additional Environments

mindmap
  root((Additional<br>Environments))
    Multi-Cluster Platform
      2-3 k3s clusters
      Mesh federation
      Cross-cluster GitOps
      Multi-cluster networking
      Est. 200-300 tasks
      Eng. 4-6 months
    Cloud Infrastructure / IaC
      Terraform + LocalStack
      Simulated AWS services
      Serverless patterns
      VPC / security groups
      Est. 200-350 tasks
      Eng. 5-8 months
    Legacy Modernization
      VMs via QEMU/Firecracker
      Ansible config mgmt
      VM to container migration
      Monolith decomposition
      Est. 150-250 tasks
      Eng. 5-8 months
    Security Operations
      Vault + Falco + SIEM
      Compliance frameworks
      Incident response toolchain
      Forensics scenarios
      Est. 100-200 tasks
      Eng. 3-5 months

Key insight: Each environment targets a fundamentally different infrastructure stack, ensuring minimal cross-environment overlap. A multi-cluster platform tests federation skills that can't exist on single-node Nebula. A cloud/IaC platform tests Terraform skills that aren't relevant to K8s cluster operations.

Per-Environment Task Yield Estimates

Environment	Conservative	Expected	Optimistic
Nebula (current, with aggressive expansion)	270	320	450
Multi-Cluster Platform	200	250	300
Cloud Infrastructure / IaC	200	275	350
Legacy Modernization	150	200	250
Security Operations	100	150	200

Milestone Analysis

Milestone: 200 Total Tasks (+80 from current 120)

flowchart LR
    subgraph Nebula["Nebula Alone"]
        N200["200 tasks<br>🟢 70-80%"]
    end
    subgraph Plus1["Nebula + 1 Environment"]
        P200["200 tasks<br>🟢 95%+"]
    end
    subgraph PlusN["Nebula + N Environments"]
        PN200["200 tasks<br>🟢 95%+<br>(overkill)"]
    end

Nebula Alone

Probability: 70-80%
Required: ~80 new standalone tasks. Gap analysis found 28-30 directly; creative exploration adds 10-20; moderate expansion (2-3 new components) adds 25-35. Total pool: ~63-85.
Timeline: 3-5 months FTE
Risk factors: Some medium-overlap ideas may be rejected. Implementation attrition (~15%) reduces the pool.
Critical dependency: Moderate Nebula expansion (at least 2 new components to reach 80 comfortably).

Nebula + 1 Environment

Probability: 95%+
Overkill for this target. The additional environment isn't needed; Nebula can likely reach 200 alone. Building a second environment just for this milestone would be a poor use of resources.

Verdict

200 is achievable on Nebula alone with moderate expansion. This is the "high confidence" target. The gap analysis directly supports ~50 new tasks from identified gaps; the remaining ~30 come from creative exploration and 2-3 new components.

Milestone: 300 Total Tasks (+180 from current 120)

flowchart LR
    subgraph Nebula["Nebula Alone"]
        N300["300 tasks<br>🟡 25-35%"]
    end
    subgraph Plus1["Nebula + 1 Environment"]
        P300["300 tasks<br>🟢 70-80%"]
    end
    subgraph PlusN["Nebula + N Environments"]
        PN300["300 tasks<br>🟢 90%+<br>(comfortable)"]
    end

Nebula Alone

Probability: 25-35%
Required: ~180 new standalone tasks. This pushes toward Nebula's ceiling even with aggressive expansion (7-10 new components = 150-230 new tasks). At this scale, later tasks are increasingly niche and overlap-adjacent.
Timeline: 6-10 months FTE
Risk factors: Overlap saturation — at 300 tasks on one platform, later ideas are increasingly niche and overlap-adjacent. Cross-task overlap checking becomes painful at this scale. Quality degrades on tail-end tasks as we exhaust the most natural investigation patterns.
Critical dependency: Aggressive Nebula expansion AND high acceptance rate on niche tasks.

Nebula + 1 Environment

Probability: 70-80%
Split: Nebula contributes ~200 (with moderate expansion), new environment contributes ~100 (early-stage, low-hanging fruit).
Timeline: 6-10 months FTE (3-5 months for Nebula tasks + 3-5 months overlapping for new environment build and early tasks)
Critical dependency: Second environment engineering starts early (month 2-3).

Nebula + 2 Environments

Probability: 90%+
Comfortable margin. Each environment handles a smaller share.

Verdict

300 is risky on Nebula alone but comfortable with one additional environment. The decision point: is the engineering investment in a second environment (3-6 months to build) worth the increased headroom? For 300 tasks, it depends on whether the client needs them quickly (Nebula-only is faster to start but harder to finish) or reliably (second environment provides margin).

Milestone: 500 Total Tasks (+380 from current 120)

flowchart LR
    subgraph Nebula["Nebula Alone"]
        N500["500 tasks<br>🔴 5-10%"]
    end
    subgraph Plus1["Nebula + 1 Environment"]
        P500["500 tasks<br>🟡 40-50%"]
    end
    subgraph PlusN["Nebula + N Environments"]
        PN500["500 tasks<br>🟢 70-80%<br>(2 additional)"]
    end

Nebula Alone

Probability: 5-10%
Required: ~380 new standalone tasks. This exceeds Nebula's estimated ceiling of 320-450 total, meaning it requires hitting the absolute theoretical maximum with zero attrition. Practically impossible.
Timeline: N/A (not achievable)

Nebula + 1 Environment

Probability: 40-50%
Split: Nebula ~250-300 (aggressive expansion), new environment ~200-250 (moderate maturity). Combined: 450-550.
Timeline: 10-16 months FTE
Risk factors: The second environment needs to reach moderate maturity (200+ tasks), which requires 6-8 months of active development after the initial 3-6 month build.
Critical dependency: Second environment must be architecturally distinct enough to avoid cross-platform overlap. A multi-cluster or IaC platform provides the most differentiation.

Nebula + 2 Environments

Probability: 70-80%
Split: Nebula ~250, Env 2 ~150, Env 3 ~100. Each contributes from its strongest areas.
Timeline: 12-18 months (environments can be built in parallel by different teams)
Critical dependency: Team scaling — 2 people can't build 2 new environments while also producing tasks. Needs 4-6 people.

Verdict

500 is not achievable on Nebula alone. It requires at least one additional environment, and two makes it comfortable. This is where the project transitions from "task authoring" to "platform engineering + task authoring." The bottleneck shifts from idea generation to environment construction.

Milestone: 900 Total Tasks (+780 from current 120)

flowchart LR
    subgraph Nebula["Nebula Alone"]
        N900["900 tasks<br>🔴 0%<br>(impossible)"]
    end
    subgraph Plus1["Nebula + 1 Environment"]
        P900["900 tasks<br>🔴 <5%"]
    end
    subgraph Plus2["Nebula + 2 Environments"]
        P2_900["900 tasks<br>🟡 20-30%"]
    end
    subgraph Plus3["Nebula + 3 Environments"]
        P3_900["900 tasks<br>🟡 40-55%"]
    end
    subgraph Plus4["Nebula + 4+ Environments"]
        P4_900["900 tasks<br>🟢 60-70%"]
    end

Nebula Alone

Probability: ~0%
The platform's absolute ceiling is 320-450 standalone tasks. 900 is mathematically impossible on Nebula alone.

Nebula + 1 Environment

Probability: <5%
Combined ceiling: ~450-700. Even at the optimistic end, this falls short of 900. Both platforms would need to hit their absolute theoretical maximums.

Nebula + 2 Environments

Probability: 20-30%
Split: Nebula ~300-350, Env 2 ~250-300, Env 3 ~200-250. Combined: 750-900.
Timeline: 18-24 months
Risk factors: This is at the combined ceiling. All three platforms need aggressive expansion and near-maximum utilization. Any one platform underperforming kills the target.
Team size: 5-8 people (2 per environment + 1-2 on cross-cutting coordination)

Nebula + 3 Environments

Probability: 40-55%
Split: Nebula ~300, Env 2 ~250, Env 3 ~200, Env 4 ~150. Combined: 900.
Timeline: 18-24 months (environments built in parallel)
Risk factors: Coordination overhead across 4 platforms. Cross-platform overlap checking. Consistent quality standards.
Team size: 6-10 people

Nebula + 4+ Environments

Probability: 60-70%
Provides comfortable margin. Each platform contributes 150-250 tasks without pushing any to its ceiling.
Timeline: 20-30 months
Team size: 8-12 people

Verdict

900 standalone tasks is a multi-platform, multi-team, multi-year program. It requires 3-4 simulation platforms with distinct infrastructure stacks, a team of 6-10+ task authors and platform engineers, and 18-24 months of sustained effort. It is not achievable by two people on one platform in any timeframe.

Summary Matrix

xychart-beta
    title "Probability of Reaching Milestone by Scenario"
    x-axis ["200", "300", "500", "900"]
    y-axis "Probability (%)" 0 --> 100
    line "Nebula alone" [75, 30, 7, 0]
    line "Nebula + 1 env" [95, 75, 45, 4]
    line "Nebula + 2+ env" [95, 90, 75, 45]

Legend: Top line = Nebula + 2+ env | Middle line = Nebula + 1 env | Bottom line = Nebula alone

Milestone	Nebula Alone	+ 1 Environment	+ 2 Environments	+ 3+ Environments
200	🟢 70-80%	🟢 95%+	🟢 95%+	🟢 95%+
300	🟡 25-35%	🟢 70-80%	🟢 90%+	🟢 95%+
500	🔴 5-10%	🟡 40-50%	🟢 70-80%	🟢 85%+
900	🔴 ~0%	🔴 <5%	🟡 20-30%	🟡 40-55%

Timeline Estimates

Milestone	Nebula Alone	+ 1 Env	+ 2 Env	+ 3+ Env
200	3-5 mo	—	—	—
300	6-10 mo	6-10 mo	—	—
500	N/A	10-16 mo	12-18 mo	—
900	N/A	N/A	18-24 mo	20-30 mo

Team Size Requirements

Milestone	Minimum team	Recommended team
200	2 (us)	2-3
300	2-3	3-5
500	4-6	5-8
900	6-10	8-12

Recommendations

For 200: Proceed on Nebula alone. We have the ideas, tooling, and capacity. Start immediately.
For 300: Begin Nebula work immediately while scoping a second environment. The environment decision should be made within 4-6 weeks based on Nebula progress.
For 500: Commit to a second environment early. The IaC/Cloud platform (Terraform + LocalStack) or Multi-Cluster platform offer the most differentiated task space. Start environment engineering in month 2.
For 900: This requires a program-level commitment — multiple platforms, a scaled team, and 18-24 months. We should present this honestly to the client with the data above and discuss whether 500-600 high-quality tasks (achievable with 2 environments) better serves their needs than 900 tasks of varying quality.

arubis/task-milestone-viability.md

Task Milestone Viability Analysis

Context

How We Derived These Numbers

Nebula Platform Capacity Model

Current State

Nebula Standalone Task Ceiling by Expansion Level

What "Expansion" Means Concretely

Additional Environment Capacity Model

What an "Additional Environment" Means

Candidate Additional Environments

Per-Environment Task Yield Estimates

Milestone Analysis

Milestone: 200 Total Tasks (+80 from current 120)

Nebula Alone

Nebula + 1 Environment

Verdict

Milestone: 300 Total Tasks (+180 from current 120)

Nebula Alone

Nebula + 1 Environment

Nebula + 2 Environments

Verdict

Milestone: 500 Total Tasks (+380 from current 120)

Nebula Alone

Nebula + 1 Environment

Nebula + 2 Environments

Verdict

Milestone: 900 Total Tasks (+780 from current 120)

Nebula Alone

Nebula + 1 Environment

Nebula + 2 Environments

Nebula + 3 Environments

Nebula + 4+ Environments

Verdict

Summary Matrix

Timeline Estimates

Team Size Requirements

Recommendations