Skip to content

Instantly share code, notes, and snippets.

@spboyer
Created February 1, 2026 13:01
Show Gist options
  • Select an option

  • Save spboyer/b0b56ddb340e8b489d910cbb36b728a9 to your computer and use it in GitHub Desktop.

Select an option

Save spboyer/b0b56ddb340e8b489d910cbb36b728a9 to your computer and use it in GitHub Desktop.
Agent Skills: References, Recipes & Token Loading Behavior - Best practices documentation

Agent Skills: References, Recipes & Token Loading Behavior

Overview

This document captures learnings about how Agent Skills handle references/, recipes/, and services/ folders, including token budget implications and best practices based on the AgentSkills.io specification and GitHub Copilot implementation behavior.

Progressive Disclosure Model

Agent Skills use a three-tier loading model to efficiently manage LLM context windows:

Tier Content When Loaded Token Budget Notes
1. Metadata name + description Startup (ALL skills) ~30 tokens/skill Enables discovery
2. Instructions SKILL.md body On skill activation < 5,000 tokens (recommended) Full file loads
3. Resources references/*, scripts/*, assets/* Explicitly referenced Unlimited (JIT) Not preloaded

Key Insight: References are NOT Loaded on Activation

A common misconception is that all references/ files load when a skill activates. This is incorrect.

  • References load only when explicitly cited in SKILL.md or during task execution
  • The agent must encounter a link like [guide](references/guide.md) to load that file
  • Unreferenced files remain unloaded regardless of folder structure

Reference File Loading Behavior

Just-In-Time (JIT) Loading

From the specification:

Files (e.g. those in scripts/, references/, or assets/) are loaded only when required

No Caching Across Requests

Per Issue #97:

Reference files should be fully loaded each time they are referenced, regardless of whether they were previously read.

This means:

  • Each reference to a file triggers a fresh load
  • Different tasks may need different sections of the same file
  • Write references as self-contained units - don't assume prior context

Whole File Loading

When a reference file is loaded, the entire file is loaded - not just a section. This has implications:

  • Split large topics into separate files
  • Avoid monolithic reference documents
  • Keep individual files < 1,000 tokens (spec recommendation)

Folder Structure Patterns

Standard Structure

skill-name/
├── SKILL.md              # Required: main instructions (< 5,000 tokens)
├── references/           # Optional: detailed documentation
│   ├── topic-a.md        # Loaded when referenced
│   └── topic-b.md        # Loaded when referenced
├── scripts/              # Optional: executable code
│   └── helper.sh
└── assets/               # Optional: templates, data files
    └── template.json

Recipes Pattern (for deployment/tooling skills)

Used by skills that support multiple implementation approaches:

skill-name/
├── SKILL.md
└── references/
    ├── recipes/              # Implementation patterns by tool
    │   ├── azd/              # Azure Developer CLI
    │   │   ├── README.md
    │   │   ├── errors.md
    │   │   └── verify.md
    │   ├── bicep/            # Infrastructure as Code
    │   │   ├── README.md
    │   │   ├── patterns.md
    │   │   └── errors.md
    │   ├── terraform/
    │   └── azcli/
    └── other-references.md

When to use recipes:

  • Skill supports multiple deployment tools
  • Each tool has distinct commands, patterns, and error handling
  • User chooses tool based on existing infrastructure or preference

Token efficiency:

  • Only the selected recipe's files are loaded
  • User choosing "azd" doesn't load bicep/terraform content

Services Pattern (for Azure/cloud skills)

Used for skills that work with multiple cloud services:

skill-name/
├── SKILL.md
└── references/
    ├── services/             # One file per service
    │   ├── container-apps.md
    │   ├── static-web-apps.md
    │   ├── functions.md
    │   ├── cosmos-db.md
    │   └── ...
    └── architecture.md

When to use services:

  • Skill provisions or configures cloud resources
  • Each service has unique configuration, Bicep/Terraform, and gotchas
  • Task determines which service files to load

Token Budget Guidelines

File Size Limits

File Type Soft Limit Hard Limit Action if Exceeded
SKILL.md 500 tokens 5,000 tokens Split into references
references/*.md 1,000 tokens 2,000 tokens Split further
Total skill metadata ~30 tokens - Keep description concise

Skill Visibility Limits

From GitHub Copilot CLI Issue #1130:

  • With many skills installed, not all appear in <available_skills>
  • Example: Only 31 of 49 skills visible due to token limits
  • Hidden skills can still be invoked but creates discovery issues

Implications:

  • Keep skill descriptions concise but keyword-rich
  • Prioritize trigger phrases in description for discoverability
  • Don't rely on users seeing all available skills

Best Practices

DO: Structure for Selective Loading

<!-- In SKILL.md -->
## Deployment

Choose your deployment method:
- [Azure Developer CLI (azd)](references/recipes/azd/README.md) - Recommended for new projects
- [Bicep](references/recipes/bicep/README.md) - IaC-first approach
- [Terraform](references/recipes/terraform/README.md) - Multi-cloud requirements

The agent loads ONLY the recipe the user selects.

DO: Keep Reference Files Self-Contained

Each reference file should be usable without requiring other files:

<!-- references/recipes/azd/errors.md -->
# AZD Deployment Errors

## Quick Reference
| Error | Cause | Fix |
|-------|-------|-----|
| ... | ... | ... |

## Detailed Troubleshooting
...

DO: Use Explicit Links

The agent only loads files it encounters links to:

<!-- Good: Explicit link triggers load -->
See [error handling](references/errors.md) for troubleshooting.

<!-- Bad: Agent won't find this file -->
Error documentation is in the references folder.

DON'T: Create Deep Reference Chains

<!-- Avoid: Multiple hops to reach content -->
references/
├── guide.md          → links to details/setup.md
│   └── details/
│       └── setup.md  → links to advanced/config.md
│           └── advanced/
│               └── config.md

Spec recommends: One level deep from SKILL.md

DON'T: Assume Caching

<!-- Wrong assumption -->
# Task 2 Guide
Refer to the authentication setup from Task 1.

<!-- Correct: Self-contained -->
# Task 2 Guide
## Authentication
[Configure auth](references/auth.md) before proceeding.

Example: Well-Structured Skill

azure-deploy/
├── SKILL.md (400 tokens)
│   - Overview and workflow
│   - Links to recipes by deployment method
│   - Links to troubleshooting
│
└── references/
    ├── TROUBLESHOOTING.md (300 tokens)
    │   - Common errors across all methods
    │
    └── recipes/
        ├── azd/
        │   ├── README.md (250 tokens) - Commands and workflow
        │   ├── errors.md (150 tokens) - AZD-specific errors
        │   └── verify.md (100 tokens) - Verification steps
        ├── bicep/
        │   └── ... (similar structure)
        └── terraform/
            └── ... (similar structure)

Token analysis:

  • Skill activation: ~400 tokens (SKILL.md only)
  • AZD deployment task: ~400 + 250 + 150 + 100 = ~900 tokens
  • Bicep deployment task: ~400 + (bicep files) = ~800 tokens

Without recipes pattern (monolithic):

  • Every deployment: ~2,500 tokens (all content loaded)

References

Changelog

  • 2026-02-01: Initial documentation based on spec research and implementation analysis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment