Claude Skills: How SKILL.md Works

Research findings on Anthropic's Skills system for Claude Code.

Overview

Skills are a filesystem-based mechanism for extending Claude's capabilities with domain-specific expertise. Each skill is a directory containing a SKILL.md file with optional supporting resources.

Directory Structure

Skills can be stored in three locations:

Location	Scope	Use Case
`~/.claude/skills/`	Personal (all projects)	Individual workflows
`.claude/skills/`	Project (version controlled)	Team-shared expertise
Plugin-bundled	Plugin scope	Distributed with plugins

.claude/skills/
└── my-skill/
    ├── SKILL.md          # Required: main instructions
    ├── REFERENCE.md      # Optional: detailed docs
    ├── scripts/          # Optional: executable code
    └── templates/        # Optional: output templates

SKILL.md Format

Two parts: YAML frontmatter + Markdown content.

---
name: my-skill-name
description: What it does AND when to use it. Include trigger contexts.
version: 1.0.0
allowed-tools: "Read, Write, Bash"
---

# My Skill Name

Instructions go here...

Required Fields

name: Lowercase, hyphens only, max 64 chars
description: Max 1024 chars - THE most critical field

Optional Fields

version: Semantic versioning (e.g., "1.0.0")
license: SPDX identifier (e.g., "MIT")
allowed-tools: Comma-separated tool restrictions
model: Override default model for execution

Progressive Disclosure Architecture

Skills use a three-level loading hierarchy to minimize context window usage:

Level	What Loads	When	Token Cost
1	Metadata (name + description)	Session startup	~30-50 tokens/skill
2	Full SKILL.md body	When Claude invokes skill	1,000-5,000 tokens
3	Supporting files	On-demand via filesystem	Variable

This means 100 skills installed = only ~3,000-5,000 tokens at startup, not the full content of all skills.

How Skill Invocation Works

Phase 1: Discovery (Startup)

Claude Code scans skill directories
Parses YAML frontmatter from each SKILL.md
Embeds skill names + descriptions into the Skill tool's description
Full SKILL.md content is NOT loaded yet

Phase 2: Selection (During Conversation)

User sends a message
Claude sees available skills in the Skill tool description
Claude uses pure LLM reasoning (not embeddings/classifiers) to match user intent
If relevant, Claude calls: Skill({ "command": "skill-name" })

Phase 3: Injection (On Invocation)

Two user messages are injected into conversation:

Visible message: Status like "The 'pdf' skill is running..."
Hidden message (isMeta: true): Full SKILL.md body (minus frontmatter)

Key: Skills inject as user messages, NOT system prompt modifications. This keeps the effect scoped to the current task rather than persisting globally.

Skills and Tool Calling

Important distinction: Skills are NOT a new type of tool. Skills provide instructions that tell Claude how to use existing tools.

How Bundled Scripts Work

A skill can bundle executable scripts in a scripts/ directory:

my-skill/
├── SKILL.md
└── scripts/
    └── validate.py

The SKILL.md contains instructions referencing the script:

## Validation

To validate input files, run:

`python {baseDir}/scripts/validate.py input.txt`

Execution Flow

When Claude follows these instructions, it uses standard tool calling:

1. Skill loads → SKILL.md content injected into conversation
2. Claude reads instruction: "run python {baseDir}/scripts/validate.py"
3. Claude calls Bash tool: Bash({ command: "python /resolved/path/scripts/validate.py input.txt" })
4. Script executes in sandbox
5. Output returns to Claude
6. Claude uses output to continue the task

What This Means

Skills = Instructions + Resources (text, scripts, templates)
Tool Calling = How Claude acts on those instructions
The allowed-tools frontmatter field restricts which tools the skill can use
Scripts run in Claude Code's sandbox with normal security restrictions

Example: PDF Skill with Script

---
name: pdf-processor
description: Extract and analyze PDF content. Use when user mentions PDFs or document extraction.
allowed-tools: "Bash, Read, Write"
---

# PDF Processor

## Text Extraction

To extract text from a PDF:

`python {baseDir}/scripts/extract_text.py input.pdf output.txt`

## Table Extraction

For tables, use:

`python {baseDir}/scripts/extract_tables.py input.pdf --format csv`

When invoked, Claude reads these instructions and executes the appropriate script via the Bash tool based on what the user needs.

Example: Production-Grade Skill (Hugging Face Model Trainer)

For complex domains, skills can be much more comprehensive. The HF Model Trainer skill demonstrates advanced patterns:

Rich description with multiple triggers:

description: This skill should be used when users want to train or fine-tune
language models using TRL on Hugging Face Jobs infrastructure. Covers SFT, DPO,
GRPO and reward modeling... Should be invoked for tasks involving cloud GPU
training, GGUF conversion, or when users mention training on Hugging Face Jobs.

Structured sections for complex workflows:

Prerequisites - Required accounts, tokens, configurations
Key Directives - Always-do rules (e.g., "Always use hf_jobs() MCP tool")
Multiple Quick Start Approaches - Different methods for different needs
Hardware Selection Guide - Match GPU to model size
Timeout Management - Domain-specific gotchas
Common Failure Modes - Proactive troubleshooting with solutions

Integration with external tools:

References MCP tools: "Always use hf_jobs() MCP tool (not bash commands)"
Bundled scripts: scripts/estimate_cost.py for cost estimation
External monitoring: Trackio integration for job tracking

Key takeaway: Production skills anticipate failure modes, provide multiple approaches, and include domain-specific operational guidance beyond basic instructions.

The Description Field is Critical

From Anthropic's official documentation:

"The description is critical for skill selection: Claude uses it to choose the right Skill from potentially 100+ available Skills."

"The description determines when your skill activates, making it the most critical component."

What to Include

Capabilities: What the skill does
Triggers: When Claude should use it
Context: Relevant scenarios
Boundaries: What it doesn't do

Good vs Bad Descriptions

Good:

Extract text and tables from PDF files, fill forms, merge documents.
Use when working with PDF files or when the user mentions PDFs, forms,
or document extraction.

Bad:

Helps with documents.

Writing Style

Write in third person (not "you can use this to...")
Include explicit trigger words users might say
Be specific about file types, actions, contexts

Best Practices

Keep SKILL.md Concise

Target under 500 lines
Put detailed reference material in separate files
Use progressive disclosure: "For advanced usage, see REFERENCE.md"

Use {baseDir} for Portability

Run `python {baseDir}/scripts/validate.py input.txt`

The {baseDir} variable resolves to the skill's installation directory at runtime.

Assume Claude's Knowledge

Don't explain what PDFs are. Focus on domain-specific procedures:

Good: "Use pdfplumber to extract text from page boundaries"
Bad: "A PDF is a Portable Document Format file..."

Iterate Based on Real Usage

Work with Claude on representative tasks
Observe where it succeeds/struggles
Ask Claude to self-reflect on what went wrong
Update SKILL.md based on what Claude actually needs

Security Considerations

Only install skills from trusted sources
Audit all bundled scripts before enabling
Review allowed-tools permissions
Check for external network calls in instructions

Cross-Platform Support

Skills work consistently across:

Claude.ai (web interface)
Claude Code (CLI)
Claude Agent SDK
Claude API (with code execution)

Official Documentation

Research compiled: December 2024

raoulbia-ai/claude-skills-how-skill-md-works.md

Select an option

No results found