Skills: the agent’s training manual
Reusable expert knowledge, loaded into context.
Skills are what the agent KNOWS
Tools operate outside the model. Skills operate inside the model's reasoning loop. A skill is a packaged piece of expert knowledge — instructions, and optionally scripts and references — that the agent loads on demand when a task calls for it.
- A tool lets an agent send an email.
- A skill tells the agent how your team writes a customer-facing email: brand voice, length constraints, the sign-off, when to include the account manager.
If tools are hands, skills are training manuals. Anthropic's own framing is an "onboarding guide you'd write for a new team member" — small, specific, trigger-aware.
Four types of skill
| Type | What it encodes | Example |
|---|---|---|
| Domain expertise | Rules, heuristics, exceptions specific to your field. | Legal contract review clauses; medical triage protocols. |
| Decision frameworks | Rubrics, scoring systems, approval thresholds. | Risk scoring for credit decisions; investment criteria. |
| Persona & tone | Brand voice guidelines, audience-specific communication rules. | How to write for a retail customer vs an institutional one. |
| Process knowledge | Step-by-step workflows, escalation procedures, handoff criteria. | The exact sequence of checks before marking a ticket "resolved". |
The SKILL.md standard
Anthropic's Agent Skills
convention is now concrete. A skill lives in its own folder with a SKILL.md at the root, plus
optional bundled files. The markdown starts with YAML frontmatter:
---
name: compliance-source-citing
description: Cross-checks external claims against two independent sources before they appear in
a compliance report. Use when drafting, editing, or reviewing anything that cites regulators,
statutes, or third-party research.
---
# Compliance source-citing
For every external claim in a report:
1. Find two independent sources. Record both URLs and the date retrieved.
2. If only one source exists, keep the claim but flag it `unverified` in the report body.
3. Never paraphrase regulatory text — quote verbatim and include a pinpoint citation
(section, paragraph, or page).
4. When sources disagree, surface the disagreement rather than picking one.
Two fields are required: name (lowercase, hyphenated, ≤64 chars) and description (≤1024
chars). Everything else — scripts, reference docs, templates — sits alongside the file.
Progressive disclosure: why skills scale
An agent might have a hundred skills available. It can't load them all every turn. Anthropic's design uses three levels of disclosure:
- Metadata only — name + description (~100 tokens per skill) is injected into the system prompt at startup. The agent reads this to decide which skill is relevant.
- Skill body — the markdown below the frontmatter is read only when the agent decides the skill applies. Aim for under 500 lines / 5k tokens.
- Bundled resources —
scripts/,references/,assets/are read or executed only when needed. A script can do real work without its source code ever entering the context window.
Writing skills that actually trigger
A few conventions from Anthropic's skill-creator guide that aren't obvious from the spec alone:
- Name in the gerund form.
processing-invoices,reviewing-contracts,writing-release-notes— verbs-as-nouns read as capabilities. Avoidhelper,utils, or anything generic. - Imperative mood in the body. "Quote verbatim. Record the date retrieved." reads better than stacks of ALL-CAPS rules.
- Match the degrees of freedom to the task. Fragile, high-stakes steps should become exact scripts the skill runs. Creative or judgement-heavy steps stay as prose the model interprets.
- Keep it short. If a skill grows past ~500 lines, split by domain into
references/and link from the main file — nested references stay out of context until they're needed. - Test across models. A description that triggers reliably on Opus may be ignored by Haiku. Iterate with real prompts and measure trigger accuracy with and without the skill loaded.
When to write a skill vs fine-tune a model
Rule of thumb: if the knowledge is company-specific and changes more than once a quarter, it belongs in a skill. Fine-tuning bakes knowledge into weights; skills keep it editable and versionable alongside your code. Fine-tune only when skills have plateaued and the latency of re-reading them matters.
Further reading
- Agent Skills overview — the canonical spec: SKILL.md, frontmatter fields, progressive disclosure, surfaces (API, Claude Code, Claude.ai).
- Agent Skills best practices — description writing, naming, structure, how to think about degrees of freedom.
- skill-creator SKILL.md — a working skill whose job is writing other skills, with an evaluation loop for tuning trigger accuracy.