🔨 Build

Skills: the agent’s training manual

Reusable expert knowledge, loaded into context.

Skills are what the agent KNOWS

Tools operate outside the model. Skills operate inside the model's reasoning loop. A skill is a packaged piece of expert knowledge — instructions, and optionally scripts and references — that the agent loads on demand when a task calls for it.

A tool lets an agent send an email.
A skill tells the agent how your team writes a customer-facing email: brand voice, length constraints, the sign-off, when to include the account manager.

If tools are hands, skills are training manuals. Anthropic's own framing is an "onboarding guide you'd write for a new team member" — small, specific, trigger-aware.

Four types of skill

Type	What it encodes	Example
Domain expertise	Rules, heuristics, exceptions specific to your field.	Legal contract review clauses; medical triage protocols.
Decision frameworks	Rubrics, scoring systems, approval thresholds.	Risk scoring for credit decisions; investment criteria.
Persona & tone	Brand voice guidelines, audience-specific communication rules.	How to write for a retail customer vs an institutional one.
Process knowledge	Step-by-step workflows, escalation procedures, handoff criteria.	The exact sequence of checks before marking a ticket "resolved".

The SKILL.md standard

Anthropic's Agent Skills convention is now concrete. A skill lives in its own folder with a SKILL.md at the root, plus optional bundled files. The markdown starts with YAML frontmatter:

---
name: compliance-source-citing
description: Cross-checks external claims against two independent sources before they appear in
  a compliance report. Use when drafting, editing, or reviewing anything that cites regulators,
  statutes, or third-party research.
---

# Compliance source-citing

For every external claim in a report:

1. Find two independent sources. Record both URLs and the date retrieved.
2. If only one source exists, keep the claim but flag it `unverified` in the report body.
3. Never paraphrase regulatory text — quote verbatim and include a pinpoint citation
   (section, paragraph, or page).
4. When sources disagree, surface the disagreement rather than picking one.

Two fields are required: name (lowercase, hyphenated, ≤64 chars) and description (≤1024 chars). Everything else — scripts, reference docs, templates — sits alongside the file.

Progressive disclosure: why skills scale

An agent might have a hundred skills available. It can't load them all every turn. Anthropic's design uses three levels of disclosure:

Metadata only — name + description (~100 tokens per skill) is injected into the system prompt at startup. The agent reads this to decide which skill is relevant.
Skill body — the markdown below the frontmatter is read only when the agent decides the skill applies. Aim for under 500 lines / 5k tokens.
Bundled resources — scripts/, references/, assets/ are read or executed only when needed. A script can do real work without its source code ever entering the context window.

Writing skills that actually trigger

A few conventions from Anthropic's skill-creator guide that aren't obvious from the spec alone:

Name in the gerund form. processing-invoices, reviewing-contracts, writing-release-notes — verbs-as-nouns read as capabilities. Avoid helper, utils, or anything generic.
Imperative mood in the body. "Quote verbatim. Record the date retrieved." reads better than stacks of ALL-CAPS rules.
Match the degrees of freedom to the task. Fragile, high-stakes steps should become exact scripts the skill runs. Creative or judgement-heavy steps stay as prose the model interprets.
Keep it short. If a skill grows past ~500 lines, split by domain into references/ and link from the main file — nested references stay out of context until they're needed.
Test across models. A description that triggers reliably on Opus may be ignored by Haiku. Iterate with real prompts and measure trigger accuracy with and without the skill loaded.

When to write a skill vs fine-tune a model

Rule of thumb: if the knowledge is company-specific and changes more than once a quarter, it belongs in a skill. Fine-tuning bakes knowledge into weights; skills keep it editable and versionable alongside your code. Fine-tune only when skills have plateaued and the latency of re-reading them matters.