Skills as a Tool-Engineering Surface

The first eleven lessons engineered the typed tool. But a tool isn't the only thing you hand an agent. A skill is a second surface — packaged knowledge the agent discovers and loads — and it answers questions a schema can't.

Why this, for you: when an agent keeps misusing a correct tool, the missing piece is often not on the tool at all — it's the usage knowledge a schema can't carry. Skills are where that knowledge lives. Knowing when to reach for a skill instead of another parameter is a packaging decision you now have to make.

A JSON schema defines what is structurally valid. It cannot express how to use the tool well — the team convention, the ID-format gotcha, the "don't call this on a closed cycle" rule. That gap is exactly what an Agent Skill fills: a folder with a SKILL.md the agent discovers by description and loads only when the task matches.

A skill is a distinct surface from a tool. The tool says what the call is; the skill says what the agent should know to make the call well. "JSON schemas define what's structurally valid, but can't express usage patterns" — and the usage pattern is precisely the part skills package.

1 Knowledge, not behavior

A skill's primary content is what the agent needs to know, not do. Domain rules, ID formats, valid filter values, and quality checklists are knowledge; tool calls and shell sequences are behavior. The Agent Skills open standard defines a skill as a folder with a SKILL.md markdown entrypoint, with scripts as an optional secondary artifact — the core is knowledge, and that is what makes it portable across 30+ tools without modification.

The anti-pattern is the skill script: a skill that embeds tool_call(...) sequences directly. It works in exactly one harness and breaks when that harness's API shifts. A knowledge-only skill tells the agent what good looks like and lets the agent decide how in its own environment.

# Skill script — embeds execution, non-portable 1. Run claude_code_tool("bash", "find docs/ -name '*.md'") # one harness only 2. For each file, run a curl probe and collect failures # Knowledge-only skill — portable across any agent ## Link quality rules - Every external link must point to a primary source, not a summary - Vendor docs must use versioned URLs (/v2/docs/), not unversioned roots

2 The description is the load gate

Here the course's oldest lesson returns in a new place. A tool's name and description decide whether the agent selects it (Lesson 2); a skill's description decides whether the agent loads it at all. It is the one part always present in context, so it must earn its tokens. The craft is the same shape:

Description part	What it controls
What it does	Whether the agent understands the skill's scope at a glance.
When to use it	Trigger phrases a user would actually say; missing them causes under-triggering.
Negative triggers	"Do NOT use for X (use Y instead)" — the cure for over-triggering.

Write the body as a delta from baseline behavior: only the conventions and edge cases the model would otherwise get wrong. The highest-signal section is ## Gotchas — the cases where the model would do something plausible but wrong, naming both the mistake and the correct alternative. That is the same poka-yoke instinct from Lesson 2, moved off the schema and into prose.

3 Three shapes, and a context lever

Skills are not all the same shape. Match the shape to whether the skill carries executable logic — and notice the shapes echo decisions you already know:

Shape	Use when	Echoes
Pure reference	Templates, taxonomies, decision tables	Onboarding description (L2)
Inline-shell	One- or two-line commands, no branching	Thin wrapper (L1)
Script-backed (CLI-first)	Non-trivial logic the agent invokes	Next lesson, L14

A skill can also be forked into an isolated context — Claude Code and VS Code both expose context: fork in the frontmatter. The skill runs in a subagent; only its distilled result crosses back, keeping search hits and intermediate scratch out of the main chat. That is Lesson 11's cost ledger again: the auxiliary tokens never enter the window you're budgeting.

The frontmatter is a permission surface too

Frontmatter does more than name the skill. disable-model-invocation: true makes a side-effect workflow (deploy, commit) user-only, so the model can't fire it on its own timing. allowed-tools pre-approves tools while the skill runs. A skill declaring allowed-tools or hooks is treated as an elevated-permission request and needs user approval before first use — the runtime reads the frontmatter as a privilege grant, not just metadata.

4 When a skill is the wrong surface

The discipline's standing caveat reappears, sharpened for packaging. A skill is overhead when:

The knowledge is too sparse — one URL or one rule is better inline or in CLAUDE.md; a skill adds indirection without payoff.
Portability isn't a goal — single-tool, single-developer projects gain nothing from the cross-tool format.
The API changes daily — a skill with stale Gotchas is worse than no skill; it actively misdirects.
The library is large — past ~40 skills, descriptions get truncated to fit, stripping the trigger words that drive selection. The selection cliff (L9) has a skills twin.

And the security note that scales with reuse: each skill from an external registry is a prompt-injection vector. A published audit of 3,984 skills found 13.4% with critical-severity issues. Review third-party skills like any other dependency — pinned, reviewed, sandboxed — not trusted because they loaded cleanly.

↪ Your win: package usage knowledge as a second surface

A skill carries what a schema can't — usage patterns, conventions, gotchas — as knowledge, not embedded behavior.
The description is the load gate: what + when + negative triggers; the body is a delta from baseline, with Gotchas the highest-signal part.
Match the shape — pure-reference, inline-shell, or script-backed — and fork to an isolated context when auxiliary tokens are heavy.
Frontmatter is a permission surface: disable-model-invocation and allowed-tools gate side effects and privilege.
Skip the skill for sparse knowledge, single-tool projects, fast-moving APIs, and bloated libraries; review external skills as dependencies.

Retrieval practice — recall, don't peek

Question 1The thing a skill carries that a JSON schema cannot is the…

Question 2A knowledge-only skill is portable across 30+ tools because it contains no…

Question 3The single field that decides whether a skill loads at all is the…

Question 4Forking a skill with context: fork keeps the main window lean by isolating its…

Question 5 · spaced recall from Lesson 11The standing token cost of tool definitions, paid every turn before any call, is the…

Ask me anything. Want to decide whether a recurring tool misuse needs a richer schema or a skill, or draft a description + ## Gotchas for a tool your agent keeps fumbling? Next in Part 6: Hooks & Deterministic Lifecycle Enforcement — when a rule must hold whatever the model decides.