Tool Engineering · ~7 min
The first eleven lessons engineered the typed tool. But a tool isn't the only thing you hand an agent. A skill is a second surface — packaged knowledge the agent discovers and loads — and it answers questions a schema can't.
A JSON schema defines what is structurally valid. It cannot express how to use the tool well — the
team convention, the ID-format gotcha, the "don't call this on a closed cycle" rule. That gap is exactly what an Agent
Skill fills: a folder with a SKILL.md the agent discovers by description and loads only when the task
matches.
A skill's primary content is what the agent needs to know, not do. Domain rules,
ID formats, valid filter values, and quality checklists are knowledge; tool calls and shell sequences are behavior.
The Agent Skills open standard defines a skill as a folder with a SKILL.md markdown entrypoint, with
scripts as an optional secondary artifact — the core is knowledge, and that is what makes it portable across
30+ tools without modification.
The anti-pattern is the skill script: a skill that embeds tool_call(...) sequences
directly. It works in exactly one harness and breaks when that harness's API shifts. A knowledge-only skill tells the
agent what good looks like and lets the agent decide how in its own environment.
Here the course's oldest lesson returns in a new place. A tool's name and description decide whether the agent
selects it (Lesson 2); a skill's description decides whether the agent loads it at all.
It is the one part always present in context, so it must earn its tokens. The craft is the same shape:
| Description part | What it controls |
|---|---|
| What it does | Whether the agent understands the skill's scope at a glance. |
| When to use it | Trigger phrases a user would actually say; missing them causes under-triggering. |
| Negative triggers | "Do NOT use for X (use Y instead)" — the cure for over-triggering. |
Write the body as a delta from baseline behavior: only the conventions and edge cases the model
would otherwise get wrong. The highest-signal section is ## Gotchas — the cases where the model would do
something plausible but wrong, naming both the mistake and the correct alternative. That is the same poka-yoke instinct
from Lesson 2, moved off the schema and into prose.
Skills are not all the same shape. Match the shape to whether the skill carries executable logic — and notice the shapes echo decisions you already know:
| Shape | Use when | Echoes |
|---|---|---|
| Pure reference | Templates, taxonomies, decision tables | Onboarding description (L2) |
| Inline-shell | One- or two-line commands, no branching | Thin wrapper (L1) |
| Script-backed (CLI-first) | Non-trivial logic the agent invokes | Next lesson, L14 |
A skill can also be forked into an isolated context — Claude Code and VS Code both expose
context: fork in the frontmatter. The skill runs in a subagent; only its distilled result crosses back,
keeping search hits and intermediate scratch out of the main chat. That is Lesson 11's cost ledger again: the auxiliary
tokens never enter the window you're budgeting.
Frontmatter does more than name the skill. disable-model-invocation: true makes a side-effect
workflow (deploy, commit) user-only, so the model can't fire it on its own timing. allowed-tools pre-approves
tools while the skill runs. A skill declaring allowed-tools or hooks is treated as an
elevated-permission request and needs user approval before first use — the runtime reads
the frontmatter as a privilege grant, not just metadata.
The discipline's standing caveat reappears, sharpened for packaging. A skill is overhead when:
CLAUDE.md; a skill adds indirection without payoff.And the security note that scales with reuse: each skill from an external registry is a prompt-injection vector. A published audit of 3,984 skills found 13.4% with critical-severity issues. Review third-party skills like any other dependency — pinned, reviewed, sandboxed — not trusted because they loaded cleanly.
disable-model-invocation and allowed-tools gate side effects and privilege.Retrieval practice — recall, don't peek
Question 1The thing a skill carries that a JSON schema cannot is the…
Question 2A knowledge-only skill is portable across 30+ tools because it contains no…
Question 3The single field that decides whether a skill loads at all is the…
Question 4Forking a skill with context: fork keeps the main window lean by isolating its…
Question 5 · spaced recall from Lesson 11The standing token cost of tool definitions, paid every turn before any call, is the…
description + ## Gotchas for a tool your agent keeps fumbling? Next in Part 6:
Hooks & Deterministic Lifecycle Enforcement — when a rule must hold whatever the model decides.