Part 6 · Beyond the Tool Catalog

Tool Engineering · ~7 min

Skills as a Tool-Engineering Surface

The first eleven lessons engineered the typed tool. But a tool isn't the only thing you hand an agent. A skill is a second surface — packaged knowledge the agent discovers and loads — and it answers questions a schema can't.

Why this, for you: when an agent keeps misusing a correct tool, the missing piece is often not on the tool at all — it's the usage knowledge a schema can't carry. Skills are where that knowledge lives. Knowing when to reach for a skill instead of another parameter is a packaging decision you now have to make.

A JSON schema defines what is structurally valid. It cannot express how to use the tool well — the team convention, the ID-format gotcha, the "don't call this on a closed cycle" rule. That gap is exactly what an Agent Skill fills: a folder with a SKILL.md the agent discovers by description and loads only when the task matches.

A skill is a distinct surface from a tool. The tool says what the call is; the skill says what the agent should know to make the call well. "JSON schemas define what's structurally valid, but can't express usage patterns" — and the usage pattern is precisely the part skills package.

1 Knowledge, not behavior

A skill's primary content is what the agent needs to know, not do. Domain rules, ID formats, valid filter values, and quality checklists are knowledge; tool calls and shell sequences are behavior. The Agent Skills open standard defines a skill as a folder with a SKILL.md markdown entrypoint, with scripts as an optional secondary artifact — the core is knowledge, and that is what makes it portable across 30+ tools without modification.

The anti-pattern is the skill script: a skill that embeds tool_call(...) sequences directly. It works in exactly one harness and breaks when that harness's API shifts. A knowledge-only skill tells the agent what good looks like and lets the agent decide how in its own environment.

# Skill script — embeds execution, non-portable 1. Run claude_code_tool("bash", "find docs/ -name '*.md'") # one harness only 2. For each file, run a curl probe and collect failures # Knowledge-only skill — portable across any agent ## Link quality rules - Every external link must point to a primary source, not a summary - Vendor docs must use versioned URLs (/v2/docs/), not unversioned roots

2 The description is the load gate

Here the course's oldest lesson returns in a new place. A tool's name and description decide whether the agent selects it (Lesson 2); a skill's description decides whether the agent loads it at all. It is the one part always present in context, so it must earn its tokens. The craft is the same shape:

Description partWhat it controls
What it doesWhether the agent understands the skill's scope at a glance.
When to use itTrigger phrases a user would actually say; missing them causes under-triggering.
Negative triggers"Do NOT use for X (use Y instead)" — the cure for over-triggering.

Write the body as a delta from baseline behavior: only the conventions and edge cases the model would otherwise get wrong. The highest-signal section is ## Gotchas — the cases where the model would do something plausible but wrong, naming both the mistake and the correct alternative. That is the same poka-yoke instinct from Lesson 2, moved off the schema and into prose.

3 Three shapes, and a context lever

Skills are not all the same shape. Match the shape to whether the skill carries executable logic — and notice the shapes echo decisions you already know:

ShapeUse whenEchoes
Pure referenceTemplates, taxonomies, decision tablesOnboarding description (L2)
Inline-shellOne- or two-line commands, no branchingThin wrapper (L1)
Script-backed (CLI-first)Non-trivial logic the agent invokesNext lesson, L14

A skill can also be forked into an isolated context — Claude Code and VS Code both expose context: fork in the frontmatter. The skill runs in a subagent; only its distilled result crosses back, keeping search hits and intermediate scratch out of the main chat. That is Lesson 11's cost ledger again: the auxiliary tokens never enter the window you're budgeting.

The frontmatter is a permission surface too

Frontmatter does more than name the skill. disable-model-invocation: true makes a side-effect workflow (deploy, commit) user-only, so the model can't fire it on its own timing. allowed-tools pre-approves tools while the skill runs. A skill declaring allowed-tools or hooks is treated as an elevated-permission request and needs user approval before first use — the runtime reads the frontmatter as a privilege grant, not just metadata.

4 When a skill is the wrong surface

The discipline's standing caveat reappears, sharpened for packaging. A skill is overhead when:

And the security note that scales with reuse: each skill from an external registry is a prompt-injection vector. A published audit of 3,984 skills found 13.4% with critical-severity issues. Review third-party skills like any other dependency — pinned, reviewed, sandboxed — not trusted because they loaded cleanly.

↪ Your win: package usage knowledge as a second surface

Retrieval practice — recall, don't peek

Question 1The thing a skill carries that a JSON schema cannot is the…

Question 2A knowledge-only skill is portable across 30+ tools because it contains no…

Question 3The single field that decides whether a skill loads at all is the…

Question 4Forking a skill with context: fork keeps the main window lean by isolating its…

Question 5 · spaced recall from Lesson 11The standing token cost of tool definitions, paid every turn before any call, is the…

Ask me anything. Want to decide whether a recurring tool misuse needs a richer schema or a skill, or draft a description + ## Gotchas for a tool your agent keeps fumbling? Next in Part 6: Hooks & Deterministic Lifecycle Enforcement — when a rule must hold whatever the model decides.
✎ Feedback