Part 1 · Foundations

Tool Engineering · ~7 min

Schema & Description Altitude

The agent never browses your tool catalog. It selects by reasoning over descriptions — so the description is where you win or lose the call, before any work begins.

Why this, for you: the highest-return edit you can make to an existing tool. You don't rewrite the implementation — you rewrite the description and tighten the schema. Anthropic measured a 40% drop in task completion time from improving tool ergonomics, descriptions included. This is that lever.

Selection is a reasoning step, not a lookup. The agent reads the descriptions in its context and picks the tool whose description best matches its current intent. A tool whose description fails to communicate a use case is invisible for that use case — even when the implementation would have handled it.

A description accurate enough to say what the tool does but not when to prefer it is the most common failure mode. The fix is positive selection signals: "Use this when X. Prefer this over other_tool when Y." Those are instructions to the agent, not documentation of the interface.

1 Write the description like onboarding, not API reference

A terse reference is enough for a developer who already knows the system. The agent knows nothing — it cannot infer what "user" means in your domain, whether a date is ISO 8601 or a Unix timestamp, or that you must call list_sprints before you have a sprint_id. Write as if onboarding a competent new hire on day one: explicit about the things docs omit because experienced users already know them.

# Terse — agent must guess ID type, filter values, traversal order "description": "Get issues for a sprint." # Onboarding — domain conventions, ID format, valid filters, where the ID comes from "description": "Retrieve all issues in a Jira sprint. sprint_id is a numeric string (e.g. '42'), not the name — get it from list_sprints first. status accepts exactly: 'To Do', 'In Progress', 'Done'; omit for all. Returns id, summary, status, assignee, story_points."

Three things a new hire learns day one — IDs are numeric strings, call list_sprints first, status takes those exact strings — are now on the surface. The terse version forces a guess on each.

2 Make the wrong call uncallable: poka-yoke the schema

Description tells the agent how to call correctly. The schema can make the wrong call impossible. This is poka-yoke — mistake-proofing borrowed from Toyota: redesign so the defect can't occur.

MechanismSchema moveExample
Contact — shape blocks misuseEnumerate valid values["python","typescript","all"], not free text
Fixed-value — bound the rangeClamp with a defaultmax_results 1–100, default 20
Motion-step — enforce orderPrerequisite gateEdit rejects a file not yet read

Unambiguous names do the same work: user_id not user, start_date not date — the name carries the type and format so the agent can't substitute the wrong thing. And keep formats close to training data (JSON, markdown, prose); inputs that need line counting or string-escaping raise error rates. Adding concrete sample calls to a tool definition moved accuracy from 72% to 90% on complex parameter handling in Anthropic's testing.

3 Altitude: specific without being brittle

There's a level above the schema, too. A description that over-prescribes — "always call list_sprints first" — strips the agent of valid paths when the ID is already known. Describe what a parameter requires, not how to obtain it from scratch every time. State the contract and the constraints; leave the sequencing to the agent's reasoning, which has information the schema author didn't.

The cost of getting altitude wrong

Too low (prescriptive sequence baked into the description): brittle — breaks when the task varies. Too high (vague "search for things"): the agent burns tokens resolving the ambiguity before it can call. The right altitude is the minimum detail that prevents misuse — no more, because every extra token is paid on every invocation.

↪ Your win: rewrite the description, tighten the schema

Retrieval practice — recall, don't peek

Question 1The most common tool-description failure is one that states what the tool does but not…

Question 2Replacing a free-text parameter with an enum is an example of…

Question 3The "onboarding" framing says to write descriptions as if the reader is…

Question 4Baking "always call list_sprints first" into a description risks making the tool…

Question 5 · spaced recall from Lesson 01When an agent keeps misusing a tool, the first place to look is…

Ask me anything. Want the enum-vs-validation decision rule, or how to test tool selection by logging which tool the agent picks per task type? Next in Part 2: Token-Efficient Tool Design — once the agent picks the right tool, how much of its context does the call cost?
✎ Feedback