Part 2 · Shaping the Surface

Tool Engineering · ~7 min

Result Shaping

Sizing output was the budget. Shaping it is the craft: which fields, in what language, paginated how — and what to do when even the trimmed result overflows.

Why this, for you: the difference between a tool the agent acts on confidently and one it hallucinates around. Opaque IDs, unbounded lists, and silent truncation each produce a distinct, recurring bug. This lesson is four moves that close them — and the daily payoff is fewer wrong-field references in agent output.

The agent reasons over your output as natural language. So output format is a reliability lever independent of model capability: the same tool, reshaped, produces fewer hallucinations and more accurate downstream calls — without touching the model.

1 Replace opaque identifiers with semantic ones

UUIDs, MIME types, and internal codes are arbitrary byte sequences to a model. Resolving them to natural-language fields measurably reduces hallucination in retrieval tasks — the agent can reference the object by name instead of miscopying a 36-character UUID.

# Opaque — the agent will eventually miscopy this {"id": "a3f4b2c1-47c1-4e2d-...", "type": "application/vnd...wordprocessingml.document"} # Semantic — the agent references it by name, no transcription risk {"name": "Q3 Budget Review", "file_type": "Word document"}

2 Return only the decision-relevant fields

A get_customer that returns twelve fields when the agent needs three forces it to parse nine irrelevant ones — and every extra field is a hallucination opportunity: the agent may reference, invent, or misattribute stripe_id or feature_flags in an email it had no business mentioning. Design the schema around decision-relevant fields. Keep developer-debugging output behind a separate debug mode — don't make the firehose the default an agent sees.

3 Filter and paginate at the tool layer

A tool that returns the full dataset pushes filtering onto the agent, which then either hallucinates the filter or loads everything into context. Move that work into the tool.

MoveWhat it does
Filter params (status=open)Apply before returning; agent doesn't filter in its head
Pagination with a cursorReturn one page + a handle, not an unbounded list
Sensible defaults (limit=20)Prevent accidental context flooding
response_format enumLet the agent pick concise vs detailed per its budget

4 When it still overflows: the PARTIAL contract

Even filtered output can blow the token budget. The wrong answer is a hard error — the agent spends a turn, gains nothing, and has to guess retry parameters. The right answer is a graceful truncation contract with three load-bearing parts. Drop any one and it collapses back into the failure it was meant to fix.

A useful prefix (work gets done this turn) + a structurally distinct marker (the agent knows it's incomplete) + a continuation handle (a defined next call: offset, cursor, page token). Together, not separately.

The marker is load-bearing — and trailing prose is ignored

A bottom-of-output [PARTIAL] line is the easiest implementation and the most ignored: the model reads the prefix as the whole document and the trailing line as commentary. Anthropic's own Read tool hit this as a bug — the agent treated a preview as the complete file and silently dropped the un-read rules. Put the marker in a structurally distinct slot: a leading banner, a separate JSON field, or typed metadata.

5 Make errors part of the shape

An error is output too. "400 Bad Request" tells the agent nothing; "Invalid date range: end_date must be after start_date. Received start=2024-03-01, end=2024-02-01." tells it exactly what to fix. Actionable errors let the agent self-correct on the next call without a human — the topic Lesson 5 takes apart in full.

Where shaping backfires

A task-specific schema that omits a field the agent unexpectedly needs forces a hallucination or an extra round-trip — sometimes pricier than returning the full record once. And a graceful tool paired with an agent that has no reaction policy for the marker is worse than a hard error: the system prompt must say what to do (paginate, summarise, ask, proceed). For security-critical reads — policy files, guardrails — fail-closed beats return-prefix: a silently truncated rules file is the same as a missing one.

↪ Your win: shape output for an agent reader

Retrieval practice — recall, don't peek

Question 1Resolving a UUID to a natural-language name mainly reduces…

Question 2Filtering and pagination belong…

Question 3A trailing [PARTIAL] line tends to be ignored because the model…

Question 4For a guardrail or policy-file read, truncation should…

Question 5 · spaced recall from Lesson 03The right design question for tool output is…

Ask me anything. Want the full PARTIAL authoring checklist (which bytes were dropped, where to place the marker), or how response_format enums pair with a "reason about your data needs first" prompt? Next in Part 2: Errors as a Teaching Signal — turning failures into the agent's next correct move.
✎ Feedback