Tool Engineering · ~7 min
A tool annotation looks like a passive safety badge. The moment a harness reads it to decide whether to run calls in parallel, it stops being advisory and starts governing execution.
MCP tools carry four advisory annotations: readOnlyHint, destructiveHint,
idempotentHint, and openWorldHint. For a long time these were cosmetic — they changed a
confirmation prompt, nothing more. Then harnesses started wiring them into the dispatch path, and a misannotation
stopped being a UX detail and became a correctness bug.
readOnlyHint: true and lifts the sequential gate on that basis,
the annotation governs execution semantics, not just UX. Codex CLI 0.134.0 shipped exactly this:
read-only tools automatically qualify for parallel dispatch. A tool that declares readOnlyHint: true
but secretly mutates now produces racing writes the moment the agent issues two calls in one turn.
Read-only tools, by the contract, don't mutate state shared across calls — so two concurrent invocations can't
interfere through tool effects. The only thing they share is the transport and the server's process budget. That
lets the harness collapse wall-clock cost for N read calls from sum(latency) toward
max(latency) plus dispatch overhead.
It's also cheap to wire: no planner, no dependency DAG, just a static lookup of the annotation in the tool list.
The hint defaults to false, so a server that omits annotations stays sequential — conservative by
design. The author opts into concurrency by setting one boolean, and accepts the responsibility that comes with it.
The MCP spec is blunt: clients must treat annotations as untrusted unless they come from a trusted
server. The flag is a claim by the tool author, not a guarantee the harness can verify. The most common and
most dangerous misannotation is a tool marked readOnlyHint: true that actually writes — logs the access,
bumps a last_seen timestamp, increments a counter. Sequentially that's invisible; under parallel
dispatch, two calls race on the same write and the agent reasons over an inconsistent result.
idempotentHint with readOnlyHintA read that fails transiently must be safe to retry. Pure reads are idempotent by definition — but setting
idempotentHint: true alongside readOnlyHint makes that explicit and gives the harness a
safe recovery path on a dropped call. This is the same safe-to-retry property from the
previous lesson, now declared on the surface instead of enforced in the body.
Even with honest annotations, concurrency isn't free. Audit for these before flipping the flag:
| Condition | Why parallel hurts |
|---|---|
| Rate-limited backend | Concurrent reads against a per-second-capped API hit 429s that sequential calls would have spaced out — wall-clock gain traded for recovery turns. |
| Weakly consistent replicas | List-then-get across read replicas can return divergent views to concurrent reads; the agent reasons over a self-inconsistent picture. |
| No per-server cap | Ten reads fan out against a server sized for sequential traffic and blow its connection or process budget — the harness win becomes a server outage. |
| Model can't interleave | Concurrent dispatch returns results out of order; a model that degrades on interleaved-ledger reasoning underperforms the sequential baseline. |
The safe operator posture for many third-party servers is to leave per-server concurrency disabled until an upstream idempotency audit has confirmed every "read-only" tool really is. The annotation is a promise; the audit is what makes it load-bearing.
readOnlyHint.sum to max at near-zero wiring cost.idempotentHint with readOnlyHint so transient-failure retries are safe.Retrieval practice — recall, don't peek
Question 1Once a harness dispatches in parallel on it, readOnlyHint governs…
Question 2Honest read-only tools let the harness cut wall-clock cost from…
Question 3The MCP spec says tool annotations should be treated as…
Question 4The most dangerous misannotation under parallel dispatch is a tool that…
Question 5 · spaced recall from Lesson 07The foundational technique for making a re-run safe is to…
readOnlyHint: true, or
how hint-driven concurrency trades against an explicit dependency DAG? Next, Part 5 opens with
Tool Discoverability at Scale — keeping selection sharp as the catalog grows past the point a model can
hold it all in view.