Skip to content
4 min read · 841 words

Thought

Primitives covers the eight-primitive overview.

A Thought is reasoning. The model thinking out loud about what to do next — the kind of content a modern model emits before it commits to an answer, the kind a planner needs to read back to itself on the next iteration, the kind an audit trail wants to capture so a human can see why the agent did what it did. It is separate from Message because reasoning is not user-facing dialogue: collapsing the two means a stray reasoning fragment can be read as something the assistant said to the user, and a stray user-facing reply can be read as reasoning. Both are bad outcomes.

A Thought always carries the reasoning text. If it carries a vendor-shaped Thought.payload — the opaque blob some providers expect to see round-tripped verbatim on the next call (OpenAI's encrypted_content, Anthropic's signed reasoning items, Gemini's thought signatures, anything else in that lineage) — it also carries a Thought.replayCompatibility tag. No tag, no safe replay. The payload is an opaque blob only the right executor knows how to round-trip. Those two fields aren't additive at the wire: when a payload is present the executor sends the payload to the model and keeps the text purely for token accounting and human inspection; when no payload is present the executor sends the text. The model sees one or the other, never both at once. The replayCompatibility tag is a stable identifier for the wire shape so a future executor can recognise the blob it knows how to send back, route it to the right channel, and drop the ones it can't. The tag is the only thing between a payload that helps the next turn and one that silently corrupts it.

Reasoning text is also one of the highest-leverage targets for context-window compaction. A modern model emits a lot of it — exploring branches, second-guessing, restating the question — and most of that volume does not have to survive into the next turn. A well-summarised synthetic Thought that preserves the intent and the conclusions while shedding the deliberation saves dramatic amounts of token budget, often more than compacting any other primitive. Treat the original reasoning as transcript, and the Thought the next turn reads as the dense version that earned its place.

Reasoning is load-bearing — in both directions

A model leans harder on content it reads as its own reasoning than on content it reads as dialogue or retrieved text. That asymmetry is the whole reason Thought is a distinct primitive, and it is also the most under-used lever in the ADK.

The hostile direction is the one people already know about: chain-of-thought hijacking, where reasoning text smuggled in from somewhere it shouldn't be coming from gets treated as authoritative. See Reasoning fences for the pattern an executor should follow when rendering thoughts into the context.

The useful direction is the same lever pointed the other way, and most implementors either don't reach for it, don't realise it's there, or reach for it wrong. A small, cheap model that reads a good Thought will outperform a much bigger model reasoning from scratch on the same problem — because the hard part of reasoning is the exploration, and a Thought lets you do that exploration somewhere else and hand the conclusions to the executor for free. The frontier reasoning models already do this on their own: a single forward pass that thinks for a long time, then answers from that thinking. The pass is opaque — you don't get to pick the planner, the budget, the persistence, or what the reasoning is allowed to see — but the leverage is real, and that's where the model's apparent intelligence is actually coming from.

You can build the same loop yourself, in the open. Run a sub-agent (or a sequence of them, or a cheaper model, or a domain-specialist model) on the parts that need thought, then write what they produced into the parent turn as Thoughts. The main model reads them the same way it reads its own reasoning — with the same lean — and skips the work it would have had to do to derive them. Done well, this is how a lightweight base model punches at frontier weight on the workloads you care about: you choose where to spend the reasoning budget, you choose what is worth persisting, and you choose which model is qualified to think about which problem. The ADK gives you the seam; what travels across it is your call.

The rules at the seam are the same in both directions. A Thought you injected from a sub-agent and a Thought an attacker injected through a poisoned retrieval are indistinguishable to the model the moment they land in the context — which is why the fencing pattern is non-negotiable on the executor side, and why the provenance of every Thought you write needs to be a decision you made on purpose, not a default your retrieval middleware fell into.