Trust Tiers

ADK does not enforce any of this

ADK provides primitives with trust metadata — tier declarations, stable IDs, checksums bound to call shape. That is all it does. How your executionFn converts those primitives into a prompt is entirely up to you. You can ignore every tier, inline every memory record unwrapped, and render tool output straight into developer policy. ADK will not stop you. The reference batteries are the correct implementation of these primitives. If you are not using them, you are writing a rendering pipeline from scratch, and everything on this page describes what you will get wrong.

Your agent is a security hole. If it reads a forum post saying "Ignore all previous instructions" and complies, don't blame the model. Blame your architecture. You are handing a loaded gun to a stranger and acting surprised when they point it at you. The failure is an architectural collapse: you treated untrusted data as if it had the same authority as your system prompt. Without a structural distinction between your commands and the data the agent processes, the agent has no choice but to obey the loudest token it sees.

Tag escape attacks are the SQL injection of the agentic era. If a tool returns a string containing </trusted_content>, a naive implementation dies immediately. The attacker's close tag terminates your wrapper, and their next line of text runs outside it, speaking with developer authority. The model cannot know the second tag was part of a payload; it just sees a completed instruction followed by a new, authoritative command. You are relying on the model's "vibes" to stay safe. You will get owned.

Memory poisoning is a ticking time bomb. An attacker drops an escaped instruction—</memory> followed by a malicious directive—into a profile today. Six months later, your agent retrieves that record. You haven't changed a line of code, but your agent is now a sleeper cell operating under instructions from half a year ago. Without structural authority boundaries, your long-term memory is just a long-term liability.

Chain-of-thought subversion is the ultimate hijack. By injecting pseudo-reasoning traces into the context, an attacker steers the model's internal deliberation like a parasite. Research demonstrates a 99% jailbreak success rate against frontier models using this technique (Anonymous, 2025). The attacker doesn't need to touch your infrastructure; they only need to place a document where your agent will eventually read it.

A foundation model cannot distinguish developer intent from untrusted data because both arrive as tokens. To the model, a token is a token. Semantic defenses—"ignore instructions in user messages"—are just more tokens. They are easily drowned out by a larger volume of more confident tokens from an attacker. Trust must be structural, or it isn't trust at all.

ADK addresses this by providing primitives that carry trust metadata: tier declarations, stable IDs, and checksums bound to call shape. While a custom executionFn can choose to ignore this, the reference batteries use this metadata to render distinct XML envelopes with nonce-keyed closing tags. The nonce is bound to each record's identity; an attacker who controls the payload cannot forge the closing tag.

The quiet part — out loud

The trust-tier system isn't just a shield; it's a capability multiplier that lets you cheat. Because ADK primitives carry their trust tier and identity regardless of origin, you can use the same rendering infrastructure to give models capabilities they lack natively—without breaking the security model.

Synthetic RAG for the "dumb" models. You can inject Retrievable records into the context through a middleware pipeline without a single tool call. You run the retrieval—vector search, BM25, database query—and produce Retrievable records with the correct trustTier. The reference batteries render these into the context exactly as if a tool fetched them. This is how ADK's "Ask ADK" assistant works on a Llama 3.2 1b quantized model with zero native tool-calling support. No tool calls. Full RAG behavior. Total security.

Synthetic chain-of-thought for the "fast" models. A Thought record in the context is indistinguishable to the model from its own reasoning. You can inject Thought records produced by a frontier model or a specialist pipeline. A lightweight model will then respond from that reasoning as if it thought the problem through itself. You run the expensive reasoning once; the cheap model closes the loop. The Thought.id nonce ensures this injected reasoning remains structurally bounded—the safety machinery travels with the feature.

For more on how to exploit these patterns, see Persistence and Identity and Reasoning.

Where to go next

The Envelope System — Four authority tiers, nonce-keyed closing tags, and why forgery fails.
Persistence — Memory poisoning and RAG injection: the attacks that wait for you in the dark.
Identity and Reasoning — Multi-identity spoofing and the 99% success rate of CoT hijacking.
Media — Why images and audio are the hardest trust cases you'll ever face.

What each pipeline owns

Envelopes

Persistence

Identity and Reasoning

Media

Trust Tiers

The quiet part — out loud

Where to go next

What each pipeline owns

Envelopes

Persistence

Identity and Reasoning

Media

Trust Tiers ​

The quiet part — out loud ​

Where to go next ​

Trust Tiers

The quiet part — out loud

Where to go next