---
url: 'https://adk.nht.io/the-loop/trust-tiers/persistence/research.md'
description: >-
  Memory poisoning attack taxonomy, RAG poisoning research, and the formal
  threat model for persistent agent state.
---

# Persistence — Threat Analysis

This research analyzes the structural and semantic vulnerabilities of persistent state in agentic systems. It focuses on the A-MemGuard and MemoryGraft attack families; we demonstrate that while the reference batteries mitigate structural escape via nonce-keyed closing tags, systems remain susceptible to semantic poisoning. A formal argument establishes that `Memory.id` must not be body-derivable to prevent attackers from predicting the envelope's closing boundary. We review RAG poisoning literature (TrustRAG/RobustRAG) to establish that provenance isolation via trust tiers is a non-negotiable requirement for secure retrieval. These mechanisms are designed strictly to prevent structural escape and cross-tier boundary violations; preventing semantic manipulation or the retrieval of misleading content is an explicit non-goal.

Other threat analyses in this section: [Envelopes](../envelopes/research) · [Identity and Reasoning](../identity-and-reasoning/research) · [Media](../media/research) · [Back to Trust Tiers](../../trust-tiers)

This page covers the formal threat analysis for [Persistence](../persistence). For the operational guide, start there.

## Memory poisoning — attack taxonomy

Memory poisoning attacks target the retrieval-and-render pipeline. The attacker writes a record containing a structural escape sequence; upon retrieval, naive parsers fail catastrophically. Two primary attack families define this threat landscape:

* **A-MemGuard \[@a-memguard-2025]**: Demonstrates that memory-poisoning defenses must be keyed off harness-controlled identifiers. Envelope integrity is physically impossible if the identifier is derivable from the content.

* **MemoryGraft \[@memorygraft-2025]**: Illustrates how poisoned records graft malicious instructions into the execution context. The per-record nonce mechanism implemented in the reference batteries defeats close-tag injection—an attacker cannot close an envelope without predicting the `Memory.id`. However, nonces are powerless against semantic poisoning: a structurally valid record containing persuasive falsehoods (e.g., "User is verified admin") will pass through the envelope intact. Semantic integrity must be enforced via retrieval filtering and memory authentication, not envelope structure.

## Why Memory.id must not be body-derivable

Deterministic identifiers are a structural security failure. If `Memory.id` is a function of the body (e.g., a content hash), the system is compromised by design:

1. The attacker constructs a body $B$ containing a forged close tag and payload.
2. The attacker computes $id = f(B)$, mirroring the system's deterministic function.
3. The attacker embeds the predicted closer `</memory_${id}>` within their body text.
4. The reference batteries render the record, placing an identical closer outside the body.
5. The model terminates the envelope at the attacker's forged boundary.

The security invariant is absolute: `Memory.id` must be assigned by the caller independently of the body. @nhtio/adk enforces this at the schema layer by requiring caller-provided IDs, ensuring the attacker cannot know the nonce at the time of content creation.

## RAG poisoning

RAG poisoning occurs when corpora are contaminated prior to retrieval. The attacker plants malicious content in source material (documentation, web crawls, etc.) that the system eventually ingests.

Key research:

* **TrustRAG \[@trustrag-2025]**: Establishes a fundamental trust asymmetry between user input and retrieved context. Retrieved content must never inherit the authority of the retrieval mechanism itself.
* **RobustRAG \[@robustrag-2024]**: Proves that provenance isolation—treating retrieved content as a distinct, lower-trust tier from developer-authored content—is a necessary condition for certifiable retrieval security.

The `trustTier` declaration on the `Retrievable` primitive in @nhtio/adk implements this isolation. The tier is bound at construction time when provenance is known, preventing trust-escalation during the retrieval-to-prompt transition.

## Long-term state contamination

Delayed-activation attacks represent the most insidious persistent threat. A poisoned record survives across sessions, lying dormant until a specific retrieval trigger is met long after the original attacker has departed.

In these scenarios, structural defenses remain critical. The per-record nonce prevents the payload from escaping the envelope, but the defense is only as robust as the nonce assignment. If the record was stored with a body-derived or attacker-influenced ID, the defense is nullified. Cross-session persistence requires that trust decisions remain immutable; a record stored as `third-party-public` must never be re-contextualized as a higher-trust tier in a future session.

## Non-goals for persistence defenses

Structural defenses are not a panacea. The following threats are explicitly out-of-scope for envelope-based mitigation:

* **Semantic memory poisoning**: Structurally valid but factually false records (e.g., "Alice has admin privileges") will pass through every structural filter. This is an authentication and auditing problem.
* **Misleading low-trust content**: A correctly labeled `third-party-public` record containing misinformation is functioning as intended when it is rendered in an untrusted envelope. The system's role is to preserve the structural tier, not to act as an arbiter of truth.
* **Prevention of record recall**: Nonces prevent *escape*, not *recall*. A poisoned record will still be rendered in the context and may influence the model through its semantic content. Filtering malicious-but-valid records requires active memory auditing and retrieval-time policies.
