Skip to content
5 min read · 1,028 words

Bring your own memory

Memory is not conversation history.

An agent's history is not its memory. Conversation history is a log of utterances — a chronological stream of messages. Memory is the distillation of facts learned over time. History records what was said; memory records what is true.

Conflate these two concepts and the agent becomes slow, expensive, and fragile. A scrapbook is not a brain, despite what every product demo keeps implying.

Memory vs. Messages vs. Retrievables

Three primitives carry different kinds of context into the executor:

PrimitiveUse forScope
MessageConversation history — what was said in this sessionCurrent turn and recent history
RetrievableKnowledge base content — documents, wikis, manualsFetched fresh per turn from external stores
MemoryDurable facts about the user or domain — preferences, decisions, namesPersists across turns and sessions

If the user says they prefer metric units: that is a Memory. If the user says their name is Alex: that is a Memory. If the project deadline moves to December 15: that is a Memory.

If the user asks "what is the capital of France": the answer does not need to be a Memory. If you have a knowledge base article about France: that is a Retrievable.

Conflating these primitives is a direct path to performance degradation. Shoving conversation logs into memory slots bloats your context window. Treat them as distinct. Memory is the right slot only when the fact is durable, personal, and worth recalling across future sessions.

The Memory Lifecycle

Memory flows through a strict, manual lifecycle on every turn:

  1. Load: Your turnInputPipeline middleware calls TurnContext.fetchMemories, which invokes your TurnRunnerConfig.fetchMemoriesCallback. Your middleware must iterate these results and call ctx.turnMemories.add(m) manually. ADK does not auto-hydrate context. For the canonical pipeline setup, see Context Hydration.
  2. Use: Your executor (or the Chat Completions battery executor) reads ctx.turnMemories when building the provider request. The Chat Completions battery renders loaded memories inside its prompt envelopes.
  3. Write: During execution, the executor or output middleware calls TurnContext.storeMemory to persist a new fact.
  4. Persist: Your TurnRunnerConfig.storeMemoryCallback writes the record to your storage layer.

Use the Memory Battery

If you want model-managed memory CRUD without writing custom callbacks, import memoryTools from @nhtio/adk/batteries/tools/memory. The battery provides the tooling; your storage callbacks still decide what persists.

Write Patterns

Choose one audited path for your memory writes — either inside the executor or within output middleware. Scattered memory writes across arbitrary event listeners become an incident report with extra steps.

ts
import type { DispatchExecutorFn } from '@nhtio/adk'
import { Memory } from '@nhtio/adk'

type ModelResponse = {
  detectedPreference?: string
}

declare function callModel(ctx: Parameters<DispatchExecutorFn>[0]): Promise<ModelResponse>

const executor: DispatchExecutorFn = async (ctx) => {
  try {
    // Run the model, get a response
    const response = await callModel(ctx)

    // Extract a preference from the model's response or from the user's message
    if (response.detectedPreference) {
      await ctx.storeMemory(new Memory({
        id: crypto.randomUUID(),
        content: response.detectedPreference,
        confidence: 0.8,
        importance: 0.6,
        createdAt: new Date(),
        updatedAt: new Date(),
      }))
    }

    ctx.ack()
  } catch (error) {
    ctx.nack(error instanceof Error ? error : new Error(String(error)))
  }
}
ts
import type { TurnPipelineMiddlewareFn } from '@nhtio/adk'
import { Memory } from '@nhtio/adk'

const memoryExtractionMiddleware: TurnPipelineMiddlewareFn = async (ctx, next) => {
  await next() // Let the turn complete successfully first

  // Analyze the turn's messages for memorable facts
  const newFacts = await extractFacts([...ctx.turnMessages])

  for (const fact of newFacts) {
    await ctx.storeMemory(new Memory({
      id: crypto.randomUUID(),
      content: fact,
      confidence: 0.8,
      importance: 0.6,
      createdAt: new Date(),
      updatedAt: new Date(),
    }))
  }
}
ts
// A background worker runs out-of-band to synthesize facts across sessions.
// It writes memory records directly to your database.
// ADK has no role in this process; it only fetches the records during the turn.

Output pipeline only runs on success

turnOutputPipeline does not run if the turn fails. If the executor throws or the dispatch ends nacked, turnOutputPipeline is skipped. Critical session cleanup and non-negotiable state transitions belong somewhere else.

fetchMemoriesCallback — Read with Ranking

Your TurnRunnerConfig.fetchMemoriesCallback is your primary lever for context management. Rank and filter the results. Flooding the context window with raw database dumps is an expensive way to degrade model performance.

typescript
import type { Memory, MemoryRetrievalFn } from '@nhtio/adk'

declare const memoryStore: {
  findBySession(sessionId: string): Promise<Memory[]>
}

declare function rankByRelevance(memories: Memory[], query: string): Memory[]

const fetchMemoriesCallback: MemoryRetrievalFn = async (ctx) => {
  const sessionId = ctx.stash.get<string | undefined>('sessionId')
  if (!sessionId) {
    return []
  }

  const allMemories = await memoryStore.findBySession(sessionId)

  // Rank by relevance to the current message
  const lastMessage = [...ctx.turnMessages].at(-1)
  const ranked = rankByRelevance(allMemories, lastMessage?.content?.toString() ?? '')

  // Respect context budget — return only the top N
  return ranked.slice(0, 10)
}

Context budget is your responsibility. If your callback returns hundreds of memories on every turn, your context window fills with noise before the executor even starts reasoning.

Memory Poisoning Defense

Memory is the highest-value attack target in an agentic system. Unlike a prompt injection that targets a single turn, poisoning your memory store corrupts every subsequent turn — indefinitely — until you query and scrub the database manually.

Unvalidated memory is a backdoor. Memory poisoning persists across every future session until you find the bad row.

If an attacker can feed malicious text to the agent, they can trick the model into calling storeMemory with instructions designed to hijack future execution.

Defend your system:

  1. Audit your write path. Keep memory writes in one audited path: the executor OR output middleware. If telemetry or events can write to your DB, you have created an untraceable side-channel.
  2. Never store raw user text as trusted. If you must store unstructured user inputs as memories, store trust/source metadata in your own persistence schema or validation layer. ADK Memory only has id, content, confidence, importance, createdAt, and updatedAt.
  3. Audit memory writes. Log every invocation of storeMemoryCallback with its source, session context, and content.
  4. Enforce structured extraction. Force the model to extract validated schemas (key-value pairs, tagged entities) rather than accepting raw, unparsed text.

Memory Lifecycle Policy

Databases do not clean themselves. If you do not write a deletion and pruning strategy, your context window will eventually choke on outdated junk. Explicitly define:

  • Update vs. overwrite: If the user says they liked blue on Monday and red on Tuesday, does the old preference get overwritten? Appended as a history entry? Flagged for conflict resolution?
  • Expiry: Do memories from six months ago still apply? Implement TTL or staleness scoring in TurnRunnerConfig.fetchMemoriesCallback — stale facts belong in storage, not in the prompt.
  • Deletion: When a user asks the agent to "forget" something, your TurnRunnerConfig.deleteMemoryCallback must physically purge or soft-delete it from your storage layer.

Your TurnRunnerConfig.fetchMemoriesCallback encodes the read policy. Your TurnRunnerConfig.storeMemoryCallback and TurnRunnerConfig.mutateMemoryCallback encode the write policy. Your application logic (executor or output middleware) encodes the lifecycle policy. All three are yours to implement.

What You Must Implement

  1. Memory storage — a database table, collection, or store that holds Memory records keyed by session or user ID.
  2. TurnRunnerConfig.fetchMemoriesCallback — query your storage, rank by relevance, and return within your context budget.
  3. turnInputPipeline middleware — call TurnContext.fetchMemories and .add() each result to ctx.turnMemories. ADK will not do this for you automatically.
  4. Write logic — choose one audited pattern (executor or output middleware) and implement memory extraction and validation.
  5. Lifecycle policy — define rules for updates, conflicts, expiry, and deletion.
  6. TurnRunnerConfig.mutateMemoryCallback and TurnRunnerConfig.deleteMemoryCallback — wire these to your storage layer so that updates and explicit deletions persist correctly.