Skip to content
6 min read · 1,170 words

The executor seam

The executor is one DispatchExecutorFn callback. The runner invokes it once per iteration with two arguments — the dispatch context ctx (DispatchContext) and the streaming helpers helpers (DispatchExecutorHelpers) — and reads what the executor signals to decide whether to loop again. What happens between the invocation and the signal is the executor's territory.

LLM Dispatch covers the dispatch contract and the iteration loop overview.

The callback shape

ts
type DispatchExecutorFn = (ctx: DispatchContext, helpers: DispatchExecutorHelpers) => Promise<void>

The executor is the integration

Your model client stays yours. The runner does not embed it, does not assume its wire shape, and does not care whether there is a model at all. A hosted API, a local runtime, an in-browser runtime, a recorded fixture, a deterministic policy module — same seam, same contract. ADK is permissive about what is on the other side and intolerant about the boundary itself.

What the runner puts on ctx

What the runner puts on ctx is what the ADK has decided the model should see this iteration: ctx.systemPrompt (see DispatchContext.systemPrompt), ctx.standingInstructions (see DispatchContext.standingInstructions), ctx.turnMemories (see DispatchContext.turnMemories), ctx.turnRetrievables (see DispatchContext.turnRetrievables), ctx.turnMessages (see DispatchContext.turnMessages), ctx.turnThoughts (see DispatchContext.turnThoughts), ctx.turnToolCalls (see DispatchContext.turnToolCalls), ctx.tools (see DispatchContext.tools). Earlier middleware filled those collections; later iterations see whatever the previous iteration persisted. The executor reads.

What helpers gives the executor

What the runner gives the executor through helpers is a streaming surface — helpers.reportMessage(id, aDelta, opts?), helpers.reportThought(id, aDelta, opts?), helpers.reportToolCall(id, partial), plus the structured helpers.log channel. Helpers accumulate per-id state across iterations, emit normalised TurnStreamableContent / TurnToolCallContent payloads to whoever is listening, and seal a stream when isComplete: true is set. Helpers do not persist; they stream the wire shape.

aDelta is an additive delta. Pass the new chunk, not the running total.

The aDelta argument is the incremental text added since the previous emission for that id — the new chunk to append. The helper concatenates it onto the per-id buffer and emits the running full. If the executor passes the full accumulated text every time, the helper concatenates that onto what it already has and the emitted full doubles, then triples, then quadruples on every chunk. The same id receives the same payload twice over.

Wire shape from a streaming SDK is almost always already in additive-delta form (chunk.delta, chunk.content, etc.) — pass it through unchanged. If you only have a running total from your provider, compute the delta yourself before calling report*.

What the executor calls on ctx to write

What the executor calls on ctx to write is the persistence surface — ctx.storeMessage(record) (see DispatchContext.storeMessage), ctx.storeThought(record) (see DispatchContext.storeThought), ctx.storeToolCall(record) (see DispatchContext.storeToolCall), plus the matching mutate* and delete* family. Persistence stores the canonical record, which carries fields the wire shape does not (role, identity, Tokenizable content, replayCompatibility, …).

Helpers and persistence are deliberately decoupled for two reasons. Storage is asynchronous; event consumption usually is not — observers and UI listeners want to render the next delta the moment it lands, not after a database round-trip. And storage has latency and per-write cost; streaming deltas through it would thrash any real storage layer. The convention is to emit per delta via helpers.report* and persist once per logical record via ctx.store* after the stream seals. You can wire your storage adapter to the event bus — the ADK will not stop you — but you are taking on the latency and write-amplification yourself.

Invoking tools

Tool handlers belong inside the executor iteration that proposed them

Nothing in the runtime enforces this; the convention is load-bearing anyway. The executor invokes them. Move tool execution later and the model never sees the result. Invoke them twice and your side effects fire twice. That is the boundary. When the model returns a tool call on iteration N, the executor calls tool.executor(ctx)(args) inside that same iteration, persists the completed ToolCall record (with results populated) via ctx.storeToolCall(...), and only then returns — so that iteration N+1's model call sees the tool result in ctx.turnToolCalls and can reason about it. That two-iteration round trip — model proposes → executor calls handler → executor persists → next iteration sees result — is the convention that makes the dispatch loop useful.

What this means for middleware authors: do not re-invoke tool handlers from turnOutputPipeline or anywhere else after the executor already handled them. Doing it after the loop has exited means the model never saw the result, defeating the loop. Doing it from dispatchOutputPipeline is also wrong if the executor in use already invoked them — you double-fire side effects. Tool execution is the executor's responsibility by convention; pipeline middleware sees the resulting ToolCall records and reacts to them, rather than re-running them.

The reference OpenAIChatCompletionsAdapter follows this convention: it drains streamed tool-call deltas, validates args, calls tool.executor(ctx)(args), wraps the result, and stores the completed record — all before returning from the executor body.

tool.executor(ctx)(args) (see Tool.executor) is the only authorised entry point to a tool's handler — it validates args against the tool's schema, fires toolExecutionStart / toolExecutionEnd, computes the stable callId checksum, and wraps downstream errors as E_TOOL_DOWNSTREAM_ERROR. See Tools.

What the executor returns

What the executor returns is one of three things, and they are how the runner decides what to do next:

  • DispatchContext.ack — the dispatch is done. The runner runs the iteration's output middleware, flushes deltas, and exits the loop.
  • ctx.nack(error) (see DispatchContext.nack) — the dispatch failed. Same flush and exit, but dispatchEnd.status === 'nack' and dispatchEnd.error carries the cause.
  • Return without signalling. The runner increments ctx.iteration (see DispatchContext.iteration) and re-enters the loop. The next iteration sees ctx.turnMessages, ctx.turnThoughts, and ctx.turnToolCalls populated with whatever the executor persisted during this one. This is how the loop "gives the model its tool results back" — persist completed ToolCall records (with ToolCall.results), return, and they appear in the next iteration's context.

Nothing else the executor does terminates the loop. Emitting through helpers does not. Persisting does not. Throwing does (wrapped as E_LLM_EXECUTION_EXECUTOR_ERROR), but that is a failure surface, not a control flow primitive.

The reference battery is an example, not a template

The OpenAIChatCompletionsAdapter battery is one executor: it projects ADK primitives into chat-completions wire shape, streams SSE, retries with backoff, dispatches tool calls inline, and nacks with stable exception codes. Read it to see the seam exercised end-to-end. Don't copy its provider plumbing blindly. Do copy the boundary discipline: stream through helpers, persist canonical records, invoke tool handlers inside the iteration, and signal deliberately. Your executor can look different; it does not get to blur those seams.

ADK-side facts and helpers vs persistence

A handful of ADK-side facts constrain the executor's surface — store queueing, sealed-stream rules, abort wiring — and the helpers/persistence split is deliberately decoupled because the wire shape and the canonical record are not the same data.

→ Continue reading: ADK-side facts and helpers vs persistence