Skip to content
7 min read · 1,340 words

LLM batteries

Writing custom HTTP fetch blocks, manual Server-Sent Events (SSE) streaming loops, and retry logic by hand is usually not the interesting part of your agent. Use the batteries unless you are deliberately replacing the execution loop.

typescript
executorCallback: new OpenAIChatCompletionsAdapter({ model, apiKey, autoAck: true }).executor()

That is it. That single line resolves the entire DispatchExecutorFn interface. autoAck: true tells the executor to call ctx.ack() automatically after a tool-call-free response; the default is false, meaning the implementor owns turn completion.

ADK ships two LLM batteries: OpenAIChatCompletionsAdapter and WebLLMChatCompletionsAdapter. They satisfy DispatchExecutorFn directly.

They handle SSE streaming, token math, safety envelopes, tool call dispatching, artifact forging, and transient error recovery. They are not small convenience helpers. They are executor implementations.

Compatible Endpoints

OpenAIChatCompletionsAdapter

The adapter works against any endpoint speaking the OpenAI Chat Completions wire format:

  • Cloud model APIs — anything that natively speaks this wire shape.
  • Self-hosted inference servers — any server that exposes a Chat Completions-compatible HTTP interface.
  • Proxy gateways and routing layers that expose a standard /v1/chat/completions interface.

Point baseURL at your endpoint. The adapter sends standard HTTP. It does not care what sits behind it.

WebLLMChatCompletionsAdapter

Runs models locally in the browser or supported JS runtimes via WebGPU. Use it for local-first or zero-server deployment models. It accepts WebLLMChatCompletionsAdapterOptions to configure loading and cache policies.

Construction and Validation

The constructor validates baseline options immediately on startup. Config bugs fail loud and fast. If you pass junk into OpenAIChatCompletionsAdapter, it throws E_INVALID_OPENAI_CHAT_COMPLETIONS_OPTIONS right away. If you pass junk into WebLLMChatCompletionsAdapter, it throws E_INVALID_WEBLLM_CHAT_COMPLETIONS_OPTIONS right away. Merged executor and stash overrides are revalidated at dispatch time.

typescript
import { OpenAIChatCompletionsAdapter } from '@nhtio/adk/batteries/llm/openai_chat_completions'

const adapter = new OpenAIChatCompletionsAdapter({
  model: process.env.MODEL_ID!,
  apiKey: process.env.API_KEY,
})

model is the only strictly required field. Everything else is optional; some fields have runtime defaults.

Validation on Overrides

Bypassing the constructor does not bypass validation. If you inject malformed config into executor overrides or the iteration stash, OpenAIChatCompletionsAdapter will throw E_INVALID_OPENAI_CHAT_COMPLETIONS_OPTIONS and WebLLMChatCompletionsAdapter will throw E_INVALID_WEBLLM_CHAT_COMPLETIONS_OPTIONS at dispatch time.

Three-Layer Options Merging

The adapter merges configuration from three sources at each iteration:

typescript
// Lowest precedence - the global fallback config
const adapter = new OpenAIChatCompletionsAdapter({
  model: 'gpt-4o',
  apiKey: process.env.OPENAI_API_KEY,
  temperature: 0.7,
  autoAck: true,
})
typescript
// Mid precedence - applies to every turn run by this TurnRunner
const runner = new TurnRunner({
  ...storageAdapter,
  executorCallback: adapter.executor({
    temperature: 0.2, // Overrides 0.7 constructor baseline
    max_completion_tokens: 1024,
  }),
})
typescript
// Highest precedence - dynamic adjustments for a single iteration
const costControlMiddleware: DispatchPipelineMiddlewareFn = async (ctx, next) => {
  ctx.stash.set(OpenAIChatCompletionsAdapter.STASH_KEY, {
    model: 'gpt-4o-mini', // Downgrade model dynamically
    temperature: 0.0,
  })
  await next()
}

Bracket Access Mismatch

ctx.stash is a Registry instance, not a plain object. Bracket assignment like ctx.stash[STASH_KEY] = ... will not type-check, and the adapter reads only via .get(). Use ctx.stash.set(OpenAIChatCompletionsAdapter.STASH_KEY, ...).

Merging Rules

  • For headers, helpers, and retry: layers are merged key-by-key. A stash override that sets one custom header does not clear the headers defined in your constructor.
  • For all other fields: the highest-precedence layer with a defined value completely replaces lower-precedence configurations.

ADK Control Fields

These fields configure the adapter's runtime behavior:

FieldTypePurpose
modelstringRequired. Model identifier passed to the model endpoint.
apiKeystringBearer token for endpoint authentication.
baseURLstringEndpoint URL. Defaults to https://api.openai.com/v1.
headersRecord<string, string>Custom HTTP headers sent with every request.
streambooleanToggles SSE streaming. Default true.
streamIdleTimeoutMsnumberAborts request if the stream goes silent for this period.
requestTimeoutMsnumberAbsolute timeout limit for the entire HTTP transaction.
retryChatCompletionsRetryConfigCustom retry configuration for handling transient errors.
fetchtypeof globalThis.fetchCustom HTTP fetch engine.
contextWindownumberTotal context budget; the adapter throws if this threshold is crossed.
tokenEncodingTokenEncoding | nullToken encoding used for local context calculations. Non-null requires contextWindow.
selfIdentitystringIdentifies the model for cleaning up raw reasoning traces.
thoughtSurfacing'all-self' | 'latest-self' | 'all'Controls which persisted thoughts are replayed into model history.
replayCompatibilityReadonlyArray<string>Forwards reasoning steps to compatibility-constrained endpoints.
bucketOrderChatCompletionsBucketOrderSets the sorting order for system prompt segments.
helpersPartial<ChatCompletionsHelpers>Overrides specific translation steps.
autoAckbooleanAutomatically calls ctx.ack() after a tool-call-free response. Default false.
strictToolChoicebooleanHalts execution if tool_choice demands an ephemeral artifact tool. Default false.
unsupportedMediaPolicystringStrategy when media inputs are incompatible with model modalities. Default 'throw'.

autoAck

autoAck defaults to false. When false, the executor stores the assistant message and reports it, but does not call ctx.ack() — turn completion is the implementor's responsibility. This is the right default: auto-acking seizes turn-completion control from the output pipeline and prevents any quality gate (output filter, confidence check, human-in-the-loop approval) from running before the turn is declared done.

Set autoAck: true when you are building a single-shot executor with no output-side gate and you want the executor to own the full lifecycle. Every example in this page that wires an adapter directly into a TurnRunner sets autoAck: true so the turn ends after the first tool-call-free response. If you are building a pipeline that gates on output content, omit autoAck and call ctx.ack() yourself after your gate passes.

Model Request Body Fields

Schema-supported request body fields not explicitly defined in the ADK control group are forwarded in the JSON request body payload:

typescript
const adapter = new OpenAIChatCompletionsAdapter({
  model: 'gpt-4o',
  temperature: 0.7,
  max_completion_tokens: 2048,
  response_format: { type: 'json_object' },
  reasoning_effort: 'high',
  seed: 42,
})

Supported fields include: temperature, top_p, max_tokens, max_completion_tokens, stop, seed, presence_penalty, frequency_penalty, logit_bias, logprobs, top_logprobs, n, parallel_tool_calls, tool_choice, response_format, reasoning_effort, service_tier, store, metadata, and user.

Automatic Tool Forging

The Chat Completions adapter handles SpooledArtifact tool forging internally — it calls SpooledArtifact.forgeTools() for you.

Manual .bindContext() plumbing in your pipelines is unnecessary for local iteration-scope tools. The adapter merges via ToolRegistryToolRegistry.merge([ctx.tools, ...forged], { onCollision: 'replace' }) — dynamically during each dispatch iteration, then calls mergedRegistry.bindContext(ctx).

Overriding Translation Helpers

The adapter uses 18 translation hooks defined under ChatCompletionsHelpers to format core ADK types into standard Chat Completions message payloads. You do not need to rewrite all 18 from scratch; pass the specific fields you want to override via options.helpers.

typescript
const adapter = new OpenAIChatCompletionsAdapter({
  model: 'gpt-4o',
  autoAck: true,
  helpers: {
    renderStandingInstructions: (items) => {
      // items is Iterable<Tokenizable>
      return Array.from(items, (item) => `[INSTRUCTION]: ${String(item)}`).join('\n')
    },
    renderUntrustedContent: (content, attrs) => {
      return `[UNTRUSTED DATA id=${attrs.nonce}]\n${content}\n[END UNTRUSTED]`
    },
  },
})

The translation interface functions:

Helper HookPurpose
renderUntrustedContentFences third-party content using randomized nonces.
renderTrustedContentFormats safe, first-party content blocks.
renderStandingInstructionsCompiles Iterable<Tokenizable> into a system prompt section.
renderMemoriesTranslates Iterable<{ memory: Memory; attrs: MemoryAttrs }> loaded memory records.
renderRetrievableSafetyDirectivePrepends instructions alerting the model to retrieval content boundaries.
renderFirstPartyRetrievablesFormats safe Iterable<{ retrievable: Retrievable; attrs: RetrievableAttrs }> records.
renderThirdPartyPublicRetrievablesFormats untrusted public search indexing records.
renderThirdPartyPrivateRetrievablesFormats restricted third-party data extractions.
renderRetrievablesTop-level dispatcher orchestrating the safe rendering of all retrievals.
renderTimelineMessageTranslates a single Message timeline record.
renderThoughtEncapsulates model-generated chain-of-thought metadata.
filterThoughtsTruncates or selects thoughts according to the configured thoughtSurfacing policy.
toolsToChatCompletionsToolsFormats ADK Tool instances into API tool declarations.
renderChatCompletionsSystemPromptConcatenates all context blocks into the final primary system instructions.
renderChatCompletionsToolCallResultTool result → tool message content.
descriptionToChatCompletionsJsonSchemaMaps ADK type descriptions down to strict JSON schemas.
buildChatCompletionsHistoryConstructs the absolute request message list combining history, memories, system prompts, and tool sequences.
createChatCompletionsToolCallDeltaAccumulatorManages streaming string accumulation for building completed tool structures.

The Battery as Reference Implementation

If you are determined to write a custom executor, study the OpenAIChatCompletionsAdapter source first. It is the broadest execution loop in the codebase. Pay specific attention to:

  • How configuration layers are merged securely and validated before calling the model.
  • How context components (ctx.turnMessages, ctx.turnMemories, ctx.turnRetrievables, and ctx.tools) are merged dynamically.
  • How SSE chunks are parsed, and how streamIdleTimeoutMs prevents silent hangs.
  • How the executor reports messages, thoughts, and tool calls via DispatchExecutorHelpers.
  • How the system ensures ctx.ack() and ctx.nack() are executed deterministically, especially when requests fail.