@nhtio/adk/batteries/llm/webllm_chat_completions/adapter
Cross-environment executor adapter for WebLLM Chat Completions compatible endpoints.
Remarks
Cross-environment LLM adapter for the WebLLM Chat Completions wire shape. Chat Completions was chosen as the ADK's reference adapter because it is the de-facto interchange format for the majority of OpenAI-compatible gateways (vLLM, Together, Groq, Fireworks, Ollama, Azure OpenAI, OpenRouter, DeepSeek, Mistral La Plateforme, and most self-hosted deployments). Its tool-call synthetic-history shape (role: 'assistant', tool_calls: [...] followed by role: 'tool' with tool_call_id) is the lowest-common-denominator that every conformant gateway accepts.
The adapter is built around three pluggable layers:
- Translation helpers — the thirteen swappable functions exported from
./helpersturn ADK primitives (@nhtio/adk!Tokenizable, @nhtio/adk!Memory, @nhtio/adk!Message, @nhtio/adk!Thought, @nhtio/adk!ToolCall, @nhtio/adk!Tool, @nhtio/adk!ArtifactTool, @nhtio/adk!SpooledArtifact) into Chat Completions wire shapes. Consumers override individual helpers viaoptions.helpers.*to customise envelope formats, bucket ordering, thought surfacing, or JSON Schema generation without forking the adapter. - Three-layer options merging — constructor baseline, per-
executor()overrides, and per-iterationctx.stash.webLLMChatCompletionsoverrides combine with key-by-key precedence forhelpersand wholesale replacement for everything else. The merged shape is re-validated on every iteration so a malformed stash override fails loud, not silently. - WebLLM engine invocation — accepts a preloaded
engineor lazycreateEnginefactory. The resolved request body is passed directly to WebLLM's OpenAI-compatible chat API.
Per-iteration flow (steps 1–9 of the plan):
- Merge constructor / executor / stash options and re-validate.
- Resolve helpers, falling back to bundled
default*for each unset field. - Forge artifact-query tools by walking
ctx.turnToolCalls, collecting uniqueSpooledArtifactconstructors, calling<Ctor>.forgeTools(ctx)on each, and merging the results withctx.tools. - Pre-render every persisted tool-call result into the prompt-ready string the timeline will use, cached by
tc.id. - When
tokenEncoding !== null, sum the token weight of every persisted bucket and throw @nhtio/adk/batteries!E_WEBLLM_CHAT_COMPLETIONS_CONTEXT_OVERFLOW when the total exceedscontextWindow. - Build the request body via
buildChatCompletionsHistory; carry vendor-opaque reasoning blocks through the_adk_reasoning_payloadsside-channel. - Resolve or lazily create a WebLLM engine and call
engine.chat.completions.create(body). - Streaming path: consume WebLLM's async chunk iterable; surface deltas through
helpers.reportMessage/reportThought/reportToolCall; assemble tool-call deltas via the accumulator; persistMessage/Thought/ToolCallrecords on stream end. - Non-streaming path: consume the returned Chat Completion object; same persistence + tool-execution loop.
Classes
| Class | Description |
|---|---|
| WebLLMChatCompletionsAdapter | Opinionated cross-environment LLM adapter for the WebLLM Chat Completions wire shape. |