---
url: 'https://adk.nht.io/assembly/batteries-embeddings.md'
description: >-
  OpenAI and WebLLM embedders with one shared shape — wire them into your own
  retrieval.
---

# Embeddings batteries

## LLM summary — Embeddings batteries

* `@nhtio/adk/batteries/embeddings/openai` ships the `OpenAIEmbeddingsAdapter` class (raw `fetch` against `/v1/embeddings`; runs in Node, browser, edge, workers).
* `@nhtio/adk/batteries/embeddings/webllm` ships the `WebLLMEmbeddingsAdapter` class (in-process WebGPU embeddings via `@mlc-ai/web-llm`; browser-only).
* The two batteries are the **same shape** — identical methods, identical return types, identical option base — differing only in the engine.
* Method surface (both): `static isAvailable()`, `get dimensions`, `preload()`, `reset()`, `embed(text, opts?)`, `embedMany(texts, opts?)`.
* `embed` returns `Promise<number[]>`; `embedMany` returns `Promise<number[][]>` (one vector per input, in input order). This is the native wire shape of both engines — NOT `Float32Array`. Coerce to `Float32Array` at your own boundary if you pack vectors into a buffer.
* Required constructor field: `model: string` — **no default**. Naming the model is the caller's responsibility.
* Shared option fields: `model`, `queryPrefix?`, `documentPrefix?`, `dimensions?`.
* `embed`/`embedMany` accept `{ kind?: 'query' | 'document' }`. Default `'document'`. `kind: 'query'` prepends `queryPrefix`; `kind: 'document'` prepends `documentPrefix` — each only when that prefix option is set. Asymmetric models like Snowflake Arctic Embed need a query prefix (`'Represent this sentence for searching relevant passages: '`) and no document prefix.
* OpenAI-only fields: `apiKey`, `baseURL`, `headers`, `fetch`, `retry`, `requestTimeoutMs`, plus `dimensions` (forwarded to the API for output-dim truncation on `text-embedding-3-*`).
* WebLLM-only fields: `engine?` (pre-loaded), `createEngine?`, `engineConfig?`, `chatOptions?`, `onInitProgress?`, `isWebGPUAvailable?`.
* Construction validates eagerly: bad options throw `E_INVALID_OPENAI_EMBEDDINGS_OPTIONS` / `E_INVALID_WEBLLM_EMBEDDINGS_OPTIONS`.
* These batteries are NOT executors and ADK has no embeddings primitive. They are tools you call from your own retrieval middleware (`turnInputPipeline`) to score and inject [`Retrievable`](https://adk.nht.io/api/@nhtio/adk/common/classes/Retrievable) records. See [Bring your own retrieval](/assembly/byo-retrieval).
* `@mlc-ai/web-llm` is an optional peer dependency, imported lazily — non-WebGPU consumers pay nothing.

ADK has no embeddings primitive. The harness only ever sees a [`Retrievable`](https://adk.nht.io/api/@nhtio/adk/common/classes/Retrievable) and its `score` — never a vector. So these batteries are not executors and they do not plug into a callback slot. They are embedders: you call them from your own retrieval middleware to turn text into vectors, rank your corpus, and inject the winners as `Retrievable` records. See [Bring your own retrieval](/assembly/byo-retrieval) for where that middleware lives.

ADK ships two embedders, and they are deliberately the **same battery in every respect except the engine**:

* [`OpenAIEmbeddingsAdapter`](https://adk.nht.io/api/@nhtio/adk/batteries/embeddings/openai/adapter/classes/OpenAIEmbeddingsAdapter) — POSTs to an OpenAI-`/v1/embeddings`-compatible endpoint over raw `fetch`. No SDK, so it runs unchanged in Node, the browser, edge runtimes, and workers. Point `baseURL` at OpenAI, Azure-behind-a-proxy, vLLM, Together, or a local gateway.
* [`WebLLMEmbeddingsAdapter`](https://adk.nht.io/api/@nhtio/adk/batteries/embeddings/webllm/adapter/classes/WebLLMEmbeddingsAdapter) — embeds in-process on WebGPU via `@mlc-ai/web-llm`'s `engine.embeddings.create()`. No network, no API key. Browser-only.

Both expose `embed` / `embedMany` / `dimensions` / `preload` / `reset` / `isAvailable` with the same signatures, return plain `number[]`, and handle query/document prefixes identically. You can swap one for the other by changing the constructor and nothing else.

## One shape, two engines

```typescript
import { OpenAIEmbeddingsAdapter } from '@nhtio/adk/batteries/embeddings/openai'

const embedder = new OpenAIEmbeddingsAdapter({
  model: 'text-embedding-3-small',
  apiKey: process.env.OPENAI_API_KEY,
})

const qv = await embedder.embed('how do trust tiers work?', { kind: 'query' })
const docs = await embedder.embedMany(corpusTexts) // kind defaults to 'document'
```

The WebLLM battery is the same call shape — only the constructor differs:

```typescript
import { WebLLMEmbeddingsAdapter } from '@nhtio/adk/batteries/embeddings/webllm'

const embedder = new WebLLMEmbeddingsAdapter({
  model: 'snowflake-arctic-embed-m-q0f32-MLC',
  // Arctic is asymmetric: prefix queries, leave documents bare.
  queryPrefix: 'Represent this sentence for searching relevant passages: ',
  onInitProgress: (r) => console.log(r.text),
})

if (!embedder.isAvailable()) throw new Error('WebGPU required for the WebLLM embedder')
await embedder.preload() // warm the engine before the first query

const qv = await embedder.embed('how do trust tiers work?', { kind: 'query' })
```

## Return type: `number[]`, not `Float32Array`

`embed` returns `number[]` and `embedMany` returns `number[][]` — the native shape of both the OpenAI `/v1/embeddings` response (`encoding_format: 'float'`) and WebLLM's `Embedding.embedding`. If you pack vectors into a contiguous typed-array buffer for fast cosine math, coerce at your boundary:

```typescript
const vec = new Float32Array(await embedder.embed(text, { kind: 'query' }))
```

## `model` is required — no default

Neither battery defaults the model. The right embedding model is a deployment decision (dimensionality, language, cost, latency), so you must name it. A missing or empty `model` throws at construction:

```typescript
new OpenAIEmbeddingsAdapter({}) // throws E_INVALID_OPENAI_EMBEDDINGS_OPTIONS
```

## Query vs document prefixes

Asymmetric embedding models expect an instruction prefix on the **query** side and none on the **document** side. The `kind` option drives this from one shared code path:

* `kind: 'query'` → prepend `queryPrefix` (if set).
* `kind: 'document'` → prepend `documentPrefix` (if set). Default when `kind` is omitted.

Set the prefixes once on the constructor; the battery applies them per call. Symmetric models (e.g. OpenAI `text-embedding-3-*`) need no prefix — leave both unset.

## Wiring an embedder into retrieval

Embedders produce vectors; **you** own the vector store and the ranking. The pattern is: embed the query in `turnInputPipeline`, search your store, and inject the hits as [`Retrievable`](https://adk.nht.io/api/@nhtio/adk/common/classes/Retrievable) records with the similarity in `score`. The executor renders those records inside trust-tier envelopes — see [Bring your own retrieval](/assembly/byo-retrieval).

```typescript
import { Retrievable } from '@nhtio/adk/common'
import { OpenAIEmbeddingsAdapter } from '@nhtio/adk/batteries/embeddings/openai'

const embedder = new OpenAIEmbeddingsAdapter({
  model: 'text-embedding-3-small',
  apiKey: process.env.OPENAI_API_KEY,
})

const retrievalMiddleware = async (ctx, next) => {
  const query = [...ctx.turnMessages].at(-1)?.content.toString() ?? ''
  if (query) {
    const qv = await embedder.embed(query, { kind: 'query' })
    const hits = await myVectorStore.search(qv, { topK: 5 }) // your store, your search

    for (const hit of hits) {
      const now = new Date()
      ctx.turnRetrievables.add(
        new Retrievable({
          id: hit.id,
          content: hit.text,
          trustTier: 'first-party',
          source: hit.url,
          score: hit.similarity, // the embedding-derived rank lands here
          createdAt: now,
          updatedAt: now,
        })
      )
    }
  }
  return next()
}
```

Embed your corpus the same way — with `embedMany(texts)` (documents) — at ingest time, store the vectors, and you are done. ADK plays no part in ingestion; it only consumes the `Retrievable` records your middleware produces.

## Optional peer dependency

The WebLLM battery imports `@mlc-ai/web-llm` lazily, and the package is declared as an **optional** peer dependency. Consumers who only use the OpenAI battery (or no embeddings at all) install nothing extra and pay nothing in their bundle.
