Embeddings batteries
ADK has no embeddings primitive. The harness only ever sees a Retrievable and its score — never a vector. So these batteries are not executors and they do not plug into a callback slot. They are embedders: you call them from your own retrieval middleware to turn text into vectors, rank your corpus, and inject the winners as Retrievable records. See Bring your own retrieval for where that middleware lives.
ADK ships two embedders, and they are deliberately the same battery in every respect except the engine:
OpenAIEmbeddingsAdapter— POSTs to an OpenAI-/v1/embeddings-compatible endpoint over rawfetch. No SDK, so it runs unchanged in Node, the browser, edge runtimes, and workers. PointbaseURLat OpenAI, Azure-behind-a-proxy, vLLM, Together, or a local gateway.WebLLMEmbeddingsAdapter— embeds in-process on WebGPU via@mlc-ai/web-llm'sengine.embeddings.create(). No network, no API key. Browser-only.
Both expose embed / embedMany / dimensions / preload / reset / isAvailable with the same signatures, return plain number[], and handle query/document prefixes identically. You can swap one for the other by changing the constructor and nothing else.
One shape, two engines
import { OpenAIEmbeddingsAdapter } from '@nhtio/adk/batteries/embeddings/openai'
const embedder = new OpenAIEmbeddingsAdapter({
model: 'text-embedding-3-small',
apiKey: process.env.OPENAI_API_KEY,
})
const qv = await embedder.embed('how do trust tiers work?', { kind: 'query' })
const docs = await embedder.embedMany(corpusTexts) // kind defaults to 'document'The WebLLM battery is the same call shape — only the constructor differs:
import { WebLLMEmbeddingsAdapter } from '@nhtio/adk/batteries/embeddings/webllm'
const embedder = new WebLLMEmbeddingsAdapter({
model: 'snowflake-arctic-embed-m-q0f32-MLC',
// Arctic is asymmetric: prefix queries, leave documents bare.
queryPrefix: 'Represent this sentence for searching relevant passages: ',
onInitProgress: (r) => console.log(r.text),
})
if (!embedder.isAvailable()) throw new Error('WebGPU required for the WebLLM embedder')
await embedder.preload() // warm the engine before the first query
const qv = await embedder.embed('how do trust tiers work?', { kind: 'query' })Return type: number[], not Float32Array
embed returns number[] and embedMany returns number[][] — the native shape of both the OpenAI /v1/embeddings response (encoding_format: 'float') and WebLLM's Embedding.embedding. If you pack vectors into a contiguous typed-array buffer for fast cosine math, coerce at your boundary:
const vec = new Float32Array(await embedder.embed(text, { kind: 'query' }))model is required — no default
Neither battery defaults the model. The right embedding model is a deployment decision (dimensionality, language, cost, latency), so you must name it. A missing or empty model throws at construction:
new OpenAIEmbeddingsAdapter({}) // throws E_INVALID_OPENAI_EMBEDDINGS_OPTIONSQuery vs document prefixes
Asymmetric embedding models expect an instruction prefix on the query side and none on the document side. The kind option drives this from one shared code path:
kind: 'query'→ prependqueryPrefix(if set).kind: 'document'→ prependdocumentPrefix(if set). Default whenkindis omitted.
Set the prefixes once on the constructor; the battery applies them per call. Symmetric models (e.g. OpenAI text-embedding-3-*) need no prefix — leave both unset.
Wiring an embedder into retrieval
Embedders produce vectors; you own the vector store and the ranking. The pattern is: embed the query in turnInputPipeline, search your store, and inject the hits as Retrievable records with the similarity in score. The executor renders those records inside trust-tier envelopes — see Bring your own retrieval.
import { Retrievable } from '@nhtio/adk/common'
import { OpenAIEmbeddingsAdapter } from '@nhtio/adk/batteries/embeddings/openai'
const embedder = new OpenAIEmbeddingsAdapter({
model: 'text-embedding-3-small',
apiKey: process.env.OPENAI_API_KEY,
})
const retrievalMiddleware = async (ctx, next) => {
const query = [...ctx.turnMessages].at(-1)?.content.toString() ?? ''
if (query) {
const qv = await embedder.embed(query, { kind: 'query' })
const hits = await myVectorStore.search(qv, { topK: 5 }) // your store, your search
for (const hit of hits) {
const now = new Date()
ctx.turnRetrievables.add(
new Retrievable({
id: hit.id,
content: hit.text,
trustTier: 'first-party',
source: hit.url,
score: hit.similarity, // the embedding-derived rank lands here
createdAt: now,
updatedAt: now,
})
)
}
}
return next()
}Embed your corpus the same way — with embedMany(texts) (documents) — at ingest time, store the vectors, and you are done. ADK plays no part in ingestion; it only consumes the Retrievable records your middleware produces.
Optional peer dependency
The WebLLM battery imports @mlc-ai/web-llm lazily, and the package is declared as an optional peer dependency. Consumers who only use the OpenAI battery (or no embeddings at all) install nothing extra and pay nothing in their bundle.