Schema & Migrations

The schema surface is the other half of the knex bet. You don't learn a new "create index" call per backend — you describe a collection the way you'd describe a table, and the adapter materializes it however its backend wants. The same description produces a CREATE TABLE with a vector(384) column on pgvector, a vec0 virtual table on sqlite-vec, and a recorded shape hint on a schemaless store like Qdrant.

Creating a collection

vs.schema.createCollection(name, cb) takes a callback that receives a CollectionBuilder — knex's createTable(name, tableBuilder => …), exactly:

typescript

await vs.schema.createCollection('docs', (c) => {
  c.vector({ dimensions: 384, metric: 'cosine' }) // the similarity field — REQUIRED
  c.string('source')
  c.integer('year').index()
  c.string('kind')
  c.boolean('published').nullable()
})

The one rule the table analogy doesn't have: a collection must declare a .vector(). It's the similarity field — the reason the collection exists — so a builder that never calls .vector() throws at build time. metric defaults to 'cosine' if you omit it.

The payload fields use the knex column vocabulary you already know:

Method	Declares	Chain
`.vector({ dimensions, metric? })`	The similarity field (required)	—
`.string(name)`	A text payload field	`.index()`, `.nullable()`
`.integer(name)`	An integer field	`.index()`, `.nullable()`
`.number(name)`	A float field	`.index()`, `.nullable()`
`.boolean(name)`	A boolean field	`.index()`, `.nullable()`
`.json(name)`	A nested/object field	`.index()`, `.nullable()`

Advisory on some backends, authoritative on others

Payload-field declarations mean different things depending on what's underneath, and the battery is honest about which:

Schemaless backends (Qdrant, Pinecone, Chroma, …) store arbitrary metadata regardless. Here the declarations are advisory — the adapter records the declared shape so it can validate writes and drop index hints, but it will not stop you from upserting a field you didn't declare.
SQL backends (pgvector, sqlite-vec, …) need real columns. Here the declarations are authoritative — pgvector emits actual CREATE TABLE columns, a vector(N) column, and the correct index opclass; sqlite-vec emits the vec0 virtual-table column spec. What you declare is what exists.

This isn't a leak in the abstraction — it's the abstraction telling the truth. A schemaless store can't enforce a column it doesn't have, and pretending otherwise would be the kind of fake guarantee the kit refuses to make.

The rest of the lifecycle

typescript

await vs.schema.hasCollection('docs')               // → boolean   (knex: hasTable)
await vs.schema.createCollectionIfNotExists('docs', cb)
await vs.schema.dropCollection('docs')              // knex: dropTable
await vs.schema.dropCollectionIfExists('docs')      // knex: dropTableIfExists
await vs.schema.renameCollection('docs', 'documents')

renameCollection is the one operation not every backend can honour. Where the backend can't rename in place (most managed and several server backends), it throws E_VECTOR_STORE_UNSUPPORTED_OPERATION rather than faking it with a copy-and-drop you didn't ask for. Check vs.capabilities.rename first if you need it portably — see Consistency & Capabilities.

connect() / close() are not schema verbs

Opening the connection is await vs.connect(), closing it is await vs.close() — plain lifecycle on the store, not part of vs.schema. knex doesn't put "open a pool" in knex.schema either; neither do we. Construct the store, connect(), then build schema.

Migrations

Migrations mirror a knex migration module: a named pair of up / down functions, each handed a context whose schema is the same VectorSchemaBuilder you used above.

typescript

// migrations/0001_docs.ts
import type { VectorMigrationContext } from '@nhtio/adk/batteries/vector'

export const name = '0001_docs'

export async function up({ schema }: VectorMigrationContext) {
  await schema.createCollection('docs', (c) => {
    c.vector({ dimensions: 384, metric: 'cosine' })
    c.string('source')
    c.integer('year').index()
  })
}

export async function down({ schema }: VectorMigrationContext) {
  await schema.dropCollectionIfExists('docs')
}

The runner is vs.migrate, with the two verbs knex taught everyone:

typescript

await vs.migrate.latest()   // apply all pending migrations, in order → returns names applied this run
await vs.migrate.rollback() // run the last applied migration's down() → returns its name, or null if none

latest() reads the ledger, filters out already-applied migrations, and runs the rest in order — recording each in the ledger as it succeeds. rollback() takes the most recently applied migration, runs its down(), and removes it from the ledger.

Failure stops the line

A migration whose up() (or down()) throws surfaces E_VECTOR_STORE_MIGRATION_FAILED([name, message]), and the ledger does not advance past the failure. The migrations that ran before it stay recorded; the one that threw is not recorded; nothing after it runs. You fix the broken migration and re-run latest() — the already-applied ones are skipped, and the run resumes from the one that failed. No half-applied limbo, no "it recorded success but the collection isn't there."

The ledger

Applied-migration state lives in a MigrationLedger the adapter owns — applied(), record(name), remove(name) — persisted in a backend-appropriate place (a _vector_migrations collection or table), exactly as knex uses a knex_migrations table. You don't implement the ledger unless you're writing an adapter; the shipped adapters provide it.

Where to go next

The Query Builder & Filters — query the collection you just created.
Consistency & Capabilities — vs.capabilities.rename and the rest of the static-truth flags.
Writing an Adapter — implement createCollection/dropCollection/hasCollection/renameCollection and the migration ledger for a new backend.

What each pipeline owns

Envelopes

Persistence

Identity and Reasoning

Media

Schema & Migrations

Creating a collection

Advisory on some backends, authoritative on others

The rest of the lifecycle

Migrations

Failure stops the line

The ledger

Where to go next

What each pipeline owns

Envelopes

Persistence

Identity and Reasoning

Media

Schema & Migrations ​

Creating a collection ​

Advisory on some backends, authoritative on others ​

The rest of the lifecycle ​

Migrations ​

Failure stops the line ​

The ledger ​

Where to go next ​

Schema & Migrations

Creating a collection

Advisory on some backends, authoritative on others

The rest of the lifecycle

Migrations

Failure stops the line

The ledger

Where to go next