Semantic Search

Semantic search finds objects by meaning rather than exact keywords. You declare which fields to embed in the @smrt() decorator, and SMRT generates embedding vectors, stores them in the _smrt_embeddings table, and ranks results by cosine similarity.

Configure embeddings on a class

Add an embeddings block to @smrt() listing the fields to index. Everything else has a project default, so the minimal config is just fields.

typescript

import { smrt, SmrtObject } from '@happyvertical/smrt-core';

@smrt({
  embeddings: {
    fields: ['title', 'content']
  }
})
class Article extends SmrtObject {
  title: string = '';
  content: string = '';
}

The per-class options (ClassEmbeddingConfig):

Option	Type	Default	Purpose
`fields`	string[]	—	Which fields to embed (required).
`provider`	`'local' \| 'ai' \| 'auto'`	project setting	Override the provider for this class.
`autoGenerate`	boolean	`true`	Generate embeddings automatically on save.
`regenerateOnChange`	boolean	`true`	Only re-embed when the source content's hash changes.
`combinedField`	object	—	Create one virtual embedding from a template of several fields.

A combinedField indexes several fields as a single searchable vector — useful when a query should match title and body together:

typescript

@smrt({
  embeddings: {
    fields: ['title', 'body'],
    combinedField: {
      name: 'content',
      template: '{title}\n\n{body}'
    }
  }
})
class Post extends SmrtObject {
  title: string = '';
  body: string = '';
}

Providers and storage

By default SMRT uses a local embedding model — no API key, vectors computed in-process — and stores them as JSON so search works on any database. You change this globally in smrt.config.ts under smrt.embeddings.

typescript

// smrt.config.ts
import { defineConfig } from '@happyvertical/smrt-config';

export default defineConfig({
  smrt: {
    embeddings: {
      provider: 'local',                    // 'local' | 'ai' | 'auto'
      localModel: 'Xenova/bge-base-en-v1.5', // default local model
      dimensions: 768,                       // default
      storage: 'json'                        // 'json' (portable) | 'native' (DB vectors)
    }
  }
});

Provider	Behavior
`'local'`	Run a local model in Node. No API key required. Default.
`'ai'`	Use your configured AI library (e.g. OpenAI `text-embedding-3-small`).
`'auto'`	Prefer AI embeddings when an AI client is configured, otherwise use the local model.

Generating embeddings

With autoGenerate on (the default), embeddings refresh in the background when you save() an object whose embedded content changed — but only when an AI client is configured, so a save never unexpectedly loads a local model. Generate them explicitly when you need to:

typescript

// One object, all configured fields
await article.generateEmbeddings();

// Only specific fields
await article.generateEmbeddings({ fields: ['title'] });

// Force regeneration even if content is unchanged
await article.generateEmbeddings({ force: true });

To backfill an existing collection, batch-generate the missing ones:

typescript

const result = await articles.generateMissingEmbeddings({
  batchSize: 50,
  onProgress: ({ completed, total }) => {
    console.log(`Embedded ${completed}/${total}`);
  }
});
console.log(result); // { generated, skipped }

Searching

semanticSearch(query, options) embeds the query text and returns matching objects, each annotated with a _similarity score (0–1).

typescript

const results = await articles.semanticSearch('machine learning trends', {
  limit: 10,
  minSimilarity: 0.7,
  where: { status: 'published' } // combine with regular filters
});

for (const article of results) {
  console.log(`${article.title} (${article._similarity.toFixed(3)})`);
}

findSimilar(object, options) finds objects close to an existing one — for "related items" or "more like this":

typescript

const article = await articles.get('article-123');
const related = await articles.findSimilar(article, {
  limit: 5,
  excludeSelf: true // default
});

Method	Default limit	Use
`semanticSearch(query, opts)`	10	Search by free-text query.
`findSimilar(obj, opts)`	5	Find items similar to a given object (or its ID).
`generateMissingEmbeddings(opts)`	—	Backfill embeddings in batches.

Objects → Semantic Search — the API on collections.
Configuration — the smrt.embeddings project defaults.
Guide: add semantic search to a model — end to end.

Verified against SMRT v0.29.34.

Semantic Search

Configure embeddings on a class

Providers and storage

Generating embeddings

Searching

Related