Semantic Search

Semantic search finds objects by meaning rather than exact keywords. You declare which fields to embed in the @smrt() decorator, and SMRT generates embedding vectors, stores them in the _smrt_embeddings table, and ranks results by cosine similarity.

Configure embeddings on a class

Add an embeddings block to @smrt() listing the fields to index. Everything else has a project default, so the minimal config is just fields.

typescript
import { smrt, SmrtObject } from '@happyvertical/smrt-core';

@smrt({
  embeddings: {
    fields: ['title', 'content']
  }
})
class Article extends SmrtObject {
  title: string = '';
  content: string = '';
}

The per-class options (ClassEmbeddingConfig):

OptionTypeDefaultPurpose
fieldsstring[]Which fields to embed (required).
provider'local' | 'ai' | 'auto'project settingOverride the provider for this class.
autoGeneratebooleantrueGenerate embeddings automatically on save.
regenerateOnChangebooleantrueOnly re-embed when the source content's hash changes.
combinedFieldobjectCreate one virtual embedding from a template of several fields.

A combinedField indexes several fields as a single searchable vector — useful when a query should match title and body together:

typescript
@smrt({
  embeddings: {
    fields: ['title', 'body'],
    combinedField: {
      name: 'content',
      template: '{title}\n\n{body}'
    }
  }
})
class Post extends SmrtObject {
  title: string = '';
  body: string = '';
}

Providers and storage

By default SMRT uses a local embedding model — no API key, vectors computed in-process — and stores them as JSON so search works on any database. You change this globally in smrt.config.ts under smrt.embeddings.

typescript
// smrt.config.ts
import { defineConfig } from '@happyvertical/smrt-config';

export default defineConfig({
  smrt: {
    embeddings: {
      provider: 'local',                    // 'local' | 'ai' | 'auto'
      localModel: 'Xenova/bge-base-en-v1.5', // default local model
      dimensions: 768,                       // default
      storage: 'json'                        // 'json' (portable) | 'native' (DB vectors)
    }
  }
});
ProviderBehavior
'local'Run a local model in Node. No API key required. Default.
'ai'Use your configured AI library (e.g. OpenAI text-embedding-3-small).
'auto'Prefer AI embeddings when an AI client is configured, otherwise use the local model.

Generating embeddings

With autoGenerate on (the default), embeddings refresh in the background when you save() an object whose embedded content changed — but only when an AI client is configured, so a save never unexpectedly loads a local model. Generate them explicitly when you need to:

typescript
// One object, all configured fields
await article.generateEmbeddings();

// Only specific fields
await article.generateEmbeddings({ fields: ['title'] });

// Force regeneration even if content is unchanged
await article.generateEmbeddings({ force: true });

To backfill an existing collection, batch-generate the missing ones:

typescript
const result = await articles.generateMissingEmbeddings({
  batchSize: 50,
  onProgress: ({ completed, total }) => {
    console.log(`Embedded ${completed}/${total}`);
  }
});
console.log(result); // { generated, skipped }

Searching

semanticSearch(query, options) embeds the query text and returns matching objects, each annotated with a _similarity score (0–1).

typescript
const results = await articles.semanticSearch('machine learning trends', {
  limit: 10,
  minSimilarity: 0.7,
  where: { status: 'published' } // combine with regular filters
});

for (const article of results) {
  console.log(`${article.title} (${article._similarity.toFixed(3)})`);
}

findSimilar(object, options) finds objects close to an existing one — for "related items" or "more like this":

typescript
const article = await articles.get('article-123');
const related = await articles.findSimilar(article, {
  limit: 5,
  excludeSelf: true // default
});
MethodDefault limitUse
semanticSearch(query, opts)10Search by free-text query.
findSimilar(obj, opts)5Find items similar to a given object (or its ID).
generateMissingEmbeddings(opts)Backfill embeddings in batches.

Related

Verified against SMRT v0.29.34.