Add semantic search to a model

This guide takes an existing model from no search to meaning-based search in four steps: declare which fields to embed, install an embedding provider, backfill vectors for existing rows, then query with semanticSearch(). By default it runs locally — no API key.

Step 1 — Declare the embedded fields

Add an embeddings block to the model's @smrt() decorator listing the fields whose text should be searchable.

typescript
// src/lib/models/Article.ts — before
import { smrt, SmrtObject } from '@happyvertical/smrt-core';

@smrt({ api: true })
export class Article extends SmrtObject {
  title: string = '';
  content: string = '';
}
typescript
// src/lib/models/Article.ts — after
import { smrt, SmrtObject } from '@happyvertical/smrt-core';

@smrt({
  api: true,
  embeddings: {
    fields: ['title', 'content'] // index both fields
  }
})
export class Article extends SmrtObject {
  title: string = '';
  content: string = '';
}

Want title and body matched as a single unit? Add a combinedField with a template:

typescript
embeddings: {
  fields: ['title', 'content'],
  combinedField: { name: 'fulltext', template: '{title}\n\n{content}' }
}

Step 2 — Pick a provider

The default provider is local: a model runs in-process, no API key, vectors stored as JSON so it works on any database. Install the local transformer runtime:

bash
pnpm add @huggingface/transformers

That is enough to start. To use a hosted model instead, set the provider in smrt.config.ts and configure an AI client:

typescript
// smrt.config.ts
import { defineConfig } from '@happyvertical/smrt-config';

export default defineConfig({
  smrt: {
    embeddings: {
      provider: 'ai',                  // use a hosted model
      aiModel: 'text-embedding-3-small',
      dimensions: 1536
    }
  },
  packages: {
    ai: { provider: 'openai', apiKey: process.env.OPENAI_API_KEY }
  }
});

Step 3 — Backfill existing rows

New and updated objects embed automatically on save() (when autoGenerate is on and an AI client is available). Rows that already existed have no vectors yet, so backfill them once:

typescript
// scripts/backfill-embeddings.ts
import { ArticleCollection } from '$lib/models/ArticleCollection.js';

const articles = await ArticleCollection.create({
  db: { type: 'postgres', url: process.env.DATABASE_URL! }
});

const result = await articles.generateMissingEmbeddings({
  batchSize: 50,
  onProgress: ({ completed, total }) => {
    console.log(`Embedded ${completed}/${total}`);
  }
});

console.log(result); // { generated: N, skipped: M }

To force a full re-embed (e.g. after changing the model) call it per object with force:

typescript
for (const article of await articles.list()) {
  await article.generateEmbeddings({ force: true });
}

Step 4 — Search

Query by meaning with semanticSearch(). Each result carries a _similarity score (0–1); raise minSimilarity to tighten relevance.

typescript
// src/routes/search/+page.server.ts
import { ArticleCollection } from '$lib/models/ArticleCollection.js';

export async function load({ url }) {
  const q = url.searchParams.get('q') ?? '';
  if (!q) return { results: [] };

  const articles = await ArticleCollection.create({
    db: { type: 'postgres', url: process.env.DATABASE_URL! }
  });

  const results = await articles.semanticSearch(q, {
    limit: 10,
    minSimilarity: 0.7,
    where: { status: 'published' } // combine with normal filters
  });

  return {
    results: results.map((a) => ({
      id: a.id,
      title: a.title,
      score: a._similarity
    }))
  };
}

Add a "related articles" section with findSimilar():

typescript
const article = await articles.get(params.id);
const related = await articles.findSimilar(article, { limit: 5 });

Tuning

SymptomLever
Too many loosely-related hitsRaise minSimilarity (e.g. 0.7 → 0.8).
Relevant items missingLower minSimilarity, or embed more fields / add a combinedField.
Search returns nothingConfirm rows are embedded — run generateMissingEmbeddings().
Slow on large tablesUse storage: 'native' with pgvector on Postgres instead of JSON.

Related

Verified against SMRT v0.29.34.