Add semantic search to a model

This guide takes an existing model from no search to meaning-based search in four steps: declare which fields to embed, install an embedding provider, backfill vectors for existing rows, then query with semanticSearch(). By default it runs locally — no API key.

Step 1 — Declare the embedded fields

Add an embeddings block to the model's @smrt() decorator listing the fields whose text should be searchable.

typescript

// src/lib/models/Article.ts — before
import { smrt, SmrtObject } from '@happyvertical/smrt-core';

@smrt({ api: true })
export class Article extends SmrtObject {
  title: string = '';
  content: string = '';
}

typescript

// src/lib/models/Article.ts — after
import { smrt, SmrtObject } from '@happyvertical/smrt-core';

@smrt({
  api: true,
  embeddings: {
    fields: ['title', 'content'] // index both fields
  }
})
export class Article extends SmrtObject {
  title: string = '';
  content: string = '';
}

Want title and body matched as a single unit? Add a combinedField with a template:

typescript

embeddings: {
  fields: ['title', 'content'],
  combinedField: { name: 'fulltext', template: '{title}\n\n{content}' }
}

Step 2 — Pick a provider

The default provider is local: a model runs in-process, no API key, vectors stored as JSON so it works on any database. Install the local transformer runtime:

bash

pnpm add @huggingface/transformers

That is enough to start. To use a hosted model instead, set the provider in smrt.config.ts and configure an AI client:

typescript

// smrt.config.ts
import { defineConfig } from '@happyvertical/smrt-config';

export default defineConfig({
  smrt: {
    embeddings: {
      provider: 'ai',                  // use a hosted model
      aiModel: 'text-embedding-3-small',
      dimensions: 1536
    }
  },
  packages: {
    ai: { provider: 'openai', apiKey: process.env.OPENAI_API_KEY }
  }
});

Step 3 — Backfill existing rows

New and updated objects embed automatically on save() (when autoGenerate is on and an AI client is available). Rows that already existed have no vectors yet, so backfill them once:

typescript

// scripts/backfill-embeddings.ts
import { ArticleCollection } from '$lib/models/ArticleCollection.js';

const articles = await ArticleCollection.create({
  db: { type: 'postgres', url: process.env.DATABASE_URL! }
});

const result = await articles.generateMissingEmbeddings({
  batchSize: 50,
  onProgress: ({ completed, total }) => {
    console.log(`Embedded ${completed}/${total}`);
  }
});

console.log(result); // { generated: N, skipped: M }

To force a full re-embed (e.g. after changing the model) call it per object with force:

typescript

for (const article of await articles.list()) {
  await article.generateEmbeddings({ force: true });
}

Step 4 — Search

Query by meaning with semanticSearch(). Each result carries a _similarity score (0–1); raise minSimilarity to tighten relevance.

typescript

// src/routes/search/+page.server.ts
import { ArticleCollection } from '$lib/models/ArticleCollection.js';

export async function load({ url }) {
  const q = url.searchParams.get('q') ?? '';
  if (!q) return { results: [] };

  const articles = await ArticleCollection.create({
    db: { type: 'postgres', url: process.env.DATABASE_URL! }
  });

  const results = await articles.semanticSearch(q, {
    limit: 10,
    minSimilarity: 0.7,
    where: { status: 'published' } // combine with normal filters
  });

  return {
    results: results.map((a) => ({
      id: a.id,
      title: a.title,
      score: a._similarity
    }))
  };
}

Add a "related articles" section with findSimilar():

typescript

const article = await articles.get(params.id);
const related = await articles.findSimilar(article, { limit: 5 });

Tuning

Symptom	Lever
Too many loosely-related hits	Raise `minSimilarity` (e.g. 0.7 → 0.8).
Relevant items missing	Lower `minSimilarity`, or embed more fields / add a `combinedField`.
Search returns nothing	Confirm rows are embedded — run `generateMissingEmbeddings()`.
Slow on large tables	Use `storage: 'native'` with pgvector on Postgres instead of JSON.

Concept: Semantic Search — providers, storage, and the full method reference.
Configuration — the smrt.embeddings project defaults.
Objects → Semantic Search — API on collections.

Verified against SMRT v0.29.34.

Add semantic search to a model

Step 1 — Declare the embedded fields

Step 2 — Pick a provider

Step 3 — Backfill existing rows

Step 4 — Search

Tuning

Related