Semantic Search
Semantic search finds objects by meaning rather than exact keywords. You declare which
fields to embed in the @smrt() decorator, and SMRT generates embedding vectors,
stores them in the _smrt_embeddings table, and ranks results by cosine similarity.
Configure embeddings on a class
Add an embeddings block to @smrt() listing the fields to index.
Everything else has a project default, so the minimal config is just fields.
import { smrt, SmrtObject } from '@happyvertical/smrt-core';
@smrt({
embeddings: {
fields: ['title', 'content']
}
})
class Article extends SmrtObject {
title: string = '';
content: string = '';
}The per-class options (ClassEmbeddingConfig):
| Option | Type | Default | Purpose |
|---|---|---|---|
fields | string[] | — | Which fields to embed (required). |
provider | 'local' | 'ai' | 'auto' | project setting | Override the provider for this class. |
autoGenerate | boolean | true | Generate embeddings automatically on save. |
regenerateOnChange | boolean | true | Only re-embed when the source content's hash changes. |
combinedField | object | — | Create one virtual embedding from a template of several fields. |
A combinedField indexes several fields as a single searchable vector — useful when
a query should match title and body together:
@smrt({
embeddings: {
fields: ['title', 'body'],
combinedField: {
name: 'content',
template: '{title}\n\n{body}'
}
}
})
class Post extends SmrtObject {
title: string = '';
body: string = '';
}Providers and storage
By default SMRT uses a local embedding model — no API key, vectors computed
in-process — and stores them as JSON so search works on any database. You change this globally
in smrt.config.ts under smrt.embeddings.
// smrt.config.ts
import { defineConfig } from '@happyvertical/smrt-config';
export default defineConfig({
smrt: {
embeddings: {
provider: 'local', // 'local' | 'ai' | 'auto'
localModel: 'Xenova/bge-base-en-v1.5', // default local model
dimensions: 768, // default
storage: 'json' // 'json' (portable) | 'native' (DB vectors)
}
}
});| Provider | Behavior |
|---|---|
'local' | Run a local model in Node. No API key required. Default. |
'ai' | Use your configured AI library (e.g. OpenAI text-embedding-3-small). |
'auto' | Prefer AI embeddings when an AI client is configured, otherwise use the local model. |
Generating embeddings
With autoGenerate on (the default), embeddings refresh in the background when you save() an object whose embedded content changed — but only when an AI client is configured,
so a save never unexpectedly loads a local model. Generate them explicitly when you need to:
// One object, all configured fields
await article.generateEmbeddings();
// Only specific fields
await article.generateEmbeddings({ fields: ['title'] });
// Force regeneration even if content is unchanged
await article.generateEmbeddings({ force: true });To backfill an existing collection, batch-generate the missing ones:
const result = await articles.generateMissingEmbeddings({
batchSize: 50,
onProgress: ({ completed, total }) => {
console.log(`Embedded ${completed}/${total}`);
}
});
console.log(result); // { generated, skipped }Searching
semanticSearch(query, options) embeds the query text and returns matching
objects, each annotated with a _similarity score (0–1).
const results = await articles.semanticSearch('machine learning trends', {
limit: 10,
minSimilarity: 0.7,
where: { status: 'published' } // combine with regular filters
});
for (const article of results) {
console.log(`${article.title} (${article._similarity.toFixed(3)})`);
}findSimilar(object, options) finds objects close to an existing one — for "related
items" or "more like this":
const article = await articles.get('article-123');
const related = await articles.findSimilar(article, {
limit: 5,
excludeSelf: true // default
});| Method | Default limit | Use |
|---|---|---|
semanticSearch(query, opts) | 10 | Search by free-text query. |
findSimilar(obj, opts) | 5 | Find items similar to a given object (or its ID). |
generateMissingEmbeddings(opts) | — | Backfill embeddings in batches. |
Related
- Objects → Semantic Search — the API on collections.
- Configuration — the
smrt.embeddingsproject defaults. - Guide: add semantic search to a model — end to end.
Verified against SMRT v0.29.34.