Do I need a vector database to start?

No. For a few thousand documents you can store embeddings in your existing database or even a JSON file and compute cosine similarity in memory. A dedicated vector store like pgvector or Pinecone only pays off once you have tens of thousands of items or need sub-100ms search at scale.

How is semantic search different from keyword search?

Keyword search matches exact words, so "cheap flights" misses a page titled "budget airfare". Semantic search compares meaning by turning text into vectors, so conceptually related results surface even with no shared words. That is why it feels smarter and handles synonyms and phrasing differences for free.

Where should the embedding calls happen in Next.js?

Index-time embedding belongs in a build step, a cron job, or a webhook that fires when content changes. Query-time embedding belongs in a server route handler or server action so your API key never reaches the browser. Never embed content on every page render.

How much does this cost to run?

Embeddings are among the cheapest model calls — typically a tiny fraction of a cent per thousand tokens. You pay once per document at index time and once per search query, so a site with thousands of pages and thousands of daily searches costs only a few dollars a month.

What is cosine similarity and why use it?

Cosine similarity measures the angle between two vectors, giving a score from -1 to 1 where 1 means identical direction, i.e. very similar meaning. It ignores vector length and focuses purely on direction, which is exactly what you want when comparing the meaning of two pieces of text.

How to Build AI-Powered Search in Next.js

Why semantic search beats keyword matching

Traditional search matches the literal characters a user types, so a query for “cheap flights” never finds a page titled “budget airfare deals” even though they mean the same thing. AI-powered search fixes this by converting both your content and the user’s query into embeddings — lists of numbers that capture meaning — and ranking results by how close those numbers are. The payoff is search that understands synonyms, paraphrases, and intent out of the box, with no manually maintained synonym lists. In Next.js it slots in cleanly: embed content in a background job, embed the query in a server route, and render ranked results on the client.

How it works

There are two phases. At index time you loop over every document, send its text to an embeddings model once, and store the returned vector next to the original text. You only repeat this when content changes. At query time a user types a search; your server route embeds that query string into a vector of the same shape, then computes cosine similarity between the query vector and every stored document vector. Cosine similarity returns a score where higher means more similar in meaning. You sort the documents by that score, drop anything below a relevance threshold, and return the top handful as JSON. The interactive demo below runs this exact pipeline on a small sample corpus using a transparent keyword-overlap proxy so you can see ranking and scoring behave the way real embeddings would, without needing an API key.

Tips and gotchas

Keep your API key server-side: query embedding must run in a route handler or server action, never in the browser. Cache index-time embeddings aggressively — re-embedding unchanged content on every deploy wastes money and time. Set a minimum similarity threshold so irrelevant results are hidden rather than shown at the bottom with a near-zero score. For up to a few thousand documents, an in-memory similarity scan is fast enough and far simpler than running a vector database; reach for pgvector or a managed vector store only when your corpus or traffic grows. Finally, show the relevance percentage in the UI — users trust ranked results far more when they can see why one result outranked another.

Try the ranking demo

Enter a search query below. The demo embeds it against a sample knowledge base and ranks results by similarity, exactly as a real Next.js semantic-search route would.