How is embedding cost calculated?

Cost equals total tokens divided by one million, multiplied by the model's per-million-token price. Embeddings are priced on input tokens only, since they produce a vector rather than generated text.

Why include a self-hosted option?

Open-source embedding models like those in the sentence-transformers family run on your own hardware with no per-token API fee. The calculator sets the API cost to zero so you can compare against paid APIs and budget your own compute separately.

Should I add re-embedding frequency?

If your documents change or you switch embedding models, you must re-embed, which repeats the cost. Setting a monthly frequency turns the one-off figure into a recurring budget line.

Is my corpus uploaded anywhere?

No. The calculator runs entirely in your browser. You enter only a word count and settings; no document text is sent, stored, or logged.

Embeddings Cost Calculator

Embeddings cost calculator

Before you build a vector index, find out what it costs to embed your whole corpus. This calculator converts your document word count to tokens and prices it across the major embedding providers — OpenAI, Cohere, Voyage — plus a self-hosted option, and lets you add a re-embedding frequency for recurring index rebuilds.

How it works

Embeddings are billed on input tokens only; there is no generated output. The calculator converts your word count to tokens (1 word ≈ 1.3 tokens), divides by one million, and multiplies by the selected model’s price to give the one-off cost to embed everything once.

If your data changes or you migrate to a new embedding model, you re-embed, which repeats the cost. Entering a monthly re-embedding frequency multiplies the one-off figure into a recurring monthly line so your budget reflects maintenance, not just the initial build.

Tips and notes

Embedding is usually cheap relative to generation. Even large corpora often cost only a few dollars to embed once — the recurring cost from frequent re-embedding is what adds up.
Self-hosting trades API fees for compute and ops. A local model removes the per-token fee but you pay for GPU or CPU time and maintenance; compare against your cloud GPU rate.
Dimensions affect storage, not embedding price. Larger vectors cost more to store and search in your vector database, even though the embedding API price is the same per token.