Embeddings Cost Calculator

Calculate how much it costs to embed your document corpus

Ad placeholder (leaderboard)

Embeddings cost calculator

Before you build a vector index, find out what it costs to embed your whole corpus. This calculator converts your document word count to tokens and prices it across the major embedding providers — OpenAI, Cohere, Voyage — plus a self-hosted option, and lets you add a re-embedding frequency for recurring index rebuilds.

How it works

Embeddings are billed on input tokens only; there is no generated output. The calculator converts your word count to tokens (1 word ≈ 1.3 tokens), divides by one million, and multiplies by the selected model’s price to give the one-off cost to embed everything once.

If your data changes or you migrate to a new embedding model, you re-embed, which repeats the cost. Entering a monthly re-embedding frequency multiplies the one-off figure into a recurring monthly line so your budget reflects maintenance, not just the initial build.

Tips and notes

  • Embedding is usually cheap relative to generation. Even large corpora often cost only a few dollars to embed once — the recurring cost from frequent re-embedding is what adds up.
  • Self-hosting trades API fees for compute and ops. A local model removes the per-token fee but you pay for GPU or CPU time and maintenance; compare against your cloud GPU rate.
  • Dimensions affect storage, not embedding price. Larger vectors cost more to store and search in your vector database, even though the embedding API price is the same per token.
Ad placeholder (rectangle)