How many vectors will my corpus produce?

Vectors ≈ total tokens ÷ chunk size. Each chunk of text becomes one embedding vector, so a larger corpus or smaller chunks means more vectors to store and search.

How big is one embedding vector?

Raw storage ≈ dimension × 4 bytes (float32). A 1536-dimension vector is about 6 KB before metadata and index overhead. Indexes typically add 1.5-3× on top of the raw float storage.

Are the vector-DB prices exact?

No — they are simplified per-GB or per-vector estimates clearly labelled as such. Real pricing depends on pods, replicas, tiers and query volume. Use this to size the order of magnitude, then confirm with the vendor.

No. The entire calculation runs in your browser. Nothing you enter is stored or transmitted.

What is the Embedding Storage Cost Calculator?

Calculate the number of embedding vectors, raw storage size in bytes, and estimated monthly cost on Pinecone, Weaviate and Qdrant for a given document corpus and embedding dimension. It runs free in your browser on Gera Tools, with nothing uploaded.

Embedding Storage Cost Calculator

Name: Embedding Storage Cost Calculator
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Embedding storage cost calculator

Before building a RAG pipeline it pays to size the vector store. This tool turns a document corpus into a vector count, estimates raw storage, and projects monthly cost on Pinecone, Weaviate or Qdrant — so you know whether you are looking at a few dollars or a few thousand.

How it works

The math starts from your corpus and chunking:

total_tokens = documents × avg_doc_tokens
vectors      = total_tokens ÷ chunk_size
raw_bytes    = vectors × dimension × 4   (float32)
stored_bytes ≈ raw_bytes × index_overhead   (~2×)
monthly_cost ≈ stored_GB × provider_rate

Embedding dimension is the biggest lever on storage: a 3072-dim vector is twice the size of a 1536-dim one. Index structures (HNSW, IVF) add overhead on top of the raw floats, which the calculator includes with a realistic multiplier.

Worked example: a 50,000-document knowledge base

Consider a company with 50,000 support articles, each averaging 600 tokens, chunked at 300 tokens with 50-token overlaps:

chunks per document ≈ 600 / 300 × 1.17 ≈ 2.3
total vectors       ≈ 50,000 × 2.3 = 115,000
raw bytes (1536-dim) = 115,000 × 1536 × 4 ≈ 707 MB
with 2× index overhead ≈ 1.4 GB stored

At a typical managed vector DB rate, 1.4 GB of vector storage is a small cost — often under $10/month at that scale. The real cost pressure starts when the corpus reaches tens of millions of vectors, where storage can climb into the hundreds of GB and query compute becomes the dominant line item.

Dimension and storage: the key trade-off

Dimension	Float32 bytes per vector	Example model
256	1,024 bytes	OpenAI text-embedding-3-small (truncated)
512	2,048 bytes	Various open-source models
768	3,072 bytes	Many sentence-transformer models
1,024	4,096 bytes	Cohere Embed v3
1,536	6,144 bytes	OpenAI text-embedding-3-small (full)
3,072	12,288 bytes	OpenAI text-embedding-3-large (full)

Moving from 1,536 to 3,072 dimensions doubles raw storage without changing the vector count. For most retrieval use cases, 1,536 dimensions provides excellent quality; the large models are most useful when you need the additional precision for tasks like fine-grained semantic similarity or reranking.

Quantization: a practical storage lever

Most production vector databases support quantization, which reduces precision to cut storage:

Scalar (int8) quantization: reduces storage to 25% of float32 with minimal recall loss at moderate corpus sizes. A common default for cost-optimized indexes.
Binary quantization: reduces storage to about 3% of float32 with a more significant but often acceptable recall drop. Needs a two-stage retrieval (coarse binary search + re-score with full vectors).

Qdrant, Weaviate, and Pinecone all support at least scalar quantization. For corpora over 10 million vectors, enabling quantization before estimating costs is worthwhile.

Tips

Use a smaller-dimension model where quality allows — text-embedding-3-small (1536) is half the storage of large (3072) and often good enough.
Many vector DBs support quantization (int8 / binary) that can cut storage 4-32× with modest recall loss — huge at scale.
Larger chunks mean fewer vectors and less storage, but worse retrieval precision. Tune chunk size against retrieval quality, not just cost.
Storage is often the smaller cost — query volume and compute frequently dominate at scale, so pair this with a per-query cost estimate.