Embedding storage cost calculator
Before building a RAG pipeline it pays to size the vector store. This tool turns a document corpus into a vector count, estimates raw storage, and projects monthly cost on Pinecone, Weaviate or Qdrant — so you know whether you are looking at a few dollars or a few thousand.
How it works
The math starts from your corpus and chunking:
total_tokens = documents × avg_doc_tokens
vectors = total_tokens ÷ chunk_size
raw_bytes = vectors × dimension × 4 (float32)
stored_bytes ≈ raw_bytes × index_overhead (~2×)
monthly_cost ≈ stored_GB × provider_rate
Embedding dimension is the biggest lever on storage: a 3072-dim vector is twice the size of a 1536-dim one. Index structures (HNSW, IVF) add overhead on top of the raw floats, which the calculator includes with a realistic multiplier.
Tips
- Use a smaller-dimension model where quality allows — text-embedding-3-small (1536) is half the storage of large (3072) and often good enough.
- Many vector DBs support quantization (int8 / binary) that can cut storage 4-32× with modest recall loss — huge at scale.
- Larger chunks mean fewer vectors and less storage, but worse retrieval precision. Tune chunk size against retrieval quality, not just cost.
- Storage is often the smaller cost — query volume and compute frequently dominate, so pair this with a per-query cost estimate.