How is raw storage calculated?

Raw bytes = vector_count × dimensions × bytes_per_value, where float32 is 4 bytes, float16 is 2, and int8 is 1. The tool converts that to MB and GB.

Why is indexed storage larger than raw?

Approximate-nearest-neighbour indexes like HNSW add graph links and metadata. The tool applies a typical multiplier (around 1.5-2x) so your estimate reflects real on-disk size, not just the raw vectors.

Are the per-database costs exact?

No. They are order-of-magnitude estimates based on published storage and serverless pricing and are clearly labelled. Real bills depend on tier, replicas, queries, and region — confirm with each provider.

Should I quantize my vectors?

Often yes. int8 or float16 quantization cuts storage 2-4x with minor recall loss for most workloads, and some databases support it natively. Test recall on your data before committing.

What is the Vector Dimensionality Cost Calculator?

Enter vector count, dimensions, and precision (float32/float16/int8) to compute raw storage in MB/GB plus indexed-storage estimates and rough monthly cost for Pinecone, Qdrant, Weaviate, and pgvector. It runs free in your browser on Gera Tools, with nothing uploaded.

Vector Dimensionality Cost Calculator

Name: Vector Dimensionality Cost Calculator
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Size and price your vector index

Embedding indexes get expensive quietly: a million 1536-dimension float32 vectors is already several gigabytes before index overhead. Enter your vector count, dimensions, and precision to see raw storage, a realistic indexed-storage estimate, and rough monthly cost across popular vector databases.

How it works

Raw storage is simply vectors × dimensions × bytes_per_value (4 bytes for float32, 2 for float16, 1 for int8). On top of that, ANN indexes such as HNSW add graph links and metadata, so the tool applies a typical overhead multiplier to estimate real on-disk size. The per-database figures convert that size into a ballpark monthly cost using published storage pricing — useful for comparison, not for an exact invoice.

Tips

Quantize when you can. Moving from float32 to int8 cuts storage 4x; test recall on your own queries first.
Truncate dimensions. Matryoshka-capable models let you store 256–512 dims instead of 1536 with little quality loss — often the biggest single saving.
Mind query cost too. Storage is only part of the bill; serverless vector DBs also charge per read, so high-QPS workloads can cost more than storage.

Why dimensions matter so much

Raw storage scales linearly with dimensions, but indexed-query performance and recall interact in more complex ways. A 1536-dimension index of 1 million vectors takes roughly 6 GB in float32 before HNSW graph overhead. Moving to 768 dimensions halves that to 3 GB and often cuts P95 query latency noticeably, because the distance computation visits fewer numbers per candidate vector. The question is whether recall at 768 dims is acceptable for your use case.

Models from the Matryoshka representation learning family (some OpenAI text-embedding-3 models, for instance) are designed to give good performance even at drastically reduced dimensions, sometimes as low as 256 or 512. If your model supports this, testing at 512 dims can give you storage and speed benefits while maintaining most of the semantic quality of the full-dimension embedding.

Choosing a precision tier

float32 (4 bytes per dimension) is the default and gives exact dot-product or cosine-similarity calculations. float16 (2 bytes) halves storage with minimal quality loss for most text embeddings, because the values rarely need full 32-bit precision. int8 (1 byte) requires quantization — a calibration step that maps the float range to 256 integer levels. Many databases like Qdrant and Weaviate support native int8 quantization with a reported recall loss of around 1–2% on benchmark datasets, which is acceptable for most retrieval tasks. The tool shows the raw storage for each precision so you can make the trade-off explicit.

Per-database overhead differences

The indexed-storage multiplier varies by database. HNSW-based indexes store graph edges (each node links to M neighbors, typically 16–32), plus metadata and bloom filters. A multiplier of 1.5–2x over raw vector bytes is a common real-world observation. Some databases like pgvector running on PostgreSQL also store the full row alongside the vector, adding more overhead. Others like Qdrant with in-memory HNSW can be more compact. Treat the estimates here as planning figures and benchmark against a sample of your real data before committing to a tier.