HyDE Embedding Cost Calculator

Cost of Hypothetical Document Embedding (HyDE) RAG vs standard RAG

Ad placeholder (leaderboard)

HyDE improves retrieval by generating a hypothetical answer before embedding — but that LLM step is paid on every query. This calculator compares HyDE against standard RAG so you can decide whether the recall gain justifies the recurring cost.

How it works

Standard RAG embeds the raw query once:

standard cost/query = query_tokens × embed_price / 1e6

HyDE adds an LLM generation step, then embeds the generated document:

hyde cost/query = (gen_input × in_price + gen_output × out_price) / 1e6
                + (gen_output × embed_price) / 1e6

The tool scales both by your daily query volume to monthly figures and shows the absolute and percentage overhead HyDE introduces.

Worked example

10,000 queries/day, query 20 tokens, HyDE generation 120 input + 200 output tokens, LLM at $0.50/1M in and $1.50/1M out, embeddings at $0.13/1M:

  • Standard cost/query: 20 × $0.13 / 1e6 = $0.0000026
  • HyDE generation: (120 × $0.5 + 200 × $1.5)/1e6 = $0.00036
  • HyDE embedding: 200 × $0.13/1e6 = $0.000026
  • HyDE cost/query:$0.000386
  • Monthly overhead: ~$115/month for ~148× the per-query cost

The dollar amount is modest at this volume — the question is whether the recall lift moves a business metric.

Tips

  • Use a cheap small model for the hypothetical document; quality of retrieval rarely needs a frontier model here.
  • Apply HyDE selectively — only to queries where standard retrieval returns low-confidence matches.
  • Cap the generated length; you are embedding it, so longer is not always better.
  • Model overall RAG spend with the LLM API Cost Calculator.
Ad placeholder (rectangle)