RAG Chunk Size & Overlap Calculator

Find the optimal chunk size and overlap for your RAG pipeline.

Ad placeholder (leaderboard)

RAG chunk size and overlap calculator

Chunking is the quiet decision that makes or breaks a retrieval-augmented generation pipeline. Chunk too large and retrieval returns diluted, low-relevance passages; chunk too small and you lose the surrounding context the model needs. Overlap helps preserve context across boundaries but multiplies your token count and cost. This calculator turns those tradeoffs into concrete numbers for your specific document length.

How it works

Given a document length in tokens, a chunk size, and an overlap percentage, the tool computes the stride — chunk size minus overlap tokens — which is how far each chunk advances. The chunk count is the document length divided by the stride, rounded up. From there it derives the overlap in tokens, the total number of tokens you will actually embed (chunk count × chunk size, which exceeds the document length because of overlap), and the overlap ratio. It also divides your retrieval context window by the chunk size to show how many chunks you can pack into a single prompt at query time.

Tips and notes

  • Mind the embed-cost multiplier. Total embedded tokens grow as overlap rises — 20% overlap embeds roughly 25% more tokens than the raw document.
  • Match top-k to the context window. If only eight chunks fit in your retrieval window, retrieving twelve wastes tokens or truncates context.
  • Start at 512 / 15%. A reasonable default for prose; shrink chunks for fact-dense data like tables or specs, grow them for narrative text.
  • Evaluate, do not guess. Use these numbers to set up experiments, then measure retrieval quality on a labelled set.
Ad placeholder (rectangle)