Parent-Child Chunker Preview

Preview parent-document + child-chunk splitting for hierarchical RAG.

Ad placeholder (leaderboard)

Preview parent-child chunking before you index

Hierarchical or small-to-big retrieval is one of the most effective RAG upgrades: you search over small child chunks for precise matches, but feed the larger parent chunk they came from into the model so it has enough surrounding context to answer well. This tool lets you preview that split — set a parent size and a child size, paste your document, and see exactly how the parents and nested children come out before you commit to an indexing pipeline.

How it works

The document is split in two passes. The first pass walks the text by sentence and paragraph boundaries, accumulating until it reaches roughly the parent size, producing the large context chunks. The second pass repeats the same boundary-aware accumulation inside each parent using the child size, producing the small chunks you would actually embed and search. Every child records which parent it belongs to, so the preview shows a clean nested tree with character counts at each level. Because the splitter respects sentence ends, chunks rarely break in the middle of a thought.

Tips and notes

A common starting point is parents of roughly 1,000–2,000 characters with children of 200–400 characters, but the right values depend on your model’s context budget and how dense your source material is. If children come out larger than your target, your document has long unbroken sentences — shorten them or accept the overshoot. Watch the parent-to-child ratio: too many tiny children per parent inflates your index, while too few defeats the precision benefit. Tune here, then mirror the same sizes in your real LangChain, LlamaIndex or custom retriever configuration.

Ad placeholder (rectangle)