RAG Context Pruner

Drop low-relevance sentences from retrieved context to cut tokens before the LLM call.

Ad placeholder (leaderboard)

RAG context pruner

Retrieval-augmented generation often dumps whole chunks into the prompt, and a lot of those sentences have nothing to do with the question. This tool scores every sentence by keyword overlap with your query, drops the ones below a threshold you set, and shows how many tokens you saved — a quick way to trim a bloated context window before you pay for it.

How it works

The query is tokenised into lowercase keywords with common stop words removed. Each sentence in the context is scored by how many distinct query keywords it contains, divided by the number of query keywords, giving a 0–1 relevance value. Sentences scoring at or above your threshold are kept in their original order; the rest are dropped. Token counts use a standard ~4-characters-per-token estimate so you can see the savings at a glance. Everything runs locally.

Tips and notes

  • Start low and tighten. A threshold around 0.1–0.2 removes clearly off-topic sentences without gutting useful context. Push higher only if you still need to save tokens.
  • Lexical pruning is a pre-filter. Use it before an embedding reranker, not instead of one — it catches the easy wins cheaply.
  • Watch for orphaned references. Dropping a sentence that defines a term used later can hurt; skim the kept set before shipping.
  • Expand the query for better recall. Including synonyms in the query field helps the scorer keep sentences that phrase the same idea differently.
Ad placeholder (rectangle)