Multi-Query RAG Query Expander (BYO-key)

Generate multiple retrieval queries from one user question for better recall.

Ad placeholder (leaderboard)

Multi-query RAG query expander

The single biggest, cheapest win for retrieval quality is rarely a better embedding model — it is asking the same question several ways. This tool uses your own API key to turn one user question into a set of complementary retrieval queries (a broad rephrase, a keyword-dense version, a narrower clarification), so your vector store gets queried from multiple angles and surfaces chunks a single embedding would miss.

How it works

You paste your own OpenAI or Anthropic key and a question, and the tool sends a single prompt asking the model to produce N distinct rewrites optimized for semantic search, returned one per line. The request goes directly from your browser to the provider — for Anthropic the call includes the anthropic-dangerous-direct-browser-access header so it works without a proxy. The response is split into individual queries you can copy. In your own pipeline you would embed each query, retrieve top-k chunks per query, then union and de-duplicate the results before reranking.

Tips and notes

  • Union, then rerank. Retrieve with every variant, merge the candidate chunks, drop duplicates, and run a single reranking pass — do not just concatenate top-k lists.
  • Keep one keyword-heavy variant. Pairing a keyword-dense query with a hybrid BM25 retriever catches exact identifiers that pure semantic search drops.
  • Mind the cost. Each variant multiplies your retrieval calls; three to five variants is the usual sweet spot.
  • Your key never leaves the browser. It is used only for the direct provider request and is never stored or logged.
Ad placeholder (rectangle)