BM25 Relevance Scorer

Score candidate documents against a query using BM25 — no backend needed.

Ad placeholder (leaderboard)

BM25 relevance scorer

BM25 is the lexical ranking function behind Elasticsearch, OpenSearch, and most production search engines, and it remains the default keyword half of nearly every hybrid RAG retriever. This tool runs a faithful Okapi BM25 implementation entirely in your browser so you can paste a query and a set of candidate chunks and see exactly how a lexical retriever would rank them — useful for sanity- checking whether your retrieval problem even needs embeddings.

How it works

The query and every document are lowercased and split into word tokens. The tool counts how many documents contain each query term to compute the inverse document frequency, IDF(t) = ln(1 + (N - n + 0.5) / (n + 0.5)), where N is the document count and n is the number of documents containing the term. Each document’s score is the sum over query terms of IDF(t) · f · (k1 + 1) / (f + k1 · (1 - b + b · len/avglen)), where f is the term frequency in that document, len is the document length in tokens, and avglen is the average document length. The two knobs, k1 and b, let you tune term saturation and length normalization respectively.

Tips and notes

  • Hybrid beats either alone. In practice teams run BM25 and embedding search in parallel and fuse the rankings (reciprocal rank fusion), because BM25 nails exact identifiers and rare terms while embeddings catch paraphrase.
  • Start with the defaults. k1=1.5 and b=0.75 are the standard starting values; only tune after you have a labeled relevance set to measure against.
  • Short queries reward IDF. A rare query term that appears in only one document will dominate the ranking — that is the intended behavior, not a bug.
  • Everything stays local. Because nothing is uploaded, you can evaluate proprietary documents and customer queries without privacy risk.
Ad placeholder (rectangle)