BM25 relevance scorer
BM25 is the lexical ranking function behind Elasticsearch, OpenSearch, and most production search engines, and it remains the default keyword half of nearly every hybrid RAG retriever. This tool runs a faithful Okapi BM25 implementation entirely in your browser so you can paste a query and a set of candidate chunks and see exactly how a lexical retriever would rank them — useful for sanity- checking whether your retrieval problem even needs embeddings.
How it works
The query and every document are lowercased and split into word tokens. The
tool counts how many documents contain each query term to compute the inverse
document frequency, IDF(t) = ln(1 + (N - n + 0.5) / (n + 0.5)), where N is
the document count and n is the number of documents containing the term. Each
document’s score is the sum over query terms of
IDF(t) · f · (k1 + 1) / (f + k1 · (1 - b + b · len/avglen)), where f is the
term frequency in that document, len is the document length in tokens, and
avglen is the average document length. The two knobs, k1 and b, let you
tune term saturation and length normalization respectively.
Tips and notes
- Hybrid beats either alone. In practice teams run BM25 and embedding search in parallel and fuse the rankings (reciprocal rank fusion), because BM25 nails exact identifiers and rare terms while embeddings catch paraphrase.
- Start with the defaults.
k1=1.5andb=0.75are the standard starting values; only tune after you have a labeled relevance set to measure against. - Short queries reward IDF. A rare query term that appears in only one document will dominate the ranking — that is the intended behavior, not a bug.
- Everything stays local. Because nothing is uploaded, you can evaluate proprietary documents and customer queries without privacy risk.