What is cosine similarity?

Cosine similarity measures the angle between two embedding vectors, returning a value from -1 to 1 where higher means more semantically similar. It ignores vector length, so it compares meaning rather than text size.

Which embedding model is used?

By default it calls OpenAI's text-embedding-3-small, with text-embedding-3-large available for higher accuracy. Both return normalized vectors suitable for cosine similarity ranking.

Does this cost money?

It uses your own API key, so you pay OpenAI's embedding rate directly — which is very cheap (fractions of a cent for a handful of short candidates). Nothing is billed by this tool.

Your query and candidates are sent only to OpenAI to compute embeddings, using the key you provide. The tool itself stores nothing and runs the ranking math locally in your browser.

What is the Similarity Score Ranker?

Free embedding similarity ranker. Enter a query and a list of candidate strings, bring your own OpenAI embeddings key, and rank every candidate by cosine similarity to the query — see the exact scores and ordering, computed in your browser. It runs free in your browser on Gera Tools, with nothing uploaded.

Similarity Score Ranker

Name: Similarity Score Ranker
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Rank candidates by real semantic similarity

Want to know which of several texts best matches a query? This tool fetches real embeddings for your query and every candidate using your own API key, then ranks the candidates by cosine similarity so you can see exactly how a semantic search would order them — scores and all.

How embedding similarity works

An embedding model turns a piece of text into a vector — a list of numbers that captures its meaning. Two texts that mean similar things end up with vectors pointing in similar directions. Cosine similarity measures the angle between two vectors:

similarity = (A · B) / (‖A‖ × ‖B‖)

The result ranges from −1 to 1; for typical text embeddings it sits between roughly 0 and 1, where higher means more related. Because it uses the angle and not the magnitude, cosine similarity compares meaning rather than length.

This is the exact operation behind semantic search and RAG retrieval: embed the query, embed each chunk, and return the highest-scoring matches.

What this tool is useful for

Debugging a RAG retrieval system

If your retrieval step keeps surfacing irrelevant chunks, paste your actual query and the top-retrieved chunks as candidates. The similarity scores show whether the problem is the query wording, chunk granularity, or embedding model choice — not just a gut feeling.

Comparing candidate answers or documents

Paste a reference answer as the query and multiple candidate answers as candidates. The ranking shows which candidates are semantically closest to your gold standard, which is useful for automated evaluation of generated text.

Tuning chunking strategy

Embed the same query against short chunks (200 tokens) versus long chunks (800 tokens) of the same document. Compare scores to see which granularity gives a stronger signal for your content type. Technical documentation often favours smaller chunks; narrative content often favours larger ones.

Checking synonym coverage

If you suspect a keyword gap — users ask “affordable” but your copy says “low-cost” — embed both terms as query and candidate. A high similarity score (above roughly 0.85 for text-embedding-3-small) means the model treats them as semantically equivalent and retrieval should work without exact matching.

Tips for using the scores

Compare scores within a batch, not across batches. A score of 0.55 may be excellent in one candidate set and mediocre in another. What matters is relative ranking.
Use the large model (text-embedding-3-large) when you need finer discrimination between close candidates; use the small model for bulk ranking at lower cost.
Keep candidates focused. Embedding a whole document dilutes its meaning across many topics; embed the specific paragraphs you want to retrieve.
Watch for ties at the top. If several candidates score within 0.01 of each other, the model cannot reliably distinguish them — consider reformulating the query to be more specific, or chunking the candidates differently.
Your API key is used directly. The key goes only to OpenAI for embedding requests and is never stored by this tool. You pay OpenAI’s embedding rate for each request, which is typically fractions of a cent for a handful of short strings.