What metrics does it use?

Unigram (single-word) Jaccard overlap, bigram (word-pair) Jaccard overlap, and a smoothed sentence-level BLEU score averaged in both directions. These are combined into one weighted composite score.

Can two paraphrases score low?

Yes. These are surface metrics that look at shared words, not meaning. A genuine paraphrase that reuses few of the same words will score lower than its meaning warrants, so always read both texts.

Can two different texts score high?

Also yes — texts about the same topic share vocabulary even when they make different claims. Use the score as a fast signal, not a final judgement.

No. All comparison runs locally in your browser. Nothing is sent to a server.

What is the Paraphrase Similarity Checker?

Compares two texts using word and bigram overlap plus a sentence-level BLEU score to estimate whether they are paraphrases, and highlights which words are shared versus unique to each side, entirely in your browser. It runs free in your browser on Gera Tools, with nothing uploaded.

Paraphrase Similarity Checker

Name: Paraphrase Similarity Checker
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Get one useful tool a week

Like this tool? Enter your email and we'll send you one genuinely useful Gera tool a week — plus a link to come back to this one. No spam, one-click unsubscribe any time.

What this checks

Sometimes you regenerate an LLM response and want to know whether the new answer actually differs from the old one, or whether two model outputs are saying the same thing in different words. This tool measures the lexical similarity between two texts — how much vocabulary and phrasing they share — and gives you a single composite score plus a word-level breakdown.

It is a fast, deterministic, in-browser check. It does not call a model and does not claim to understand meaning.

How it works

The texts are lowercased and tokenised into words. The tool then computes three complementary signals:

Unigram overlap — the Jaccard similarity of the two word sets: how much of the combined vocabulary appears in both texts.
Bigram overlap — the same measure on adjacent word pairs, which captures shared phrasing and word order, not just shared words.
BLEU — a smoothed, bidirectional sentence-level BLEU score over 1- to 4-grams with a brevity penalty, the standard surface metric from machine translation.

These are blended into one composite percentage and turned into a verdict: likely paraphrases, partial overlap, or likely different content. The word lists below show exactly which terms are shared and which are unique to each side.

How to read it — and its limits

Lexical metrics are a starting point, not the truth. Two important failure modes:

A real paraphrase that swaps in synonyms (“The feline rested on the ledge”) shares few words with the original and will score lower than it should.
Two texts on the same topic but with opposite claims (“profits rose” vs “profits fell”) share most of their words and will score high despite meaning the reverse.

So use the score to triage — high scores are worth a closer read for redundancy, low scores confirm genuine divergence — but make the final call by reading both texts. For meaning-level similarity, you need embeddings and cosine similarity, not word overlap.