Diff two AI responses
When you iterate on a prompt — change the system message, bump the temperature, upgrade the model — you need to know what actually changed in the output. Eyeballing two long responses misses subtle regressions. This tool runs a real diff between version A and version B and highlights every addition, deletion, and unchanged span.
How it works
The diff computes the longest common subsequence between the two token streams
(lines, or words within lines), exactly like git diff. Tokens that appear only in A
are deletions, tokens only in B are additions, and shared tokens are unchanged. The
result is rendered with colour coding so the differences jump out.
A: The capital is Paris.
B: The capital city is Paris.
-> The capital [+city] is Paris.
Tips and notes
Use line-level diff to catch reordered paragraphs or added list items, and word-level to catch a single changed number inside a sentence. Enable trimming to ignore trailing whitespace so only meaningful edits show. For regression testing, keep a golden output in A and paste each new run into B — the highlighted diff is your pass/fail signal.