Redaction pipelines fail quietly: a regex that looks right in review can still leak a phone number with an unusual separator or an email in an attachment field. This tool lets you exercise your redaction regex against realistic text and immediately see both what it catches and, just as importantly, what it misses.
How it works
The tool does two passes over your sample text:
- Your regex is compiled with the global flag and run across the text. Every non-empty match is highlighted in green — these are the spans your pipeline would replace with a redaction marker.
- Built-in PII detectors run independently for emails, phone numbers, credit cards, IPv4 addresses, US Social Security numbers, IBANs, and JWTs. The credit-card detector additionally verifies each candidate with the Luhn checksum so random digit runs are not reported.
For each detected PII span, the tool checks whether it overlaps any of your regex matches. If it does not, the value is shown in the red leak list — a piece of sensitive-looking data your regex would have let through.
Example
Suppose your regex only matches emails. Paste text that contains both an email and a credit-card number. The email is highlighted green (redacted), but the Luhn-valid card number appears in the red leak list because your pattern never covered it. That is the exact gap that causes production incidents, surfaced before you ship.
Notes and tips
- The global flag is always applied so matching is exhaustive; add
i,m, orsas your pattern needs. - An empty-match regex (for example
a*) is guarded so it cannot loop forever — zero-width matches are skipped. - The detectors are heuristics tuned to reduce false positives. They will not know about your internal account IDs or custom tokens, so pair this with domain-specific tests.
- Everything runs locally, so it is safe to test with real-shaped sample data.