CSV Outlier Detector

Flag statistical outliers in numeric CSV columns using IQR or Z-score.

Ad placeholder (leaderboard)

Spot anomalies in tabular data

Bad data hides in spreadsheets — a price typed with an extra zero, a sensor that spiked, a duplicated row. This tool reads a CSV, finds the numeric columns, and flags the values that sit far outside the rest of their column using one of two standard statistical methods: the IQR fence or the Z-score. Everything is computed locally, so you can vet a confidential dataset without uploading it.

How it works

The CSV is parsed in the browser (handling quoted fields and commas inside quotes). For each column the tool decides whether it is numeric by checking that most non-empty cells parse as numbers. Then:

  • IQR method: the column is sorted, Q1 and Q3 are taken at the 25th and 75th percentiles (linear interpolation), and IQR = Q3 − Q1. A value is an outlier if it is below Q1 − k·IQR or above Q3 + k·IQR. The multiplier k defaults to 1.5; 3.0 flags only extreme outliers.
  • Z-score method: the column’s mean μ and sample standard deviation σ are computed, and each value’s z = (x − μ) / σ. A value is an outlier when |z| exceeds your threshold (default 3).

Empty and non-numeric cells are ignored in the statistics and never flagged.

Tips and notes

  • IQR is the safer default for skewed or heavy-tailed data because quartiles are not dragged around by the outliers themselves.
  • With Z-score, a single huge value inflates σ and can hide smaller anomalies — if you suspect that, switch to IQR.
  • The bounds used for each column are shown so you can sanity-check whether the threshold matches your domain knowledge.
Ad placeholder (rectangle)