Model bias is systematic, unfair difference in a model's behaviour across groups of people. It usually originates in unrepresentative or historically skewed training data, but can also come from labels, features, objective functions and feedback loops.

What is demographic parity versus equalised odds?

Demographic parity asks whether positive outcomes are distributed equally across groups. Equalised odds asks whether error rates (false positives and negatives) are equal across groups. They can conflict, so you must choose the metric that matches your harm model.

Should I remove protected attributes to avoid bias?

Often no. Dropping a protected attribute rarely removes bias because proxies (postcode, name, purchase history) leak it back in. You usually need the attribute available for measurement even if it is excluded from the model's inputs.

What is a feedback loop in ML fairness?

A feedback loop occurs when a model's outputs shape the future data it learns from — for example, predictive policing concentrating patrols, generating more recorded incidents, which then confirm the model. It can amplify small initial biases over time.

Does passing this checklist mean my model is fair?

No. The checklist ensures you have looked in the right places and documented your reasoning. Fairness is contextual and contested — the goal is a defensible, evidenced process, not a single pass/fail score.

What is the Model Bias Review Checklist?

Walk through a systematic bias-review checklist covering training-data representativeness, demographic parity, protected attributes, feedback loops and documentation gaps — and export a reviewable record for your model card. It runs free in your browser on Gera Tools, with nothing uploaded.

Model Bias Review Checklist

Name: Model Bias Review Checklist
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Get one useful tool a week

Like this tool? Enter your email and we'll send you one genuinely useful Gera tool a week — plus a link to come back to this one. No spam, one-click unsubscribe any time.

Catch model bias before it reaches production

Bias rarely announces itself. A model can post excellent aggregate accuracy while quietly performing far worse for a subgroup that is under-represented in the training data — and you will not see it unless you deliberately look. This checklist gives you a systematic review to run before deployment, covering the five places bias actually hides: the data, the labels, the metrics, the feedback loops, and the documentation.

It is built to produce evidence, not just a green tick. The exported record becomes part of your model card or fairness report, demonstrating that you examined the right questions.

How a structured bias review works

A good review walks the model’s whole lifecycle rather than fixating on one fairness number:

Training-data representativeness — does the data reflect the population the model will serve, including the tails? Under-representation is the single most common root cause.
Protected attributes and proxies — you need group labels available to measure parity, even if you exclude them from the model. Watch for proxies that smuggle the attribute back in.
Metric choice — demographic parity, equalised odds and calibration can all be “fair” and mutually exclusive. Pick the one that matches the harm you are trying to prevent, and justify it.
Feedback loops — will the model’s own outputs shape future training data? If so, small biases compound. Plan for monitoring drift after launch.
Documentation — record the protected groups considered, the metrics chosen, the thresholds, and the residual risk you accepted.

Notes and tips

Always slice your metrics by subgroup; aggregate accuracy hides the failures that matter for fairness.
Removing a protected attribute is usually the wrong fix — keep it for measurement and address bias at the data, objective or threshold level.
For generative models, bias shows up as representational harm (stereotyped or skewed outputs) rather than classification error; test prompts across groups.
This checklist documents process, not proof. Treat residual bias as a risk you consciously accept and monitor, and pair it with the AI Risk Classifier for the regulatory angle.