What AI bias actually means
AI bias is systematic, unfair difference in how a machine learning system treats people or groups. A model is biased when its predictions are consistently less accurate, less favourable, or more harmful for some populations than for others — not by random chance, but in a repeatable pattern. Crucially, the bias usually does not come from malice or a coding mistake. It comes from the model faithfully learning patterns that were already present in its training data or in the way the task was set up.
The main types of bias
Researchers distinguish several sources. Historical bias exists when the data reflects past inequality even if it was collected perfectly — for example, lending data from decades when certain neighbourhoods were systematically denied credit. Representation bias arises when some groups are under-sampled, so the model simply has too few examples to learn them well. Measurement bias happens when the features or labels are poor proxies — using “arrests” as a stand-in for “crime”, for instance, imports policing disparities. Aggregation bias appears when one model is forced to serve groups that genuinely behave differently, so it fits the majority and fails the rest. Each enters at a different stage of the pipeline.
Real-world examples
Bias is not theoretical. Automated hiring tools have down-ranked CVs containing words associated with women because past hiring favoured men. Lending and credit models have offered worse terms to applicants from certain postcodes that correlate with race. In healthcare, a widely used risk algorithm used past spending as a proxy for need; because less money was historically spent on Black patients, the model underestimated their illness. In each case the system was technically “accurate” on its training objective while producing unfair outcomes in the real world.
How bias is measured
Because a single accuracy figure can hide large gaps, teams use disaggregated evaluation: they split test results by sensitive attributes and compare metrics like false-positive rate, false-negative rate, and selection rate across groups. Common fairness criteria include demographic parity (equal positive rates), equal opportunity (equal true-positive rates), and calibration (scores mean the same thing for everyone). These criteria can conflict, so measuring bias means first deciding which harms matter for the specific decision being automated.
How to reduce it
Mitigation spans the whole lifecycle. Before training, teams rebalance or augment data, audit labels, and remove or carefully handle sensitive features. During training, they can add fairness constraints or reweight examples. After training, they adjust decision thresholds per group or post-process outputs. Just as important are process safeguards: documenting datasets and models, running independent audits, keeping humans in the loop for high-stakes decisions, and monitoring live systems, since bias can drift as the world changes. There is no one-click fix — only deliberate, measured, ongoing work.