The core idea
Federated learning flips the usual machine learning setup. Instead of moving data to the model, it moves the model to the data. A central server holds a shared global model and sends copies to many participants — millions of phones, a handful of hospitals, or a group of banks. Each participant trains the model on its own local data and sends back only the updates (changes to the model’s weights), never the raw data. The server combines those updates into an improved global model and repeats. The sensitive data never leaves the device that owns it.
The training protocol round by round
A federated learning run proceeds in rounds. First, the server selects a subset of available clients and sends them the current global model. Each selected client trains locally for a few steps on its own data, producing a small model update. The clients upload those updates, and the server aggregates them — most commonly with Federated Averaging (FedAvg), which takes a weighted mean of the updates based on how much data each client used. The averaged result becomes the new global model, and the next round begins. Over many rounds the global model improves as if it had seen everyone’s data, even though it never did.
Secure aggregation
Raw model updates can still reveal something about the underlying data, so federated systems often add secure aggregation. Using cryptographic masking, each client’s update is hidden under random noise that cancels out only when many updates are summed together. The server can compute the aggregate but cannot inspect any single client’s contribution. This means even the central operator never sees an individual update — only the combined result of the whole cohort.
Adding differential privacy
For stronger, mathematically provable guarantees, federated learning is combined with differential privacy. Carefully calibrated noise is added — either by each client to its update, or by the server to the aggregate — so that no individual’s participation can be detected from the final model. This bounds how much any one person’s data can influence the result, protecting against attempts to reconstruct training examples or test whether someone took part.
Where it is used
The best-known deployment is mobile keyboards: predictive text and next-word suggestions improve across millions of phones without typed messages ever being uploaded. In healthcare, hospitals collaborate to train diagnostic and imaging models across their patient populations while keeping records inside each institution, satisfying privacy regulations. Banks use it for fraud detection across branches or institutions. The common thread is data that is valuable in aggregate but sensitive, regulated, or simply too large to centralise — exactly where federated learning earns its place.