Concept · ~8 min read

What Is Model Drift

Model drift is the degradation in a deployed model's performance over time because the real world has changed since the model was trained — and detecting it before users notice is the most important ongoing responsibility of an ML engineer.

Why this appears in interviews

Model drift is the reason ML engineering is not a one-time job. Every deployed model will eventually drift. Interviewers ask about it to test whether you understand the fundamental nature of the problem — that a model trained on yesterday's world is not guaranteed to work on tomorrow's — and whether you can reason about the different ways drift manifests and how to catch it.

The mental model — a map of a city that keeps changing

Imagine you have a map of a city drawn in 2022. In 2022, the map is accurate. By 2026, new roads have been built, old ones renamed, entire neighbourhoods redeveloped. The map is still internally consistent and looks correct, but it no longer reflects the real city. A trained ML model is like this map — a compressed mathematical representation of patterns that existed in the world when you collected training data. As the world changes, the model's representation becomes less accurate. The model has not changed. The world has.

The three types of model drift

Data drift (input distribution shift): The statistical properties of the model's input features change over time. A fraud detection model trained on 2022 transactions starts seeing 2026 transaction patterns — different merchants, currencies, transaction sizes. The inputs have changed; the model has not adapted.

Concept drift: The relationship between the inputs and the correct output changes. The feature values look similar to training data, but what those values mean for the prediction has changed. Example: a spam filter in 2023 was trained to classify emails with "NFT" in the subject as spam. By 2026, "NFT" appears in legitimate business emails. The feature is the same; its predictive meaning has changed.

Label shift (prior probability shift): The distribution of the target variable changes. The model predicts the right things for the right reasons, but the base rate of what it is predicting has shifted. Example: a disease detection model trained when a disease has 1% prevalence is deployed into a population where prevalence has risen to 8% due to an outbreak.

Why concept drift is the hardest to detect

Data drift can be detected without ground truth — you can measure whether feature distributions are changing just by looking at incoming data.

Concept drift requires ground truth to detect — you need to know what the correct answer actually was to know that the model's predictions are now wrong for the right inputs. And ground truth often arrives late. For a fraud detection model, you may not know a transaction was fraudulent until weeks after it occurred.

This latency between prediction and ground truth is one of the most challenging properties of production ML systems.

The cost of undetected drift

Undetected model drift is insidious because it causes systems to fail gradually and silently. A 2% reduction in model accuracy per month may be hard to notice in any single week. After six months, performance has dropped 12% — and because the decline was gradual, there is no clear "this is when it broke" moment. For business-critical ML systems — credit scoring, fraud detection, personalised pricing — silent drift can cause significant financial or reputational damage before it is discovered.

Common interview mistakes

Mistake 1: Conflating data drift and concept drift. They require different detection strategies and different responses. Data drift can be detected by monitoring feature distributions alone. Concept drift requires monitoring prediction quality, which requires delayed ground truth.

Mistake 2: Treating drift as a deployment problem rather than an ongoing concern. "We tested the model thoroughly before deploying it." Testing at deployment says nothing about performance six months later.

Mistake 3: Not thinking about drift during model design. The right time to design a drift detection strategy is before deployment, not after the first drift incident.

Key vocabulary

Data drift — Change in the statistical distribution of model inputs between training and serving time.

Concept drift — Change in the relationship between model inputs and the correct output over time.

Label shift — Change in the distribution of the target variable over time.

Ground truth latency — The delay between a model making a prediction and the true outcome being known. Critical for designing drift detection for time-delayed outcomes.

← Previous
Next · ConceptMlops What It Is