MLOpsMLOpsMachine Learning Operations — deploying, monitoring, and maintaining ML models in production reliably and at scale. is the practice of applying software engineering and DevOps principles — automation, CI/CD, monitoring, and reproducibility — to the full lifecycle of machine learning systems in production, closing the gap between model development and reliable model operation.
Why this appears in interviews
MLOpsMLOpsMachine Learning Operations — deploying, monitoring, and maintaining ML models in production reliably and at scale. is both a philosophy and a set of practices, and the term is frequently misused. Interviewers ask about it to see whether you understand the operational challenges of production ML — not just the tooling — and whether you can articulate why the ML development process breaks down without it.
The mental model — why ML is harder than software to operate
The model has three inputs, not one. A software system's behaviour is determined by its code. An ML model's behaviour is determined by its code, its training data, and its hyperparameters. Change any one of these and the model changes — even if you did not touch the deployment.
"Testing" a model is probabilistic, not deterministic. You cannot exhaustively test an ML model the way you can test software. You sample from a distribution of inputs and measure aggregate performance. This means bugs can slip through that deterministic tests would catch.
Production is part of the system. In traditional software, production is where you deploy your finished product. In ML, production is where you collect the data that trains your next model. The boundary between development and production is blurry.
The maturity model — three levels
Level 0 — Manual ML: Data scientists train models in notebooks, export weights, a DevOps engineer manually deploys them. No automated retraining. No monitoring. Most ML in industry is at this level.
Level 1 — ML Pipeline Automation: The training pipeline is automated — new data triggers a training run, the trained model is automatically evaluated, and if it passes quality gates it is promoted to staging. Human approval is still required for production deployment.
Level 2 — CI/CD for ML: Training, evaluation, and deployment are all automated in a CI/CD pipeline. Model performance monitoring triggers automated retraining. New model versions are deployed using canary releases with automated rollback. Engineers are alerted when intervention is needed but the system manages itself day-to-day.
What MLOps actually looks like in practice
- A model registry: versioned storage of trained models with metadata, enabling instant rollback to any previous model version
- Automated training pipelines: a data change triggers preprocessing, training, evaluation, and promotion
- Feature stores: a shared system that computes features consistently between training and serving (prevents training-serving skew)
- Monitoring and alerting: automated checks on data drift, prediction distribution shifts, system performance, and business metrics
- Reproducibility: every model in the registry can be re-trained from scratch by re-running the training pipeline with the logged data version, code version, and hyperparameters
The MLOps tool landscape
Experiment tracking: MLflow, Weights & Biases — track training runs, hyperparameters, metrics
Pipeline orchestration: Kubeflow Pipelines, Metaflow, Vertex AI Pipelines — automate the training workflow
Model registry: MLflow Model Registry, SageMaker Model Registry — version and serve models
Feature stores: Feast, Tecton, Vertex AI Feature Store — consistent feature computation
Monitoring: Evidently, Arize AI, WhyLabs — track data and model quality in production
Common interview mistakes
Mistake 1: Listing MLOpsMLOpsMachine Learning Operations — deploying, monitoring, and maintaining ML models in production reliably and at scale. tools without explaining the problem they solve. Always ground your answer in the underlying operational challenge.
Mistake 2: Treating MLOpsMLOpsMachine Learning Operations — deploying, monitoring, and maintaining ML models in production reliably and at scale. as only about deployment. MLOpsMLOpsMachine Learning Operations — deploying, monitoring, and maintaining ML models in production reliably and at scale. covers the entire lifecycle — from data collection through training through deployment through monitoring and retraining.
Mistake 3: Not knowing what a model registry is. The model registry is the most fundamental MLOpsMLOpsMachine Learning Operations — deploying, monitoring, and maintaining ML models in production reliably and at scale. concept.
Key vocabulary
Model registry — A versioned store of trained models with metadata, enabling rollback, auditing, and deployment management.
Reproducibility — The ability to recreate any past model exactly by re-running the training pipeline with the same inputs.
Training pipeline — An automated sequence of steps (data preprocessing, model training, evaluation, promotion) triggered by new data or a schedule.
ML maturity level — A framework (Level 0-2) describing how automated and robust an organisation's ML operations are.