This guide uses everyday analogies to explain Machine Learning concepts and the MLOps ecosystem tools: Kubeflow, MLflow, and KServe.
Imagine you want to create a restaurant chain that serves the perfect dish to every customer. To achieve this, you need:
- A culinary school to train chefs (Kubeflow)
- A professional registry to catalog and version chefs (MLflow)
- Industrial kitchens where chefs work (KServe)
┌─────────────────────────────────────────────────────────────────────────────┐
│ REAL WORLD vs MACHINE LEARNING │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ RESTAURANT CHAIN MLOPS │
│ ──────────────── ───── │
│ Culinary school = Kubeflow (orchestrated training) │
│ Professional chef registry = MLflow (model registry) │
│ Industrial kitchen = KServe (model serving) │
│ │
│ INSIDE EACH RESTAURANT MACHINE LEARNING │
│ ────────────────────── ──────────────── │
│ Apprentice chef = Algorithm (code) │
│ Recipes and practice = Training data │
│ Experienced chef = Trained model │
│ Cooking a dish = Making an inference │
│ Customer order = Input (input data) │
│ Served dish = Output (prediction) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
TRAINING (happens once, before "opening the restaurant")
─────────────────────────────────────────────────────────
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ RECIPES │ │ PRACTICE │ │ EXPERIENCED │
│ (Data) │ ───▶│ (Training) │ ───▶│ CHEF │
│ │ │ │ │ (Model) │
└──────────────┘ └──────────────┘ └──────────────┘
Real example:
- Data: 1000 labeled photos of cats and dogs
- Training: Algorithm learns patterns (ears, snouts, etc.)
- Model: File that "knows" how to distinguish cats from dogs
INFERENCE (happens every time a customer places an order)
────────────────────────────────────────────────────────
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ ORDER │ │ CHEF │ │ DISH │
│ (Input) │ ───▶│ KITCHEN │ ───▶│ (Output) │
│ │ │ (Model) │ │ │
└──────────────┘ └──────────────┘ └──────────────┘
Real example:
- Input: New photo of an animal
- Model: Analyzes the photo using what it learned
- Output: "It's a cat" (with 95% confidence)
| Aspect | Training | Inference |
|---|---|---|
| Analogy | Culinary school | Running restaurant |
| Frequency | Once (or periodically) | Thousands of times per day |
| Resources | Heavy CPU/GPU, hours/days | Light, milliseconds |
| Where it happens | Notebooks, training clusters | Production servers (KServe!) |
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ ML MODEL JOURNEY │
│ │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ KUBEFLOW │ │ MLFLOW │ │ KSERVE │ │
│ │ │ │ │ │ │ │
│ │ "Culinary │ ───▶│ "Professional │ ───▶│ "Industrial │ │
│ │ School" │ │ Registry" │ │ Kitchen" │ │
│ │ │ │ │ │ │ │
│ │ Trains the │ │ Catalogs and │ │ Puts chef │ │
│ │ chef │ │ versions │ │ to work │ │
│ │ │ │ the chef │ │ │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
│ │
│ DEVELOPMENT MANAGEMENT/GOVERNANCE PRODUCTION │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ YOU WANT: A restaurant chain where each location serves perfect dishes, │
│ with certified chefs and standardized kitchens. │
│ │
│ ═══════════════════════════════════════════════════════════════════════ │
│ │
│ KUBEFLOW = CULINARY SCHOOL │
│ ────────────────────────── │
│ • Where chefs are trained │
│ • Has structured curriculum (Pipelines) │
│ • Laboratories for experiments (Notebooks) │
│ • Classrooms with heavy equipment (GPUs for training) │
│ • Can train multiple chefs simultaneously (distributed jobs) │
│ │
│ ═══════════════════════════════════════════════════════════════════════ │
│ │
│ MLFLOW = PROFESSIONAL CHEF REGISTRY │
│ ─────────────────────────────────── │
│ • History of all courses taken (experiment tracking) │
│ • Certificates and diplomas (model artifacts) │
│ • Versions: Junior Chef, Mid Chef, Senior Chef (model versions) │
│ • Who's working vs retired (staging, production, archived) │
│ • Compare performance between chefs (metrics comparison) │
│ │
│ ═══════════════════════════════════════════════════════════════════════ │
│ │
│ KSERVE = INDUSTRIAL KITCHEN │
│ ─────────────────────────── │
│ • Where the certified chef goes to work │
│ • Kitchen ready to use, just bring the chef │
│ • Serves thousands of orders per day │
│ • Opens more kitchens if needed (auto-scaling) │
│ • Closes at night if no customers (scale-to-zero) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Kubeflow is an ML platform for Kubernetes that orchestrates the entire model development lifecycle. It's like a complete culinary school.
┌─────────────────────────────────────────────────────────────────────────────┐
│ KUBEFLOW = CULINARY SCHOOL │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ KUBEFLOW COMPONENT ANALOGY │
│ ────────────────── ─────── │
│ │
│ Kubeflow Notebooks = Experimental laboratory │
│ (where you test new recipes) │
│ │
│ Kubeflow Pipelines = Structured curriculum │
│ (sequence of classes until graduation) │
│ │
│ Training Operators = Specialized classrooms │
│ (TFJob, PyTorchJob) (with specific equipment) │
│ │
│ Katib = Optimization program │
│ (hyperparameter tuning) (finding the best teaching technique) │
│ │
│ KServe (integrated) = Internship program │
│ (putting graduates to work) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
A Pipeline is a sequence of steps that transform an apprentice into a chef:
PIPELINE = TRAINING CURRICULUM
──────────────────────────────
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Fetch │ │ Prepare │ │ Train │ │ Evaluate │ │ Register │
│ Data │──▶│ Data │──▶│ Model │──▶│ Model │──▶│in MLflow │
│ │ │ │ │ │ │ │ │ │
│"Buy │ │"Organize │ │"Intensive│ │"Final │ │"Issue │
│ingredi- │ │ pantry" │ │ course" │ │ exam" │ │ diploma" │
│ ents" │ │ │ │ │ │ │ │ │
└──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
Each box is a pipeline "component" that runs in a separate container.
# pipeline.py - Chef training curriculum
@pipeline(name="train-iris-chef")
def train_chef():
# Lesson 1: Get ingredients
data = fetch_data(source="s3://data/iris.csv")
# Lesson 2: Prepare ingredients
prepared_data = prepare_data(data=data.output)
# Lesson 3: Intensive course (training)
model = train_model(
data=prepared_data.output,
algorithm="sklearn.RandomForest",
epochs=100
)
# Lesson 4: Final exam (evaluation)
metrics = evaluate_model(model=model.output)
# Lesson 5: Issue diploma (register in MLflow)
register_mlflow(
model=model.output,
metrics=metrics.output,
name="iris-chef"
)┌─────────────────────────────────────────────────────────────────────────────┐
│ TRAINING OPERATORS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ TFJob For training with TensorFlow │
│ (TensorFlow room) "Room with Google equipment" │
│ │
│ PyTorchJob For training with PyTorch │
│ (PyTorch room) "Room with Meta equipment" │
│ │
│ MPIJob For distributed training │
│ (distributed room) "Multiple connected rooms for large classes" │
│ │
│ XGBoostJob For training with XGBoost │
│ (XGBoost room) "Room specialized in tabular data" │
│ │
│ │
│ Why different rooms? │
│ Each framework has specific resource and configuration needs. │
│ Operators abstract this complexity away. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
MLflow is a platform for managing the ML lifecycle. It's like the professional registry that keeps history, certificates, and versions of all chefs.
┌─────────────────────────────────────────────────────────────────────────────┐
│ MLFLOW = PROFESSIONAL CHEF REGISTRY │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ MLFLOW COMPONENT ANALOGY │
│ ──────────────── ─────── │
│ │
│ MLflow Tracking = Grade book │
│ (all attempts, parameters, results) │
│ │
│ MLflow Models = Standard diploma format │
│ (certificate any kitchen accepts) │
│ │
│ MLflow Model Registry = Records office │
│ (official versions, status, approvals) │
│ │
│ MLflow Projects = Instruction manual │
│ (how to reproduce the training) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Each training attempt is recorded with all details:
┌─────────────────────────────────────────────────────────────────────────────┐
│ EXPERIMENT: train-iris-chef │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ RUN #1 (first attempt) │
│ ────────────────────── │
│ Parameters: n_estimators=10, max_depth=3 │
│ Metrics: accuracy=0.85, f1_score=0.83 │
│ Artifacts: model.pkl, confusion_matrix.png │
│ Status: ❌ Failed (accuracy < 0.90) │
│ │
│ RUN #2 (second attempt) │
│ ─────────────────────── │
│ Parameters: n_estimators=100, max_depth=5 │
│ Metrics: accuracy=0.92, f1_score=0.91 │
│ Artifacts: model.pkl, confusion_matrix.png │
│ Status: ❌ Failed (overfitting detected) │
│ │
│ RUN #3 (third attempt) │
│ ────────────────────── │
│ Parameters: n_estimators=50, max_depth=4 │
│ Metrics: accuracy=0.95, f1_score=0.94 │
│ Artifacts: model.pkl, confusion_matrix.png │
│ Status: ✅ Passed! Register in Model Registry │
│ │
│ Analogy: It's like keeping all exams, with grades and comments, │
│ to know exactly what worked and what didn't. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ MODEL REGISTRY: iris-chef │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ VERSION STAGE DESCRIPTION │
│ ─────── ───── ─────────── │
│ │
│ v1 Archived "Retired chef" │
│ First model, accuracy 0.85 │
│ Replaced by v2 │
│ │
│ v2 Production "Working chef" ← CURRENT IN KSERVE │
│ Current model, accuracy 0.95 │
│ In production since 2024-01-15 │
│ │
│ v3 Staging "Chef in testing" │
│ New model, accuracy 0.97 │
│ Awaiting approval for production │
│ │
│ ═══════════════════════════════════════════════════════════════════════ │
│ │
│ STAGES: │
│ │
│ None → Model registered but not classified │
│ Staging → In testing/validation (intern chef) │
│ Production → In active use (hired chef) │
│ Archived → Retired/historical (retired chef) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
import mlflow
from mlflow.tracking import MlflowClient
# During training - log experiment
with mlflow.start_run():
# Parameters (recipe ingredients)
mlflow.log_param("n_estimators", 50)
mlflow.log_param("max_depth", 4)
# ... train model ...
# Metrics (exam grades)
mlflow.log_metric("accuracy", 0.95)
mlflow.log_metric("f1_score", 0.94)
# Artifacts (diploma + portfolio)
mlflow.sklearn.log_model(model, "model")
mlflow.log_artifact("confusion_matrix.png")
# After approval - register official version
client = MlflowClient()
client.create_registered_model("iris-chef")
client.create_model_version(
name="iris-chef",
source="runs:/abc123/model",
run_id="abc123"
)
# Promote to production
client.transition_model_version_stage(
name="iris-chef",
version=2,
stage="Production"
)┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ COMPLETE MLOPS FLOW │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ 1. KUBEFLOW │ │
│ │ (Culinary School) │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Notebook │─▶│ Pipeline │─▶│ Training │─▶│ Trained │ │ │
│ │ │(experim.)│ │ (ETL) │ │ Job │ │ Model │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └─────┬────┘ │ │
│ │ │ │ │
│ └──────────────────────────────────────────────────────┼──────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ 2. MLFLOW │ │
│ │ (Professional Registry) │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Track │─▶│ Compare │─▶│ Register │─▶│ Promote │ │ │
│ │ │experiment│ │ runs │ │ model │ │to "Prod" │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └─────┬────┘ │ │
│ │ │ │ │
│ └──────────────────────────────────────────────────────┼──────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ 3. KSERVE │ │
│ │ (Industrial Kitchen) │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Deploy │─▶│ Serve │─▶│ Scale │─▶│ Monitor │ │ │
│ │ │ model │ │ requests │ │ as needed│ │ & logs │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
KServe can fetch models directly from MLflow Model Registry:
# inference-service-mlflow.yaml
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: iris-chef-prod
spec:
predictor:
model:
modelFormat:
name: sklearn
# Fetches "iris-chef" model in "Production" version from MLflow
storageUri: "gs://mlflow-bucket/mlartifacts/123/abc/artifacts/model"┌────────────────────────────────────────────────────────────────────────────┐
│ TOOLS COMPARISON │
├────────────────┬───────────────────┬───────────────────┬───────────────────┤
│ │ KUBEFLOW │ MLFLOW │ KSERVE │
├────────────────┼───────────────────┼───────────────────┼───────────────────┤
│ Analogy │ Culinary │ Professional │ Industrial │
│ │ School │ Registry │ Kitchen │
├────────────────┼───────────────────┼───────────────────┼───────────────────┤
│ Focus │ Training and │ Tracking and │ Production │
│ │ experimentation │ versioning │ serving │
├────────────────┼───────────────────┼───────────────────┼───────────────────┤
│ Lifecycle │ Development │ Management/ │ Production │
│ phase │ │ Governance │ │
├────────────────┼───────────────────┼───────────────────┼───────────────────┤
│ Main │ • Pipelines │ • Tracking │ • InferenceService│
│ features │ • Notebooks │ • Model Registry │ • Auto-scaling │
│ │ • Training Jobs │ • Artifacts │ • Canary deploy │
├────────────────┼───────────────────┼───────────────────┼───────────────────┤
│ Runs on │ Kubernetes │ Standalone or K8s │ Kubernetes │
├────────────────┼───────────────────┼───────────────────┼───────────────────┤
│ Used by │ Data Scientists │ DS + ML Engineers │ ML Engineers + │
│ │ │ │ DevOps │
└────────────────┴───────────────────┴───────────────────┴───────────────────┘
DAY 1: EXPERIMENTATION (Data Scientist in Kubeflow)
───────────────────────────────────────────────────
1. Opens Kubeflow Notebook
2. Loads data, explores, tests algorithms
3. Finds a promising approach
4. MLflow automatically logs each experiment
DAY 2: TRAINING PIPELINE (Kubeflow + MLflow)
────────────────────────────────────────────
1. Creates Pipeline in Kubeflow with steps:
- Fetch data → Preprocess → Train → Evaluate → Register
2. Executes Pipeline
3. MLflow logs metrics and artifacts
4. Model approved → registers in Model Registry
DAY 3: DEPLOY TO PRODUCTION (KServe)
────────────────────────────────────
1. ML Engineer creates InferenceService pointing to MLflow
2. KServe downloads model from Model Registry
3. Model starts serving requests
4. Active monitoring
DAY 30: NEW MODEL (Cycle continues)
────────────────────────────────────
1. New model trained (v3) with better accuracy
2. Deploy as canary (10% of traffic)
3. Good metrics → promote to 100%
4. Old model (v2) archived
You trained your chef (model). Now you need a professional kitchen to serve thousands of customers. This is where KServe comes in.
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ WITHOUT KSERVE (improvised kitchen) WITH KSERVE (industrial kitchen) │
│ ─────────────────────────────────── ──────────────────────────────── │
│ │
│ - You set up everything manually - Kitchen ready to use │
│ - Configure stove, sink, fridge - Just bring the chef (model) │
│ - Hire waiters - Automatic service │
│ - Manage queues manually - Managed queues │
│ - If crowded, customers wait - Opens more kitchens (auto-scaling)│
│ - Kitchen always on - Turns off when empty (scale-zero) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
What you provide: What KServe provides:
──────────────────── ────────────────────────
┌──────────────────┐ ┌──────────────────────────────────┐
│ │ │ • Web server (waiter) │
│ YOUR MODEL │ │ • Load balancing │
│ (chef.pkl) │───────────▶│ • Auto-scaling │
│ │ │ • Health checks │
└──────────────────┘ │ • Logging and metrics │
│ • Automatic model download │
└──────────────────────────────────┘
The InferenceService is like a contract you sign with the industrial kitchen:
"I want to serve this model, it's stored at this address, and it's sklearn type"
# "Contract" with the KServe kitchen
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: sklearn-iris # Your "restaurant" name
spec:
predictor:
model:
modelFormat:
name: sklearn # Chef type (sklearn, tensorflow, pytorch...)
storageUri: "gs://..." # Where the chef is storedDifferent chefs need different kitchens:
┌─────────────────────────────────────────────────────────────────────────────┐
│ KITCHEN TYPES (ServingRuntimes) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Sklearn Chef ──▶ kserve-sklearnserver kitchen (simple, light) │
│ TensorFlow Chef ──▶ kserve-tensorflow kitchen (heavy, GPUs) │
│ PyTorch Chef ──▶ kserve-torchserve kitchen (flexible) │
│ XGBoost Chef ──▶ kserve-xgbserver kitchen (tabular data) │
│ HuggingFace Chef ──▶ kserve-huggingfaceserver kitchen (LLMs, NLP) │
│ │
│ You don't need to choose! KServe automatically detects based on │
│ the modelFormat you specified. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
The model needs to be somewhere accessible:
┌─────────────────────────────────────────────────────────────────────────────┐
│ WHERE TO STORE THE MODEL │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ gs://bucket/model Google Cloud Storage (Google cloud) │
│ s3://bucket/model AWS S3 or MinIO (Amazon cloud or self-hosted) │
│ http://server/model Public HTTP server │
│ pvc://my-pvc/model Volume inside Kubernetes │
│ │
│ Analogy: It's like the warehouse address where your chef is waiting │
│ to be picked up and taken to the kitchen. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ ORDER FLOW │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ │
│ ORDER │ TRANSFORMER │ Prep cook (assistant) │
│ (request) ───▶ │ (optional) │ - Washes ingredients │
│ │ │ - Cuts, seasons │
│ └──────┬──────┘ - Prepares for the chef │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ PREDICTOR │ Head chef (REQUIRED) │
│ │ (model) │ - Makes the dish (inference) │
│ │ │ - Is your ML model │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ RESPONSE │ EXPLAINER │ Sommelier (optional) │
│ (response) ◀─── │ (optional) │ - Explains why this dish │
│ │ │ - "I chose this wine because..." │
│ └─────────────┘ - Useful for understanding ML decisions│
│ │
└─────────────────────────────────────────────────────────────────────────────┘
PROBLEM: Identify flower species based on petal/sepal measurements
┌─────────────────────────────────────────────────────────────────────────┐
│ │
│ INPUT (flower measurements): │
│ ┌─────────────────────────────────────────┐ │
│ │ sepal_length: 5.1 cm │ │
│ │ sepal_width: 3.5 cm │ ─────▶ Model │
│ │ petal_length: 1.4 cm │ │ │
│ │ petal_width: 0.2 cm │ │ │
│ └─────────────────────────────────────────┘ │ │
│ ▼ │
│ OUTPUT (prediction): ◀─────── Class: 0 │
│ ┌─────────────────────────────────────────┐ (Iris Setosa) │
│ │ 0 = Iris Setosa │ │
│ │ 1 = Iris Versicolor │ │
│ │ 2 = Iris Virginica │ │
│ └─────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
1. YOU APPLY THE YAML
─────────────────────────────────────────────────────────────────────
kubectl apply -f sklearn-iris.yaml
"Hey KServe, I want to open a restaurant called sklearn-iris"
2. KSERVE CREATES THE INFRASTRUCTURE
─────────────────────────────────────────────────────────────────────
┌──────────────────────────────────────────────────────────┐
│ KServe creates: │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Deployment │ │ Service │ │ Ingress │ │
│ │ (kitchen) │ │ (waiter) │ │ (entrance) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
└──────────────────────────────────────────────────────────┘
3. STORAGE-INITIALIZER DOWNLOADS THE MODEL
─────────────────────────────────────────────────────────────────────
gs://bucket/model ────download────▶ /mnt/models/
"Go to the warehouse and bring the chef to the kitchen"
4. SERVER STARTS AND LOADS THE MODEL
─────────────────────────────────────────────────────────────────────
kserve-sklearnserver loads sklearn-iris.pkl into memory
"Chef is in the kitchen, ready to cook!"
5. CLIENT MAKES REQUEST
─────────────────────────────────────────────────────────────────────
curl -X POST .../v1/models/sklearn-iris:predict
-d '{"instances": [[5.1, 3.5, 1.4, 0.2]]}'
"Table 7 wants a dish with these ingredients"
6. MODEL RETURNS PREDICTION
─────────────────────────────────────────────────────────────────────
{"predictions": [0]}
"Here it is: Iris Setosa! (class 0)"
# sklearn-iris.yaml - Each line explained
apiVersion: serving.kserve.io/v1beta1 # KServe "recipe" version
kind: InferenceService # Resource type: inference service
metadata:
name: sklearn-iris # Your service name (how you'll call it)
spec:
predictor: # "Head chef" configuration
model:
modelFormat:
name: sklearn # Model type (sklearn, tensorflow, etc)
# KServe uses this to choose the right kitchen
storageUri: "gs://kfserving-examples/models/sklearn/1.0/model"
# Address where the model is stored
# KServe will download automatically
resources: # How much resources the kitchen needs
requests: # Minimum guaranteed
cpu: "100m" # 0.1 CPU (100 millicores)
memory: "256Mi" # 256 MB of RAM
limits: # Maximum allowed
cpu: "500m" # 0.5 CPU
memory: "512Mi" # 512 MB of RAM┌─────────────────────────────────────────────────────────────────────────────┐
│ V1 PROTOCOL │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Endpoint: POST /v1/models/{model-name}:predict │
│ │
│ Request (order): │
│ { │
│ "instances": [ // List of "orders" │
│ [5.1, 3.5, 1.4, 0.2], // Order 1: one flower │
│ [6.7, 3.1, 4.4, 1.4] // Order 2: another flower │
│ ] │
│ } │
│ │
│ Response: │
│ { │
│ "predictions": [0, 1] // Answers: Setosa, Versicolor │
│ } │
│ │
│ Analogy: "I want these two dishes" → "Here are the two dishes" │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ V2 PROTOCOL │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Endpoint: POST /v2/models/{model-name}/infer │
│ │
│ Request (more detailed): │
│ { │
│ "inputs": [{ │
│ "name": "input-0", // Ingredient name │
│ "shape": [1, 4], // Format: 1 order, 4 values │
│ "datatype": "FP32", // Type: decimal numbers │
│ "data": [[5.1, 3.5, 1.4, 0.2]] // The values │
│ }] │
│ } │
│ │
│ Advantage: More precise, industry standard, works with any │
│ framework. It's like an order with all specifications. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ INFERENCESERVICE STATES │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ │
│ │ Unknown │ "Order received" │
│ └────┬────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ │
│ │ Pending │ "Setting up kitchen, fetching chef" │
│ └────┬────┘ │
│ │ │
│ ┌─────────┴─────────┐ │
│ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ │
│ │ Ready │ │ Failed │ │
│ │ ✓ │ │ ✗ │ │
│ └─────────┘ └─────────┘ │
│ "Restaurant open!" "Something went wrong" │
│ │
│ Check: kubectl get isvc sklearn-iris │
│ READY=True means it's working! │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
# See all conditions
kubectl get isvc sklearn-iris -o jsonpath='{range .status.conditions[*]}{.type}: {.status}{"\n"}{end}'
# Result:
# IngressReady: True ← Restaurant door is open
# PredictorReady: True ← Chef is in the kitchen
# Ready: True ← Everything working!┌─────────────────────────────────────────────────────────────────────────────┐
│ PROBLEM │ ANALOGY │ SOLUTION │
├───────────────────────────────┼────────────────────────┼────────────────────┤
│ │ │ │
│ Pod in "Pending" │ Kitchen doesn't have │ Check if there │
│ │ space/resources │ are resources in │
│ │ │ the cluster │
│ │ │ │
├───────────────────────────────┼────────────────────────┼────────────────────┤
│ │ │ │
│ Pod in "CrashLoopBackOff" │ Chef arrives and │ Check logs: │
│ │ faints │ kubectl logs ... │
│ │ │ │
├───────────────────────────────┼────────────────────────┼────────────────────┤
│ │ │ │
│ "Model not found" │ Chef isn't at the │ Check │
│ │ given address │ storageUri │
│ │ │ │
├───────────────────────────────┼────────────────────────┼────────────────────┤
│ │ │ │
│ Download timeout │ Warehouse too far │ Model too large │
│ │ or closed │ or URL │
│ │ │ inaccessible │
│ │ │ │
└─────────────────────────────────────────────────────────────────────────────┘
# 1. Check general status
kubectl get isvc sklearn-iris
# READY=False? Something's wrong!
# 2. See details and events
kubectl describe isvc sklearn-iris
# Look for "Events" at the bottom
# 3. Check pods
kubectl get pods -l serving.kserve.io/inferenceservice=sklearn-iris
# STATUS other than "Running"? Problem!
# 4. Check model download logs (storage-initializer)
kubectl logs <pod-name> -c storage-initializer
# Download errors appear here
# 5. Check server logs (kserve-container)
kubectl logs <pod-name> -c kserve-container
# Model loading errors appear here# ══════════════════════════════════════════════════════════════════════════════
# DAILY COMMANDS
# ══════════════════════════════════════════════════════════════════════════════
# APPLY a model
kubectl apply -f sklearn-iris.yaml
# LIST all models
kubectl get isvc
# SEE DETAILS of a model
kubectl describe isvc sklearn-iris
# CHECK LOGS of the model
kubectl logs -l serving.kserve.io/inferenceservice=sklearn-iris -c kserve-container
# DELETE a model
kubectl delete isvc sklearn-iris
# ══════════════════════════════════════════════════════════════════════════════
# TEST INFERENCE
# ══════════════════════════════════════════════════════════════════════════════
# 1. Find the pod
POD=$(kubectl get pods -l serving.kserve.io/inferenceservice=sklearn-iris \
-o jsonpath='{.items[0].metadata.name}')
# 2. Open tunnel (in one terminal)
kubectl port-forward pod/$POD 8080:8080
# 3. Make request (in another terminal)
curl -X POST http://localhost:8080/v1/models/sklearn-iris:predict \
-H "Content-Type: application/json" \
-d '{"instances": [[5.1, 3.5, 1.4, 0.2]]}'
# Expected response: {"predictions": [0]}| Term | Meaning | Analogy |
|---|---|---|
| Model | File with learned "knowledge" | Trained chef |
| Training | Process of creating the model | Culinary school course |
| Inference | Using model to predict | Cooking a dish |
| Serving | Making model available as service | Opening a restaurant |
| MLOps | Practices for operationalizing ML | Restaurant chain management |
| Term | Meaning | Analogy |
|---|---|---|
| Pipeline | Sequence of automated steps | Training curriculum |
| Component | Individual pipeline step | One class in the course |
| Experiment | Set of pipeline executions | Class of students |
| Run | One pipeline execution | One student taking the course |
| Notebook | Interactive development environment | Experimental laboratory |
| Training Operator | Manages training jobs | Specialized classroom |
| Katib | Hyperparameter optimization | Finding best teaching method |
| Term | Meaning | Analogy |
|---|---|---|
| Experiment | Group of related attempts | Course record |
| Run | One training attempt | One exam/evaluation |
| Parameter | Configuration used in training | Recipe ingredients |
| Metric | Performance measurement | Exam grade |
| Artifact | Generated file (model, charts) | Diploma + portfolio |
| Model Registry | Catalog of official models | Records office |
| Stage | Model status (Staging/Production) | Intern vs Hired |
| Term | Meaning | Analogy |
|---|---|---|
| InferenceService | Resource that defines serving | Restaurant contract |
| Predictor | Component that makes predictions | Head chef |
| Transformer | Pre/post-processing | Kitchen assistant |
| Explainer | Explains predictions | Sommelier |
| StorageUri | Where model is stored | Warehouse address |
| ServingRuntime | Execution environment | Kitchen type |
| Scale-to-zero | Turn off when no orders | Close kitchen at night |
| Auto-scaling | Automatically adjust capacity | Open more kitchens |
| Canary | Gradual deploy of new version | Test new chef with few customers |
Now that you understand the concepts, you can:
- Create a Notebook: Experiment with data in Kubeflow environment
- Create a Pipeline: Automate the training flow
- Use Katib: Automatically optimize hyperparameters
- Instrument your code: Add
mlflow.log_*in training - Use Model Registry: Version and promote models
- Compare experiments: Use MLflow UI for analysis
- Try other models: XGBoost, TensorFlow
- Add Transformer: Preprocess data before inference
- Configure auto-scaling: Adjust replicas based on demand
- Canary deploy: Test new model with part of the traffic
- Complete pipeline: Kubeflow → MLflow → KServe automated
- Monitor: Prometheus/Grafana for inference metrics
- CI/CD for ML: GitHub Actions + ArgoCD for automatic deploy
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ MLOPS ECOSYSTEM = RESTAURANT CHAIN │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ KUBEFLOW MLFLOW KSERVE │ │
│ │ ════════ ══════ ══════ │ │
│ │ │ │
│ │ Culinary Professional Industrial │ │
│ │ School Registry Kitchen │ │
│ │ │ │
│ │ "Where the chef "Where the chef "Where the chef │ │
│ │ is trained" is cataloged" works" │ │
│ │ │ │
│ │ • Notebooks • Tracking • Serving │ │
│ │ • Pipelines • Model Registry • Auto-scaling │ │
│ │ • Training Jobs • Artifacts • Canary deploy │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ Flow: Train (Kubeflow) → Register (MLflow) → Serve (KServe) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Remember: Kubeflow is the school, MLflow is the registry, KServe is the kitchen. Together, they form the complete infrastructure to take your model from notebook to production!