Zero-Downtime Deployments in Kubernetes: A Practical Guide

Zero Downtime Is Harder Than It Looks

Kubernetes rolling updates sound like a free zero-downtime deployment — replace old pods with new ones gradually. In practice, teams still see errors during deployments because they skip the configuration details that make rolling updates actually safe: readiness probes, graceful shutdown, pod disruption budgets, and database migrations that maintain backward compatibility. This guide covers each piece.

Readiness Probes: The Foundation

A readiness probe tells Kubernetes when a pod is ready to receive traffic. Without it, Kubernetes routes requests to new pods the moment they start — before the application has finished initialising, connecting to the database, or warming up caches. Configure a readiness probe for every production workload.

spec:
  containers:
    - name: api
      image: myapp:v2
      readinessProbe:
        httpGet:
          path: /health/ready
          port: 3000
        initialDelaySeconds: 5
        periodSeconds: 5
        failureThreshold: 3
        successThreshold: 1
      livenessProbe:
        httpGet:
          path: /health/live
          port: 3000
        initialDelaySeconds: 15
        periodSeconds: 20
        failureThreshold: 3

Implement two distinct health endpoints: /health/ready returns 503 until the app is ready (database connected, cache warmed), and /health/live always returns 200 as long as the process is running. The liveness probe restarts stuck pods; the readiness probe gates traffic routing.

Rolling Update Strategy Configuration

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # max extra pods above desired count during update
      maxUnavailable: 0  # never take a pod down before a new one is ready

maxUnavailable: 0 is the key setting for zero downtime — Kubernetes will not terminate an old pod until a new pod passes its readiness probe. maxSurge: 1 allows one extra pod to exist during the rollout so there is never a capacity drop.

Graceful Shutdown

When Kubernetes sends a SIGTERM to a pod being terminated, the pod has terminationGracePeriodSeconds (default 30s) to finish in-flight requests. If your application exits immediately on SIGTERM, active requests fail. Implement graceful shutdown.

// Node.js graceful shutdown
const server = app.listen(3000)

process.on('SIGTERM', () => {
  console.log('SIGTERM received — shutting down gracefully')
  server.close(() => {
    // Close database connections, flush logs, etc.
    process.exit(0)
  })

  // Force exit after 25s (before k8s kills the pod at 30s)
  setTimeout(() => {
    console.error('Forced shutdown after timeout')
    process.exit(1)
  }, 25_000)
})

Set terminationGracePeriodSeconds to at least your P99 request duration plus 10 seconds. For most APIs this means 30–60 seconds.

PodDisruptionBudgets for Node Maintenance

Rolling updates are not the only source of pod evictions. Node maintenance, cluster upgrades, and spot instance terminations all trigger evictions. A PodDisruptionBudget ensures Kubernetes never evicts too many pods at once.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
spec:
  minAvailable: 2  # or use maxUnavailable: 1
  selector:
    matchLabels:
      app: api

With minAvailable: 2, Kubernetes will not evict any pod if doing so would leave fewer than 2 running. This prevents a cluster upgrade from taking your entire deployment offline.

Database Migrations: The Hard Part

The trickiest part of zero-downtime deployments is database migrations. If your migration drops a column that the old code reads, requests on old pods will fail. The safe pattern is expand-contract (also called parallel change):

Expand: Deploy migration that adds the new column. Both old and new code work with the new schema.
Migrate: Deploy new application code that uses the new column.
Contract: Deploy migration that drops the old column (now that no code references it).

Never combine schema changes that break backward compatibility with an application deployment in a single release. Split them across at least two deployments with a bake time in between.

Deployment Verification

# Watch rollout progress
kubectl rollout status deployment/api --timeout=5m

# Verify new version is running
kubectl get pods -l app=api -o jsonpath='{.items[*].spec.containers[*].image}'

# Roll back immediately if something is wrong
kubectl rollout undo deployment/api

Automate post-deployment smoke tests that run against the cluster and trigger an automatic rollback if they fail. The window between deployment and user impact is measured in seconds — manual detection is too slow.

Zero-Downtime Deployments in Kubernetes: A Practical Guide

Zero Downtime Is Harder Than It Looks

Readiness Probes: The Foundation

Rolling Update Strategy Configuration

Graceful Shutdown

PodDisruptionBudgets for Node Maintenance

Database Migrations: The Hard Part

Deployment Verification

Bookt.dk — Danish Salon Booking

Kubernetes for LLM Inference: Scaling AI Workloads

AWS Infrastructure for AI Workloads: The Complete Setup

Want to Build This for Your Team?