Kubernetes Deployment Patterns
Purpose
Practical Kubernetes manifests and Helm chart patterns for deploying containerised applications and ML services. Covers Deployment, Service, Ingress, ConfigMap/Secret, HPA, and rolling update strategies. Synthesized from: Kubernetes Basics, Docker Patterns.
Examples
Complete Deployment + Service + Ingress stack:
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: inference-api
labels:
app: inference-api
spec:
replicas: 3
selector:
matchLabels:
app: inference-api
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # allow 1 extra pod during update
maxUnavailable: 0 # never take a pod down before a new one is ready
template:
metadata:
labels:
app: inference-api
spec:
containers:
- name: inference-api
image: ghcr.io/myorg/inference-api:1.2.3
ports:
- containerPort: 8000
env:
- name: MODEL_VERSION
value: "v3"
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-secret
key: password
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2"
memory: "2Gi"
readinessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 15
---
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: inference-api
spec:
selector:
app: inference-api
ports:
- port: 80
targetPort: 8000
type: ClusterIP # internal only; use Ingress for external access
---
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: inference-api
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: inference-api
port:
number: 80
tls:
- hosts: [api.example.com]
secretName: api-tls-secretConfigMap and Secret:
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
LOG_LEVEL: "info"
FEATURE_X: "enabled"
---
apiVersion: v1
kind: Secret
metadata:
name: db-secret
type: Opaque
stringData: # stringData auto-base64-encodes
password: "supersecret"Horizontal Pod Autoscaler (CPU-based):
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: inference-api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: inference-api
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70Helm chart structure:
charts/inference-api/
├── Chart.yaml # name, version, appVersion
├── values.yaml # default values
├── templates/
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── ingress.yaml
│ ├── hpa.yaml
│ └── _helpers.tpl # template helpers
└── values-prod.yaml # override for production
values.yaml:
replicaCount: 2
image:
repository: ghcr.io/myorg/inference-api
tag: latest
pullPolicy: IfNotPresent
service:
port: 80
ingress:
enabled: true
host: api.example.com
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2
memory: 2Gi
hpa:
enabled: true
minReplicas: 2
maxReplicas: 20
targetCPUUtilizationPercentage: 70Deploy with Helm:
# Install
helm install inference-api ./charts/inference-api \
--namespace prod --create-namespace \
-f charts/inference-api/values-prod.yaml
# Upgrade
helm upgrade inference-api ./charts/inference-api \
--namespace prod -f charts/inference-api/values-prod.yaml
# Rollback
helm rollback inference-api 1 --namespace prodkubectl essentials:
# Check rollout status
kubectl rollout status deployment/inference-api
# View logs
kubectl logs -l app=inference-api --tail=100 -f
# Exec into pod
kubectl exec -it deploy/inference-api -- bash
# Port-forward for local testing
kubectl port-forward svc/inference-api 8080:80Architecture
External traffic
│ HTTPS
▼
Ingress (nginx) → TLS termination
│
▼
Service (ClusterIP) → round-robin load balancing
│
├── Pod 1 (inference-api)
├── Pod 2 (inference-api) ← HPA adds/removes pods based on CPU
└── Pod 3 (inference-api)
│
▼
ConfigMap (env config) + Secret (credentials)
Rolling update sequence:
- HPA or
kubectl set imagetriggers new ReplicaSet - K8s creates new pod (maxSurge); waits for readiness probe to pass
- Old pod is terminated (maxUnavailable=0 means zero-downtime)
- Process repeats until all replicas are on new version