Docker runs containers. Kubernetes manages thousands of them. Learn how K8s orchestrates your containers across machines -- scaling, healing, and deploying your apps automatically. Read the Docker page first if you haven't.
You know Docker. You can run containers. But what happens when your app gets popular?
Doing all of this manually with docker run commands would be a nightmare. Kubernetes automates all of it.
Docker is like being able to cook a dish. Kubernetes is like being the manager of a restaurant chain. You don't cook -- you tell the system "I need 10 chefs making pasta, 5 making pizza, and if anyone calls in sick, hire a replacement immediately". Kubernetes handles the logistics.
You declare what you want ("run 3 copies of my API"), and Kubernetes figures out how to make it happen -- which machine to put each container on, how to restart failures, how to route traffic.
Kubernetes is powerful but complex. If you're running a side project or a small app with a few containers, Docker Compose is probably enough. K8s shines when you have multiple services, need auto-scaling, or are deploying to production across multiple servers. Learn it because it's everywhere in industry, but don't feel pressured to use it for everything.
A Kubernetes cluster is a group of machines (physical or virtual) that run your containers together. There are two types of machines in a cluster:
This is the management layer. It makes all the decisions -- where to run containers, when to restart them, how to scale. You never run your app on the control plane. It has these components:
| Component | What It Does |
|---|---|
| API Server | The front door. Every command you run (kubectl) goes through here. It validates and processes all requests. |
| etcd | The database. Stores the entire state of the cluster -- what's running, what's desired, configurations. It's a key-value store. |
| Scheduler | Decides which node to place a new container on, based on available resources, constraints, and affinity rules. |
| Controller Manager | Watches the current state and makes it match the desired state. If you want 3 replicas and only 2 are running, it starts a third. |
These are the machines that actually run your containers. Each node has:
| Component | What It Does |
|---|---|
| kubelet | Agent on each node. Talks to the control plane, ensures containers are running as expected. |
| Container Runtime | Actually runs containers (usually containerd, which is what Docker uses under the hood). |
| kube-proxy | Handles networking -- routes traffic to the right containers. |
Everything in Kubernetes is an object defined in YAML. There are three you absolutely need to understand:
A Pod is the smallest thing you can deploy in Kubernetes. It's a wrapper around one or more containers that share the same network and storage.
A Pod is like a tiny server that runs your container. Usually it's one container per Pod. Sometimes you put two containers in the same Pod if they're tightly coupled (like your app + a logging sidecar), but that's less common.
You almost never create Pods directly. Instead, you create a Deployment that creates Pods for you.
A Deployment tells Kubernetes: "I want N copies of this Pod running at all times." It handles:
Pods get random IP addresses that change every time they restart. You can't hard-code those IPs. A Service gives your Pods a stable address and load-balances traffic across them.
ClusterIP (default) -- Only accessible inside the cluster. Other services can reach it, but the outside world can't. Use this for internal communication (e.g., API talking to database).
NodePort -- Exposes the service on a port on every node. Useful for development. Traffic to NodeIP:30000 gets routed to your Pods.
LoadBalancer -- Creates a cloud load balancer (on AWS, GCP, etc.) that routes external traffic to your Pods. This is how you expose apps to the internet in production.
Everything in K8s is declared in YAML files. You tell Kubernetes what you want, and it makes it happen. Here are the essential configs:
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-api # Name of this deployment
spec:
replicas: 3 # Run 3 copies of this Pod
selector:
matchLabels:
app: my-api # Find Pods with this label
template: # Pod template
metadata:
labels:
app: my-api # Label for this Pod
spec:
containers:
- name: api
image: myapp:1.0 # Docker image to run
ports:
- containerPort: 3000
env:
- name: DATABASE_URL
value: "postgres://user:pass@db-service:5432/mydb"
resources:
requests: # Minimum resources needed
memory: "128Mi"
cpu: "100m" # 100 millicores = 0.1 CPU
limits: # Maximum resources allowed
memory: "256Mi"
cpu: "500m"
apiVersion + kind -- What type of object this is. Deployments use apps/v1.
metadata.name -- The name you give this deployment. Used to reference it in commands.
spec.replicas: 3 -- "I want 3 Pods running at all times." If one dies, K8s starts a replacement.
selector.matchLabels -- How the Deployment finds its Pods. It looks for Pods with app: my-api label.
template -- The Pod blueprint. Every Pod created by this Deployment will look like this.
resources.requests -- The minimum resources this Pod needs. The scheduler uses this to decide which node has enough room.
resources.limits -- The maximum. If the Pod tries to use more, K8s throttles or kills it.
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: my-api-service
spec:
type: ClusterIP # Internal only (default)
selector:
app: my-api # Route to Pods with this label
ports:
- port: 80 # Port other services use
targetPort: 3000 # Port on the container
Now other services in the cluster can reach your API at http://my-api-service:80. K8s handles load balancing across all 3 Pods.
# Create or update resources from a file
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
# Apply all YAML files in a directory
kubectl apply -f ./k8s/
# Delete resources
kubectl delete -f deployment.yaml
K8s networking follows simple rules:
Services use label selectors. When you create a Service with selector: { app: my-api }, it automatically finds all Pods with that label and routes traffic to them. If a Pod dies and a new one starts with the same label, the Service picks it up automatically.
# Service "my-api-service" in namespace "default" is accessible at:
my-api-service # Short name (same namespace)
my-api-service.default # With namespace
my-api-service.default.svc.cluster.local # Fully qualified
<service-name>.<namespace>.svc.cluster.local<service-name><service-name>.<namespace><service-name>.<namespace>.svc.cluster.local
An Ingress is a set of rules for routing external HTTP traffic to your Services. Think of it as a reverse proxy (like Nginx) that K8s manages for you.
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-ingress
spec:
rules:
- host: api.myapp.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-api-service
port:
number: 80
Internet → Ingress Controller (Nginx/Traefik) → Service → Pods
The Ingress Controller is the actual reverse proxy. The Ingress YAML just configures its rules.
Store configuration data separately from your container image. This lets you change config without rebuilding.
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
APP_ENV: "production"
LOG_LEVEL: "info"
MAX_CONNECTIONS: "100"
# secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
type: Opaque
stringData:
DB_PASSWORD: "supersecret123"
API_KEY: "sk-abc123"
spec:
containers:
- name: api
image: myapp:1.0
envFrom: # Load all keys as env vars
- configMapRef:
name: app-config
- secretRef:
name: db-credentials
# OR load individual keys:
env:
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: DB_PASSWORD
Just like Docker volumes, Pods need persistent storage for databases. K8s uses PersistentVolumeClaims (PVC) to request storage:
# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: db-storage
spec:
accessModes:
- ReadWriteOnce # One node can write at a time
resources:
requests:
storage: 10Gi # 10 GB of storage
# Use it in a Pod
spec:
containers:
- name: postgres
image: postgres:16
volumeMounts:
- name: db-data
mountPath: /var/lib/postgresql/data
volumes:
- name: db-data
persistentVolumeClaim:
claimName: db-storage
# Scale to 5 replicas
kubectl scale deployment my-api --replicas=5
# Scale to 0 (stops all pods, saves resources)
kubectl scale deployment my-api --replicas=0
Automatically scale based on CPU or memory usage:
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-api
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70 # Scale up when CPU > 70%
K8s constantly monitors your Pods' CPU usage. If the average goes above 70%, it adds more Pods (up to 10). If usage drops, it removes Pods (down to 2). This means you handle traffic spikes automatically without manual intervention or paying for servers you don't need during quiet hours.
Kubernetes constantly watches your Pods. If one crashes, it immediately starts a replacement. This happens because the Controller Manager compares the desired state (3 replicas) with the actual state (2 running) and reconciles the difference.
You can also define health checks so K8s restarts unhealthy containers:
spec:
containers:
- name: api
image: myapp:1.0
livenessProbe: # Is the container alive?
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe: # Is it ready to receive traffic?
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
Liveness -- "Is this container still working?" If it fails, K8s kills and restarts the container. Use this to detect deadlocks or stuck processes.
Readiness -- "Can this container handle traffic right now?" If it fails, K8s stops sending traffic to it but doesn't restart it. Use this for startup time (e.g., waiting for a database connection).
# ===== VIEWING RESOURCES =====
kubectl get pods # List all pods
kubectl get pods -o wide # With more detail (node, IP)
kubectl get deployments # List deployments
kubectl get services # List services
kubectl get all # List everything
# ===== DETAILED INFO =====
kubectl describe pod my-pod # Detailed pod info + events
kubectl describe deployment my-api # Deployment details
# ===== LOGS =====
kubectl logs my-pod # View pod logs
kubectl logs -f my-pod # Follow logs (real-time)
kubectl logs my-pod -c my-container # Specific container in pod
# ===== DEBUGGING =====
kubectl exec -it my-pod -- bash # Shell into a pod
kubectl port-forward my-pod 3000:3000 # Forward local port to pod
# ===== APPLYING & DELETING =====
kubectl apply -f deployment.yaml # Create/update from file
kubectl delete -f deployment.yaml # Delete from file
kubectl delete pod my-pod # Delete a specific pod
# ===== SCALING =====
kubectl scale deployment my-api --replicas=5
# ===== ROLLOUTS =====
kubectl rollout status deployment my-api # Watch deployment progress
kubectl rollout history deployment my-api # See revision history
kubectl rollout undo deployment my-api # Rollback to previous version
# ===== CONTEXT =====
kubectl config get-contexts # See available clusters
kubectl config use-context my-cluster # Switch clusters
kubectl get namespaces # List namespaces
kubectl -n production get pods # Pods in a specific namespace
You don't need a cloud account to learn K8s. Minikube runs a single-node K8s cluster on your laptop.
# Install Minikube (on Linux with curl)
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube
# Start a cluster
minikube start
# Check it's running
kubectl get nodes
# Build images inside Minikube (so it can find them)
eval $(minikube docker-env)
docker build -t myapp:1.0 .
# Access a service
minikube service my-api-service
# Dashboard (web UI)
minikube dashboard
# Stop the cluster
minikube stop
# Delete the cluster
minikube delete
kind (Kubernetes IN Docker) -- Runs K8s nodes as Docker containers. Lighter than Minikube.
k3s -- Lightweight K8s distribution. Great for Raspberry Pi or resource-constrained environments.
Docker Desktop -- Has a built-in K8s cluster you can enable in settings.
| Feature | Docker Compose | Kubernetes |
|---|---|---|
| Purpose | Local development, simple deployments | Production orchestration at scale |
| Complexity | Simple -- one YAML file | Complex -- multiple YAML files, many concepts |
| Scaling | Manual (--scale) |
Automatic (HPA) |
| Self-healing | Basic (restart: always) |
Advanced (health checks, auto-replacement) |
| Multi-machine | Single machine only | Distributed across many machines |
| Rolling updates | Recreate only | Zero-downtime rolling updates |
| Load balancing | None built-in | Built-in across Pods |
| When to use | Dev environments, small apps | Production, microservices, scale |
Use Docker Compose for development -- it's simpler and faster to iterate. Use Kubernetes for production when you need scaling, reliability, and zero-downtime deployments. Many teams use both: Compose locally, K8s in the cloud.