Kubernetes -- Container Orchestration From Zero

1. Why Kubernetes Exists 2. K8s Architecture (Control Plane & Nodes) 3. Core Objects: Pods, Deployments, Services 4. Writing K8s YAML 5. Networking & Service Discovery 6. Storage & ConfigMaps 7. Scaling & Self-Healing 8. Essential kubectl Commands 9. Running K8s Locally (Minikube) 10. Kubernetes vs Docker Compose

1. Why Kubernetes Exists

You know Docker. You can run containers. But what happens when your app gets popular?

You need to run 50 copies of your API across 10 servers to handle the traffic
If one container crashes, something needs to automatically restart it
When you deploy a new version, you want zero downtime -- no one should notice the update
You need to distribute traffic evenly across all those containers
Some containers need more CPU, others need more memory -- you need smart scheduling

Doing all of this manually with docker run commands would be a nightmare. Kubernetes automates all of it.

The Analogy

Docker is like being able to cook a dish. Kubernetes is like being the manager of a restaurant chain. You don't cook -- you tell the system "I need 10 chefs making pasta, 5 making pizza, and if anyone calls in sick, hire a replacement immediately". Kubernetes handles the logistics.

You declare what you want ("run 3 copies of my API"), and Kubernetes figures out how to make it happen -- which machine to put each container on, how to restart failures, how to route traffic.

Do You Actually Need Kubernetes?

Kubernetes is powerful but complex. If you're running a side project or a small app with a few containers, Docker Compose is probably enough. K8s shines when you have multiple services, need auto-scaling, or are deploying to production across multiple servers. Learn it because it's everywhere in industry, but don't feel pressured to use it for everything.

2. K8s Architecture

A Kubernetes cluster is a group of machines (physical or virtual) that run your containers together. There are two types of machines in a cluster:

Control Plane (The Brain)

This is the management layer. It makes all the decisions -- where to run containers, when to restart them, how to scale. You never run your app on the control plane. It has these components:

Component	What It Does
API Server	The front door. Every command you run (`kubectl`) goes through here. It validates and processes all requests.
etcd	The database. Stores the entire state of the cluster -- what's running, what's desired, configurations. It's a key-value store.
Scheduler	Decides which node to place a new container on, based on available resources, constraints, and affinity rules.
Controller Manager	Watches the current state and makes it match the desired state. If you want 3 replicas and only 2 are running, it starts a third.

Worker Nodes (The Muscle)

These are the machines that actually run your containers. Each node has:

Component	What It Does
kubelet	Agent on each node. Talks to the control plane, ensures containers are running as expected.
Container Runtime	Actually runs containers (usually containerd, which is what Docker uses under the hood).
kube-proxy	Handles networking -- routes traffic to the right containers.

How It All Flows:

You (kubectl) → API Server → Scheduler picks a node → kubelet on that node → Container Runtime starts container

Controller Manager constantly watches: "Are things running as declared?" → If not, fix it.

3. Core Objects: Pods, Deployments, Services

Everything in Kubernetes is an object defined in YAML. There are three you absolutely need to understand:

Pod -- The Smallest Unit

A Pod is the smallest thing you can deploy in Kubernetes. It's a wrapper around one or more containers that share the same network and storage.

Think of It This Way

A Pod is like a tiny server that runs your container. Usually it's one container per Pod. Sometimes you put two containers in the same Pod if they're tightly coupled (like your app + a logging sidecar), but that's less common.

You almost never create Pods directly. Instead, you create a Deployment that creates Pods for you.

Deployment -- Managing Pods

A Deployment tells Kubernetes: "I want N copies of this Pod running at all times." It handles:

Replication -- Run 3 copies of my API
Rolling updates -- Deploy a new version without downtime
Rollback -- If the new version is broken, go back to the old one
Self-healing -- If a Pod crashes, start a new one

Service -- Exposing Pods to Traffic

Pods get random IP addresses that change every time they restart. You can't hard-code those IPs. A Service gives your Pods a stable address and load-balances traffic across them.

Service Types

ClusterIP (default) -- Only accessible inside the cluster. Other services can reach it, but the outside world can't. Use this for internal communication (e.g., API talking to database).

NodePort -- Exposes the service on a port on every node. Useful for development. Traffic to NodeIP:30000 gets routed to your Pods.

LoadBalancer -- Creates a cloud load balancer (on AWS, GCP, etc.) that routes external traffic to your Pods. This is how you expose apps to the internet in production.

The Relationship:

Deployment manages → ReplicaSet manages → Pods (your containers)
Service → routes traffic to → Pods (by label selector)

4. Writing K8s YAML

Everything in K8s is declared in YAML files. You tell Kubernetes what you want, and it makes it happen. Here are the essential configs:

Deployment YAML

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api               # Name of this deployment
spec:
  replicas: 3                # Run 3 copies of this Pod
  selector:
    matchLabels:
      app: my-api             # Find Pods with this label
  template:                   # Pod template
    metadata:
      labels:
        app: my-api           # Label for this Pod
    spec:
      containers:
        - name: api
          image: myapp:1.0    # Docker image to run
          ports:
            - containerPort: 3000
          env:
            - name: DATABASE_URL
              value: "postgres://user:pass@db-service:5432/mydb"
          resources:
            requests:           # Minimum resources needed
              memory: "128Mi"
              cpu: "100m"       # 100 millicores = 0.1 CPU
            limits:             # Maximum resources allowed
              memory: "256Mi"
              cpu: "500m"

Breaking This Down

apiVersion + kind -- What type of object this is. Deployments use apps/v1.

metadata.name -- The name you give this deployment. Used to reference it in commands.

spec.replicas: 3 -- "I want 3 Pods running at all times." If one dies, K8s starts a replacement.

selector.matchLabels -- How the Deployment finds its Pods. It looks for Pods with app: my-api label.

template -- The Pod blueprint. Every Pod created by this Deployment will look like this.

resources.requests -- The minimum resources this Pod needs. The scheduler uses this to decide which node has enough room.

resources.limits -- The maximum. If the Pod tries to use more, K8s throttles or kills it.

Service YAML

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: my-api-service
spec:
  type: ClusterIP             # Internal only (default)
  selector:
    app: my-api               # Route to Pods with this label
  ports:
    - port: 80                # Port other services use
      targetPort: 3000        # Port on the container

Now other services in the cluster can reach your API at http://my-api-service:80. K8s handles load balancing across all 3 Pods.

Applying YAML

# Create or update resources from a file
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

# Apply all YAML files in a directory
kubectl apply -f ./k8s/

# Delete resources
kubectl delete -f deployment.yaml

5. Networking & Service Discovery

K8s networking follows simple rules:

Every Pod gets its own IP address
Pods can communicate with any other Pod without NAT
Services provide stable DNS names

How Services Find Pods

Services use label selectors. When you create a Service with selector: { app: my-api }, it automatically finds all Pods with that label and routes traffic to them. If a Pod dies and a new one starts with the same label, the Service picks it up automatically.

DNS Inside the Cluster

# Service "my-api-service" in namespace "default" is accessible at:
my-api-service                          # Short name (same namespace)
my-api-service.default                  # With namespace
my-api-service.default.svc.cluster.local  # Fully qualified

Kubernetes DNS Naming Rule:

<service-name>.<namespace>.svc.cluster.local

• Within same namespace: just use <service-name>
• Cross-namespace: use <service-name>.<namespace>
• Full FQDN: <service-name>.<namespace>.svc.cluster.local

Ingress -- Exposing to the Internet

An Ingress is a set of rules for routing external HTTP traffic to your Services. Think of it as a reverse proxy (like Nginx) that K8s manages for you.

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
spec:
  rules:
    - host: api.myapp.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-api-service
                port:
                  number: 80

The Traffic Flow

Internet → Ingress Controller (Nginx/Traefik) → Service → Pods

The Ingress Controller is the actual reverse proxy. The Ingress YAML just configures its rules.

6. Storage & ConfigMaps

ConfigMaps -- Non-Secret Configuration

Store configuration data separately from your container image. This lets you change config without rebuilding.

# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  APP_ENV: "production"
  LOG_LEVEL: "info"
  MAX_CONNECTIONS: "100"

Secrets -- Sensitive Data

# secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
stringData:
  DB_PASSWORD: "supersecret123"
  API_KEY: "sk-abc123"

Using ConfigMaps and Secrets in Pods

spec:
  containers:
    - name: api
      image: myapp:1.0
      envFrom:                    # Load all keys as env vars
        - configMapRef:
            name: app-config
        - secretRef:
            name: db-credentials
      # OR load individual keys:
      env:
        - name: DATABASE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: DB_PASSWORD

Persistent Volumes -- Durable Storage

Just like Docker volumes, Pods need persistent storage for databases. K8s uses PersistentVolumeClaims (PVC) to request storage:

# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: db-storage
spec:
  accessModes:
    - ReadWriteOnce           # One node can write at a time
  resources:
    requests:
      storage: 10Gi           # 10 GB of storage

# Use it in a Pod
spec:
  containers:
    - name: postgres
      image: postgres:16
      volumeMounts:
        - name: db-data
          mountPath: /var/lib/postgresql/data
  volumes:
    - name: db-data
      persistentVolumeClaim:
        claimName: db-storage

7. Scaling & Self-Healing

Manual Scaling

# Scale to 5 replicas
kubectl scale deployment my-api --replicas=5

# Scale to 0 (stops all pods, saves resources)
kubectl scale deployment my-api --replicas=0

Horizontal Pod Autoscaler (HPA)

Automatically scale based on CPU or memory usage:

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-api
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70   # Scale up when CPU > 70%

How Auto-Scaling Works

K8s constantly monitors your Pods' CPU usage. If the average goes above 70%, it adds more Pods (up to 10). If usage drops, it removes Pods (down to 2). This means you handle traffic spikes automatically without manual intervention or paying for servers you don't need during quiet hours.

Self-Healing

Kubernetes constantly watches your Pods. If one crashes, it immediately starts a replacement. This happens because the Controller Manager compares the desired state (3 replicas) with the actual state (2 running) and reconciles the difference.

You can also define health checks so K8s restarts unhealthy containers:

spec:
  containers:
    - name: api
      image: myapp:1.0
      livenessProbe:            # Is the container alive?
        httpGet:
          path: /health
          port: 3000
        initialDelaySeconds: 10
        periodSeconds: 15
      readinessProbe:           # Is it ready to receive traffic?
        httpGet:
          path: /ready
          port: 3000
        initialDelaySeconds: 5
        periodSeconds: 5

Liveness vs Readiness

Liveness -- "Is this container still working?" If it fails, K8s kills and restarts the container. Use this to detect deadlocks or stuck processes.

Readiness -- "Can this container handle traffic right now?" If it fails, K8s stops sending traffic to it but doesn't restart it. Use this for startup time (e.g., waiting for a database connection).

8. Essential kubectl Commands

# ===== VIEWING RESOURCES =====
kubectl get pods                    # List all pods
kubectl get pods -o wide            # With more detail (node, IP)
kubectl get deployments             # List deployments
kubectl get services                # List services
kubectl get all                     # List everything

# ===== DETAILED INFO =====
kubectl describe pod my-pod         # Detailed pod info + events
kubectl describe deployment my-api  # Deployment details

# ===== LOGS =====
kubectl logs my-pod                 # View pod logs
kubectl logs -f my-pod              # Follow logs (real-time)
kubectl logs my-pod -c my-container # Specific container in pod

# ===== DEBUGGING =====
kubectl exec -it my-pod -- bash     # Shell into a pod
kubectl port-forward my-pod 3000:3000  # Forward local port to pod

# ===== APPLYING & DELETING =====
kubectl apply -f deployment.yaml    # Create/update from file
kubectl delete -f deployment.yaml   # Delete from file
kubectl delete pod my-pod           # Delete a specific pod

# ===== SCALING =====
kubectl scale deployment my-api --replicas=5

# ===== ROLLOUTS =====
kubectl rollout status deployment my-api   # Watch deployment progress
kubectl rollout history deployment my-api  # See revision history
kubectl rollout undo deployment my-api     # Rollback to previous version

# ===== CONTEXT =====
kubectl config get-contexts         # See available clusters
kubectl config use-context my-cluster  # Switch clusters
kubectl get namespaces              # List namespaces
kubectl -n production get pods      # Pods in a specific namespace

9. Running K8s Locally (Minikube)

You don't need a cloud account to learn K8s. Minikube runs a single-node K8s cluster on your laptop.

# Install Minikube (on Linux with curl)
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube

# Start a cluster
minikube start

# Check it's running
kubectl get nodes

# Build images inside Minikube (so it can find them)
eval $(minikube docker-env)
docker build -t myapp:1.0 .

# Access a service
minikube service my-api-service

# Dashboard (web UI)
minikube dashboard

# Stop the cluster
minikube stop

# Delete the cluster
minikube delete

Other Local Options

kind (Kubernetes IN Docker) -- Runs K8s nodes as Docker containers. Lighter than Minikube.

k3s -- Lightweight K8s distribution. Great for Raspberry Pi or resource-constrained environments.

Docker Desktop -- Has a built-in K8s cluster you can enable in settings.

10. Kubernetes vs Docker Compose

Feature	Docker Compose	Kubernetes
Purpose	Local development, simple deployments	Production orchestration at scale
Complexity	Simple -- one YAML file	Complex -- multiple YAML files, many concepts
Scaling	Manual (`--scale`)	Automatic (HPA)
Self-healing	Basic (`restart: always`)	Advanced (health checks, auto-replacement)
Multi-machine	Single machine only	Distributed across many machines
Rolling updates	Recreate only	Zero-downtime rolling updates
Load balancing	None built-in	Built-in across Pods
When to use	Dev environments, small apps	Production, microservices, scale

The Practical Advice

Use Docker Compose for development -- it's simpler and faster to iterate. Use Kubernetes for production when you need scaling, reliability, and zero-downtime deployments. Many teams use both: Compose locally, K8s in the cloud.

Kubernetes (K8s)

Table of Contents