Table of Contents

1. How the Internet Actually Works 2. The OSI Model 3. The TCP/IP Model 4. IP Addresses & Subnets 5. DNS -- The Internet's Phone Book 6. TCP vs UDP 7. HTTP/HTTPS 8. TLS/SSL -- How Encryption Works 9. WebSockets & Real-Time Communication 10. REST vs GraphQL vs gRPC 11. Common Ports 12. Network Debugging Tools 13. SSH, SCP & Secure File Transfer 14. Socket Programming 15. Network Programming Patterns 16. NAT (Network Address Translation) 17. Firewalls & iptables 18. VPNs & Tunneling

1. How the Internet Actually Works

The internet is not a cloud. It is a physical network of cables, routers, and switches that span the globe. When you visit a website, your data travels through copper wires, fiber optic cables (including ones laid across the ocean floor), and radio waves -- bouncing between dozens of machines before it reaches the server and comes back.

The Journey of a Single Request
You type "example.com" in your browser. Here is what actually happens:

1. YOUR DEVICE
   Browser asks the OS: "What's the IP address for example.com?"
   OS checks its local DNS cache. Cache miss.

2. YOUR ROUTER
   Request goes to your home router via WiFi or Ethernet.
   Router checks its DNS cache. Cache miss.
   Router forwards to your ISP's DNS resolver.

3. YOUR ISP (Internet Service Provider)
   ISP's DNS resolver asks the root DNS servers,
   then the .com TLD servers, then example.com's
   authoritative nameserver. Gets back: 93.184.216.34

4. THE ROUTE TO THE SERVER
   Your device now sends a TCP packet to 93.184.216.34.
   The packet travels through:
     - Your router
     - ISP's local network
     - ISP's backbone (high-speed fiber)
     - Internet Exchange Point (IXP) -- where ISPs exchange traffic
     - Possibly undersea cables (if the server is on another continent)
     - The server's ISP / cloud provider network
     - The server's data center
     - The actual physical server

5. THE SERVER RESPONDS
   The server processes your request and sends back HTML.
   The response takes the same journey in reverse.
   Total round-trip time: 20-300ms depending on distance.

6. YOUR BROWSER RENDERS
   Browser parses HTML, discovers CSS/JS/images, and sends
   more requests for each one. Each follows the same path.
Key Physical Infrastructure
  • Undersea cables: Over 550 cables on the ocean floor carry 99% of intercontinental internet traffic. Not satellites -- cables. A break in a major cable can disrupt internet for entire countries.
  • Internet Exchange Points (IXPs): Physical buildings where ISPs connect their networks to exchange traffic directly, instead of routing through a third party. Major IXPs handle terabits per second.
  • Data centers: Warehouses full of servers. Major cloud providers (AWS, Google, Azure) have data centers on every continent except Antarctica.
  • The "last mile": The connection from the ISP to your home. This is usually the slowest part -- fiber optic, cable (coax), DSL, or cellular.
Why This Matters for Developers

Understanding the physical layer explains latency. Light in fiber travels at about 200,000 km/s. New York to London is 5,500 km. The absolute minimum round trip time is ~55ms -- and real-world routing adds more. This is why CDNs exist: you cannot beat the speed of light, so you put servers closer to users.

2. The OSI Model

The OSI (Open Systems Interconnection) model is a conceptual framework that breaks network communication into 7 layers. Each layer has a specific job and talks only to the layers directly above and below it. Think of it like sending a letter: you write the letter (application), put it in an envelope (presentation), address it (session), hand it to the post office (transport), they route it (network), load it on a truck (data link), and the truck drives on a road (physical).

The 7 Layers -- From Top to Bottom
Layer 7: APPLICATION    -- What the user interacts with
Layer 6: PRESENTATION   -- Data format and encryption
Layer 5: SESSION         -- Managing connections
Layer 4: TRANSPORT       -- Reliable delivery (TCP/UDP)
Layer 3: NETWORK         -- Routing across networks (IP)
Layer 2: DATA LINK       -- Communication on a local network (Ethernet, WiFi)
Layer 1: PHYSICAL        -- Actual electrical signals, light, radio waves

Mnemonic (top-down): "All People Seem To Need Data Processing"
Mnemonic (bottom-up): "Please Do Not Throw Sausage Pizza Away"
Each Layer Explained with Real Examples
LAYER 7: APPLICATION
  What it does:  The protocols your applications speak
  Protocols:     HTTP, HTTPS, FTP, SMTP, DNS, SSH, MQTT
  Example:       Your browser sends "GET /index.html HTTP/1.1"
  Developer note: This is where you spend most of your time

LAYER 6: PRESENTATION
  What it does:  Data translation, encryption, compression
  Handles:       SSL/TLS encryption, character encoding (UTF-8),
                 data serialization (JSON, XML, Protocol Buffers)
  Example:       TLS encrypts your HTTP request before sending
  Developer note: Often merged with Layer 7 in practice

LAYER 5: SESSION
  What it does:  Establishes, manages, and terminates connections
  Handles:       Session tokens, authentication state, socket management
  Example:       A WebSocket session stays open while you chat
  Developer note: Also often merged with Layer 7 in practice

LAYER 4: TRANSPORT
  What it does:  End-to-end communication between processes
  Protocols:     TCP (reliable, ordered) and UDP (fast, unreliable)
  Key concept:   PORTS -- TCP port 443 for HTTPS, port 80 for HTTP
  Example:       TCP breaks your data into segments, numbers them,
                 and ensures they arrive in order with nothing missing

LAYER 3: NETWORK
  What it does:  Routes packets across different networks
  Protocols:     IP (IPv4, IPv6), ICMP (ping), routing protocols
  Key concept:   IP ADDRESSES -- every device gets a unique address
  Example:       Your packet to 93.184.216.34 gets routed through
                 10+ routers to reach the destination

LAYER 2: DATA LINK
  What it does:  Communication between devices on the same local network
  Protocols:     Ethernet (802.3), WiFi (802.11), ARP
  Key concept:   MAC ADDRESSES -- hardware address burned into your NIC
  Example:       Your laptop's WiFi card sends a frame to your router.
                 The frame has your MAC address and the router's MAC.

LAYER 1: PHYSICAL
  What it does:  Transmits raw bits (0s and 1s) over a physical medium
  Media:         Copper wire, fiber optic, radio waves (WiFi, 5G)
  Example:       Electrical voltage changes on an Ethernet cable
                 represent individual bits. High voltage = 1, low = 0.
How Data Travels Down and Up the Stack
SENDING (encapsulation -- each layer wraps the data from above):

  Application layer:  [HTTP request data]
  Transport layer:    [TCP header | HTTP request data]
  Network layer:      [IP header | TCP header | HTTP request data]
  Data link layer:    [Ethernet header | IP header | TCP | HTTP data | Ethernet trailer]
  Physical layer:     01001010110010110... (raw bits on the wire)

Each layer adds its own header (and sometimes trailer) around the data.
This is called ENCAPSULATION.

RECEIVING (decapsulation -- each layer strips its header):

  Physical layer:     Raw bits arrive
  Data link layer:    Strips Ethernet header, passes IP packet up
  Network layer:      Strips IP header, passes TCP segment up
  Transport layer:    Strips TCP header, passes HTTP data up
  Application layer:  Your browser receives the HTTP response

The beauty: each layer only cares about its own headers.
TCP doesn't know or care that it's carrying HTTP inside.
IP doesn't know or care that it's carrying TCP inside.
Why the OSI Model Matters for Developers
  • Debugging: "Connection refused" is Layer 4 (TCP). "404 Not Found" is Layer 7 (HTTP). "No route to host" is Layer 3 (IP). Knowing the layer tells you where to look.
  • Load balancers: Layer 4 load balancers route by IP/port. Layer 7 load balancers can inspect HTTP headers and route by URL path.
  • Firewalls: Can operate at Layer 3 (block IP ranges), Layer 4 (block ports), or Layer 7 (block specific HTTP methods).
  • Interviews: "Explain the OSI model" is one of the most common networking interview questions. Know it cold.

3. The TCP/IP Model

The OSI model is a teaching tool. The TCP/IP model is what the internet actually uses. It has 4 layers instead of 7, and it's the practical model that every network engineer and developer works with daily.

TCP/IP vs OSI -- Side by Side
OSI Model (7 layers)          TCP/IP Model (4 layers)
====================          =======================
7. Application    ─┐
6. Presentation    ├──────►   4. Application (HTTP, DNS, FTP, SSH)
5. Session        ─┘
4. Transport      ──────────► 3. Transport (TCP, UDP)
3. Network        ──────────► 2. Internet (IP, ICMP, ARP)
2. Data Link      ─┐
1. Physical       ─┴────────► 1. Network Access (Ethernet, WiFi)

The TCP/IP model merges Layers 5-7 into "Application" because
in practice, the distinctions between them are blurry.
It also merges Layers 1-2 into "Network Access."

When people say "TCP/IP," they mean the entire protocol suite --
not just TCP and IP. It includes HTTP, DNS, UDP, ICMP, ARP,
and hundreds of other protocols that make the internet work.
TCP/IP Layer Protocols What It Does Data Unit
Application HTTP, HTTPS, DNS, FTP, SSH, SMTP Application-specific communication Message / Data
Transport TCP, UDP Process-to-process delivery, reliability Segment (TCP) / Datagram (UDP)
Internet IP, ICMP, ARP Addressing and routing across networks Packet
Network Access Ethernet, WiFi, PPP Physical transmission on local network Frame
Which Model Should You Use?
  • Use the OSI model when discussing concepts, teaching, or in interviews. It is more granular and widely referenced.
  • Use the TCP/IP model when building or debugging real systems. It maps directly to how software is actually implemented.
  • In practice: Most developers think in terms of "application layer" (HTTP), "transport layer" (TCP/UDP), and "network layer" (IP). The other layers rarely come up unless you are doing low-level network programming.

4. IP Addresses & Subnets

Every device on a network needs a unique address, just like every house needs a street address for mail delivery. IP (Internet Protocol) addresses are those addresses. Understanding them is essential for configuring servers, debugging connectivity, and designing network architectures.

IPv4

IPv4 addresses are 32 bits, written as four numbers (0-255) separated by dots. Each number is one byte (8 bits). There are 2^32 = about 4.3 billion possible addresses. That sounds like a lot, but it is not enough -- the world has more devices than IPv4 addresses.

IPv4 Address Anatomy
IP Address:  192.168.1.100

Binary:      11000000.10101000.00000001.01100100
             ^^^^^^^^ ^^^^^^^^ ^^^^^^^^ ^^^^^^^^
             192      168      1        100

Each "octet" is 8 bits (1 byte), range: 0-255
Total: 32 bits = 4 bytes

Special addresses:
  0.0.0.0        -- "This network" (used when a device doesn't know its IP yet)
  127.0.0.1      -- Localhost (loopback -- your own machine)
  255.255.255.255 -- Broadcast (send to everyone on the local network)
  169.254.x.x    -- Link-local (auto-assigned when DHCP fails)

Private vs Public IP Addresses

Reserved Private Ranges
PRIVATE IP RANGES (cannot be routed on the public internet):

  10.0.0.0    -- 10.255.255.255     (10.0.0.0/8)       16.7 million addresses
  172.16.0.0  -- 172.31.255.255     (172.16.0.0/12)     1 million addresses
  192.168.0.0 -- 192.168.255.255    (192.168.0.0/16)    65,536 addresses

Your home network uses private IPs (usually 192.168.x.x).
Your router has ONE public IP from your ISP.
All your devices share that public IP using NAT.

PUBLIC IPs are globally unique and routable on the internet.
Your web server at DigitalOcean or AWS has a public IP
that anyone in the world can reach.

How NAT (Network Address Translation) works:

  Inside your home:
    Laptop: 192.168.1.10 ─┐
    Phone:  192.168.1.11  ├─► Router (192.168.1.1) ─► ISP ─► Internet
    TV:     192.168.1.12 ─┘   Public IP: 73.45.123.89

  When your laptop visits example.com:
    Outgoing: src=192.168.1.10 ──► Router rewrites to src=73.45.123.89
    Incoming: dst=73.45.123.89 ──► Router rewrites to dst=192.168.1.10

  The router keeps a NAT table mapping internal IP:port to external IP:port.
  This is why all your devices can share one public IP.

Subnet Masks and CIDR Notation

How Subnets Divide Networks
A subnet mask tells you which part of an IP is the NETWORK
and which part is the HOST.

IP:          192.168.1.100
Subnet mask: 255.255.255.0

Binary:
  IP:   11000000.10101000.00000001.01100100
  Mask: 11111111.11111111.11111111.00000000
        ├── NETWORK (first 24 bits) ──┤├HOST┤

Network address: 192.168.1.0   (all host bits = 0)
Broadcast:       192.168.1.255 (all host bits = 1)
Usable hosts:    192.168.1.1 -- 192.168.1.254 (254 devices)

CIDR notation is shorthand: 192.168.1.0/24
The "/24" means "the first 24 bits are the network part."

Common CIDR blocks:
  /8   = 255.0.0.0       = 16,777,214 hosts  (huge, like 10.0.0.0/8)
  /16  = 255.255.0.0     = 65,534 hosts       (medium)
  /24  = 255.255.255.0   = 254 hosts           (typical LAN)
  /28  = 255.255.255.240 = 14 hosts            (small subnet)
  /32  = 255.255.255.255 = 1 host              (single IP, used in routing)

Why subnetting matters for developers:
  - AWS VPCs use CIDR blocks: "Give me a /16 network (65K IPs)"
  - Docker networks: containers get IPs from a subnet (172.17.0.0/16)
  - Firewall rules: "Allow traffic from 10.0.0.0/8" means any 10.x.x.x
  - Kubernetes: pods get IPs from a cluster CIDR range
CIDR Subnet Formula:

For a /prefix network:
• Number of addresses = 2^(32 - prefix)
• Usable hosts = 2^(32 - prefix) - 2 (subtract network and broadcast addresses)

Example: /24 → 2^8 = 256 addresses, 254 usable hosts
Example: /16 → 2^16 = 65,536 addresses, 65,534 usable hosts

IPv6

The Future (and Present) of IP Addressing
IPv4 has 4.3 billion addresses. The world has 15+ billion connected devices.
We ran out of IPv4 addresses years ago. IPv6 fixes this.

IPv6: 128 bits = 340 undecillion addresses (3.4 x 10^38)
That's enough to give every grain of sand on Earth its own IP.

Format: eight groups of four hex digits, separated by colons
  Full:    2001:0db8:85a3:0000:0000:8a2e:0370:7334
  Short:   2001:db8:85a3::8a2e:370:7334
           (leading zeros dropped, consecutive zero groups = ::)

Special addresses:
  ::1           -- Loopback (like 127.0.0.1 in IPv4)
  fe80::/10     -- Link-local (like 169.254.x.x)
  ::ffff:0:0/96 -- IPv4-mapped IPv6 (e.g., ::ffff:192.168.1.1)

Why adoption is slow:
  - NAT made IPv4 livable (one public IP for a whole network)
  - Upgrading every router, firewall, and app is expensive
  - IPv4 and IPv6 are NOT compatible -- you need dual-stack or tunneling
  - Most cloud providers support both, but many apps are still IPv4-only

As a developer: ensure your apps work with both.
  - Don't hardcode IPv4 formats in validators
  - Use libraries that handle both (Node's net module, Go's net package)
  - Configure your servers to listen on both (:: for IPv6, 0.0.0.0 for IPv4)

5. DNS -- The Internet's Phone Book

Humans remember names (google.com). Computers use numbers (142.250.80.46). DNS (Domain Name System) translates between the two. It is one of the most critical pieces of internet infrastructure -- if DNS is down, the internet feels "broken" even though the servers are fine.

DNS Resolution Step by Step
You type "www.example.com" in your browser. Here is the full DNS lookup:

1. BROWSER CACHE
   Browser checks its own DNS cache.
   "Have I looked up www.example.com recently?" No.

2. OS CACHE
   Browser asks the operating system.
   OS checks /etc/hosts file and its DNS cache. Not there.

3. RECURSIVE RESOLVER (your ISP's DNS server, or 8.8.8.8, or 1.1.1.1)
   OS sends a DNS query to the configured resolver.
   The resolver does the heavy lifting:

4. ROOT NAMESERVER (13 root server clusters worldwide)
   Resolver asks: "Where is www.example.com?"
   Root says: "I don't know, but .com is handled by these TLD servers."

5. TLD NAMESERVER (.com, .org, .io, etc.)
   Resolver asks the .com TLD server: "Where is example.com?"
   TLD says: "example.com's nameservers are ns1.example.com (93.184.216.34)"

6. AUTHORITATIVE NAMESERVER (the domain owner's DNS server)
   Resolver asks example.com's nameserver: "What is www.example.com?"
   Authoritative NS responds: "It's 93.184.216.34, TTL=3600"

7. RESULT CACHED AND RETURNED
   Resolver caches the answer for 3600 seconds (1 hour).
   Returns the IP to your OS, which caches it too.
   Browser connects to 93.184.216.34.

Total time: 20-100ms (first lookup). Subsequent lookups: ~0ms (cached).

   Browser ──► OS ──► Recursive Resolver ──► Root NS
                                          ──► .com TLD NS
                                          ──► example.com Auth NS
                                          ◄── 93.184.216.34 (answer)

DNS Record Types

Record Purpose Example
A Maps domain to IPv4 address example.com. A 93.184.216.34
AAAA Maps domain to IPv6 address example.com. AAAA 2606:2800:220:1:...
CNAME Alias -- points one domain to another www.example.com. CNAME example.com.
MX Mail server for the domain example.com. MX 10 mail.example.com.
TXT Arbitrary text (SPF, DKIM, verification) example.com. TXT "v=spf1 include:_spf.google.com ~all"
NS Nameserver for the domain example.com. NS ns1.example.com.
SRV Service location (port and host) _sip._tcp.example.com. SRV 10 60 5060 sip.example.com.
PTR Reverse lookup (IP to domain) 34.216.184.93.in-addr.arpa. PTR example.com.
TTL and DNS Caching
TTL (Time To Live) tells resolvers how long to cache a record.

example.com.  3600  IN  A  93.184.216.34
              ^^^^
              TTL = 3600 seconds (1 hour)

After 1 hour, resolvers must re-query for a fresh answer.

LOW TTL (60-300 seconds):
  - Records update quickly (good for failover, blue-green deploys)
  - More DNS queries (slightly slower for users)
  - Use when you expect to change the IP frequently

HIGH TTL (3600-86400 seconds):
  - Fewer DNS queries (faster for users, less load on nameservers)
  - Changes propagate slowly (could take hours)
  - Use for stable infrastructure

COMMON MISTAKE: Setting TTL to 86400 (24 hours), then doing a
server migration and wondering why users still hit the old IP.
Lower your TTL to 60 seconds BEFORE a migration, wait 24 hours
for old caches to expire, do the migration, then raise TTL back.
DNS Security Concerns

Traditional DNS is unencrypted -- your ISP (and anyone on your network) can see every domain you visit. Modern alternatives:

  • DNS over HTTPS (DoH): DNS queries sent over HTTPS. Used by Firefox and Chrome. Hides queries from ISPs.
  • DNS over TLS (DoT): DNS queries encrypted with TLS on port 853. Used by Android and systemd-resolved.
  • DNSSEC: Cryptographically signs DNS records to prevent spoofing. Verifies the response came from the real nameserver.

6. TCP vs UDP

TCP and UDP are the two main transport layer protocols. They solve fundamentally different problems: TCP guarantees delivery at the cost of speed. UDP prioritizes speed at the cost of reliability. Every application makes this tradeoff.

TCP -- Transmission Control Protocol

The Three-Way Handshake
Before any data is sent, TCP establishes a connection:

  Client                         Server
    |                               |
    |──── SYN (seq=100) ──────────►|   1. "I want to connect"
    |                               |
    |◄─── SYN-ACK (seq=300,ack=101)|   2. "OK, I acknowledge your SYN"
    |                               |
    |──── ACK (ack=301) ──────────►|   3. "I acknowledge your SYN-ACK"
    |                               |
    |   Connection established.     |
    |   Now data can flow.          |

Why three steps?
  - Both sides must agree on initial sequence numbers
  - Prevents old/duplicate connection attempts from being accepted
  - Each side confirms it can send AND receive

After data transfer, connection is closed with a four-way FIN handshake:
  Client: FIN ──►  Server: ACK ──►  Server: FIN ──►  Client: ACK
How TCP Guarantees Reliability
TCP provides several guarantees that UDP does not:

1. ORDERING
   Data is split into segments, each numbered with a sequence number.
   If segment 3 arrives before segment 2, TCP holds it and waits.
   Your application always receives data in the correct order.

2. ACKNOWLEDGMENT
   Every segment must be acknowledged. If the sender doesn't get
   an ACK within a timeout, it retransmits the segment.

   Client: [Data seq=1] ──►  Server: [ACK=2] ──►
   Client: [Data seq=2] ──►  (lost!)
   Client: (timeout, retransmit) [Data seq=2] ──►  Server: [ACK=3] ──►

3. FLOW CONTROL (Window Size)
   Receiver tells sender: "I can buffer 64KB right now."
   If the receiver is slow, it shrinks the window. Sender slows down.
   Prevents the sender from overwhelming the receiver.

4. CONGESTION CONTROL
   TCP starts slow (small window), then ramps up.
   If packets are lost, TCP assumes network congestion and backs off.
   Algorithms: Slow Start, Congestion Avoidance, Fast Retransmit.

5. ERROR DETECTION
   Each segment has a checksum. If bits got corrupted in transit,
   the receiver detects it and drops the segment (sender retransmits).
TCP 3-Way Handshake (Exact Sequence):

1. Client → Server: SYN (seq=x)
2. Server → Client: SYN-ACK (seq=y, ack=x+1)
3. Client → Server: ACK (seq=x+1, ack=y+1)

Connection established. Both sides know the other's initial sequence number.

UDP -- User Datagram Protocol

UDP: Fire and Forget
UDP is dead simple:
  - No connection setup (no handshake)
  - No guaranteed delivery (packets can be lost)
  - No ordering (packets can arrive out of order)
  - No congestion control (sender blasts at full speed)
  - Minimal overhead (8-byte header vs TCP's 20-byte header)

UDP packet structure:
  ┌──────────────┬──────────────┐
  │ Source Port   │ Dest Port    │  4 bytes
  ├──────────────┼──────────────┤
  │ Length        │ Checksum     │  4 bytes
  ├──────────────┴──────────────┤
  │         Payload             │  Variable
  └─────────────────────────────┘

Total header: 8 bytes. That's it.
TCP header: 20-60 bytes. Plus handshake. Plus acknowledgments.

UDP is fast because it does almost nothing.
If you need reliability with UDP, YOU implement it in your application.
Feature TCP UDP
Connection Connection-oriented (handshake required) Connectionless (just send)
Reliability Guaranteed delivery, retransmission Best-effort, packets may be lost
Ordering Data arrives in order No ordering guarantee
Speed Slower (overhead from reliability) Faster (minimal overhead)
Header size 20-60 bytes 8 bytes
Use cases HTTP, email, file transfer, SSH Video streaming, gaming, DNS, VoIP
When to Use Each
  • Use TCP when you cannot afford to lose data: web pages, API calls, file downloads, email, database connections.
  • Use UDP when speed matters more than completeness: live video, voice calls (VoIP), online games, DNS lookups, IoT sensor data.
  • Modern twist: QUIC (used by HTTP/3) is built on UDP but adds reliability at the application layer. It gets TCP-like guarantees with UDP-like performance by avoiding TCP's head-of-line blocking.

7. HTTP/HTTPS

HTTP (HyperText Transfer Protocol) is the protocol your browser and APIs speak. It is a request-response protocol: the client sends a request, the server sends a response. Every web developer uses HTTP every day, but few understand what is actually on the wire.

Anatomy of an HTTP Request
GET /api/users?page=2 HTTP/1.1
Host: api.example.com
Accept: application/json
Authorization: Bearer eyJhbGciOiJIUzI1NiIs...
User-Agent: Mozilla/5.0 (X11; Linux x86_64)
Accept-Encoding: gzip, deflate, br
Connection: keep-alive

[Request line]     GET /api/users?page=2 HTTP/1.1
  - Method:        GET
  - Path:          /api/users
  - Query string:  ?page=2
  - HTTP version:  HTTP/1.1

[Headers]          Key: Value pairs with metadata
  - Host:          Which server (required in HTTP/1.1)
  - Accept:        What response format the client wants
  - Authorization: Credentials (bearer token, API key)
  - User-Agent:    What client is making the request

[Body]             (empty for GET, present for POST/PUT/PATCH)


Anatomy of an HTTP Response:

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 245
Cache-Control: max-age=60
Set-Cookie: session=abc123; HttpOnly; Secure

{"users": [{"id": 1, "name": "Sean"}, {"id": 2, "name": "Alex"}]}

[Status line]      HTTP/1.1 200 OK
  - HTTP version:  HTTP/1.1
  - Status code:   200
  - Reason phrase: OK

[Headers]          Response metadata
[Body]             The actual data

HTTP Methods

Method Purpose Has Body Idempotent Safe
GET Read / retrieve a resource No Yes Yes
POST Create a new resource Yes No No
PUT Replace an entire resource Yes Yes No
PATCH Partially update a resource Yes No No
DELETE Remove a resource Optional Yes No
HEAD GET without body (check headers only) No Yes Yes
OPTIONS What methods are allowed (CORS preflight) No Yes Yes
Idempotent vs Safe -- Why It Matters
SAFE: Calling it doesn't change anything on the server.
  GET /users -- safe (just reading)
  DELETE /users/5 -- NOT safe (deletes data)

IDEMPOTENT: Calling it once or 100 times has the same result.
  GET /users -- idempotent (always returns same data)
  PUT /users/5 {name:"Sean"} -- idempotent (result is the same)
  DELETE /users/5 -- idempotent (user is deleted, calling again changes nothing)
  POST /users {name:"Sean"} -- NOT idempotent (creates a new user each time!)

Why this matters:
  - Network timeout on a POST? DON'T automatically retry (might create duplicates)
  - Network timeout on a PUT? Safe to retry (same result)
  - This is why payment APIs use idempotency keys:
    POST /payments {amount: 50, idempotency_key: "abc123"}
    If you retry with the same key, the server returns the existing payment
    instead of charging twice.
HTTP Method Properties (Precise Rules):

MethodSafe?Idempotent?Body?
GETYesYesNo
POSTNoNoYes
PUTNoYesYes
PATCHNoNoYes
DELETENoYesOptional

Safe: Does not modify server state. Idempotent: Multiple identical requests have same effect as one.

HTTP Status Codes

Every Status Code You Need to Know
1xx INFORMATIONAL (rare, mostly protocol-level)
  100 Continue          -- "I got your headers, send the body"
  101 Switching Protocols -- "Upgrading to WebSocket"

2xx SUCCESS
  200 OK                -- Standard success response
  201 Created           -- Resource created (POST response)
  204 No Content        -- Success, but no body (DELETE response)

3xx REDIRECTION
  301 Moved Permanently -- URL changed forever (cached by browsers)
  302 Found             -- Temporary redirect (not cached)
  304 Not Modified      -- Use your cached version (saves bandwidth)
  307 Temporary Redirect -- Like 302 but preserves HTTP method
  308 Permanent Redirect -- Like 301 but preserves HTTP method

4xx CLIENT ERROR (the request was wrong)
  400 Bad Request       -- Malformed request (missing field, bad JSON)
  401 Unauthorized      -- Not authenticated (no credentials or expired)
  403 Forbidden         -- Authenticated but not authorized (no permission)
  404 Not Found         -- Resource doesn't exist
  405 Method Not Allowed -- Wrong HTTP method (POST to a GET-only endpoint)
  409 Conflict          -- Conflict with current state (duplicate username)
  413 Payload Too Large -- Request body exceeds server's limit
  422 Unprocessable Entity -- Valid syntax but invalid data (validation errors)
  429 Too Many Requests -- Rate limited

5xx SERVER ERROR (the server failed)
  500 Internal Server Error -- Generic server crash
  502 Bad Gateway       -- Proxy/load balancer got bad response from upstream
  503 Service Unavailable -- Server overloaded or in maintenance
  504 Gateway Timeout   -- Proxy didn't get a response from upstream in time

HTTP/1.1 vs HTTP/2 vs HTTP/3

The Evolution of HTTP
HTTP/1.1 (1997 -- still widely used):
  - One request per TCP connection at a time
  - Workaround: browsers open 6 parallel connections per domain
  - Text-based protocol (human-readable headers)
  - Head-of-line blocking: slow response blocks all others on that connection
  - Keep-Alive: reuse connection for multiple requests (sequential)

HTTP/2 (2015 -- widely adopted):
  - MULTIPLEXING: multiple requests on ONE TCP connection simultaneously
  - Binary protocol (faster to parse, not human-readable)
  - Header compression (HPACK) -- headers often repeat, compress them
  - Server Push: server sends resources before client requests them
  - Stream prioritization: important resources first
  - Still TCP, so TCP-level head-of-line blocking remains

HTTP/3 (2022 -- growing adoption):
  - Built on QUIC (UDP-based transport)
  - NO head-of-line blocking (lost packet affects only its stream)
  - Faster connection setup (0-RTT or 1-RTT vs TCP's 1-3 RTT)
  - Built-in encryption (TLS 1.3 is mandatory)
  - Better for mobile (connection survives IP changes)
  - Used by: Google, Facebook, Cloudflare

Performance comparison (loading a web page with 50 resources):
  HTTP/1.1: ~6 TCP connections, waterfall of requests       SLOW
  HTTP/2:   1 TCP connection, all 50 requests in parallel   FAST
  HTTP/3:   1 QUIC connection, parallel + no HOL blocking   FASTEST
Cookies, Sessions, and Headers Developers Should Know
  • Set-Cookie / Cookie: Server sends Set-Cookie: session=abc; HttpOnly; Secure; SameSite=Strict. Browser sends Cookie: session=abc on every subsequent request.
  • Cache-Control: Controls caching. no-store = never cache. max-age=3600 = cache for 1 hour. public = CDN can cache. private = only browser can cache.
  • Content-Type: Tells the receiver the format. application/json, text/html, multipart/form-data.
  • CORS headers: Access-Control-Allow-Origin controls which domains can call your API from a browser.
  • ETag / If-None-Match: Efficient caching. Server sends a hash of the response. Client sends it back on next request. If unchanged, server returns 304 (no body, save bandwidth).

8. TLS/SSL -- How Encryption Works

TLS (Transport Layer Security) is what puts the "S" in HTTPS. It encrypts the connection between client and server so that no one in the middle -- your ISP, a hacker on the WiFi, a government -- can read or modify the data. SSL is the old name (deprecated). TLS is the current standard (TLS 1.3 as of 2018).

The TLS 1.3 Handshake
Before encrypted data flows, client and server must agree on
encryption keys. This is the TLS handshake:

  Client                              Server
    |                                    |
    |── ClientHello ───────────────────►|
    |   (supported ciphers, random,     |
    |    key share, SNI)                |
    |                                    |
    |◄────────────────── ServerHello ───|
    |   (chosen cipher, random,         |
    |    key share, certificate,        |
    |    finished)                       |
    |                                    |
    |── Finished ──────────────────────►|
    |                                    |
    |◄═══════ Encrypted data ══════════►|

TLS 1.3 completes in 1 round-trip (1-RTT).
TLS 1.2 took 2 round-trips (2-RTT).
TLS 1.3 with 0-RTT resumption: 0 round-trips for repeat connections!

What happens in each step:

1. CLIENT HELLO: "I support these ciphers: AES-256-GCM, ChaCha20.
   Here is my key share (Diffie-Hellman). I want to connect to
   api.example.com (SNI = Server Name Indication)."

2. SERVER HELLO: "I chose AES-256-GCM. Here is my certificate
   proving I'm really api.example.com. Here is my key share.
   Together our key shares create a shared secret that only we know."

3. FINISHED: Both sides derive encryption keys from the shared secret.
   All subsequent data is encrypted with AES-256-GCM.
Certificates and Certificate Authorities
How does the client know the server is legit and not an impersonator?
CERTIFICATES.

A TLS certificate contains:
  - Domain name (e.g., example.com)
  - Public key (for key exchange)
  - Issuer (who signed this certificate)
  - Validity period (not before / not after)
  - Digital signature from the CA

CERTIFICATE CHAIN:
  Your browser trusts ~150 ROOT Certificate Authorities (CAs).
  CAs sign intermediate certificates. Intermediates sign your cert.

  [Root CA: DigiCert]           -- pre-installed in your browser/OS
       |
  [Intermediate CA: DigiCert G2] -- signed by root
       |
  [Your cert: example.com]       -- signed by intermediate

  Browser verifies: "example.com cert signed by DigiCert G2,
  which is signed by DigiCert root, which I trust. Chain valid."

Getting a certificate:
  1. Let's Encrypt (free, automated, 90-day certificates)
     $ certbot --nginx -d example.com

  2. Paid CAs (DigiCert, Comodo) for extended validation (EV)
     EV certs require identity verification (company name in browser)

  3. Self-signed (for development only -- browsers will show warnings)
     $ openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem
Why HTTPS Everywhere

Without HTTPS, anyone between you and the server can:

  • Read everything: Passwords, API tokens, personal data -- all in plain text on the wire.
  • Modify responses: ISPs have injected ads into HTTP pages. Attackers can inject malicious JavaScript.
  • Impersonate servers: A rogue WiFi hotspot can pretend to be your bank and steal credentials.

Modern browsers mark HTTP sites as "Not Secure." HTTP/2 and HTTP/3 require HTTPS. There is no good reason to serve anything over plain HTTP in production.

9. WebSockets & Real-Time Communication

HTTP is request-response: the client asks, the server answers, and the connection is done. But what about chat apps, live dashboards, multiplayer games, and real-time notifications? You need the server to push data to the client without waiting for a request. That is where WebSockets come in.

HTTP vs WebSocket -- The Core Difference
HTTP (half-duplex, request-response):
  Client: "Any new messages?" ──► Server: "No."
  Client: "Any new messages?" ──► Server: "No."
  Client: "Any new messages?" ──► Server: "Yes, here's one."
  Client: "Any new messages?" ──► Server: "No."
  (client must keep asking -- polling)

WebSocket (full-duplex, persistent connection):
  Client: "Let's upgrade to WebSocket" ──► Server: "OK, upgraded"
  [Connection stays open]
  Server: "New message: Hello!" ──► Client
  Client: "I'm typing..." ──► Server
  Server: "New message: How are you?" ──► Client
  (either side can send at any time)

WebSocket upgrade handshake (starts as HTTP, then upgrades):

  GET /chat HTTP/1.1
  Host: example.com
  Upgrade: websocket
  Connection: Upgrade
  Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
  Sec-WebSocket-Version: 13

  HTTP/1.1 101 Switching Protocols
  Upgrade: websocket
  Connection: Upgrade
  Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

After the 101 response, the connection is no longer HTTP.
It is a persistent, bidirectional WebSocket connection.
WebSocket in Practice (Node.js)
// Server (using the 'ws' library)
const WebSocket = require("ws");
const wss = new WebSocket.Server({ port: 8080 });

wss.on("connection", (ws) => {
  console.log("Client connected");

  ws.on("message", (data) => {
    const message = JSON.parse(data);
    console.log("Received:", message);

    // Broadcast to all connected clients
    wss.clients.forEach((client) => {
      if (client.readyState === WebSocket.OPEN) {
        client.send(JSON.stringify({
          user: message.user,
          text: message.text,
          timestamp: Date.now(),
        }));
      }
    });
  });

  ws.on("close", () => console.log("Client disconnected"));
});

// Client (browser)
const ws = new WebSocket("wss://example.com/chat");

ws.onopen = () => {
  ws.send(JSON.stringify({ user: "Sean", text: "Hello!" }));
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);
  displayMessage(message);
};

ws.onclose = () => {
  console.log("Disconnected. Reconnecting...");
  setTimeout(connect, 3000);  // Auto-reconnect
};

Alternatives to WebSockets

Technique Direction When to Use Pros Cons
Long Polling Server to client Simple notifications, legacy support Works everywhere, simple High latency, wastes connections
SSE (Server-Sent Events) Server to client only Live feeds, dashboards, notifications Simple API, auto-reconnect, HTTP-based One-directional, limited to text
WebSocket Bidirectional Chat, gaming, collaborative editing Low latency, full-duplex, binary support More complex, doesn't work through some proxies
WebTransport Bidirectional Gaming, streaming (emerging) HTTP/3 based, unreliable + reliable streams New, limited browser support
Server-Sent Events -- The Simpler Alternative
// SSE is perfect when you only need server-to-client streaming.
// It's simpler than WebSockets and works over regular HTTP.

// Server (Express)
app.get("/events", (req, res) => {
  res.writeHead(200, {
    "Content-Type": "text/event-stream",
    "Cache-Control": "no-cache",
    "Connection": "keep-alive",
  });

  // Send event every 2 seconds
  const interval = setInterval(() => {
    res.write(`data: ${JSON.stringify({ time: Date.now() })}\n\n`);
  }, 2000);

  req.on("close", () => clearInterval(interval));
});

// Client (browser -- built-in API, no library needed)
const source = new EventSource("/events");
source.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log("Server says:", data);
};

// SSE automatically reconnects if the connection drops.
// Use SSE for: live scores, stock tickers, notification feeds.
// Use WebSockets for: chat, games, anything needing client-to-server.
Choosing the Right Real-Time Approach
  • Notifications / live feeds: SSE. Simple, reliable, auto-reconnect.
  • Chat / messaging: WebSocket. Need bidirectional communication.
  • Collaborative editing (Google Docs): WebSocket + CRDTs or OT (operational transforms).
  • Online gaming: WebSocket or WebTransport. Need low latency and binary data.
  • Simple "is there new data?": Polling (check every 30 seconds). Don't over-engineer.

10. REST vs GraphQL vs gRPC

These are three different approaches to designing APIs. Each makes different tradeoffs between simplicity, flexibility, and performance. There is no universally "best" choice -- it depends on your use case.

The Same Operation in Three Styles
// Get a user's name and their last 3 orders

// ─── REST ───
GET /api/users/42
GET /api/users/42/orders?limit=3
// Two requests. Server decides what fields to return.
// You get ALL fields whether you need them or not (over-fetching).

// ─── GraphQL ───
POST /graphql
{
  "query": "{
    user(id: 42) {
      name
      orders(last: 3) {
        id
        total
        status
      }
    }
  }"
}
// ONE request. Client specifies exactly which fields it wants.
// No over-fetching, no under-fetching.

// ─── gRPC ───
// Defined in a .proto file:
service UserService {
  rpc GetUser (GetUserRequest) returns (UserResponse);
  rpc GetOrders (GetOrdersRequest) returns (OrdersResponse);
}

message GetUserRequest { int32 id = 1; }
message UserResponse { string name = 1; }
// Uses Protocol Buffers (binary serialization). Extremely fast.
// Strongly typed. Code is auto-generated from .proto files.
Feature REST GraphQL gRPC
Transport HTTP (any method) HTTP (usually POST) HTTP/2 (binary frames)
Data format JSON (usually) JSON Protocol Buffers (binary)
Schema / contract Optional (OpenAPI/Swagger) Required (GraphQL schema) Required (.proto files)
Fetching Fixed response shape per endpoint Client specifies exact fields Fixed response per RPC method
Performance Good Good (but parsing overhead) Excellent (binary, streaming)
Streaming Limited (SSE, chunked) Subscriptions (WebSocket) Built-in bidirectional streaming
Browser support Native Native (it's just HTTP) Requires grpc-web proxy
Learning curve Low Medium Medium-High
Best for Public APIs, CRUD apps, simple services Mobile apps, complex UIs, multiple clients Microservices, internal APIs, high performance
When to Use Each
  • REST: Your default choice. Simple, well-understood, great tooling. Use for public APIs, CRUD applications, and when your team knows REST well.
  • GraphQL: When you have multiple clients (web, mobile, smartwatch) that need different data shapes. When over-fetching or under-fetching is a real problem. When you want a single endpoint instead of dozens.
  • gRPC: For service-to-service communication in microservices. When you need maximum performance (binary serialization is 5-10x faster than JSON). When you need streaming (real-time data feeds between services).
  • Many companies use all three: gRPC between backend services, GraphQL as a gateway for frontend clients, REST for public/partner APIs.

11. Common Ports

A port is a 16-bit number (0-65535) that identifies a specific process on a machine. IP addresses identify the machine; ports identify which application on that machine should receive the traffic. Think of the IP as the building's street address and the port as the apartment number.

Port Protocol What It's For
20, 21FTPFile Transfer Protocol (20=data, 21=control)
22SSHSecure Shell (remote login, SCP, SFTP)
23TelnetUnencrypted remote login (don't use this)
25SMTPSending email (server to server)
53DNSDomain name resolution (UDP and TCP)
80HTTPUnencrypted web traffic
110POP3Retrieving email (download and delete)
143IMAPRetrieving email (sync, keep on server)
443HTTPSEncrypted web traffic (TLS)
465 / 587SMTPS / SubmissionEncrypted email sending
993IMAPSEncrypted IMAP
995POP3SEncrypted POP3
3000Dev serversCommon default for Node.js, React dev server
3306MySQLMySQL / MariaDB database
5432PostgreSQLPostgreSQL database
5672AMQPRabbitMQ message broker
6379RedisRedis in-memory data store
8080HTTP (alt)Common alternative HTTP port for dev/proxies
8443HTTPS (alt)Alternative HTTPS port
9092KafkaApache Kafka message broker
27017MongoDBMongoDB database
Port Security
  • Ports 0-1023 are "well-known" and require root/admin to bind on Unix systems.
  • Never expose database ports (3306, 5432, 27017, 6379) to the public internet. Bind them to localhost or a private network.
  • Use firewalls (iptables, ufw, security groups) to allow only the ports your application needs. A common setup: allow 22 (SSH), 80 (HTTP), 443 (HTTPS) -- block everything else.
  • Don't rely on non-standard ports for security ("security through obscurity"). Running SSH on port 2222 reduces noise but doesn't stop a determined attacker.

12. Network Debugging Tools

When something goes wrong with your network -- and it will -- you need the right tools to diagnose the problem. These tools are the stethoscope, thermometer, and X-ray machine of network debugging. Learn them before you need them.

ping -- Is the host reachable?
# Basic connectivity test (uses ICMP echo)
$ ping example.com
PING example.com (93.184.216.34): 56 data bytes
64 bytes from 93.184.216.34: icmp_seq=0 ttl=56 time=11.632 ms
64 bytes from 93.184.216.34: icmp_seq=1 ttl=56 time=11.726 ms

# What it tells you:
#   - Host is reachable (you got a response)
#   - Round-trip time: ~11ms (latency)
#   - TTL=56: packet went through ~8 hops (started at 64)

# Ping a specific number of times
$ ping -c 4 example.com

# Common issues:
#   - "Request timeout": host is down OR firewall blocks ICMP
#   - High/variable latency: network congestion or routing issues
#   - "Unknown host": DNS resolution failed
traceroute -- What path do packets take?
# Shows every router (hop) between you and the destination
$ traceroute example.com
 1  192.168.1.1 (192.168.1.1)      1.234 ms  -- your router
 2  10.0.0.1 (10.0.0.1)            8.456 ms  -- ISP's first router
 3  72.14.213.105                   10.234 ms -- ISP backbone
 4  108.170.248.1                   11.567 ms -- Google peering
 5  93.184.216.34                   12.890 ms -- destination

# Use traceroute to identify:
#   - Where latency spikes (hop 2 to 3 = ISP problem)
#   - Where packets are being dropped (*** = no response)
#   - Whether traffic is taking an unexpected route

# On Linux, use traceroute. On macOS, use traceroute.
# On Windows, use tracert.
# mtr combines ping + traceroute in real-time:
$ mtr example.com
nslookup and dig -- DNS debugging
# nslookup: quick DNS lookup
$ nslookup example.com
Server:    1.1.1.1
Address:   1.1.1.1#53

Non-authoritative answer:
Name:      example.com
Address:   93.184.216.34

# dig: detailed DNS lookup (more info than nslookup)
$ dig example.com

;; ANSWER SECTION:
example.com.    3600    IN    A    93.184.216.34

;; Query time: 23 msec
;; SERVER: 1.1.1.1#53

# Look up specific record types
$ dig example.com MX          # mail servers
$ dig example.com AAAA        # IPv6 address
$ dig example.com NS          # nameservers
$ dig example.com TXT         # TXT records (SPF, DKIM)

# Trace the full DNS resolution path
$ dig +trace example.com

# Query a specific DNS server
$ dig @8.8.8.8 example.com   # ask Google's DNS
$ dig @1.1.1.1 example.com   # ask Cloudflare's DNS

# Check if DNS propagation is complete after a change:
# Query multiple resolvers and compare answers
curl -- The Swiss Army knife of HTTP
# Simple GET request
$ curl https://api.example.com/users

# Show response headers
$ curl -I https://example.com
HTTP/2 200
content-type: text/html
content-length: 1256
cache-control: max-age=604800

# Verbose mode (see full request/response including TLS handshake)
$ curl -v https://example.com

# POST with JSON body
$ curl -X POST https://api.example.com/users \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-token" \
  -d '{"name": "Sean", "email": "sean@example.com"}'

# Follow redirects
$ curl -L http://example.com    # follows 301/302 redirects

# Download a file
$ curl -O https://example.com/file.zip

# Measure timing (DNS, connect, TLS, first byte, total)
$ curl -w "\nDNS: %{time_namelookup}s\nConnect: %{time_connect}s\n\
TLS: %{time_appconnect}s\nFirst byte: %{time_starttransfer}s\n\
Total: %{time_total}s\n" -o /dev/null -s https://example.com

# Output:
# DNS: 0.012s
# Connect: 0.045s
# TLS: 0.089s
# First byte: 0.134s
# Total: 0.156s
netstat and ss -- What's listening on my machine?
# ss is the modern replacement for netstat (faster, more info)

# Show all listening TCP ports
$ ss -tlnp
State   Recv-Q  Send-Q  Local Address:Port   Peer Address:Port  Process
LISTEN  0       128     0.0.0.0:22            0.0.0.0:*          sshd
LISTEN  0       511     0.0.0.0:80            0.0.0.0:*          nginx
LISTEN  0       511     127.0.0.1:3000        0.0.0.0:*          node
LISTEN  0       128     127.0.0.1:5432        0.0.0.0:*          postgres

# Flags: -t=TCP, -l=listening, -n=numeric (no DNS), -p=process name

# Show all established connections
$ ss -tnp

# Show what's using a specific port
$ ss -tlnp | grep :3000
# or
$ lsof -i :3000

# netstat equivalent (if ss is not available)
$ netstat -tlnp

# Common debugging scenario:
# "My app won't start -- port 3000 already in use"
$ ss -tlnp | grep :3000
# Find the PID and kill it, or use a different port
tcpdump -- Capture packets on the wire
# tcpdump captures raw network traffic. Powerful but noisy.

# Capture all traffic on port 80
$ sudo tcpdump -i any port 80

# Capture traffic to/from a specific host
$ sudo tcpdump -i any host 93.184.216.34

# Capture and save to a file (open in Wireshark later)
$ sudo tcpdump -i any -w capture.pcap port 443

# Show packet contents in ASCII
$ sudo tcpdump -i any -A port 80

# Capture DNS traffic
$ sudo tcpdump -i any port 53

# Capture only SYN packets (new TCP connections)
$ sudo tcpdump -i any 'tcp[tcpflags] & tcp-syn != 0'

# When to use tcpdump:
#   - Debugging mysterious connection issues
#   - Verifying traffic is actually encrypted (HTTPS)
#   - Checking if packets are reaching your server at all
#   - Analyzing protocol behavior (TCP retransmissions, etc.)

# Wireshark: GUI version of tcpdump. Same pcap files.
# Much easier to use for complex analysis (filtering, following streams,
# decoding protocols). Install it on your development machine.
Network Debugging Cheat Sheet
  • "Is the server up?" -- ping server.com
  • "Where is the latency?" -- traceroute server.com or mtr server.com
  • "Is DNS working?" -- dig server.com or nslookup server.com
  • "Is the HTTP endpoint working?" -- curl -v https://server.com/api/health
  • "What's listening on my machine?" -- ss -tlnp
  • "What port is my app using?" -- ss -tlnp | grep :3000 or lsof -i :3000
  • "Are packets actually being sent?" -- sudo tcpdump -i any port 443
  • "How long does each network phase take?" -- curl -w "..." -o /dev/null -s (timing)
  • "Is my firewall blocking traffic?" -- sudo iptables -L -n or check cloud security groups
The Debugging Thought Process

When something network-related breaks, work up from the bottom of the stack:

  1. Layer 1-2: Can you reach the machine? ping the IP address (not the domain). If this fails, it's a connectivity or firewall issue.
  2. Layer 3: Is DNS working? dig the domain. If the IP is wrong, it's a DNS problem.
  3. Layer 4: Is the port open? ss -tlnp on the server. telnet host port from the client. If connection refused, the service isn't running or a firewall is blocking it.
  4. Layer 7: Is the application responding correctly? curl -v to see the full HTTP exchange. Check status codes, headers, and response body.

This bottom-up approach prevents you from spending an hour debugging your application code when the real problem is a misconfigured firewall.

13. SSH, SCP & Secure File Transfer

SSH (Secure Shell) is how you securely connect to remote machines and transfer files. If you deploy code to a server, manage a VPS, or push to GitHub over SSH -- you are using this protocol. It replaced Telnet (unencrypted, port 23) and FTP (passwords sent in plain text) by encrypting everything: your commands, your passwords, and your files.

What SSH Actually Is

SSH Protocol Architecture
SSH (Secure Shell) is an APPLICATION LAYER protocol that runs over TCP on port 22.

Protocol stack when you run "ssh user@server":

  ┌─────────────────────────────┐
  │  Your terminal / commands   │  ← What you interact with
  ├─────────────────────────────┤
  │  SSH (Application Layer)    │  ← Encrypts everything, authenticates
  ├─────────────────────────────┤
  │  TCP (Transport Layer)      │  ← Reliable delivery, port 22
  ├─────────────────────────────┤
  │  IP (Network Layer)         │  ← Routes packets to the server
  ├─────────────────────────────┤
  │  Ethernet/WiFi (Link Layer) │  ← Physical transmission
  └─────────────────────────────┘

SSH is NOT just "remote terminal." It is actually three protocols in one:

  1. SSH Transport Layer Protocol (RFC 4253)
     - Server authentication (is this really my server?)
     - Key exchange (Diffie-Hellman)
     - Encryption setup (AES-256, ChaCha20)
     - Integrity checking (HMAC)

  2. SSH User Authentication Protocol (RFC 4252)
     - Password authentication
     - Public key authentication (the preferred method)
     - Keyboard-interactive (for 2FA)

  3. SSH Connection Protocol (RFC 4254)
     - Interactive shell sessions
     - Command execution
     - Port forwarding / tunneling
     - File transfer (SCP, SFTP)

How SSH Establishes a Connection

The SSH Handshake -- Step by Step
When you type: ssh sean@192.168.1.50

1. TCP CONNECTION
   Your machine opens a TCP connection to 192.168.1.50:22.
   Standard TCP three-way handshake: SYN → SYN-ACK → ACK.

2. PROTOCOL VERSION EXCHANGE
   Client: "SSH-2.0-OpenSSH_9.6"
   Server: "SSH-2.0-OpenSSH_9.3"
   Both agree on SSH version 2.0 (SSH-1 is deprecated and insecure).

3. KEY EXCHANGE (Diffie-Hellman)
   This is the critical step. Both sides generate a shared secret
   WITHOUT ever sending the secret over the wire.

   Client picks random a, computes A = g^a mod p, sends A to server
   Server picks random b, computes B = g^b mod p, sends B to client
   Both compute: shared_secret = (other's value)^(my random) mod p
     Client: K = B^a mod p
     Server: K = A^b mod p
   Both get the SAME value K, but an eavesdropper who saw A and B
   cannot compute K (this is the Discrete Logarithm Problem).

   Modern SSH uses Curve25519 (elliptic curve Diffie-Hellman)
   instead of classic DH -- same idea, much faster, shorter keys.

4. SERVER AUTHENTICATION
   Server proves its identity by signing data with its HOST KEY.
   First time you connect, you see:
     "The authenticity of host '192.168.1.50' can't be established.
      ED25519 key fingerprint is SHA256:abc123...
      Are you sure you want to continue connecting?"

   You accept → fingerprint is saved to ~/.ssh/known_hosts.
   Next time, SSH checks the fingerprint automatically.
   If it CHANGES, SSH refuses to connect (possible MITM attack).

5. ENCRYPTION ACTIVATED
   Both sides derive session keys from the shared secret K.
   All traffic from here on is encrypted with AES-256-GCM
   or ChaCha20-Poly1305 (symmetric encryption -- fast).

6. USER AUTHENTICATION
   Now the encrypted channel is up. Client proves WHO it is:
     Option A: Password (sent encrypted through the SSH tunnel)
     Option B: Public key (client proves it has the private key)
   Public key auth is strongly preferred (see below).

7. SESSION ESTABLISHED
   You get a shell. Every keystroke you type and every byte of
   output is encrypted end-to-end.
SSH Connection Summary:

TCP port: 22 (default)
Transport: TCP (reliable, ordered delivery required)
Key exchange: Curve25519 or Diffie-Hellman (asymmetric)
Session encryption: AES-256-GCM or ChaCha20-Poly1305 (symmetric)
Integrity: HMAC-SHA2 or AEAD built into cipher
Authentication: Public key (preferred) or password

SSH uses asymmetric cryptography for key exchange and authentication,
then symmetric cryptography for the actual data transfer (because it is much faster).

SSH Key Authentication

Why Public Key Auth is Better Than Passwords
Password authentication:
  - Password travels over the network (encrypted, but still)
  - Vulnerable to brute-force attacks
  - You have to type it every time
  - If the server is compromised, attacker gets your password

Public key authentication:
  - Private key NEVER leaves your machine
  - Server only has the public key (useless without the private key)
  - Cannot be brute-forced (4096-bit RSA or Ed25519)
  - No password to type (optional passphrase on the key itself)

How it works:

  1. You generate a key pair:
     $ ssh-keygen -t ed25519 -C "sean@laptop"

     This creates:
       ~/.ssh/id_ed25519       ← PRIVATE key (guard this with your life)
       ~/.ssh/id_ed25519.pub   ← PUBLIC key (safe to share with anyone)

  2. You put the public key on the server:
     $ ssh-copy-id sean@192.168.1.50

     This appends your public key to the server's
     ~/.ssh/authorized_keys file.

  3. Authentication (what happens behind the scenes):
     Server: sends a random challenge
     Client: signs the challenge with the PRIVATE key
     Server: verifies the signature using the PUBLIC key
     If it matches → you're in. Private key never transmitted.

  4. Now you connect without a password:
     $ ssh sean@192.168.1.50
     Welcome to Ubuntu 22.04!  ← straight in
SSH Key Security Best Practices
  • Use Ed25519 keys, not RSA. Ed25519 is faster, shorter, and more secure: ssh-keygen -t ed25519
  • Add a passphrase to your private key. If someone steals the file, they still need the passphrase. Use ssh-agent so you only type it once per session.
  • Never share your private key. The public key is the one you put on servers, GitHub, etc. The private key stays on your machine.
  • Set correct file permissions: chmod 700 ~/.ssh and chmod 600 ~/.ssh/id_ed25519. SSH refuses to use keys with loose permissions.
  • Disable password auth on servers once key auth works: set PasswordAuthentication no in /etc/ssh/sshd_config. This stops brute-force attacks entirely.

SCP -- Secure Copy Protocol

Transferring Files Over SSH
SCP (Secure Copy) uses the SSH protocol to transfer files.
It establishes an SSH connection, then streams file data
through the encrypted tunnel. Same port, same encryption,
same authentication -- it is just SSH with file transfer.

Protocol stack for SCP:

  ┌──────────────────────────────┐
  │  SCP (file copy commands)    │  ← Tells the remote side what to read/write
  ├──────────────────────────────┤
  │  SSH (encrypted channel)     │  ← All data encrypted (AES-256 / ChaCha20)
  ├──────────────────────────────┤
  │  TCP port 22                 │  ← Reliable delivery
  ├──────────────────────────────┤
  │  IP → routing → destination  │
  └──────────────────────────────┘

# Copy a file FROM your machine TO a remote server
$ scp report.pdf sean@192.168.1.50:/home/sean/documents/
  ─── local file ── user@host:remote-path ───────────────

# Copy a file FROM a remote server TO your machine
$ scp sean@192.168.1.50:/var/log/app.log ./local-copy.log
  ─── user@host:remote-path ──────────── local-path ─────

# Copy an entire directory (recursive)
$ scp -r ./my-project sean@192.168.1.50:/home/sean/projects/

# Copy between two remote servers (through your machine)
$ scp sean@server1:/data/backup.sql sean@server2:/data/backup.sql

# Use a specific SSH key
$ scp -i ~/.ssh/deploy_key build.zip deploy@prod:/var/www/

# Use a non-standard SSH port
$ scp -P 2222 file.txt sean@server:/home/sean/

# Show progress for large files
$ scp -v large-file.tar.gz sean@server:/backups/

SFTP -- A Better Alternative to SCP

SFTP vs SCP -- What to Use in 2024+
SFTP (SSH File Transfer Protocol) is the modern replacement for SCP.
Both use SSH for encryption, but SFTP is more capable:

SCP:
  - Simple copy (one direction, one operation)
  - Cannot resume interrupted transfers
  - Cannot list directories or delete remote files
  - OpenSSH has deprecated SCP's protocol internals
    (it now uses SFTP under the hood by default)

SFTP:
  - Full file system operations (ls, cd, mkdir, rm, rename)
  - Can resume interrupted transfers
  - Interactive mode (browse the remote filesystem)
  - Supported by GUI tools (FileZilla, WinSCP, Cyberduck)
  - The standard going forward

# Interactive SFTP session
$ sftp sean@192.168.1.50
Connected to 192.168.1.50.
sftp> ls
documents/  projects/  backups/
sftp> cd documents
sftp> put report.pdf              # Upload file
sftp> get presentation.pptx       # Download file
sftp> mkdir new-folder
sftp> exit

# One-liner (non-interactive)
$ sftp sean@server:/path/to/file.pdf ./local/

# Both SCP and SFTP use the SAME:
#   - Port: TCP 22
#   - Encryption: AES-256 / ChaCha20 (via SSH)
#   - Authentication: SSH keys or password
#   - Protocol: SSH (they're subsystems of SSH)

The key point: SFTP is NOT FTP over SSL.
  - FTP (port 21) + TLS = FTPS  ← old, complex, separate protocol
  - SSH (port 22) + file ops = SFTP  ← what you should use

How PDFs and Binary Files Travel Over SSH

Sending a PDF Over SCP -- What Actually Happens
You run: scp thesis.pdf sean@server:/home/sean/

Here is EXACTLY what happens at every layer:

1. FILE READING (your machine)
   Your OS reads thesis.pdf from disk as RAW BYTES.
   A PDF is binary data -- headers, fonts, images, text streams,
   all encoded as bytes. SCP does not care about the file format.
   It is just bytes: 25 50 44 46 2D 31 2E 34 0A ... (%PDF-1.4...)

2. SSH CONNECTION (already established or created now)
   SCP opens an SSH channel to the server on TCP port 22.
   Key exchange, authentication -- same as any SSH connection.

3. SCP PROTOCOL EXCHANGE
   Client tells server: "I'm sending a file called thesis.pdf,
   size 2,457,600 bytes, permissions 0644."
   Server responds: "Ready to receive."

4. CHUNKING AND ENCRYPTION
   The file bytes are split into chunks (typically 32-64 KB).
   Each chunk is:
     a) Encrypted with AES-256-GCM (or ChaCha20-Poly1305)
     b) Given an HMAC for integrity verification
     c) Wrapped in an SSH packet with sequence number

   Raw PDF bytes:     [25 50 44 46 2D 31 2E 34 ...]
   After encryption:  [A7 3F 8B 2C 91 D4 E8 0F ...]  (unreadable)
   + HMAC tag:        [integrity check appended]

5. TCP SEGMENTATION
   Encrypted SSH packets are handed to TCP.
   TCP splits them into segments (~1460 bytes each for Ethernet).
   Each segment gets a sequence number for ordered delivery.
   TCP ensures every segment arrives and none are lost.

6. IP ROUTING
   TCP segments are wrapped in IP packets.
   Routed across the internet: your router → ISP → backbone →
   destination ISP → server's network → server.

7. REASSEMBLY ON THE SERVER
   TCP reassembles segments in order.
   SSH decrypts each chunk and verifies the HMAC.
   SCP writes the decrypted bytes to /home/sean/thesis.pdf.
   The file on the server is byte-for-byte identical to the original.

KEY INSIGHT: SSH does not care what the file contains.
  PDF, JPEG, MP4, .zip, binary executable -- it is all just bytes.
  SSH encrypts the byte stream. The receiving end gets back the
  exact same bytes. File format is irrelevant to the transport.

  Text file?   → bytes → encrypted → TCP → decrypted → bytes → text file
  PDF?         → bytes → encrypted → TCP → decrypted → bytes → PDF
  Docker image? → bytes → encrypted → TCP → decrypted → bytes → Docker image
  It is all the same process.
File Transfer Protocol Comparison:

ProtocolPortTransportEncrypted?Status
FTP20/21TCPNo (plaintext passwords!)Avoid
FTPS990TCP + TLSYes (TLS layer)Legacy
SCP22TCP + SSHYes (SSH encryption)Deprecated
SFTP22TCP + SSHYes (SSH encryption)Recommended
rsync+SSH22TCP + SSHYes (SSH encryption)Best for sync

SSH Tunneling and Port Forwarding

Using SSH as an Encrypted Tunnel
SSH can do more than remote shells and file copies.
It can tunnel ANY TCP traffic through the encrypted connection.

LOCAL PORT FORWARDING (-L)
  "Make a remote service available on my local machine"

  # Access a database on a remote server that only listens on localhost
  $ ssh -L 5432:localhost:5432 sean@production-server

  Now connect to localhost:5432 on your machine →
  traffic is tunneled through SSH →
  arrives at production-server:5432 (the database)

  Your app:   localhost:5432
       ↓ (encrypted SSH tunnel)
  Server:     localhost:5432 (PostgreSQL)

  Use case: Access a remote database without exposing it to the internet.

REMOTE PORT FORWARDING (-R)
  "Make my local service available on the remote machine"

  # Expose your local dev server through a remote server
  $ ssh -R 8080:localhost:3000 sean@public-server

  Now anyone can hit public-server:8080 →
  traffic tunnels back to your laptop:3000

  Use case: Demo a local app without deploying it.

DYNAMIC PORT FORWARDING (-D) -- SOCKS Proxy
  "Route ALL my traffic through the SSH server"

  $ ssh -D 1080 sean@trusted-server

  Configure your browser to use SOCKS5 proxy at localhost:1080.
  All web traffic goes through the SSH tunnel to trusted-server,
  then out to the internet. Like a simple VPN.

  Use case: Secure browsing on untrusted WiFi.
SSH Quick Reference
  • Connect: ssh user@host (port 22) or ssh -p 2222 user@host (custom port)
  • Copy file to server: scp file.pdf user@host:/path/ or sftp user@host
  • Copy directory: scp -r folder/ user@host:/path/
  • Generate key: ssh-keygen -t ed25519
  • Copy key to server: ssh-copy-id user@host
  • Run a single command: ssh user@host "ls -la /var/log"
  • SSH config file: Put frequent connections in ~/.ssh/config for shorthand aliases
  • Sync files efficiently: rsync -avz -e ssh ./local/ user@host:/remote/ (only transfers changes)
~/.ssh/config -- Stop Typing Long Commands
# Instead of: ssh -i ~/.ssh/deploy_key -p 2222 sean@192.168.1.50
# Just type:  ssh myserver

# ~/.ssh/config
Host myserver
    HostName 192.168.1.50
    User sean
    Port 2222
    IdentityFile ~/.ssh/deploy_key

Host github
    HostName github.com
    User git
    IdentityFile ~/.ssh/github_ed25519

Host prod
    HostName prod.example.com
    User deploy
    IdentityFile ~/.ssh/deploy_key
    ForwardAgent yes

# Now you can:
$ ssh myserver              # connects with all the right settings
$ scp file.txt myserver:~/  # scp uses the config too
$ sftp myserver             # sftp uses the config too

Setting Up an SSH Server (sshd)

So far we have talked about the SSH client -- the program that connects TO a server. But what about the other side? The SSH server (called sshd, the SSH daemon) is what listens on port 22 and accepts incoming connections. If you have a VPS, a Raspberry Pi, or any Linux machine you want to access remotely, you need to set this up.

Installing and Starting OpenSSH Server (Ubuntu/Debian)
# Install the SSH server package
$ sudo apt update
$ sudo apt install openssh-server

# The service starts automatically after install. Check its status:
$ sudo systemctl status sshd
● sshd.service - OpenBSD Secure Shell server
     Active: active (running) since Sat 2026-03-14 10:00:00 UTC
     ...

# If it is not running, start and enable it (so it survives reboots):
$ sudo systemctl start sshd
$ sudo systemctl enable sshd

# Verify it is actually listening on port 22:
$ sudo ss -tlnp | grep 22
LISTEN  0  128  0.0.0.0:22  0.0.0.0:*  users:(("sshd",pid=1234,fd=3))

# Find your machine's IP address so others can connect:
$ ip addr show | grep "inet "
    inet 127.0.0.1/8 scope host lo
    inet 192.168.1.50/24 brd 192.168.1.255 scope global eth0

# Now anyone on your network can: ssh youruser@192.168.1.50
Understanding /etc/ssh/sshd_config
# The SSH server configuration file. Edit with:
$ sudo nano /etc/ssh/sshd_config

# After ANY change, restart the service:
$ sudo systemctl restart sshd

# ──────────────────────────────────────────────────
# KEY SETTINGS AND WHAT THEY MEAN
# ──────────────────────────────────────────────────

Port 22
  # Which port sshd listens on. Change to something like 2222 to
  # reduce brute-force noise (not real security, but fewer bot hits).

PermitRootLogin no
  # Can someone SSH in as root? Almost always set this to "no."
  # Use a normal user and sudo instead. Options:
  #   yes              → root can log in (dangerous)
  #   no               → root cannot log in at all (recommended)
  #   prohibit-password → root can log in with key only, not password

PasswordAuthentication no
  # Allow password-based logins? Set to "no" once your SSH keys work.
  # This is the SINGLE MOST IMPORTANT security setting.
  # With this off, brute-force attacks are impossible.

PubkeyAuthentication yes
  # Allow public key logins? Yes, always. This is the secure method.

AuthorizedKeysFile .ssh/authorized_keys
  # Where sshd looks for allowed public keys (relative to user's home).
  # When you run ssh-copy-id, your key goes into this file.

MaxAuthTries 3
  # How many authentication attempts per connection before disconnect.

AllowUsers sean deploy
  # Only these usernames can SSH in. Everyone else is rejected.
  # Very useful for multi-user servers.

X11Forwarding no
  # Forward graphical applications? Usually "no" for servers.

ClientAliveInterval 300
ClientAliveCountMax 2
  # Send a keepalive every 300 seconds. After 2 missed responses,
  # disconnect. Prevents zombie sessions (300 × 2 = 600s timeout).

Banner /etc/ssh/banner.txt
  # Display a message before login (legal warning, etc.).
Recommended sshd_config for a Production Server
# /etc/ssh/sshd_config -- hardened settings
Port 2222
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
MaxAuthTries 3
AllowUsers deploy
X11Forwarding no
ClientAliveInterval 300
ClientAliveCountMax 2
Protocol 2

After editing, always test your config before restarting to avoid locking yourself out:

$ sudo sshd -t            # Test config for syntax errors
$ sudo systemctl restart sshd  # Apply changes
Do Not Lock Yourself Out
  • Before disabling password auth: Make sure your SSH key login actually works. Open a SECOND terminal, SSH in with your key. If it works, then disable passwords. If you disable passwords and your key does not work, you are locked out permanently (unless you have physical/console access).
  • Before changing the port: Make sure your firewall allows the new port. sudo ufw allow 2222/tcp BEFORE restarting sshd.
  • Keep a session open: When editing sshd_config, keep your current SSH session open. Test the new config in a new terminal. If the new config is broken, your old session still works.

Connecting From Phones & Other Devices

SSH is not just for laptops. You can connect from your phone, your tablet, or any device with a network connection. This is incredibly useful for managing servers on the go, checking logs, or restarting a service while away from your desk.

Android -- Termux (Full Terminal)
Termux gives you a real Linux terminal on your Android phone.
It has the actual OpenSSH client built in, not some watered-down version.

1. Install Termux from F-Droid (NOT the Play Store, that version is outdated)
   https://f-droid.org/en/packages/com.termux/

2. Open Termux and install OpenSSH:
   $ pkg update && pkg install openssh

3. Now you have the full ssh command:
   $ ssh sean@192.168.1.50

4. Generate a key pair on your phone:
   $ ssh-keygen -t ed25519 -C "my-android-phone"
   # Keys saved to: ~/.ssh/id_ed25519 (private) and ~/.ssh/id_ed25519.pub (public)

5. Copy your public key to the server:
   $ ssh-copy-id sean@192.168.1.50
   # Or manually: cat ~/.ssh/id_ed25519.pub and paste it into the server's
   #              ~/.ssh/authorized_keys file

6. Create a config file for shortcuts:
   $ nano ~/.ssh/config

   Host myserver
       HostName 192.168.1.50
       User sean
       IdentityFile ~/.ssh/id_ed25519

   $ ssh myserver   # Done. Works exactly like on a laptop.

Bonus: Termux also supports scp, sftp, rsync, and SSH tunneling.
       It is a full Linux environment in your pocket.
Android -- JuiceSSH (GUI App)
JuiceSSH is a graphical SSH client for Android. Good for people
who prefer tapping over typing commands.

1. Install from the Google Play Store
2. Tap "Connections" → "+" to add a new connection
3. Fill in:
   - Nickname:  myserver
   - Type:      SSH
   - Address:   192.168.1.50
   - Identity:  (create one with your username and key/password)
4. Tap the connection to connect

Key management in JuiceSSH:
  - Go to "Identities" → create a new identity
  - You can generate a key pair inside the app
  - Export the PUBLIC key and add it to your server's authorized_keys
  - The private key stays on the phone

JuiceSSH also supports:
  - Port forwarding (tunnels)
  - Multiple simultaneous sessions
  - Snippets (saved commands you can tap to run)
  - Team sharing (share connections without sharing passwords)
iOS / iPad -- Blink Shell & Termius
BLINK SHELL (iOS/iPad -- paid, but the best)
──────────────────────────────────────────────
- Full terminal emulator with mosh + ssh support
- Mosh = "Mobile Shell" -- stays connected even when you switch
  WiFi networks or lose signal temporarily. SSH drops, mosh reconnects.
- Supports Ed25519 keys, ssh-agent, and config files
- Has a built-in key generator
- Supports ProxyJump for jump hosts
- Great keyboard support on iPad with external keyboards

Setup:
  1. Install from App Store
  2. Go to Settings → Keys → Generate a new Ed25519 key
  3. Copy the public key to your server's authorized_keys
  4. Add a host: Settings → Hosts → Add
  5. Type: ssh myserver

TERMIUS (iOS/Android/Desktop -- free tier available)
──────────────────────────────────────────────
- Cross-platform SSH client (phone, tablet, desktop)
- Syncs your connections and keys across devices (with account)
- Supports SFTP (file transfer with a GUI file browser)
- Built-in snippet library for common commands
- Port forwarding support
- Free tier: basic SSH. Paid: SFTP, sync, vaults.

Setup:
  1. Install from App Store / Play Store
  2. Add a new host with your server details
  3. Generate or import an SSH key in the app
  4. Connect and manage your servers
From Other Computers (Windows / Mac / Linux)
WINDOWS (modern -- Windows 10/11)
──────────────────────────────────
OpenSSH is built into Windows now. Open PowerShell or Command Prompt:

  > ssh sean@192.168.1.50
  > ssh-keygen -t ed25519
  > scp file.txt sean@192.168.1.50:/home/sean/

Keys are stored in: C:\Users\YourName\.ssh\
Config file:        C:\Users\YourName\.ssh\config

If OpenSSH is not installed:
  Settings → Apps → Optional Features → Add "OpenSSH Client"

WINDOWS (older / GUI preference) -- PuTTY
──────────────────────────────────────────
  1. Download PuTTY from putty.org
  2. Enter hostname and port 22, click "Open"
  3. For key auth, use PuTTYgen to generate keys
     - PuTTY uses .ppk format (not standard OpenSSH format)
     - PuTTYgen can convert between formats
  4. Load your .ppk key in Connection → SSH → Auth → Private key file

MAC / LINUX
───────────
SSH is pre-installed. Open Terminal:

  $ ssh sean@192.168.1.50
  $ ssh-keygen -t ed25519

Everything works out of the box. Keys in ~/.ssh/, config in ~/.ssh/config.
macOS also has Keychain integration so you do not have to type your
key passphrase every time:

  $ ssh-add --apple-use-keychain ~/.ssh/id_ed25519

Complete Walkthrough: SSH Into Your Laptop From Your Android Phone

This is the full, end-to-end process. No skipping steps. By the end, you will be sitting on your couch with your phone, running commands on your laptop across the room. We will cover the normal Linux case first, then handle WSL2 quirks separately at the end.

Step 1: Prepare Your Laptop (Linux / WSL)
# ──────────────────────────────────────────────────
# 1A. Install OpenSSH server
# ──────────────────────────────────────────────────
$ sudo apt update
$ sudo apt install openssh-server

# ──────────────────────────────────────────────────
# 1B. Start sshd and enable it on boot
# ──────────────────────────────────────────────────
$ sudo systemctl start sshd
$ sudo systemctl enable sshd

# Verify it is running:
$ sudo systemctl status sshd
● sshd.service - OpenBSD Secure Shell server
     Active: active (running)

# ──────────────────────────────────────────────────
# 1C. Find your laptop's IP address
# ──────────────────────────────────────────────────
$ ip addr show | grep "inet " | grep -v 127.0.0.1
    inet 192.168.1.50/24 brd 192.168.1.255 scope global wlan0

# Write down the IP. In this example: 192.168.1.50
# "wlan0" means WiFi. "eth0" would mean ethernet cable.

# ──────────────────────────────────────────────────
# 1D. Open the firewall (if ufw is active)
# ──────────────────────────────────────────────────
$ sudo ufw status
# If it says "active", allow SSH:
$ sudo ufw allow 22/tcp
$ sudo ufw reload

# If ufw is inactive, you can skip this -- nothing is blocking port 22.

# ──────────────────────────────────────────────────
# 1E. Make sure PasswordAuthentication is ON (for now)
# ──────────────────────────────────────────────────
# We will use password auth for the initial setup, then switch to keys.
$ sudo nano /etc/ssh/sshd_config
# Find "PasswordAuthentication" and set it to yes:
#   PasswordAuthentication yes
$ sudo systemctl restart sshd
WSL2 Users: Read This Before Continuing

If your laptop runs Windows and you are using WSL2, there is a networking catch. WSL2 runs in its own virtual machine with its own internal IP address (usually something like 172.x.x.x). Your phone cannot reach that IP directly -- it can only see the Windows host IP (192.168.x.x).

You have two options:

  • Option A: Set up port forwarding from Windows to WSL2 (covered in the WSL2-specific section below).
  • Option B: Use Tailscale on both WSL2 and your phone (it creates a direct tunnel, bypassing all the NAT issues).

If you are on native Linux (not WSL), ignore this and continue normally.

Step 2: Make Sure Both Devices Are on the Same WiFi
This sounds obvious, but it trips people up constantly.

WHY IT MATTERS
──────────────
SSH over a local network (LAN) means your phone and laptop talk
directly through your WiFi router. No internet needed. No port
forwarding. No DNS. Just two devices on the same network.

  Phone (192.168.1.71)  <──WiFi──>  Router  <──WiFi──>  Laptop (192.168.1.50)

Both IPs start with "192.168.1." -- that means same subnet, same network.

HOW TO VERIFY
─────────────
On your laptop:
  $ ip addr show | grep "inet " | grep -v 127.0.0.1
      inet 192.168.1.50/24 ...

On your phone (Termux, or Settings → WiFi → tap your network):
  $ ifconfig wlan0
      inet addr:192.168.1.71

If both start with the same prefix (e.g., 192.168.1.x), you are good.
If they differ (e.g., one is 192.168.1.x and the other is 10.0.0.x),
they are on different networks and SSH will NOT connect.

QUICK TEST -- PING FROM YOUR PHONE
───────────────────────────────────
From Termux on your phone:
  $ ping -c 3 192.168.1.50
  PING 192.168.1.50: 3 packets transmitted, 3 received, 0% packet loss

If you see "0% packet loss" → your phone can reach your laptop. Move on.
If you see "100% packet loss" → check the troubleshooting below.

TROUBLESHOOTING "CANNOT REACH LAPTOP"
──────────────────────────────────────
1. AP Isolation / Client Isolation
   Some routers block devices from talking to each other on WiFi.
   Check your router admin panel (usually 192.168.1.1 in a browser):
   → Wireless Settings → AP Isolation → Disable it

2. Guest Network
   If your phone is on a "Guest" WiFi network, it is deliberately
   isolated. Connect both devices to the same non-guest network.

3. Firewall on laptop
   Make sure you ran: sudo ufw allow 22/tcp

4. Wrong IP
   Double-check the laptop's IP. It can change if your router
   uses DHCP. Re-run: ip addr show
Step 3: Set Up Your Android Phone (Termux)
# ──────────────────────────────────────────────────
# 3A. Install Termux
# ──────────────────────────────────────────────────
# Download Termux from F-Droid (NOT the Play Store -- that version is abandoned):
#   https://f-droid.org/en/packages/com.termux/
#
# Install it and open it. You now have a real Linux terminal on your phone.

# ──────────────────────────────────────────────────
# 3B. Install OpenSSH inside Termux
# ──────────────────────────────────────────────────
$ pkg update && pkg upgrade
$ pkg install openssh

# ──────────────────────────────────────────────────
# 3C. Test the connection with a password first
# ──────────────────────────────────────────────────
$ ssh sean@192.168.1.50
# Replace "sean" with your laptop username and "192.168.1.50" with your laptop IP.
# Type "yes" when it asks about the fingerprint (first time only).
# Enter your laptop password.
#
# If you see your laptop's shell prompt → IT WORKS. Type "exit" to disconnect.

# ──────────────────────────────────────────────────
# 3D. Generate an SSH key pair on your phone
# ──────────────────────────────────────────────────
$ ssh-keygen -t ed25519 -C "android-phone"
# Press Enter for default file location (~/.ssh/id_ed25519)
# Enter a passphrase (optional but recommended)
#
# Two files created:
#   ~/.ssh/id_ed25519       ← PRIVATE key (never share this)
#   ~/.ssh/id_ed25519.pub   ← PUBLIC key (this goes on the laptop)

# ──────────────────────────────────────────────────
# 3E. Copy your public key to the laptop
# ──────────────────────────────────────────────────
$ ssh-copy-id sean@192.168.1.50
# Enter your laptop password one last time.
# This appends your public key to ~/.ssh/authorized_keys on the laptop.

# ──────────────────────────────────────────────────
# 3F. Test key-based login (no password prompt!)
# ──────────────────────────────────────────────────
$ ssh sean@192.168.1.50
# If it logs you in without asking for a password → keys are working.

# ──────────────────────────────────────────────────
# 3G. (Optional) Create an SSH config for convenience
# ──────────────────────────────────────────────────
$ mkdir -p ~/.ssh
$ nano ~/.ssh/config

# Paste this:
Host laptop
    HostName 192.168.1.50
    User sean
    IdentityFile ~/.ssh/id_ed25519

# Now you can just type:
$ ssh laptop
Now Disable Password Auth on Your Laptop

Key-based login is working. Time to lock the door behind you.

# On your laptop:
$ sudo nano /etc/ssh/sshd_config

# Change these lines:
PasswordAuthentication no
PubkeyAuthentication yes

# Restart sshd:
$ sudo systemctl restart sshd

# Test from your phone again -- it should still connect using your key.
# If it asks for a password, something went wrong. DO NOT close your
# current session. Debug with a second terminal.
Step 4: Access From Outside Your Home Network (Optional)
Everything above works on your local WiFi. But what if you want to
SSH into your laptop from a coffee shop, your office, or over
mobile data? Now your phone and laptop are NOT on the same network.

You have several options, ranked from easiest to hardest:

──────────────────────────────────────────────────
OPTION 1: Tailscale (Easiest -- Recommended)
──────────────────────────────────────────────────
Tailscale creates a private VPN mesh. Your devices get stable IPs
(like 100.x.x.x) that work from ANYWHERE -- no port forwarding,
no dynamic DNS, no router changes.

On your laptop:
  $ curl -fsSL https://tailscale.com/install.sh | sh
  $ sudo tailscale up
  # Log in with Google/GitHub/etc. Note the Tailscale IP (e.g., 100.64.0.1)

On your phone:
  Install "Tailscale" from the Play Store.
  Log in with the same account.
  Now from Termux:
  $ ssh sean@100.64.0.1    # Works from anywhere in the world

Why Tailscale is great:
  - No firewall/router config needed
  - Encrypted WireGuard tunnel
  - Works behind NAT, cellular, hotel WiFi -- everything
  - Free for personal use (up to 100 devices)

──────────────────────────────────────────────────
OPTION 2: WireGuard (More Control, More Setup)
──────────────────────────────────────────────────
WireGuard is the VPN protocol that Tailscale is built on. If you
want to self-host your VPN (no third-party account), set up
WireGuard manually. This requires a server with a public IP
(e.g., a cheap VPS) to act as the relay.

──────────────────────────────────────────────────
OPTION 3: Router Port Forwarding + Dynamic DNS
──────────────────────────────────────────────────
This is the traditional approach. It works but has more moving parts.

Step A: Forward port 22 on your router
  1. Log into your router (usually http://192.168.1.1)
  2. Find "Port Forwarding" (sometimes under NAT or Firewall)
  3. Add a rule:
     - External port: 22 (or a non-standard port like 2222)
     - Internal IP: 192.168.1.50 (your laptop)
     - Internal port: 22
     - Protocol: TCP
  4. Save

Step B: Find your public IP
  $ curl ifconfig.me
  203.0.113.42

Step C: Test from outside your network
  $ ssh sean@203.0.113.42

The problem: your public IP can change (most ISPs use dynamic IPs).

Step D: Set up Dynamic DNS so you have a stable hostname
  - DuckDNS (free):   https://www.duckdns.org
    Gives you: mylaptop.duckdns.org
  - No-IP (free tier): https://www.noip.com
    Gives you: mylaptop.ddns.net

  Install the DuckDNS update script on your laptop to keep the
  DNS record pointing to your current IP.

  Then connect with:
  $ ssh sean@mylaptop.duckdns.org
Security Warnings: Exposing SSH to the Internet
  • Bots will find you. Within minutes of opening port 22 to the internet, automated scanners will start trying to log in. This is not hypothetical -- it happens to every public SSH server.
  • Disable password auth. This is non-negotiable. Key-only auth makes brute-force attacks impossible.
  • Use a non-standard port. Change sshd to listen on 2222 or similar. It does not stop a determined attacker, but it drops 99% of bot traffic.
  • Install fail2ban. It bans IPs after a few failed login attempts: sudo apt install fail2ban.
  • Consider Tailscale instead. If you just need personal access, Tailscale avoids exposing any port to the public internet. It is safer by design.
Step 5: WSL2-Specific Setup (Windows Laptop)
WSL2 runs inside a lightweight VM. It has its own virtual network
adapter with its own IP address. This means:

  Your phone  →  sees Windows IP (192.168.1.50)
  WSL2        →  has internal IP (172.28.123.45)
  Phone CANNOT reach 172.28.x.x directly

You need to tell Windows: "when something connects to port 22 on
my Windows IP, forward it to port 22 on the WSL2 internal IP."

# ──────────────────────────────────────────────────
# 5A. Find your WSL2 IP address (run inside WSL2)
# ──────────────────────────────────────────────────
$ ip addr show eth0 | grep "inet "
    inet 172.28.123.45/20 brd 172.28.127.255 scope global eth0

# Note: This IP changes every time WSL2 restarts.

# ──────────────────────────────────────────────────
# 5B. Start sshd inside WSL2
# ──────────────────────────────────────────────────
# WSL2 does not use systemd by default (older versions).
# Start sshd manually:
$ sudo service ssh start

# On newer WSL2 with systemd enabled:
$ sudo systemctl start sshd

# Verify:
$ sudo ss -tlnp | grep 22
LISTEN  0  128  0.0.0.0:22  0.0.0.0:*  users:(("sshd",...))

# ──────────────────────────────────────────────────
# 5C. Set up port forwarding (run in Windows PowerShell as Admin)
# ──────────────────────────────────────────────────
# Open PowerShell as Administrator and run:

# First, get the WSL2 IP from Windows:
> wsl hostname -I
172.28.123.45

# Set up the port proxy:
> netsh interface portproxy add v4tov4 `
    listenport=22 `
    listenaddress=0.0.0.0 `
    connectport=22 `
    connectaddress=172.28.123.45

# Verify the rule was created:
> netsh interface portproxy show v4tov4

Listen on ipv4:             Connect to ipv4:
Address         Port        Address         Port
--------------- ----------  --------------- ----------
0.0.0.0         22          172.28.123.45   22

# ──────────────────────────────────────────────────
# 5D. Open Windows Firewall for port 22
# ──────────────────────────────────────────────────
# Still in admin PowerShell:
> New-NetFirewallRule -DisplayName "SSH" `
    -Direction Inbound `
    -Action Allow `
    -Protocol TCP `
    -LocalPort 22

# ──────────────────────────────────────────────────
# 5E. Now connect from your phone
# ──────────────────────────────────────────────────
# Use the WINDOWS IP (not the WSL2 internal IP):
$ ssh sean@192.168.1.50
# This hits Windows port 22 → forwarded to WSL2 port 22 → you are in WSL2.

# ──────────────────────────────────────────────────
# 5F. The IP-changes-on-reboot problem
# ──────────────────────────────────────────────────
# WSL2's internal IP changes every time you restart it.
# You need to update the port proxy rule each time.
#
# Automate it with a script. Save this as wsl-ssh-forward.ps1:
#
#   $wslIp = (wsl hostname -I).Trim()
#   netsh interface portproxy delete v4tov4 listenport=22 listenaddress=0.0.0.0
#   netsh interface portproxy add v4tov4 `
#       listenport=22 listenaddress=0.0.0.0 `
#       connectport=22 connectaddress=$wslIp
#   Write-Host "SSH forwarding to WSL2 at $wslIp"
#
# Run it after each WSL2 restart, or add it to Task Scheduler.
WSL2 Shortcut: Just Use Tailscale

All the port forwarding above is fragile. The IP changes, the proxy rules break, the firewall gets in the way. If you install Tailscale inside WSL2, it gets its own stable 100.x.x.x IP that works from anywhere -- bypassing all the Windows-to-WSL2 networking issues entirely.

# Inside WSL2:
$ curl -fsSL https://tailscale.com/install.sh | sh
$ sudo tailscale up

# On your phone (Termux):
$ ssh sean@100.64.0.2   # Tailscale IP -- works anywhere, no port forwarding
Transferring SSH Keys to Mobile Devices
  • Best approach: Generate the key ON the device (Termux, Blink, Termius all support this). Then copy only the public key to your server. The private key never leaves the device.
  • If you must transfer an existing key: Use a secure method -- AirDrop (iOS/Mac), a password manager's secure notes, or a QR code. NEVER email or text a private key.
  • One key per device: Generate a separate key pair for each device. If your phone is lost/stolen, you revoke only that key from your server's authorized_keys without affecting your laptop.

How Password Authentication Works (Deep Dive)

Even though key-based auth is recommended, understanding password authentication teaches you why it is less secure and how SSH protects (and fails to fully protect) passwords.

SSH Password Authentication -- The Full Flow
When you type: ssh sean@server.com and then enter your password:

  ┌──────────┐                              ┌──────────┐
  │  CLIENT  │                              │  SERVER   │
  └────┬─────┘                              └────┬─────┘
       │                                          │
       │  1. TCP handshake (SYN, SYN-ACK, ACK)    │
       │─────────────────────────────────────────►│
       │                                          │
       │  2. SSH version exchange                  │
       │◄────────────────────────────────────────►│
       │                                          │
       │  3. Diffie-Hellman key exchange           │
       │    Both sides compute shared secret K     │
       │    Encryption is now ACTIVE (AES-256)     │
       │◄────────────────────────────────────────►│
       │                                          │
       │  ═══════ ENCRYPTED TUNNEL READY ═══════  │
       │                                          │
       │  4. Client sends: "I want to auth with   │
       │     method: password, user: sean,         │
       │     password: MyP@ssw0rd123"              │
       │─────────────────────────────────────────►│
       │  (encrypted -- eavesdroppers see garbage) │
       │                                          │
       │                    5. Server checks:      │
       │                    - Find user "sean" in  │
       │                      /etc/passwd          │
       │                    - Read password hash   │
       │                      from /etc/shadow     │
       │                    - Hash the received    │
       │                      password with same   │
       │                      salt and algorithm   │
       │                    - Compare hashes       │
       │                                          │
       │  6. "Authentication successful"           │
       │◄─────────────────────────────────────────│
       │                                          │
       │  7. Shell session starts                  │
       │◄────────────────────────────────────────►│
How the Server Checks Your Password (/etc/shadow)
# The server does NOT store your actual password. It stores a HASH.
# The hash lives in /etc/shadow (readable only by root):

$ sudo cat /etc/shadow | grep sean
sean:$6$rK3G.x7z$Wv8q...(long hash)...:19500:0:99999:7:::

# Breaking down that hash string:
#   $6$rK3G.x7z$Wv8q...
#    │  │         │
#    │  │         └─ The actual hash output
#    │  └─ The salt (random string, unique per user)
#    └─ The algorithm: $6$ = SHA-512 (most common on modern Linux)
#                      $5$ = SHA-256
#                      $y$ = yescrypt (newer, on Ubuntu 22.04+)

# When you send your password, the server does:
#   1. Extract the salt from the stored hash
#   2. Compute: hash(your_password + salt)
#   3. Compare result with stored hash
#   4. If they match → you are in. If not → rejected.

# The salt prevents rainbow table attacks (precomputed hash lookups).
# Even if two users have the same password, their hashes differ
# because they have different salts.
Why Password Auth Is Vulnerable (Even Though Encrypted)
  • Brute force attacks: An attacker cannot read your password on the wire (it is encrypted). But they CAN try thousands of passwords by opening new SSH connections. Bots do this 24/7 on every public server. Common passwords like "admin123" are tried within seconds.
  • Credential stuffing: If your password was leaked in a data breach from another site, attackers will try it against your SSH server automatically.
  • No proof of identity: A password proves you KNOW something, not that you ARE someone. Anyone who learns the password can log in from anywhere.
  • Phishing risk: If you accidentally SSH into an attacker's server (typo in hostname), you just gave them your password.
Fail2Ban -- Stopping Brute Force Attacks
# Fail2Ban monitors your SSH logs and bans IPs that fail too many times.

# Install:
$ sudo apt install fail2ban

# Create a local config (do not edit the main file directly):
$ sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
$ sudo nano /etc/fail2ban/jail.local

# Find the [sshd] section and configure:
[sshd]
enabled  = true
port     = ssh        # or 2222 if you changed it
filter   = sshd
logpath  = /var/log/auth.log
maxretry = 3          # ban after 3 failed attempts
bantime  = 3600       # ban for 1 hour (in seconds)
findtime = 600        # within a 10-minute window

# Start and enable:
$ sudo systemctl start fail2ban
$ sudo systemctl enable fail2ban

# Check banned IPs:
$ sudo fail2ban-client status sshd
Status for the jail: sshd
|- Filter
|  |- Currently failed: 2
|  |- Total failed:     847
|  `- File list:        /var/log/auth.log
`- Actions
   |- Currently banned: 14
   |- Total banned:     203
   `- Banned IP list:   103.x.x.x 185.x.x.x ...

# Unban a specific IP (if you accidentally locked yourself out):
$ sudo fail2ban-client set sshd unbanip 192.168.1.100

How Public Key Authentication Works (Deep Dive)

Public key authentication is the gold standard for SSH. It is more secure than passwords, cannot be brute-forced, and (once set up) more convenient. Here is exactly what happens at the cryptographic level.

What Is Inside Your Key Files
When you run: ssh-keygen -t ed25519

You get TWO files:

~/.ssh/id_ed25519       ← PRIVATE KEY (never share this)
~/.ssh/id_ed25519.pub   ← PUBLIC KEY (safe to share with anyone)

PRIVATE KEY (id_ed25519):
─────────────────────────
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAMwAAAAtzc2gtZW
QyNTUxOQAAACB0h7k... (base64-encoded binary data) ...
-----END OPENSSH PRIVATE KEY-----

This contains:
  - The private key (32 bytes for Ed25519)
  - The public key (embedded as well)
  - Metadata (key type, comment, encryption info if passphrase-protected)
  - If you set a passphrase, this file is encrypted with AES-256
    using a key derived from your passphrase. Without the passphrase,
    the file is useless.

PUBLIC KEY (id_ed25519.pub):
────────────────────────────
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIHSHuT... sean@laptop

This is a single line with three parts:
  [key-type] [base64-encoded-public-key] [comment]

The server stores this line in ~/.ssh/authorized_keys.
You can have multiple public keys in authorized_keys (one per line),
allowing multiple devices to log in.
The Challenge-Response Flow -- Step by Step
When you SSH with key auth, the server NEVER sees your private key.
Instead, it uses a challenge-response protocol:

  ┌──────────┐                              ┌──────────┐
  │  CLIENT  │                              │  SERVER   │
  └────┬─────┘                              └────┬─────┘
       │                                          │
       │  1. TCP + SSH handshake (same as before)  │
       │    Encrypted channel established          │
       │◄────────────────────────────────────────►│
       │                                          │
       │  2. "I want to authenticate as sean       │
       │     using public key: ssh-ed25519 AAAA.." │
       │─────────────────────────────────────────►│
       │                                          │
       │              3. Server checks:            │
       │              Does /home/sean/.ssh/        │
       │              authorized_keys contain      │
       │              this public key?             │
       │                                          │
       │              YES → continue               │
       │              NO  → reject                 │
       │                                          │
       │  4. Server creates a CHALLENGE:           │
       │     A random session ID + data, encrypted │
       │     or hashed in a specific way.          │
       │     "Prove you own this key by signing    │
       │      this data."                          │
       │◄─────────────────────────────────────────│
       │                                          │
       │  5. Client uses PRIVATE KEY to create     │
       │     a digital signature of the challenge. │
       │     Private key never leaves the client.  │
       │                                          │
       │  6. Client sends the SIGNATURE (not the   │
       │     private key!) back to the server.     │
       │─────────────────────────────────────────►│
       │                                          │
       │              7. Server uses the PUBLIC    │
       │              KEY (from authorized_keys)   │
       │              to VERIFY the signature.     │
       │                                          │
       │              Only the matching private    │
       │              key could have produced      │
       │              this signature. Math proves  │
       │              the client has the key.      │
       │                                          │
       │  8. "Authentication successful"           │
       │◄─────────────────────────────────────────│

KEY INSIGHT: The private key NEVER leaves your machine. The server
only sees the public key and a signature. Even if someone intercepts
everything, they cannot log in because they cannot produce valid
signatures without the private key.
Why This Is Secure:

Private key = can sign data (prove identity)
Public key = can verify signatures (check identity)

You cannot derive the private key from the public key.
You cannot forge a signature without the private key.
Each signature is unique to the challenge, so replaying it is useless.
Ed25519 vs RSA vs ECDSA -- Which Key Type?
┌──────────────┬────────────┬─────────────┬──────────────────────────┐
│ Algorithm    │ Key Size   │ Performance │ Recommendation           │
├──────────────┼────────────┼─────────────┼──────────────────────────┤
│ Ed25519      │ 256 bits   │ Fastest     │ USE THIS. Modern,        │
│              │            │             │ fast, small keys, secure │
├──────────────┼────────────┼─────────────┼──────────────────────────┤
│ RSA          │ 3072-4096  │ Slower      │ OK for compatibility.    │
│              │ bits       │             │ Use 4096 bits minimum.   │
│              │            │             │ Larger keys, slower.     │
├──────────────┼────────────┼─────────────┼──────────────────────────┤
│ ECDSA        │ 256-521    │ Fast        │ Avoid. Fragile --        │
│              │ bits       │             │ implementation bugs can  │
│              │            │             │ leak your private key.   │
│              │            │             │ (Needs perfect random    │
│              │            │             │ numbers for each sig.)   │
├──────────────┼────────────┼─────────────┼──────────────────────────┤
│ DSA          │ 1024 bits  │ --          │ DEPRECATED. Do not use.  │
│              │            │             │ Disabled in OpenSSH 7.0+ │
└──────────────┴────────────┴─────────────┴──────────────────────────┘

Generate the right key:
  $ ssh-keygen -t ed25519 -C "sean@laptop-2026"       # Best choice
  $ ssh-keygen -t rsa -b 4096 -C "sean@old-system"    # Compatibility
ssh-agent -- Stop Typing Your Passphrase
# If your private key has a passphrase (it should), you have to type
# it every time you SSH. ssh-agent caches the decrypted key in memory
# so you only type the passphrase ONCE per session.

# Start the agent (usually already running on most systems):
$ eval "$(ssh-agent -s)"
Agent pid 12345

# Add your key to the agent:
$ ssh-add ~/.ssh/id_ed25519
Enter passphrase for /home/sean/.ssh/id_ed25519: ********
Identity added: /home/sean/.ssh/id_ed25519 (sean@laptop)

# Now SSH connections use the cached key -- no passphrase prompt:
$ ssh myserver    # just works, no passphrase asked

# List keys currently in the agent:
$ ssh-add -l
256 SHA256:abc123... sean@laptop (ED25519)

# Remove all keys from agent (e.g., when leaving your desk):
$ ssh-add -D

# macOS: persist across reboots with Keychain:
$ ssh-add --apple-use-keychain ~/.ssh/id_ed25519

# Linux: add to your ~/.bashrc to auto-start the agent:
if [ -z "$SSH_AUTH_SOCK" ]; then
  eval "$(ssh-agent -s)" > /dev/null
  ssh-add ~/.ssh/id_ed25519 2> /dev/null
fi
SSH Agent Forwarding -- Use Your Keys on Remote Servers
# Problem: You SSH into ServerA, then from ServerA you want to SSH
# into ServerB (or git pull from GitHub). ServerA does not have your
# private key, and you should NOT copy it there.

# Solution: Agent forwarding. ServerA asks your LOCAL agent to sign
# the challenge, through the forwarded connection.

  ┌────────────┐         ┌────────────┐         ┌────────────┐
  │  Laptop    │   SSH   │  ServerA   │   SSH   │  ServerB   │
  │            │────────►│            │────────►│            │
  │ [ssh-agent]│         │ (no keys!) │         │            │
  │ [keys here]│◄ ─ ─ ─ ─│ forwards   │         │            │
  │            │ signing │ auth back  │         │            │
  └────────────┘ request └────────────┘         └────────────┘

# Enable for a single connection:
$ ssh -A sean@serverA

# Or in ~/.ssh/config:
Host serverA
    HostName 10.0.1.5
    ForwardAgent yes

# Then on serverA:
serverA$ ssh serverB    # works! Uses your laptop's key via forwarding
serverA$ git pull       # works for GitHub too, if your key is on GitHub
Agent Forwarding Security Risk
  • Only forward to servers you trust. A root user on the remote server can use your forwarded agent to authenticate as you to other servers while your session is active.
  • Use ProxyJump instead when possible. It is safer because the intermediate server never gets access to your agent. See the SSH Config Deep Dive below.
Multiple Keys for Different Servers
# You should use DIFFERENT keys for different purposes:
# - One for GitHub
# - One for your production servers
# - One for your homelab
# - One per device (laptop key, phone key, work computer key)

# Generate separate keys:
$ ssh-keygen -t ed25519 -f ~/.ssh/github_key -C "github"
$ ssh-keygen -t ed25519 -f ~/.ssh/prod_key -C "production-servers"
$ ssh-keygen -t ed25519 -f ~/.ssh/homelab_key -C "homelab"

# Tell SSH which key to use for which server in ~/.ssh/config:
Host github.com
    IdentityFile ~/.ssh/github_key

Host prod-*
    IdentityFile ~/.ssh/prod_key

Host homelab pi raspberry
    IdentityFile ~/.ssh/homelab_key

# Why separate keys?
# 1. If one key is compromised, only those servers are affected
# 2. You can revoke a single key without disrupting everything
# 3. You can see in server logs WHICH key was used to log in
# 4. Different keys can have different passphrases (or none for automation)

SSH Config File Deep Dive

The SSH config file (~/.ssh/config) is one of the most powerful and underused features of SSH. It lets you create aliases, set defaults, configure jump hosts, and manage dozens of servers without remembering any connection details.

Full ~/.ssh/config -- Managing Multiple Servers
# ──────────────────────────────────────────────────
# ~/.ssh/config -- Your SSH address book
# ──────────────────────────────────────────────────
# Each "Host" block defines a shortcut. When you type "ssh prod",
# SSH looks up the "prod" block and uses those settings.

# ── DEFAULTS FOR ALL CONNECTIONS ──
Host *
    # Keep connections alive (prevent "broken pipe" disconnects)
    ServerAliveInterval 60
    ServerAliveCountMax 3
    # Automatically add new hosts to known_hosts
    StrictHostKeyChecking accept-new
    # Reuse connections for speed (multiplexing)
    ControlMaster auto
    ControlPath ~/.ssh/sockets/%r@%h-%p
    ControlPersist 600
    # Default to Ed25519 keys
    IdentitiesOnly yes

# ── GITHUB ──
Host github.com
    HostName github.com
    User git
    IdentityFile ~/.ssh/github_ed25519

# ── PRODUCTION SERVERS ──
Host prod
    HostName prod.example.com
    User deploy
    Port 2222
    IdentityFile ~/.ssh/prod_ed25519
    ForwardAgent no

Host staging
    HostName staging.example.com
    User deploy
    Port 2222
    IdentityFile ~/.ssh/prod_ed25519

# ── HOME LAB ──
Host pi
    HostName 192.168.1.100
    User pi
    IdentityFile ~/.ssh/homelab_ed25519

Host nas
    HostName 192.168.1.200
    User admin
    IdentityFile ~/.ssh/homelab_ed25519

# ── WORK (behind corporate VPN) ──
Host work
    HostName 10.0.5.25
    User sean.dev
    IdentityFile ~/.ssh/work_ed25519
    ProxyJump bastion

# ── CLOUD VPS ──
Host vps
    HostName 203.0.113.42
    User root
    IdentityFile ~/.ssh/vps_ed25519
Jump Hosts and ProxyJump -- Reaching Servers Behind Firewalls
# Many servers are NOT directly accessible from the internet.
# They sit behind a "bastion" or "jump" host:

  ┌──────────┐        ┌──────────────┐        ┌──────────────┐
  │  You     │──SSH──►│  Bastion      │──SSH──►│  Internal    │
  │ (laptop) │        │  (public IP)  │        │  Server      │
  └──────────┘        │  jump.example │        │  (private IP)│
   Internet           │  .com         │        │  10.0.5.25   │
                      └──────────────┘        └──────────────┘

# OLD WAY (two separate SSH commands):
$ ssh sean@jump.example.com
bastion$ ssh sean@10.0.5.25     # now you are on the internal server

# BETTER WAY -- ProxyJump (one command, one step):
$ ssh -J sean@jump.example.com sean@10.0.5.25

# BEST WAY -- put it in your config:
Host bastion
    HostName jump.example.com
    User sean
    IdentityFile ~/.ssh/work_ed25519

Host internal-server
    HostName 10.0.5.25
    User sean
    IdentityFile ~/.ssh/work_ed25519
    ProxyJump bastion

# Now just:
$ ssh internal-server
# SSH automatically connects through the bastion. One command.
# SCP and SFTP work through ProxyJump too:
$ scp report.pdf internal-server:~/

# CHAINING -- multiple jump hosts:
Host deeply-buried-server
    HostName 172.16.0.99
    ProxyJump bastion,middle-server
    # Goes: you → bastion → middle-server → 172.16.0.99
Wildcard Hosts and Pattern Matching
# Apply settings to GROUPS of hosts using wildcards:

# All servers at example.com get the same user and key:
Host *.example.com
    User deploy
    IdentityFile ~/.ssh/deploy_ed25519
    Port 2222

# All hosts starting with "dev-" use the dev key:
Host dev-*
    User developer
    IdentityFile ~/.ssh/dev_ed25519
    ForwardAgent yes

# All hosts starting with "prod-" use strict settings:
Host prod-*
    User deploy
    IdentityFile ~/.ssh/prod_ed25519
    ForwardAgent no
    LogLevel ERROR

# Now these all work automatically:
$ ssh web.example.com     # uses deploy user, deploy key, port 2222
$ ssh api.example.com     # same settings
$ ssh dev-backend         # uses developer user, dev key
$ ssh prod-database       # uses deploy user, prod key, no forwarding

# Negation patterns -- match everything EXCEPT:
Host * !github.com
    # These settings apply to all hosts except github.com
    ServerAliveInterval 60
SSH Connection Multiplexing -- Reuse Connections
# Opening an SSH connection takes time (TCP handshake, key exchange,
# authentication). Multiplexing lets subsequent connections reuse
# an existing one, making them nearly instant.

# In ~/.ssh/config:
Host *
    ControlMaster auto
    ControlPath ~/.ssh/sockets/%r@%h-%p
    ControlPersist 600

# ControlMaster auto   → first connection becomes the "master"
# ControlPath          → where to store the socket file
#                        %r = remote user, %h = host, %p = port
# ControlPersist 600   → keep master alive 600s after last session closes

# Create the sockets directory:
$ mkdir -p ~/.ssh/sockets

# Now:
$ ssh myserver            # first connection: normal speed (2-3 seconds)
$ ssh myserver            # second connection: instant (under 0.1 seconds)
$ scp file.txt myserver:~ # also instant, reuses the connection

# This is especially useful with ProxyJump (jump hosts), where each
# connection would otherwise require TWO handshakes.
SSH Config Best Practices
  • Use IdentitiesOnly yes in Host *: Without this, SSH tries ALL your keys against every server, which can trigger lockouts if you have many keys.
  • Set permissions: chmod 600 ~/.ssh/config -- SSH may refuse to use the config if it is world-readable.
  • Use ProxyJump over ForwardAgent: ProxyJump is safer because the jump host never gets access to your SSH agent.
  • Comment your config: Future you will forget what Host x7b-prod-02 is. Add comments with #.
  • Use StrictHostKeyChecking accept-new: Automatically accepts keys for NEW hosts but still warns if a KNOWN host's key changes (possible MITM).
Common SSH Mistakes
  • "Permission denied (publickey)": Your key is not in the server's ~/.ssh/authorized_keys, or file permissions are wrong. Run ssh-copy-id or check chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys on the server.
  • "WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED": The server's host key is different from what is in your ~/.ssh/known_hosts. This could be a MITM attack, or the server was reinstalled. Verify with the server admin before accepting.
  • Using FTP instead of SFTP: FTP sends passwords in plain text. Always use SFTP (port 22, SSH-based) instead of FTP (port 21). They are completely different protocols despite the similar names.
  • Leaving password auth enabled: Once SSH keys work, disable password authentication in /etc/ssh/sshd_config. Bots constantly try to brute-force SSH passwords on every public server.

14. Socket Programming

A socket is an endpoint for communication between two machines over a network. It is the lowest-level networking API most developers will ever use. Every HTTP request, every database connection, every WebSocket -- they all use sockets under the hood. Understanding sockets means understanding how network communication actually works at the code level.

What Is a Socket?
A socket is identified by 5 things (the "5-tuple"):

  1. Protocol     (TCP or UDP)
  2. Source IP    (your machine's address)
  3. Source Port  (assigned by your OS, e.g., 54321)
  4. Dest IP      (the server's address)
  5. Dest Port    (the service port, e.g., 80 for HTTP)

Think of it like a phone call:
  - IP address = phone number (which building)
  - Port = extension number (which desk in the building)
  - Socket = the active phone line between two extensions

  ┌──────────────────┐            ┌──────────────────┐
  │  Your Machine     │            │  Server           │
  │                   │            │                   │
  │  Socket:          │   TCP/UDP  │  Socket:          │
  │  192.168.1.5:54321│◄──────────►│  93.184.216.34:80 │
  │                   │            │                   │
  └──────────────────┘            └──────────────────┘

The OS manages sockets through file descriptors (on Linux/Mac)
or handles (on Windows). When you open a socket, the OS gives
you a number (like 3, 4, 5...) and you read/write to it like a file.

TCP Sockets -- Reliable Communication

TCP sockets provide a reliable, ordered, bidirectional byte stream. Data arrives in order, nothing is lost, and you get an error if the connection drops. This is what HTTP, SSH, databases, and most applications use.

TCP Socket Lifecycle
SERVER SIDE:                          CLIENT SIDE:

1. socket()    ← create socket        1. socket()    ← create socket
2. bind()      ← assign address:port
3. listen()    ← start accepting
4. accept()    ← wait for client ──── 2. connect()   ← connect to server
   │                                      │
   ▼                                      ▼
5. recv()/send() ◄───────────────────► 3. send()/recv()
   │                                      │
   ▼                                      ▼
6. close()     ← disconnect           4. close()     ← disconnect

Key concepts:
  bind()    → "I want to listen on this specific address:port"
  listen()  → "Start queuing incoming connections" (backlog size)
  accept()  → BLOCKS until a client connects, then returns a NEW socket
              for that specific client (the original socket keeps listening)
  connect() → Initiates the TCP 3-way handshake (SYN → SYN-ACK → ACK)

TCP Server and Client

Python
import socket

# ─── TCP SERVER ───
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)  # AF_INET = IPv4, SOCK_STREAM = TCP
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)  # reuse port after restart
server.bind(('0.0.0.0', 8080))  # listen on all interfaces, port 8080
server.listen(5)                 # queue up to 5 pending connections
print("Server listening on port 8080...")

while True:
    client_socket, address = server.accept()  # blocks until client connects
    print(f"Connection from {address}")

    data = client_socket.recv(1024)  # receive up to 1024 bytes
    print(f"Received: {data.decode()}")

    client_socket.send(b"Hello from server!")  # send response
    client_socket.close()  # done with this client
Python
import socket

# ─── TCP CLIENT ───
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect(('127.0.0.1', 8080))  # connect to server

client.send(b"Hello from client!")   # send data

response = client.recv(1024)          # receive response
print(f"Server replied: {response.decode()}")

client.close()  # close connection
JavaScript
// ─── TCP SERVER (Node.js) ───
const net = require('net');

const server = net.createServer((socket) => {
    console.log(`Client connected: ${socket.remoteAddress}:${socket.remotePort}`);

    socket.on('data', (data) => {
        console.log(`Received: ${data.toString()}`);
        socket.write('Hello from server!');  // send response
    });

    socket.on('end', () => console.log('Client disconnected'));
    socket.on('error', (err) => console.error('Socket error:', err.message));
});

server.listen(8080, () => console.log('Server listening on port 8080'));
JavaScript
// ─── TCP CLIENT (Node.js) ───
const net = require('net');

const client = net.createConnection({ port: 8080, host: '127.0.0.1' }, () => {
    console.log('Connected to server');
    client.write('Hello from client!');  // send data
});

client.on('data', (data) => {
    console.log(`Server replied: ${data.toString()}`);
    client.end();  // close connection
});

client.on('end', () => console.log('Disconnected'));

UDP Sockets -- Fast, Connectionless

UDP sockets send individual messages (datagrams) with no connection, no ordering, and no delivery guarantee. They are faster and simpler than TCP because there is no handshake, no acknowledgment, and no retransmission. Use UDP for real-time data where speed matters more than reliability: gaming, video streaming, DNS lookups, voice calls.

Python
import socket

# ─── UDP SERVER ───
server = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)  # SOCK_DGRAM = UDP
server.bind(('0.0.0.0', 9090))
print("UDP server listening on port 9090...")

while True:
    data, addr = server.recvfrom(1024)  # receive data and sender address
    print(f"From {addr}: {data.decode()}")
    server.sendto(b"ACK", addr)          # reply to sender (optional)
Python
import socket

# ─── UDP CLIENT ───
client = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
client.sendto(b"Hello UDP!", ('127.0.0.1', 9090))  # no connect() needed

data, addr = client.recvfrom(1024)
print(f"Reply: {data.decode()}")
client.close()
JavaScript
// ─── UDP SERVER (Node.js) ───
const dgram = require('dgram');
const server = dgram.createSocket('udp4');

server.on('message', (msg, rinfo) => {
    console.log(`From ${rinfo.address}:${rinfo.port}: ${msg.toString()}`);
    server.send('ACK', rinfo.port, rinfo.address);  // reply
});

server.bind(9090, () => console.log('UDP server listening on port 9090'));
JavaScript
// ─── UDP CLIENT (Node.js) ───
const dgram = require('dgram');
const client = dgram.createSocket('udp4');

client.send('Hello UDP!', 9090, '127.0.0.1', (err) => {
    if (err) console.error(err);
});

client.on('message', (msg) => {
    console.log(`Reply: ${msg.toString()}`);
    client.close();
});
TCP vs UDP Sockets -- Side by Side
                    TCP (SOCK_STREAM)              UDP (SOCK_DGRAM)
────────────────────────────────────────────────────────────────────
Connection:         connect() + accept()           No connection needed
Sending:            send() / recv()                sendto() / recvfrom()
Delivery:           Guaranteed, in-order           Best effort, may be lost
Overhead:           Higher (handshake, ACKs)       Lower (just send)
Message boundary:   No (byte stream)               Yes (each sendto = 1 msg)
Use cases:          HTTP, SSH, databases           DNS, gaming, streaming

TCP is a byte stream -- if you send "Hello" and "World" separately,
you might recv() "HelloWorld" as one chunk. You must handle framing.

UDP preserves message boundaries -- each sendto() is one datagram.
If you send "Hello" and "World", you recvfrom() them as two separate messages.
But either one might get lost or arrive out of order.

Handling Multiple Clients

A basic socket server handles one client at a time. In production, you need to handle many clients simultaneously. There are three main approaches:

Python
# Approach 1: Threading -- one thread per client
import socket
import threading

def handle_client(client_socket, address):
    print(f"[Thread] Handling {address}")
    while True:
        data = client_socket.recv(1024)
        if not data:
            break  # client disconnected
        client_socket.send(data)  # echo back
    client_socket.close()
    print(f"[Thread] {address} disconnected")

server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
server.bind(('0.0.0.0', 8080))
server.listen(5)

while True:
    client_socket, address = server.accept()
    thread = threading.Thread(target=handle_client, args=(client_socket, address))
    thread.start()  # each client gets its own thread
Python
# Approach 2: select() -- single-threaded multiplexing
import socket
import select

server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
server.bind(('0.0.0.0', 8080))
server.listen(5)
server.setblocking(False)

sockets_list = [server]  # track all sockets to monitor

while True:
    # select() blocks until at least one socket is ready
    readable, _, exceptional = select.select(sockets_list, [], sockets_list)

    for sock in readable:
        if sock is server:
            # New client connecting
            client_socket, address = server.accept()
            client_socket.setblocking(False)
            sockets_list.append(client_socket)
        else:
            # Existing client sent data
            data = sock.recv(1024)
            if data:
                sock.send(data)  # echo back
            else:
                sockets_list.remove(sock)
                sock.close()
Python
# Approach 3: asyncio -- modern async sockets
import asyncio

async def handle_client(reader, writer):
    addr = writer.get_extra_info('peername')
    print(f"Client connected: {addr}")

    while True:
        data = await reader.read(1024)
        if not data:
            break
        writer.write(data)  # echo back
        await writer.drain()

    writer.close()
    print(f"Client disconnected: {addr}")

async def main():
    server = await asyncio.start_server(handle_client, '0.0.0.0', 8080)
    print("Async server listening on port 8080")
    async with server:
        await server.serve_forever()

asyncio.run(main())
Multi-Client Approaches Compared
Approach        How It Works              Pros                    Cons
────────────────────────────────────────────────────────────────────────
Threading       One thread per client     Simple, intuitive       Memory heavy (1MB/thread),
                                                                  GIL limits Python CPU perf

select/poll     OS monitors many sockets  Low memory, no threads  Complex code, fd limits
                Single thread handles all                         (select: 1024 on some OS)

epoll/kqueue    Like select but O(1)      Scales to millions      Linux/BSD specific
                (used by nginx, Node.js)  of connections

asyncio/        Coroutines + event loop   Clean code, scalable    Requires async everywhere,
async-await     Single-threaded                                   learning curve

In production, most frameworks use epoll/kqueue under the hood:
  - Node.js uses libuv (epoll on Linux, kqueue on Mac)
  - Python asyncio uses selectors (auto-picks best: epoll/kqueue/select)
  - Go uses its own goroutine scheduler + epoll
  - Nginx uses epoll for handling 10K+ connections on one thread

Socket Options and Common Settings

Important Socket Options
# Python socket options you should know about

# SO_REUSEADDR -- Allow reusing a port immediately after the server stops
# Without this, you get "Address already in use" for ~60 seconds after restart
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

# TCP_NODELAY -- Disable Nagle's algorithm (send data immediately)
# Nagle batches small writes for efficiency, but adds latency
# Disable for real-time apps (gaming, interactive terminals)
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

# SO_KEEPALIVE -- Detect dead connections
# OS sends periodic probes on idle connections
# If peer doesn't respond, connection is closed automatically
sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)

# SO_RCVBUF / SO_SNDBUF -- Set receive/send buffer sizes
# Larger buffers can improve throughput for bulk transfers
sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 65536)   # 64KB receive buffer
sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 65536)   # 64KB send buffer

# Timeout -- Don't block forever on recv()
sock.settimeout(10.0)  # 10 second timeout, raises socket.timeout

Message Framing -- The #1 TCP Socket Mistake

TCP Does NOT Preserve Message Boundaries

The most common socket programming bug: assuming recv() returns exactly one "message." TCP is a byte stream, not a message stream. If you send "Hello" and "World" in two separate send() calls, the receiver might get "HelloWorld" in one recv(), or "Hel" and "loWorld" in two -- TCP makes no guarantees about how bytes are grouped.

Python
# WRONG -- will break with large messages or fast senders
data = sock.recv(1024)  # might get partial message or two messages merged

# RIGHT -- length-prefixed framing
import struct

def send_message(sock, message):
    data = message.encode()
    length = struct.pack('!I', len(data))  # 4-byte big-endian length prefix
    sock.sendall(length + data)             # send length + actual data

def recv_message(sock):
    # First, read exactly 4 bytes (the length prefix)
    raw_length = recv_exact(sock, 4)
    if not raw_length:
        return None
    length = struct.unpack('!I', raw_length)[0]
    # Then, read exactly 'length' bytes (the message)
    return recv_exact(sock, length).decode()

def recv_exact(sock, num_bytes):
    """Keep reading until we have exactly num_bytes."""
    data = b''
    while len(data) < num_bytes:
        chunk = sock.recv(num_bytes - len(data))
        if not chunk:
            return None  # connection closed
        data += chunk
    return data
JavaScript
// Node.js -- handling TCP framing with length prefix
const net = require('net');

function sendMessage(socket, message) {
    const data = Buffer.from(message, 'utf8');
    const header = Buffer.alloc(4);
    header.writeUInt32BE(data.length);  // 4-byte length prefix
    socket.write(Buffer.concat([header, data]));
}

// On the receiving side, collect chunks until you have a full message
function createParser(onMessage) {
    let buffer = Buffer.alloc(0);

    return (chunk) => {
        buffer = Buffer.concat([buffer, chunk]);

        while (buffer.length >= 4) {
            const msgLen = buffer.readUInt32BE(0);
            if (buffer.length < 4 + msgLen) break;  // incomplete message

            const message = buffer.slice(4, 4 + msgLen).toString('utf8');
            buffer = buffer.slice(4 + msgLen);  // consume the message
            onMessage(message);
        }
    };
}

// Usage
const server = net.createServer((socket) => {
    const parse = createParser((msg) => {
        console.log('Received complete message:', msg);
    });
    socket.on('data', parse);
});
Common Framing Strategies
Strategy              How It Works                         Used By
──────────────────────────────────────────────────────────────────────
Length prefix         4 bytes (uint32) + payload            Most binary protocols
Delimiter             Messages end with \n or \r\n          Redis, SMTP, FTP
Fixed size            Every message is exactly N bytes      Some game protocols
HTTP-style            Headers + Content-Length + body        HTTP/1.1
Type-Length-Value     1 byte type + 4 byte length + data    TLV-based protocols

The length-prefix approach is the most common and reliable.
Delimiter-based works for text protocols but fails if the
delimiter appears in the data (unless you escape it).

Building a Simple Chat Server

Putting it all together -- a multi-client TCP chat server that broadcasts messages to all connected clients.

Python
import socket
import threading

clients = []  # list of connected client sockets
lock = threading.Lock()

def broadcast(message, sender):
    """Send a message to all clients except the sender."""
    with lock:
        for client in clients:
            if client != sender:
                try:
                    client.send(message)
                except:
                    clients.remove(client)

def handle_client(client_socket, address):
    print(f"{address} connected")
    with lock:
        clients.append(client_socket)

    try:
        while True:
            data = client_socket.recv(1024)
            if not data:
                break
            message = f"{address}: {data.decode()}"
            print(message)
            broadcast(message.encode(), client_socket)
    finally:
        with lock:
            clients.remove(client_socket)
        client_socket.close()
        print(f"{address} disconnected")

server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
server.bind(('0.0.0.0', 8080))
server.listen(5)
print("Chat server running on port 8080")

while True:
    client_socket, address = server.accept()
    threading.Thread(target=handle_client, args=(client_socket, address)).start()
JavaScript
// Chat server in Node.js
const net = require('net');
const clients = new Set();

const server = net.createServer((socket) => {
    clients.add(socket);
    const addr = `${socket.remoteAddress}:${socket.remotePort}`;
    console.log(`${addr} connected`);

    socket.on('data', (data) => {
        const message = `${addr}: ${data.toString().trim()}`;
        console.log(message);
        // Broadcast to all OTHER clients
        for (const client of clients) {
            if (client !== socket && !client.destroyed) {
                client.write(message + '\n');
            }
        }
    });

    socket.on('end', () => {
        clients.delete(socket);
        console.log(`${addr} disconnected`);
    });

    socket.on('error', () => clients.delete(socket));
});

server.listen(8080, () => console.log('Chat server on port 8080'));
Socket Programming Key Takeaways
  • TCP = reliable stream. Use for most applications. Always implement message framing (length-prefix or delimiter).
  • UDP = fast datagrams. Use for real-time data where occasional loss is acceptable. Each sendto() is one message.
  • Always set SO_REUSEADDR on server sockets to avoid "Address already in use" errors during development.
  • Handle partial reads. recv(1024) can return anywhere from 1 to 1024 bytes. Never assume you got a complete message.
  • Use sendall() not send() in Python. send() may not send all bytes; sendall() keeps sending until everything is sent.
  • For production, use asyncio or an event loop, not threading. Threads work for learning but do not scale to thousands of connections.
  • Close sockets in a finally block to avoid leaking file descriptors.

15. Network Programming Patterns

Understanding how servers handle thousands (or millions) of simultaneous connections is one of the most important concepts in modern backend engineering. This section covers the C10K problem, I/O models, and tracing a web request end-to-end.

The C10K Problem

In 1999, Dan Kegel asked: how do you handle 10,000 concurrent connections on a single server? At the time, most servers spawned one thread (or process) per connection. With 10K connections, that meant 10K threads -- each consuming memory for its stack, plus the overhead of context switching between them. Servers would grind to a halt.

Why One-Thread-Per-Connection Fails
One thread per connection at 10K connections:

  Thread stack size:   ~1 MB each (default on Linux)
  10,000 threads  =    ~10 GB of RAM just for stacks
  Context switches =   OS spends more time switching than doing work

  ┌────────────────────────────────────────────────┐
  │  CPU with one-thread-per-connection             │
  │                                                 │
  │  Time:  [T1][T2][T3]...[T9999][T10000][T1]...  │
  │          ▲                              ▲       │
  │          │                              │       │
  │       Thread 1 runs               Back to T1    │
  │       for ~10μs                   after 100ms!  │
  │                                                 │
  │  Each thread gets a tiny slice. Most time is    │
  │  spent SWITCHING, not doing useful work.        │
  └────────────────────────────────────────────────┘

Modern solutions all share one idea: use a small number of threads to handle many connections, by not blocking on any single one.

Modern Approaches to C10K+
┌─────────────────────────────────────────────────────────────────────┐
│  APPROACH 1: Event-Driven I/O (Node.js, Nginx)                      │
│                                                                     │
│  Single thread + event loop. Register interest in sockets,          │
│  OS notifies when data is ready. No blocking, no thread overhead.   │
│                                                                     │
│  1 thread handles 10,000+ connections:                              │
│                                                                     │
│  Event Loop: ──►[socket 1 ready]──►[process]──►[socket 47 ready]──► │
│                                                                     │
│  System calls: epoll (Linux), kqueue (macOS/BSD), IOCP (Windows)    │
├─────────────────────────────────────────────────────────────────────┤
│  APPROACH 2: Lightweight Green Threads (Go, Erlang)                  │
│                                                                     │
│  Go spawns goroutines (~2-8 KB stack each, not 1 MB).               │
│  M:N scheduling: M goroutines on N OS threads (N = num CPU cores).  │
│  The Go runtime multiplexes goroutines onto threads.                │
│                                                                     │
│  10,000 goroutines = ~80 MB (vs ~10 GB with OS threads)             │
│                                                                     │
│  OS Threads (N=8):  [T1][T2][T3][T4][T5][T6][T7][T8]               │
│  Goroutines (M=10K): g1,g2,...,g10000 distributed across T1-T8     │
├─────────────────────────────────────────────────────────────────────┤
│  APPROACH 3: io_uring (Linux 5.1+)                                   │
│                                                                     │
│  Shared ring buffer between userspace and kernel.                   │
│  Submit I/O requests to submission queue (SQ).                      │
│  Kernel completes them and puts results in completion queue (CQ).   │
│  Zero syscall overhead for batched I/O. The future of Linux I/O.   │
└─────────────────────────────────────────────────────────────────────┘
Nginx vs Apache -- Connection Handling
APACHE (prefork MPM):                   NGINX:
┌───────────────────────┐               ┌───────────────────────┐
│  Master Process        │               │  Master Process        │
│  ├── Worker Process 1  │               │  ├── Worker 1 (1 thread)│
│  │   └── 1 connection  │               │  │   └── 10K connections │
│  ├── Worker Process 2  │               │  ├── Worker 2 (1 thread)│
│  │   └── 1 connection  │               │  │   └── 10K connections │
│  ├── Worker Process 3  │               │  ├── Worker 3 (1 thread)│
│  │   └── 1 connection  │               │  │   └── 10K connections │
│  ...                   │               │  ...                   │
│  └── Worker Process N  │               │  └── Worker N (1 thread)│
│      └── 1 connection  │               │      └── 10K connections │
│                        │               │                        │
│  10K connections =     │               │  10K connections =     │
│  10K processes         │               │  1-4 workers           │
│  ~10 GB RAM            │               │  ~50 MB RAM            │
└───────────────────────┘               └───────────────────────┘

Apache forks a process per connection (or thread with worker MPM).
Nginx uses event-driven I/O -- each worker handles thousands of
connections using epoll/kqueue. That is why Nginx dominates.

Blocking vs Non-Blocking I/O

Every network operation involves waiting for data. How your program waits determines how many connections it can handle.

The Four I/O Models
MODEL 1: BLOCKING I/O
─────────────────────
Application          Kernel
    │                  │
    │── read() ───────►│
    │   (thread blocks) │── wait for data ──►
    │   (sleeping...)   │◄── data arrives ───
    │◄── data ─────────│
    │                  │
Thread is STUCK until data arrives. Simple code, but one thread
per connection. Cannot scale past a few thousand connections.

MODEL 2: NON-BLOCKING I/O
──────────────────────────
Application          Kernel
    │                  │
    │── read() ───────►│
    │◄── EWOULDBLOCK ──│  (no data yet)
    │── read() ───────►│
    │◄── EWOULDBLOCK ──│  (still no data)
    │── read() ───────►│
    │◄── EWOULDBLOCK ──│  (nope)
    │── read() ───────►│
    │◄── data ─────────│  (finally!)
    │                  │
Thread is NOT stuck, but wastes CPU spinning (busy-wait).
Rarely used alone -- usually combined with I/O multiplexing.

MODEL 3: I/O MULTIPLEXING (select / poll / epoll)
──────────────────────────────────────────────────
Application          Kernel
    │                  │
    │── epoll_wait() ──►│  "tell me when ANY of these
    │   (blocks here)   │   1000 sockets have data"
    │                  │
    │                  │── socket 47 has data!
    │◄── socket 47 ────│
    │── read(47) ──────►│
    │◄── data ─────────│
    │                  │
ONE thread watches MANY sockets. Only blocks on epoll_wait(),
then processes whichever sockets are ready. This is what
Node.js, Nginx, and Redis use.

MODEL 4: ASYNC I/O (io_uring, IOCP)
────────────────────────────────────
Application          Kernel
    │                  │
    │── submit read ───►│  "read socket 47 into this buffer"
    │   (returns        │
    │    immediately)   │── kernel does the read ──►
    │                  │◄── done ──────────────────
    │◄── completion ───│  "buffer is filled, here you go"
    │                  │
Application does OTHER WORK while kernel handles I/O.
True async -- no blocking, no polling. Most efficient model.
io_uring on Linux, IOCP on Windows.
I/O Model Comparison
┌──────────────────┬──────────────────┬─────────────┬───────────────┐
│ Model            │ Threads Needed   │ Complexity  │ Throughput    │
├──────────────────┼──────────────────┼─────────────┼───────────────┤
│ Blocking         │ 1 per connection │ Very Low    │ Low (~1K conn)│
│ Non-blocking     │ 1 (busy-wait)    │ Medium      │ Medium        │
│ I/O Multiplexing │ 1 (or few)       │ Medium-High │ High (~100K)  │
│ Async (io_uring) │ 1 (or few)       │ High        │ Highest (1M+) │
└──────────────────┴──────────────────┴─────────────┴───────────────┘
select() vs poll() vs epoll()
select():
  - Oldest (POSIX). Works everywhere.
  - Limited to 1024 file descriptors (FD_SETSIZE).
  - O(n) -- kernel checks ALL fds every time.
  - Must rebuild the fd set after every call.

poll():
  - No fd limit (uses an array, not a bitmask).
  - Still O(n) -- kernel checks all fds.
  - Slightly better API than select().

epoll() (Linux only):
  - O(1) for ready events. Kernel tracks state internally.
  - No fd limit. Handles millions of connections.
  - Two modes: level-triggered (default) and edge-triggered.
  - Edge-triggered: only notifies on STATE CHANGES (more efficient
    but you must read ALL available data or you miss events).

kqueue (macOS/BSD):
  - Similar to epoll. Single syscall for both registering and waiting.
  - Handles various event types (sockets, files, signals, timers).
Which I/O Model Should You Use?
  • Learning/prototyping: Blocking I/O with threads. Simple and understandable.
  • Production web servers: I/O multiplexing (epoll). This is what Node.js and Nginx use under the hood.
  • High-performance systems: io_uring for Linux. Still newer, but rapidly being adopted (used in Tigerbeetle, Bun, etc.).
  • Go programs: Just use goroutines. Go's runtime does I/O multiplexing for you automatically via the netpoller.

How a Web Request Actually Works (End-to-End)

When you type a URL and hit Enter, here is everything that happens, step by step, with approximate timings.

Complete Journey of a Web Request
You type: https://example.com/page

STEP 1: URL PARSING (~0ms)
  Browser extracts: protocol=https, host=example.com, path=/page

STEP 2: DNS RESOLUTION (~20-120ms, or 0ms if cached)
  Browser cache → OS cache → Router cache → ISP DNS → Root DNS
  Result: example.com → 93.184.216.34

  Your Machine            DNS Resolver         Root DNS    .com TLD    Auth DNS
      │──── A example.com? ──►│                    │           │           │
      │                       │── where is .com? ──►│           │           │
      │                       │◄── try 192.5.6.30 ──│           │           │
      │                       │── where is          │           │           │
      │                       │   example.com? ─────────────────►│           │
      │                       │◄── try 199.43.135.53 ───────────│           │
      │                       │── A example.com? ───────────────────────────►│
      │                       │◄── 93.184.216.34 ───────────────────────────│
      │◄── 93.184.216.34 ─────│

STEP 3: TCP HANDSHAKE (~20-100ms)
  Client                    Server (93.184.216.34:443)
      │── SYN ──────────────────►│
      │◄── SYN-ACK ─────────────│
      │── ACK ──────────────────►│
  Connection established. 1.5 round trips.

STEP 4: TLS HANDSHAKE (~30-150ms, adds 1-2 more round trips)
  Client                    Server
      │── ClientHello ──────────►│  (supported ciphers, TLS version)
      │◄── ServerHello ─────────│  (chosen cipher, certificate)
      │   (verify certificate    │
      │    against CA bundle)    │
      │── Key Exchange ──────────►│  (Diffie-Hellman params)
      │◄── Finished ────────────│
      │── Finished ──────────────►│
  Symmetric encryption key established. All data encrypted from here.

STEP 5: HTTP REQUEST (~1ms to send)
  GET /page HTTP/1.1
  Host: example.com
  User-Agent: Mozilla/5.0 ...
  Accept: text/html
  Accept-Encoding: gzip, br
  Connection: keep-alive

STEP 6: SERVER PROCESSING (~5-500ms)
  Web server (Nginx) receives request
  → Reverse proxy to application server (Node/Django/Go)
  → Application queries database (~5-50ms)
  → Application renders HTML template (~1-10ms)
  → Response sent back through Nginx

STEP 7: HTTP RESPONSE (~5-50ms transfer)
  HTTP/1.1 200 OK
  Content-Type: text/html; charset=UTF-8
  Content-Encoding: gzip
  Content-Length: 3842
  Cache-Control: max-age=3600

  <!DOCTYPE html><html>...</html>

STEP 8: BROWSER RENDERING (~50-500ms)
  Parse HTML → Build DOM tree
  Parse CSS → Build CSSOM
  Combine → Render tree
  Layout → Paint → Composite
  Execute JavaScript

TOTAL: ~150ms - 1500ms for a typical page load

What curl -v Shows You at Each Stage

The curl -v (verbose) flag shows you every step of the request in real time. Here is real output annotated with what each line means.

Annotated curl -v Output
$ curl -v https://example.com

# ── DNS Resolution ──
* Host example.com:443 was resolved.
* IPv6: 2606:2800:21f:cb07:6820:80da:af6b:8b2c
* IPv4: 93.184.216.34
*   Trying 93.184.216.34:443...          ← connecting to resolved IP

# ── TCP Handshake ──
* Connected to example.com (93.184.216.34) port 443
                                          ← TCP 3-way handshake done

# ── TLS Handshake ──
* ALPN: curl offers h2,http/1.1          ← client supports HTTP/2
* TLSv1.3 (OUT), TLS handshake, Client hello
* TLSv1.3 (IN), TLS handshake, Server hello
* TLSv1.3 (IN), TLS handshake, Certificate
* TLSv1.3 (IN), TLS handshake, CERT verify
* TLSv1.3 (OUT), TLS handshake, Finished
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
                                          ← TLS established, cipher chosen
* Server certificate:
*  subject: CN=www.example.org            ← who the cert is for
*  issuer: C=US; O=DigiCert Inc           ← who signed the cert (CA)
*  expire date: Mar 14 23:59:59 2026 GMT  ← cert expiry

# ── HTTP Request ──
> GET / HTTP/2                            ← request method and path
> Host: example.com                       ← which domain (for shared IPs)
> User-Agent: curl/8.5.0                  ← who is making the request
> Accept: */*                             ← accepted content types

# ── HTTP Response ──
< HTTP/2 200                              ← status code (200 = OK)
< content-type: text/html; charset=UTF-8  ← response format
< content-length: 1256                    ← body size in bytes
< cache-control: max-age=604800           ← cache for 7 days
< age: 337133                             ← seconds since cached by CDN

# ── Response Body ──
<!doctype html>
<html>...</html>
Useful curl Flags for Network Debugging
  • curl -v -- verbose output showing every step
  • curl -w "\n%{time_namelookup} %{time_connect} %{time_appconnect} %{time_total}\n" -- timing breakdown for DNS, TCP, TLS, total
  • curl -I -- fetch only headers (HEAD request)
  • curl --resolve example.com:443:1.2.3.4 -- bypass DNS and connect to a specific IP
  • curl -k -- skip TLS certificate verification (dev only!)

16. NAT (Network Address Translation)

IPv4 only provides about 4.3 billion addresses, but there are far more devices on the internet. NAT solves this by letting multiple devices on a local network share a single public IP address. Your home router is a NAT device -- every phone, laptop, and smart device in your house appears to the outside world as one IP.

How NAT Works
Your home network:                         The Internet:

┌──────────────────────────────────────┐     ┌──────────────────┐
│  Phone:     192.168.1.10             │     │                  │
│  Laptop:    192.168.1.20             │     │  example.com     │
│  Desktop:   192.168.1.30             │     │  93.184.216.34   │
│                                      │     │                  │
│          ┌───────────────┐           │     └──────────────────┘
│          │ Router (NAT)  │           │              ▲
│          │               │           │              │
│          │ Internal:     │           │              │
│          │ 192.168.1.1   │───────────────────────────
│          │               │  Public IP:              │
│          │ External:     │  73.45.123.89            │
│          │ 73.45.123.89  │                          │
│          └───────────────┘                          │
└──────────────────────────────────────┘              │

All three devices share ONE public IP: 73.45.123.89
NAT Translation Table
When your laptop (192.168.1.20) visits example.com:

OUTGOING packet (laptop → internet):
  Original:    src=192.168.1.20:54321  dst=93.184.216.34:443
  After NAT:   src=73.45.123.89:10001  dst=93.184.216.34:443
                ▲                ▲
                public IP        NAT-assigned port

Router's NAT table:
┌──────────────────────────┬───────────────────────────┐
│ Internal (LAN)           │ External (WAN)            │
├──────────────────────────┼───────────────────────────┤
│ 192.168.1.20:54321       │ 73.45.123.89:10001        │
│ 192.168.1.10:48000       │ 73.45.123.89:10002        │
│ 192.168.1.30:60123       │ 73.45.123.89:10003        │
└──────────────────────────┴───────────────────────────┘

INCOMING reply (internet → laptop):
  Arrives at:  dst=73.45.123.89:10001
  Router looks up port 10001 → 192.168.1.20:54321
  Forwards to: dst=192.168.1.20:54321

The external server never sees your internal 192.168.x.x address.
It only sees 73.45.123.89.

Why NAT Breaks Peer-to-Peer

If both Alice and Bob are behind NAT, neither can directly connect to the other. Alice's router will drop incoming connections from Bob because there is no NAT table entry for them -- Bob never initiated a connection from inside the network.

NAT Traversal: STUN, TURN, ICE
The P2P problem:

  Alice (behind NAT)              Bob (behind NAT)
  Internal: 192.168.1.5           Internal: 10.0.0.8
  Public:   73.45.123.89          Public:   98.76.54.32

  Alice → Bob?  Bob's router drops it (no NAT entry)
  Bob → Alice?  Alice's router drops it (no NAT entry)

SOLUTION 1: STUN (Session Traversal Utilities for NAT)
  - Alice asks a STUN server: "What is my public IP:port?"
  - STUN replies: "You are 73.45.123.89:10001"
  - Alice tells Bob (via a signaling server) her public address
  - Both sides send packets simultaneously → NAT creates entries
  - Works ~85% of the time (fails with symmetric NAT)

SOLUTION 2: TURN (Traversal Using Relays around NAT)
  - A relay server in the middle forwards all traffic
  - Alice → TURN server → Bob
  - Always works but adds latency and costs bandwidth
  - Used as fallback when STUN fails

SOLUTION 3: ICE (Interactive Connectivity Establishment)
  - Used by WebRTC and VoIP
  - Tries multiple strategies in order:
    1. Direct connection (if on same network)
    2. STUN (punch through NAT)
    3. TURN (relay as last resort)
  - Picks the best working path automatically

Port Forwarding

If you want to run a server behind NAT (e.g., a game server, a self-hosted website), you must manually tell your router to forward traffic on a specific external port to your internal machine.

Port Forwarding Configuration
Router port forwarding rule:

  External port 8080 → Internal 192.168.1.30:8080

  Someone on the internet connects to 73.45.123.89:8080
  → Router forwards to 192.168.1.30:8080
  → Your server receives the connection

Common port forwarding use cases:
  - Minecraft server: forward port 25565
  - Web server: forward port 80 and 443
  - SSH access: forward port 22 (or a custom port like 2222)
  - Security cameras: forward camera's web port
NAT is Not a Firewall

NAT hides your internal IPs, but that is a side effect, not a security feature. NAT was designed to solve address exhaustion, not to protect you. A proper firewall filters traffic based on rules. Do not rely on NAT alone for security.

17. Firewalls & iptables

A firewall inspects network packets and decides whether to allow, drop, or reject them based on a set of rules. On Linux, the built-in firewall is implemented via iptables (or its modern replacement, nftables). For simpler configuration, ufw (Uncomplicated Firewall) provides a friendly frontend.

iptables Chains
Every packet passes through one or more CHAINS of rules:

                     ┌──────────────┐
  Incoming packet ──►│   INPUT      │──► Local process (your app)
                     └──────────────┘

                     ┌──────────────┐
  Local process ────►│   OUTPUT     │──► Outgoing packet
                     └──────────────┘

                     ┌──────────────┐
  Forwarded packet ─►│   FORWARD    │──► Another network interface
  (router/gateway)   └──────────────┘

Each chain has an ordered list of rules. For each packet:
  1. Check rule 1 → match? → ACCEPT / DROP / REJECT
  2. Check rule 2 → match? → ACCEPT / DROP / REJECT
  3. ...
  N. No rules matched → apply DEFAULT POLICY (usually DROP)

Stateful vs Stateless Firewalls

Stateful vs Stateless
STATELESS FIREWALL:
  - Examines each packet independently.
  - No memory of previous packets.
  - You must create rules for BOTH directions:
    "Allow outgoing to port 443" AND "Allow incoming from port 443"
  - Simpler but harder to configure correctly.

STATEFUL FIREWALL (iptables with conntrack):
  - Tracks connection state (NEW, ESTABLISHED, RELATED).
  - If you allow an outgoing connection, replies are automatically allowed.
  - Much easier to configure:
    "Allow outgoing to port 443" → replies come back automatically.
  - iptables is stateful by default when using -m conntrack.

  Example of connection tracking:
  ┌────────────┐                    ┌────────────┐
  │ Your Server │── SYN (NEW) ──────►│ Remote     │
  │             │◄── SYN-ACK ────────│            │
  │             │   (ESTABLISHED)    │            │
  │             │── ACK ────────────►│            │
  │             │   (ESTABLISHED)    │            │
  └────────────┘                    └────────────┘

  conntrack table: src=10.0.0.1 dst=93.184.216.34 sport=54321
  dport=443 state=ESTABLISHED

Common iptables Rules

Practical iptables Commands
# View current rules
sudo iptables -L -n -v

# Set default policies: drop everything, then whitelist
sudo iptables -P INPUT DROP
sudo iptables -P FORWARD DROP
sudo iptables -P OUTPUT ACCEPT      # allow all outgoing

# Allow loopback (localhost) traffic
sudo iptables -A INPUT -i lo -j ACCEPT

# Allow established and related connections (stateful)
sudo iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

# Allow SSH (port 22)
sudo iptables -A INPUT -p tcp --dport 22 -j ACCEPT

# Allow HTTP and HTTPS (ports 80, 443)
sudo iptables -A INPUT -p tcp --dport 80 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 443 -j ACCEPT

# Allow ping (ICMP)
sudo iptables -A INPUT -p icmp --icmp-type echo-request -j ACCEPT

# Block a specific IP
sudo iptables -A INPUT -s 203.0.113.50 -j DROP

# Rate-limit SSH to prevent brute force (max 3 new connections per minute)
sudo iptables -A INPUT -p tcp --dport 22 -m conntrack --ctstate NEW \
  -m limit --limit 3/min --limit-burst 3 -j ACCEPT

# Log dropped packets (for debugging)
sudo iptables -A INPUT -j LOG --log-prefix "IPT-DROP: " --log-level 4

# Save rules (Debian/Ubuntu)
sudo iptables-save > /etc/iptables/rules.v4

# Delete a rule (by line number -- use iptables -L --line-numbers)
sudo iptables -D INPUT 5

ufw -- The Simpler Frontend

Practical ufw Commands
# Enable ufw (careful -- may lock you out of SSH!)
sudo ufw enable

# Default: deny incoming, allow outgoing
sudo ufw default deny incoming
sudo ufw default allow outgoing

# Allow SSH (ALWAYS do this before enabling ufw on a remote server!)
sudo ufw allow ssh         # or: sudo ufw allow 22/tcp

# Allow HTTP and HTTPS
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp

# Allow a port range
sudo ufw allow 8000:8100/tcp

# Allow from a specific IP only
sudo ufw allow from 10.0.0.5 to any port 22

# Allow from a subnet
sudo ufw allow from 192.168.1.0/24

# Deny a specific port
sudo ufw deny 3306/tcp     # block MySQL from outside

# Check status and rules
sudo ufw status verbose

# Delete a rule
sudo ufw delete allow 80/tcp

# Reset all rules
sudo ufw reset
Do Not Lock Yourself Out

If you are configuring a firewall on a remote server (VPS), always allow SSH first before enabling the firewall. If you block port 22 and enable the firewall, you will lose access to your server. Many hosting providers offer an emergency console, but prevention is better.

Firewall Key Takeaways
  • Default deny is the safest approach. Block everything, then whitelist only what you need.
  • Use ufw unless you need advanced features. It generates iptables rules under the hood.
  • Always allow ESTABLISHED,RELATED connections so replies to your outgoing connections are not blocked.
  • Rate-limit SSH to prevent brute-force attacks. Combine with fail2ban for even better protection.
  • Save your rules -- iptables rules are lost on reboot unless you persist them.

18. VPNs & Tunneling

A VPN (Virtual Private Network) creates an encrypted tunnel between your device and a remote server. All your traffic flows through this tunnel, hiding it from your ISP, Wi-Fi snoopers, and anyone else on the network path. VPNs are also used to connect offices, access internal resources, and build secure overlay networks.

How a VPN Works
WITHOUT VPN:
  Your Device ──► ISP ──► Internet ──► Destination
                  ▲
                  │ ISP can see ALL your traffic
                  │ (which sites, what data if not HTTPS)

WITH VPN:
  Your Device ══► ISP ══► VPN Server ──► Internet ──► Destination
       │          │           │
       │    Encrypted tunnel  │
       │    ISP sees only     │
       │    "gibberish" going │
       │    to VPN server IP  │

  ══► = encrypted tunnel
  ──► = normal traffic

Your real IP is hidden. Destination sees VPN server's IP.
ISP sees encrypted traffic to a single IP (the VPN server).

WireGuard vs OpenVPN

VPN Protocol Comparison
┌──────────────────────┬────────────────────────┬──────────────────────┐
│                      │ WireGuard              │ OpenVPN              │
├──────────────────────┼────────────────────────┼──────────────────────┤
│ Code size            │ ~4,000 lines           │ ~100,000+ lines      │
│ Speed                │ Very fast              │ Slower               │
│ Crypto               │ Modern (ChaCha20,      │ Configurable (can    │
│                      │ Curve25519, BLAKE2s)   │ use outdated ciphers)│
│ Connection           │ Instant (1 RTT)        │ Slow (multi-step     │
│                      │                        │ TLS handshake)       │
│ Runs as              │ Kernel module          │ Userspace daemon     │
│ Configuration        │ Simple config file     │ Complex config file  │
│ Roaming              │ Handles IP changes     │ Reconnects on change │
│ Audit surface        │ Small, audited         │ Large, complex       │
│ Platform             │ Linux, macOS, Windows, │ Everything           │
│                      │ iOS, Android           │                      │
├──────────────────────┼────────────────────────┼──────────────────────┤
│ When to use          │ Default choice for     │ Legacy systems,      │
│                      │ new setups             │ or need TCP mode     │
└──────────────────────┴────────────────────────┴──────────────────────┘

Mesh VPNs: Tailscale & ZeroTier

Traditional VPN vs Mesh VPN
TRADITIONAL VPN (hub-and-spoke):
  All traffic goes through a central server.

  Device A ──► VPN Server ──► Device B
  Device C ──► VPN Server ──► Device D

  Problem: VPN server is a bottleneck and single point of failure.
  All traffic takes a detour through the server, even if devices
  are on the same network.

MESH VPN (Tailscale, ZeroTier):
  Devices connect DIRECTLY to each other (peer-to-peer).

  Device A ◄───────────────► Device B
      ▲                         ▲
      │                         │
      └────── Device C ─────────┘
              ▲
              │
           Device D

  A coordination server handles key exchange and NAT traversal,
  but actual data flows directly between devices.

  Tailscale: Built on WireGuard. Uses DERP relay servers as fallback.
  ZeroTier:  Custom protocol. Creates virtual Layer 2 network.

  Use cases:
  - Access your home server from anywhere
  - Connect dev machines across locations
  - Replace traditional VPNs for remote teams
  - Self-hosted alternative: Headscale (open-source Tailscale control server)

SSH Tunneling

SSH is not just for remote shells. It can create encrypted tunnels to forward traffic between ports. This is incredibly useful for accessing services behind firewalls or encrypting insecure protocols.

SSH Local Port Forwarding
# LOCAL PORT FORWARDING (-L)
# "Make a remote service available on my local machine"
#
# Syntax: ssh -L local_port:remote_host:remote_port user@ssh_server

# Example: Access a database on a remote server that only allows
# local connections (port 5432 is not exposed to the internet)

ssh -L 5432:localhost:5432 user@myserver.com

# Now connect to localhost:5432 on YOUR machine
# → traffic is tunneled through SSH to myserver.com
# → myserver.com connects to its own localhost:5432 (PostgreSQL)

# Diagram:
#   Your Machine              SSH Server (myserver.com)
#   localhost:5432 ═══════════► localhost:5432 (PostgreSQL)
#        ▲         encrypted          │
#        │         SSH tunnel          │
#   Your app connects                 Database lives here
#   to localhost:5432                  (not exposed to internet)
SSH Remote Port Forwarding
# REMOTE PORT FORWARDING (-R)
# "Make my local service available on the remote server"
#
# Syntax: ssh -R remote_port:local_host:local_port user@ssh_server

# Example: Expose your local dev server (port 3000) to the internet
# through a VPS with a public IP

ssh -R 8080:localhost:3000 user@myvps.com

# Now anyone can visit http://myvps.com:8080
# → traffic is tunneled back to YOUR machine's port 3000

# Diagram:
#   Your Machine              VPS (myvps.com)
#   localhost:3000 ◄═══════════ 0.0.0.0:8080
#        │         encrypted          ▲
#        │         SSH tunnel          │
#   Your dev server               Internet users
#   (React, Flask, etc.)          visit myvps.com:8080
SSH Dynamic Port Forwarding (SOCKS Proxy)
# DYNAMIC PORT FORWARDING (-D)
# "Route ALL my traffic through the SSH server (like a VPN)"
#
# Syntax: ssh -D local_port user@ssh_server

ssh -D 1080 user@myserver.com

# Configure your browser to use SOCKS5 proxy: localhost:1080
# All browser traffic is now tunneled through myserver.com
# Useful for bypassing network restrictions or accessing
# geo-restricted content

# Tip: combine flags for background tunnels:
ssh -f -N -D 1080 user@myserver.com
#  -f = go to background after connecting
#  -N = no remote command (tunnel only)
#  -D = dynamic SOCKS proxy

When to Use Which

VPN & Tunnel Decision Guide
┌──────────────────────────┬───────────────────────────────────────────┐
│ Scenario                 │ Best Tool                                 │
├──────────────────────────┼───────────────────────────────────────────┤
│ Privacy from ISP         │ WireGuard VPN (or commercial VPN)         │
│ Access home network      │ Tailscale or WireGuard                    │
│   from anywhere          │                                           │
│ Connect office networks  │ WireGuard site-to-site or Tailscale       │
│ Quick access to a remote │ SSH local port forwarding (-L)            │
│   database or service    │                                           │
│ Expose local dev server  │ SSH remote port forwarding (-R)           │
│   temporarily            │   or ngrok / Cloudflare Tunnel            │
│ Route browser traffic    │ SSH dynamic proxy (-D) or VPN             │
│   through another server │                                           │
│ Team remote access       │ Tailscale / ZeroTier (easiest)            │
│   (replace corporate VPN)│   or WireGuard (more control)             │
│ Air-gapped / high-       │ WireGuard with strict firewall rules      │
│   security environments  │                                           │
└──────────────────────────┴───────────────────────────────────────────┘
VPN & Tunneling Key Takeaways
  • WireGuard is the modern default. Faster, simpler, and more secure than OpenVPN. Use it unless you have a specific reason not to.
  • Tailscale makes WireGuard easy. No port forwarding, no manual key exchange. Install and go. Free for personal use.
  • SSH tunnels are underrated. You already have SSH access to your servers -- use -L and -R for quick, secure access without setting up a full VPN.
  • A VPN does not make you anonymous. The VPN provider can still see your traffic. For true anonymity, you need Tor (which is much slower).
  • Always use a kill switch with privacy VPNs. If the VPN drops, your traffic should stop -- not fall back to your real IP.