# 04 — Firewall ## Summary Every node runs UFW (Uncomplicated Firewall, a frontend for iptables) with a default-deny-incoming policy. Specific ports are allowed from specific sources only. This chapter lists every rule on every node, why each rule exists, and what breaks without it. It also traces what happens to an inbound packet as it goes through iptables, UFW, and the kernel. ## Policy All three nodes have the same UFW config. The policy: | Direction | Default | |---|---| | **Incoming** | **deny** | | Outgoing | allow | | Routed | disabled (we don't NAT) | Default deny is a white-list model: unless a rule explicitly allows a packet, it's dropped. This is more secure than default-allow but requires that every legitimate port be enumerated in a rule. ## Current ruleset per node Run `sudo ufw status verbose` on any node to see the live ruleset. The canonical ruleset below, grouped by purpose. ### Public-facing | Port | Protocol | From | Purpose | |---|---|---|---| | 22 | TCP | Anywhere | SSH (key-only) | | 443 | TCP | Cloudflare ranges (15 IPv4 + 7 IPv6) | HTTPS (CF → Traefik, TLS-terminated at Traefik) | **Port :80 is closed** on all three nodes. CF is in Full (strict) mode and initiates every request on :443 to the origin. Cloudflare's "Always Use HTTPS" turns any plaintext client request into HTTPS at the edge, so the origin never needs to accept :80. **Port :443 is restricted to Cloudflare** via 22 UFW allow rules per node (one per CF CIDR). Direct-connect from any non-CF IP is dropped at the kernel. This closes the "node IP leak = bypass CF WAF/DDoS" hole entirely. See [Chapter 13](./13-cloudflare.md#cloudflare-ip-ranges-used-in-traefik-trustedips) for the exact ranges and UFW rule format. **Refresh cadence**: CF updates its IP ranges rarely. A monthly `curl https://www.cloudflare.com/ips-v4` diff and UFW re-apply is enough. Automation TODO (Chapter 20). ### SSH (operator access) | Port | Protocol | From | Purpose | |---|---|---|---| | 22 | TCP | Anywhere | SSH login (key-only) | SSH is open to the internet but hardened: key-only auth, no root login, `AllowUsers deploy` configured (the stock distribution still allows root; we hardened in bootstrap). See [Chapter 5](./05-security.md) for the full SSH config. **TODO** (Chapter 20): Move SSH off :22 to :2222 or similar, tighten to the operator's current IP. Current state is acceptable given key-only + fail2ban defaults. ### Kubernetes API (kubectl from operator) | Port | Protocol | From | Purpose | |---|---|---|---| | 6443 | TCP | 47.185.183.191 (operator IP) | kubectl to kube-apiserver | When the operator's public IP changes (moves, new ISP), this rule needs updating on all 3 nodes. Ugly but necessary. A better long-term fix is **Cloudflare Access** or **Tailscale** to avoid pinning operator IPs. ### Inter-node cluster traffic These rules allow the three nodes to talk to each other for cluster state. Each node has an allow rule for each of the **three node IPs** (including its own — the "allow from self" rule exists so local flows are explicit). | Port | Protocol | From | Purpose | |---|---|---|---| | 6443 | TCP | other nodes | kube-apiserver (other servers' talk to each other) | | 2379 | TCP | other nodes | etcd client (Raft state reads) | | 2380 | TCP | other nodes | etcd peer (Raft state writes between server nodes) | | 10250 | TCP | other nodes | kubelet (metrics, exec, logs from API server) | | 8472 | UDP | other nodes | Flannel VXLAN overlay | ### Application-specific (legacy, mostly superfluous on k3s) These rules were added during the Swarm era and still exist on the nodes. None of them hurt anything; most are unused on k3s. | Port | Protocol | From | Purpose (original) | Status on k3s | |---|---|---|---|---| | 2377 | TCP | node IPs | Swarm cluster management | unused (Swarm gone) | | 7946 | TCP + UDP | node IPs | Swarm gossip | unused | | 4789 | UDP | node IPs | Swarm VXLAN | unused (k3s uses 8472) | | (ESP, proto 50) | — | node IPs | IPSec encrypted overlay | unused | | 500 | UDP | node IPs | IKE key exchange | unused | | 3000 | TCP | node IPs | admin Next.js, when we tried node-IP hardcoding | unused | These can be removed in a cleanup pass. They don't affect security because no process listens on those ports anymore. ## Why each required rule exists ### Port 22 — SSH (public) Obviously needed for operator access. Without it we'd have no way to reach the nodes. Hetzner console's "rescue" mode is an emergency fallback. ### Port 80 — HTTP (public) Cloudflare talks HTTP to origin on port 80 (SSL=Flexible mode). Without this rule, Cloudflare gets connection-refused and returns 521 to users. ### Port 443 — HTTPS (public) Currently unused in SSL=Flexible mode. Open to smooth the future Full-strict migration. No process listens on 443 yet; the kernel would reject connections. Rule is harmless. ### Port 6443 — kube-apiserver (operator + inter-node) **From operator IP**: so `kubectl` works. Without this, `kubectl get pods` times out. **From other nodes**: server nodes check each other's apiservers for Raft elections and cross-node controller operations. Without this, nodes can still run pods but can't participate in cluster state changes. ### Ports 2379/2380 — embedded etcd (inter-node) K3s runs etcd as an embedded library inside the server binary. The etcd client port (2379) and peer port (2380) carry Raft protocol messages between the three servers. **Without these rules, Raft cannot replicate state and the cluster loses quorum.** This bit us during the k3s install — initially the joins failed because 2379/2380 were blocked. ### Port 10250 — kubelet (inter-node) The kubelet on each node exposes a read-only API for the kube-apiserver to call — `kubectl logs`, `kubectl exec`, kubelet metrics scraping. Without this rule, operator commands like `kubectl logs -n honeydue deploy/api` fail with "Error from server: unable to upgrade connection". ### Port 8472 UDP — Flannel VXLAN (inter-node) Pod-to-pod traffic between nodes flows through VXLAN tunnels on UDP 8472. **Without this rule, cross-node pod communication silently fails** — which looks like "admin can't reach api" or "worker can't reach Redis" depending on where pods land. This rule is load-bearing. It is the single most important inter-node rule. ## Inbound packet's journey through UFW/iptables When a packet arrives at hetzner1's network interface on port 80: ```mermaid sequenceDiagram participant NIC as hetzner1 NIC participant PRE as iptables
raw + mangle + nat PREROUTING participant FIL as iptables filter INPUT
(UFW lives here) participant SOCK as Traefik pod socket
(host network) NIC->>PRE: Packet: SYN :80 from CF PRE->>PRE: conntrack state: NEW PRE->>FIL: handoff to INPUT chain FIL->>FIL: UFW rules evaluated Note over FIL: Rule: allow 80/tcp from anywhere
→ ACCEPT FIL->>SOCK: delivered to listening socket SOCK->>SOCK: Traefik accepts connection ``` UFW is really a set of wrapper chains on top of iptables. `sudo iptables -L INPUT -n --line-numbers` on any node shows the actual rules; UFW just makes editing them easier. ## Rule syntax we used UFW commands we ran during setup (for reference): ```bash # Reset to default sudo ufw --force reset # Default deny incoming sudo ufw default deny incoming sudo ufw default allow outgoing # SSH + web (public) sudo ufw allow 22/tcp comment 'SSH' sudo ufw allow 80/tcp comment 'HTTP' sudo ufw allow 443/tcp comment 'HTTPS' # Kubernetes inter-node (repeat for each peer IP) for ip in 178.104.247.152 178.105.32.198 178.104.249.189; do sudo ufw allow from "$ip" to any port 6443 proto tcp comment "k3s-api $ip" sudo ufw allow from "$ip" to any port 2379 proto tcp comment "k3s-etcd-client $ip" sudo ufw allow from "$ip" to any port 2380 proto tcp comment "k3s-etcd-peer $ip" sudo ufw allow from "$ip" to any port 10250 proto tcp comment "k3s-kubelet $ip" sudo ufw allow from "$ip" to any port 8472 proto udp comment "k3s-flannel-vxlan $ip" done # Kubectl from operator sudo ufw allow from 47.185.183.191 to any port 6443 proto tcp comment 'kubectl from dev' # Enable sudo ufw --force enable ``` Rules persist across reboots via `/etc/ufw/user.rules`. ## What if we used Hetzner Cloud Firewall instead? Hetzner Cloud has a provider-level firewall feature — rule-for-rule equivalent but configured in the Hetzner console (or via API), not on the nodes. Tradeoffs: | | Hetzner Cloud Firewall | UFW (current) | |---|---|---| | Cost | Free | Free | | Config location | Hetzner console / API | Per-node `/etc/ufw/` | | Applies to | All traffic to NIC | All traffic to kernel | | Failure mode | Provider-side issue = rules gone | Node-side issue = rules gone | | Inter-node traffic | Same rules for all nodes | Same rules on each node | | Visible to attacker | Yes (provider fingerprints) | Yes (iptables probe) | | Rule ordering | UI-based | `iptables -L` | Either works. A future improvement: move the stable rules to Hetzner Cloud Firewall (one source of truth) and leave only the dynamic rules (operator IP, ad-hoc debug) on the nodes. ## Why we don't use iptables directly UFW is a frontend. `iptables` works, but the rules are harder to read and edit. `sudo ufw allow from X to any port Y proto Z comment 'Z-rule'` is clearer than writing the equivalent `-A INPUT ...` rule directly. Also, UFW's `comment` field lets us explain each rule, which becomes critical when the ruleset grows past ~10 rules. ## Testing the firewall From the operator workstation (47.185.183.191): ```bash # Should work (22/tcp open) ssh deploy@hetzner1 exit # Should work (80/tcp open) curl -I -H "Host: api.myhoneydue.com" http://hetzner1/api/health/ # Should work (443/tcp open; TLS handshake will fail because nothing listens) curl -kI https://178.104.247.152/ # Should work (6443 allowed from operator IP) export KUBECONFIG=~/.kube/honeydue-k3s.yaml kubectl get nodes # Should time out (default-deny from arbitrary ports) curl http://178.104.247.152:3000/ # not open to operator curl http://178.104.247.152:6379/ # Redis not exposed publicly ``` From another peer node (hetzner2 trying to reach hetzner1): ```bash # Should work (k3s API allowed from peer node IPs) curl -k https://178.104.247.152:6443/healthz # Should work (etcd client from peer) nc -zv 178.104.247.152 2379 ``` ## The hidden dependency: kubelet/containerd also need ports Beyond the UFW rules, the kubelet also listens on: - **10255/tcp** — kubelet read-only port (no auth, deprecated; disabled by default in k3s) - **10256/tcp** — kube-proxy health - **10257/tcp** — kube-controller-manager health - **10259/tcp** — kube-scheduler health These are bound to `localhost` only, so they don't need UFW rules. But they're important to know about when debugging — if one of these health endpoints isn't responding, the relevant component is broken. ## Legacy rules to clean up The following rules are on the nodes from the Swarm era and can be removed in a future cleanup pass: ```bash # On each node, list Swarm-era rules sudo ufw status numbered | grep -E "2377|7946|4789|500|3000|esp" # Remove by number (highest-to-lowest to avoid renumbering) # Example: sudo ufw --force delete 15 sudo ufw --force delete 14 # ... etc. ``` We left them in because they don't affect security (no process listens on those ports), and removing them requires careful testing that nothing in k3s secretly relies on 4789/udp or similar. ## Operator cheat sheet ```bash # Show the ruleset, with comments, numbered sudo ufw status numbered verbose # Add a new rule sudo ufw allow from to any port proto comment '' # Remove a rule by number sudo ufw status numbered sudo ufw --force delete # Temporarily disable all rules (emergency) sudo ufw disable # Re-enable sudo ufw enable # Reload after editing /etc/ufw/ files directly sudo ufw reload ``` ## What to do if the firewall locks you out Worst case: you apply a rule that blocks your own SSH, UFW enables it immediately, and you can't log back in. Recovery: 1. Hetzner Cloud Console → Server → Rescue mode 2. Boot into rescue, mount the disk 3. Edit `/etc/ufw/user.rules` to remove the bad rule 4. Reboot back into normal mode This has never happened to us but it's the escape hatch. The Console is always a TLS login away. ## References - [UFW man page][ufw-man] - [K3s networking requirements][k3s-reqs] - [Kubernetes ports and protocols][k8s-ports] - [Cloudflare IP ranges][cf-ips] [ufw-man]: https://manpages.ubuntu.com/manpages/noble/en/man8/ufw.8.html [k3s-reqs]: https://docs.k3s.io/installation/requirements#networking [k8s-ports]: https://kubernetes.io/docs/reference/networking/ports-and-protocols/ [cf-ips]: https://www.cloudflare.com/ips/