Files
honeyDueAPI/docs/deployment/06-traefik-ingress.md
T
Trey t c77ff07ce9
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
fix(security): remediate 2026-05-12 audit findings (Stages 2–5)
Remediation of the 2026-05-12/13 audits (78 findings + cluster gaps),
tracked in deploy-k3s/SECURITY.md, plus fixes from two independent
post-remediation reviews.

Auth & sessions:
- SHA-256 hashed auth-token storage (C1); prior-token cache eviction on
  re-login (MEDIUM-1)
- local Google JWKS verification, iss/aud/exp checks (C2/C3)
- constant-time login + generic errors (L1/LIVE-L11/LIVE-L13)
- per-account login lockout keyed on distinct source IPs (M5/MEDIUM-3)
- verified-email gating, login rate limiting (LIVE-L19, H1-H3)

IAP & webhooks:
- Apple/Google cross-account replay protection (C5/C6/C10/C13, H5/H6)
- migrations 000003-000006 (token hashing, IAP replay, audit_log +
  webhook_event_log table creation, append-only audit log)

Authorization & races:
- file-ownership owner-OR-member fix (C7), atomic share-code join
  (C9/H9), device-token reassignment (C8/LOW-3)

Secrets & deploy:
- secrets file-mounted at /etc/honeydue/secrets, not env (F8); Redis
  password out of the ConfigMap (HIGH-1); B2 keys reconciled
- digest-pinned images, admin ingress hardening, CSP/HSTS, /metrics
  lockdown; kubeconfig 0600, etcd secrets-encryption, fail2ban +
  unattended-upgrades at provision; secret-rotation runbook

Build, vet, and the full test suite (incl. -race) pass; the goose
migration chain is verified against PostgreSQL 16.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 22:28:33 -05:00

434 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 06 — Traefik Ingress
> **Updated 2026-05-15 (security remediation):** the Traefik middleware set
> changed — `cloudflare-only` + `admin-auth` are now attached to the admin
> ingress, a strict `auth-rate-limit` middleware fronts the auth endpoints
> (via a dedicated `honeydue-api-auth` Ingress), and `security-headers`
> gained COOP/CORP + a 2-year preload HSTS and dropped the deprecated
> `X-XSS-Protection`. `deploy-k3s/SECURITY.md` is the authoritative
> current-state record.
## Summary
Traefik is the reverse proxy that routes external HTTP requests to the
right application pod based on the `Host:` header. We run Traefik v3 as a
Kubernetes DaemonSet with `hostNetwork: true` — each of the three nodes
has its own Traefik pod listening directly on the node's `:80`/`:443`.
Cloudflare round-robins DNS across the three node IPs, so any node can
serve any request. No external load balancer.
## Why Traefik
K3s bundles Traefik by default. The alternatives:
| Option | Pros | Cons |
|---|---|---|
| **Traefik v3 (bundled)** | Zero install, excellent k8s integration, middleware system, active development | Helm-driven config is indirect |
| NGINX Ingress | Most popular, battle-tested | Another thing to install, more config surface |
| HAProxy Ingress | Extremely performant | More hands-on, older docs |
| Caddy | Simple config, auto-HTTPS | `caddy-docker-proxy` / Ingress integration is less mature |
| Envoy / Istio | Most featureful | Massive overkill at our scale |
Traefik came "free" with K3s, does the job, and its
[Swarm provider][traefik-swarm] is what we would have used if we'd
fixed our Swarm architecture. Using it on k3s keeps the mental model
consistent.
## Deployment model
```mermaid
flowchart TB
subgraph CF[Cloudflare edge]
DNS[DNS A records:<br/>api.myhoneydue.com → 3 node IPs<br/>admin.myhoneydue.com → 3 node IPs]
end
subgraph N1[hetzner1]
T1[Traefik pod<br/>hostNetwork:true<br/>:80/:443]
kernel1[Linux kernel<br/>net.ipv4.ip_unprivileged_port_start=0]
end
subgraph N2[hetzner2]
T2[Traefik pod<br/>hostNetwork:true<br/>:80/:443]
kernel2[Linux kernel]
end
subgraph N3[hetzner3]
T3[Traefik pod<br/>hostNetwork:true<br/>:80/:443]
kernel3[Linux kernel]
end
subgraph Cluster[k3s cluster services]
APISvc[api Service :8000]
AdminSvc[admin Service :3000]
end
DNS -. HTTP :80 .-> T1 & T2 & T3
T1 & T2 & T3 -- reverse_proxy --> APISvc & AdminSvc
```
### ASCII fallback
```
Cloudflare DNS
┌───────────────────┐
│ api → 3 IPs │
│ admin→ 3 IPs │
└─────────┬─────────┘
│ HTTP :80
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ hetzner1 │ │ hetzner2 │ │ hetzner3 │
│ Traefik │ │ Traefik │ │ Traefik │
│ :80/443 │ │ :80/443 │ │ :80/443 │
│(hostNet) │ │(hostNet) │ │(hostNet) │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
└── ClusterIP ──────┼── ClusterIP ──────┘
┌────────────────────────┐
│ api Service :8000 │
│ admin Service :3000 │
└────────────────────────┘
```
## Why DaemonSet + hostNetwork
**What we're trying to achieve**: Any public-facing node should answer
:80/:443. Cloudflare round-robins DNS; whichever node it picks, that
node must serve.
**The default k3s Traefik deployment** is a single-replica Deployment
exposed via a LoadBalancer Service. That requires either:
- Hetzner Load Balancer (+ $8.49/mo, another thing to manage), **or**
- K3s' built-in `servicelb` (klipper-lb) which binds node ports
dynamically to proxy to the Service
Neither was quite what we wanted. With three replicas of the stock Traefik
behind klipper-lb, each Traefik pod is reachable but there's an extra hop
through klipper's proxy daemon.
**DaemonSet + hostNetwork** is cleaner: each Traefik pod *is* the host's
:80/:443. No proxy daemon, no LB Service, no VIP. Cloudflare DNS →
node IP → kernel → Traefik, one hop.
### Trade-offs of hostNetwork
**Pro:**
- One fewer layer of indirection; lower latency
- No Service needed; no kube-proxy in the ingress path
- Standard Cloudflare round-robin DNS is the failover mechanism
**Con:**
- Traefik is in the host netns; it sees the node's interfaces, not
the cluster overlay
- Traefik still joins the cluster-DNS resolution (via `hostNetwork`'s
default DNS policy) so it can resolve Service names like `api`
- Port conflicts possible if anything else wants :80/:443 on the node
(nothing else does in our setup)
### Trade-offs of DaemonSet
**Pro:**
- One Traefik per node; matches our Cloudflare 3-IP round-robin
exactly
- Any node down = Cloudflare's origin health checks route around it
**Con:**
- Updates require `maxUnavailable > 0` (host ports conflict during
surge) → brief moment where one node is down during rollout
- 3× the memory usage vs. 1-replica Deployment (but Traefik is tiny
— ~128 MB total across all three)
## Our Traefik configuration
We reconfigure the bundled K3s Traefik via a `HelmChartConfig`. K3s
uses the `helm-controller` to manage bundled addons; `HelmChartConfig`
lets us override values without disabling-and-replacing the chart.
Full config at
`deploy-k3s/manifests/traefik-helmchartconfig.yaml`. Key settings:
```yaml
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: traefik
namespace: kube-system
spec:
valuesContent: |-
deployment:
kind: DaemonSet # was Deployment
hostNetwork: true
service:
enabled: false # no LoadBalancer Service
ports:
web:
port: 80
hostPort: 80
websecure:
port: 443
hostPort: 443
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 0
securityContext:
capabilities:
drop: [ALL]
add: [NET_BIND_SERVICE]
readOnlyRootFilesystem: true
runAsGroup: 65532
runAsNonRoot: true
runAsUser: 65532
additionalArguments:
- "--entrypoints.web.forwardedHeaders.trustedIPs=<CF ranges>"
```
### Why each setting
- **`kind: DaemonSet`** — one Traefik per node. Default is a Deployment
with 1 replica.
- **`hostNetwork: true`** — Traefik runs in the host's network namespace
so it can bind real :80/:443 on the node.
- **`service.enabled: false`** — no LoadBalancer Service is created.
With `hostNetwork`, we don't need one.
- **`ports.*.hostPort`** — explicit host port binding. Matches the
container port (DaemonSet semantics with `hostPort: 80` ensure the
kubelet schedules at most one Traefik per node).
- **`updateStrategy.maxUnavailable: 1, maxSurge: 0`** — we accept one
node being down during a Traefik update (host port can't be shared).
The Traefik Helm chart rejects this config combination with
`maxSurge > 0` — this was the second config iteration.
- **Security context** — non-root (UID 65532), read-only root filesystem,
only `NET_BIND_SERVICE` capability. See Chapter 5.
- **`forwardedHeaders.trustedIPs`** — Cloudflare's IP ranges. Traefik
trusts `X-Forwarded-Proto` et al. only from these ranges, so a
bypassing client can't spoof the proto header.
### Forwarded-headers trustedIPs
The full list of trusted CF ranges is in our `additionalArguments`. It's
the union of CF's published IPv4 and IPv6 ranges. When Cloudflare passes
a request to origin, it adds `X-Forwarded-For` and `X-Forwarded-Proto`
headers; Traefik only honors these if the request came from one of these
IPs. Every other client's headers are ignored.
If CF publishes new IP ranges (rare but possible), the
`trustedIPs` list needs updating. It's a raw string in our
HelmChartConfig — we'd need to edit, apply, and bump the helm job.
## Traefik v3 vs v2
K3s ships Traefik v3 (currently `3.6.10`). The v2 → v3 migration
changed a few things:
- `swarmMode` removed (replaced by a `swarm` provider, but we don't
use Swarm anyway)
- Encoded-character handling changed (v3 warns about RFC 3986 handling;
we ignore the warning)
- Middleware CRD group is `traefik.io/v1alpha1` (was `containo.us`)
Our deployment handles all of this automatically via the bundled
chart.
## Ingress resources
We define two standard k8s `Ingress` resources in
`deploy-k3s/manifests/ingress/ingress-simple.yaml`:
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: honeydue-api
namespace: honeydue
spec:
ingressClassName: traefik
rules:
- host: api.myhoneydue.com
http:
paths:
- path: /
pathType: Prefix
backend:
service: {name: api, port: {number: 8000}}
- host: myhoneydue.com
http:
paths:
- path: /
pathType: Prefix
backend:
service: {name: api, port: {number: 8000}}
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: honeydue-admin
namespace: honeydue
spec:
ingressClassName: traefik
rules:
- host: admin.myhoneydue.com
http:
paths:
- path: /
pathType: Prefix
backend:
service: {name: admin, port: {number: 3000}}
```
Traefik watches for Ingress resources with `ingressClassName: traefik`
and programs its router table accordingly. Changes are applied within
seconds — no restart needed.
### What pathType: Prefix means
Every request starting with `/` matches (which is everything). Alternative
is `Exact` (matches only the literal path). `Prefix` is the default for
most Ingress controllers and matches how users think about URL routing.
## How requests flow
1. **Cloudflare DNS** resolves `api.myhoneydue.com` to a CF edge IP
(client never sees the three origin IPs — CF proxies).
2. **Cloudflare edge** terminates TLS from the browser, then opens a
fresh TCP to one of the origin IPs on `:443` (SSL=Full (strict)).
Say it picks `178.105.32.198` (hetzner2).
3. **UFW on hetzner2** accepts the SYN — the source IP is in one of
the 15 CF IPv4 CIDRs allowed on `:443`. (Any non-CF source IP is
dropped at the kernel.)
4. **Linux kernel** sees a listener on `0.0.0.0:443` (the Traefik pod,
hostNetwork). Hands off the SYN.
5. **Traefik accepts** the connection, completes the TLS handshake
using the `cloudflare-origin-cert` secret (CF Origin CA — CF
verifies this chain on its side). Reads the plaintext HTTP request.
6. **Traefik matches** the `Host:` header against its router table.
`Host: api.myhoneydue.com``honeydue-api` Ingress → `api` Service.
Attached middlewares (`security-headers`, `rate-limit`) run here.
7. **Traefik dials** `10.43.167.83:8000` (api Service ClusterIP). This
goes through the cluster DNS (CoreDNS) and kube-proxy (IPVS).
8. **kube-proxy IPVS** rewrites the destination to a live api pod endpoint
— say `10.42.2.6:8000` (api pod on hetzner3).
9. **Flannel VXLAN** encapsulates the packet and sends to hetzner3
(UDP :8472 between node IPs).
10. **hetzner3's kernel** decapsulates, delivers to the api pod.
11. **api pod** processes, returns response.
12. **Response flows back** the reverse path.
Cloudflare caches 200 responses at the edge (default TTL varies; for
HTML/JSON usually 0 unless we set `Cache-Control` headers). So the
second request for the same URL might not reach the origin at all.
## Middleware (mostly unused)
Traefik supports middleware — small functions run before/after the proxy.
The `deploy-k3s/manifests/ingress/middleware.yaml` scaffold defines:
- **`rate-limit`** — 100 req/min average, 200 burst
- **`security-headers`** — HSTS, X-Frame-Options, CSP, etc.
- **`cloudflare-only`** — IP allowlist restricting origin to CF ranges
- **`admin-auth`** — HTTP basic auth for admin panel
**None of these are currently attached to our Ingresses.** To enable,
add the `traefik.ingress.kubernetes.io/router.middlewares` annotation to
the Ingress:
```yaml
metadata:
annotations:
traefik.ingress.kubernetes.io/router.middlewares: honeydue-security-headers@kubernetescrd,honeydue-rate-limit@kubernetescrd
```
We left them off to minimize surface area for the first week of the new
cluster. Enabling is TODO in Chapter 20.
## Traefik dashboard
Disabled. The Traefik dashboard (`/dashboard/` and `/api/`) exposes
runtime state and is potentially information leaky. The bundled k3s
Traefik disables it by default, and we haven't re-enabled it.
If needed for debugging:
```bash
# Port-forward to a Traefik pod
kubectl port-forward -n kube-system daemonset/traefik 9000:9000
# (the chart exposes the dashboard on :9000 when enabled)
# Then visit http://localhost:9000/dashboard/
```
This requires kubectl access and isn't exposed publicly.
## Version pinning
We take whatever Traefik version is bundled with K3s (currently 3.6.10).
The bundled chart is pinned to a specific version in K3s' release notes;
when we upgrade K3s the Traefik version can change. If that ever breaks
something, we can pin a specific version via the HelmChartConfig's
`version` field:
```yaml
spec:
version: 39.0.501+up39.0.5 # specific chart version
```
## Limitations we accept
- **No sticky sessions.** Every request to `api.myhoneydue.com` can go
to a different pod. Our Go API is stateless — this is fine.
- **No canary deployments** (yet). Traefik supports weighted routing
via its CRDs (`TraefikService`) but we don't use them. TODO if/when
we do gradual rollouts.
- **No mTLS.** Traefik supports mutual TLS client auth for sensitive
endpoints. We don't use it.
- **Single ingress class.** Everything goes through the same Traefik.
For multi-tenant setups we'd want separate ingress classes with
separate policies.
## Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
| 404 from Traefik | Ingress doesn't match `Host:` | Check Ingress host field, DNS |
| 502 from Traefik | Backend Service has no endpoints | `kubectl get endpoints -n honeydue` |
| 503 from Traefik | Circuit breaker / backend unhealthy | Check pod logs, readiness probe |
| 504 from Traefik | Backend slow | Check pod CPU/memory, DB connections |
| Connection refused at 80 | Traefik pod not running or kernel not listening | `kubectl get pods -n kube-system -l app.kubernetes.io/name=traefik`; `ssh deploy@node 'ss -lntp | grep :80'` |
| Mixed content error in browser | `X-Forwarded-Proto` not honored by app | Check `trustedIPs` includes CF; check app reads the header |
## Operator cheat sheet
```bash
# Traefik pods per node
kubectl get pods -n kube-system -l app.kubernetes.io/name=traefik -o wide
# Traefik logs (all pods)
kubectl logs -n kube-system -l app.kubernetes.io/name=traefik --tail=50 --prefix
# Ingress status
kubectl get ingress -n honeydue
# List all routers Traefik sees (requires dashboard or API)
kubectl exec -n kube-system daemonset/traefik -- traefik healthcheck
# Re-apply config
kubectl apply -f deploy-k3s/manifests/traefik-helmchartconfig.yaml
kubectl delete job -n kube-system helm-install-traefik # triggers reinstall
# Restart all Traefik pods
kubectl rollout restart daemonset/traefik -n kube-system
```
## References
- [Traefik v3 docs][traefik]
- [Traefik Swarm provider][traefik-swarm]
- [K3s Traefik customization][k3s-traefik]
- [HelmChartConfig docs][k3s-helm]
- [Cloudflare IP ranges][cf-ips]
[traefik]: https://doc.traefik.io/traefik/v3.6/
[traefik-swarm]: https://doc.traefik.io/traefik/providers/swarm/
[k3s-traefik]: https://docs.k3s.io/networking/networking-services#traefik-ingress-controller
[k3s-helm]: https://docs.k3s.io/helm#customizing-packaged-components-with-helmchartconfig
[cf-ips]: https://www.cloudflare.com/ips/