Files
Trey t 6f303dbbaa
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Migrate prod deploy from Swarm to K3s; add full deployment book
Infrastructure:
- Stack now runs on K3s v1.34.6 HA (3 Hetzner CX33 nodes as managers)
- Traefik DaemonSet + hostNetwork replaces Caddy + ingress mesh
- All manifests in deploy-k3s/manifests/; Swarm config (deploy/) kept
  temporarily for reference

Bug fixes surfaced during migration:
- Dockerfile: golang:1.24-alpine -> 1.25-alpine (go.mod requires 1.25)
- cache_service.go: remove sync.Once reassignment from inside Do()
  callback (was causing 'unlock of unlocked mutex' fatal after
  Redis Ping failure)
- router.go: relax CSP from 'default-src none' to 'default-src self'
  + allowlist fonts.googleapis.com so the marketing landing page CSS
  actually loads in browsers
- deploy/scripts/deploy_prod.sh: use docker buildx with
  --platform linux/amd64 so arm64 (Apple Silicon) dev machines produce
  images runnable on x86_64 Hetzner nodes; fix array expansion under
  set -u
- deploy/swarm-stack.prod.yml: fix secret source references to use
  top-level aliases (the '\${X_SECRET}' form never actually resolved);
  dozzle ports: long-form host_ip is rejected by Swarm, switched to
  short-form (bound to 0.0.0.0 with UFW-based loopback restriction);
  worker replicas 2 -> 1 (Asynq scheduler singleton)
- deploy-k3s/manifests/admin/deployment.yaml: probe path '/admin/' -> '/'
  (Next.js serves at root; /admin/ returned 404 and killed pods);
  startupProbe failureThreshold 12 -> 24
- deploy-k3s/manifests/pod-disruption-budgets.yaml: worker minAvailable
  1 -> 0 (singleton)
- deploy-k3s/manifests/api/deployment.yaml: startupProbe failureThreshold
  12 -> 48 (MigrateWithLock serializes across 3 replicas on first-boot;
  real startup takes up to 240s)
- .gitignore: tighten 'api' -> '/api' (was matching deploy-k3s/manifests/api/
  and admin/src/app/api/*, hiding legitimate files)

New files:
- deploy-k3s/manifests/traefik-helmchartconfig.yaml: DaemonSet +
  hostNetwork override for k3s-bundled Traefik
- deploy-k3s/manifests/ingress/ingress-simple.yaml: plain Ingress
  without TLS (CF Flexible SSL) and without middleware
- deploy-k3s/MIGRATION_NOTES.md: operator-facing migration log

Documentation:
- docs/deployment/ — full deployment book, 26 files, ~42k words:
  - Part I Overview, infrastructure, orchestrator choice (Ch 0-2)
  - Part II Networking, firewall, Cloudflare (Ch 3-4, 13)
  - Part III Security, Traefik ingress (Ch 5-6)
  - Part IV Services, DB, storage, secrets, registry (Ch 7-11)
  - Part V Data flow, deploy process, observability, failures, runbook
    (Ch 12, 14-17)
  - Part VI Cost, Swarm postmortem, roadmap (Ch 18-20)
  - Appendices: glossary, kubectl cheat sheet, file locations,
    consolidated citations
- README.md: Production Deployment section replaced with pointer to
  the book; Go version bumped to 1.25

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 07:20:54 -05:00

171 lines
6.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# K3s Migration Notes — 2026-04-24
honeyDue is running on a 3-node K3s HA cluster on the existing Hetzner nodes
(hetzner1/2/3), replacing the previous Docker Swarm deployment.
## Why we migrated
Docker Swarm's libnetwork has a known stale-DNS bug on 29.x
([moby/moby#52265](https://github.com/moby/moby/issues/52265)) that leaves
ghost A-records when tasks migrate between nodes. Single-replica services
(like the admin panel) landed on a ghost IP ~50% of the time → connection
refused → 502. Full stack recreate cleared it, but the bug recurs on every
node-to-node task migration.
K3s uses CoreDNS + containerd with no libnetwork history → the bug class
doesn't exist there. See `docs/SWARM_POSTMORTEM.md` if it exists, or the
research summary in the earlier deploy session.
## Differences from the original `deploy-k3s/` scaffold
The original scaffold assumes a greenfield provision via `hetzner-k3s`,
GHCR for images, Cloudflare origin certs, and a Hetzner Load Balancer.
We reused existing nodes and kept Cloudflare Flexible SSL:
| Setting | Scaffold default | What we did |
|---|---|---|
| Provisioning | `hetzner-k3s` tool creates boxes | Manual k3s install on existing Hetzner boxes |
| Registry | GHCR (`ghcr-credentials`) | Gitea (`gitea-credentials`) via `kubectl create secret docker-registry` |
| Ingress TLS | `cloudflare-origin-cert` Secret | No TLS at origin (CF Flexible) |
| Load balancer | Hetzner LB → nodes | Cloudflare round-robin across 3 node IPs |
| Admin basic auth | `admin-auth` Traefik middleware | Not applied — in-app auth only |
| CF-only IP allowlist | `cloudflare-only` middleware | Not applied — UFW restricts some ports, 80/443 open to anyone who knows node IPs |
| Traefik | LoadBalancer via servicelb | DaemonSet w/ hostNetwork (servicelb disabled); see `traefik-config.yaml` below |
| Worker replicas | 2 | 1 (Asynq scheduler is singleton) |
| API start_period | 12×5s = 60s | 48×5s = 240s (covers migrate + lock queue on first boot) |
| Admin probe path | `/admin/` | `/` (Next.js serves at root) |
## Manifest fixes applied in-repo (already committed)
- `manifests/api/deployment.yaml``startupProbe.failureThreshold: 12 → 48`
- `manifests/admin/deployment.yaml` — probe path `/admin/ → /`, threshold `12 → 24`
- `manifests/worker/deployment.yaml``replicas: 2 → 1`
- `manifests/pod-disruption-budgets.yaml` — worker `minAvailable: 1 → 0`
## Traefik override (applied as HelmChartConfig)
K3s ships Traefik as a single-replica Deployment with a LoadBalancer service.
With servicelb disabled (to avoid binding a random port), we reconfigure it
to a DaemonSet binding directly on each node's public :80/:443 via
`hostNetwork: true`. The HelmChartConfig:
```yaml
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: traefik
namespace: kube-system
spec:
valuesContent: |-
deployment:
kind: DaemonSet
hostNetwork: true
service:
enabled: false
ports:
web:
port: 80
hostPort: 80
websecure:
port: 443
hostPort: 443
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 0
securityContext:
capabilities:
drop: [ALL]
add: [NET_BIND_SERVICE]
readOnlyRootFilesystem: true
runAsGroup: 65532
runAsNonRoot: true
runAsUser: 65532
additionalArguments:
- "--entrypoints.web.forwardedHeaders.trustedIPs=173.245.48.0/20,103.21.244.0/22,103.22.200.0/22,103.31.4.0/22,141.101.64.0/18,108.162.192.0/18,190.93.240.0/20,188.114.96.0/20,197.234.240.0/22,198.41.128.0/17,162.158.0.0/15,104.16.0.0/13,104.24.0.0/14,172.64.0.0/13,131.0.72.0/22"
```
Apply with `kubectl apply -f traefik-config.yaml`, then bump the helm job
(`kubectl delete job -n kube-system helm-install-traefik`) to trigger reinstall.
## Required node-level sysctl
hostNetwork pods with capabilities don't get CAP_NET_BIND_SERVICE in the
host netns on modern containerd. Set on each node:
```bash
echo 'net.ipv4.ip_unprivileged_port_start=0' | sudo tee /etc/sysctl.d/99-unprivileged-ports.conf
sudo sysctl --system
```
## UFW rules added for k3s (per node)
All between the 3 node IPs (178.104.247.152, 178.105.32.198, 178.104.249.189):
- `6443/tcp` — kube API
- `2379/tcp`, `2380/tcp` — embedded etcd client + peer
- `10250/tcp` — kubelet
- `8472/udp` — flannel VXLAN overlay
Plus from your workstation IP to each node's `6443/tcp` for `kubectl`.
## Ingress
Minimal hostname-only routing (`/tmp/honeydue-ingress.yaml` at deploy time
— move it into `deploy-k3s/manifests/ingress/` in a follow-up):
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: honeydue-api
namespace: honeydue
spec:
ingressClassName: traefik
rules:
- host: api.myhoneydue.com
http:
paths:
- {path: /, pathType: Prefix, backend: {service: {name: api, port: {number: 8000}}}}
- host: myhoneydue.com
http:
paths:
- {path: /, pathType: Prefix, backend: {service: {name: api, port: {number: 8000}}}}
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: honeydue-admin
namespace: honeydue
spec:
ingressClassName: traefik
rules:
- host: admin.myhoneydue.com
http:
paths:
- {path: /, pathType: Prefix, backend: {service: {name: admin, port: {number: 3000}}}}
```
## Operator access
Kubeconfig lives at `~/.kube/honeydue-k3s.yaml`.
```bash
export KUBECONFIG=~/.kube/honeydue-k3s.yaml
kubectl get pods -n honeydue
```
## Remaining TODOs (not blocking)
- Apply `manifests/ingress/middleware.yaml` for security headers + rate limiting
(CF-only allowlist + basic auth deliberately skipped until you want them)
- Apply `manifests/network-policies.yaml` for default-deny + explicit allows
- Apply `manifests/api/hpa.yaml` if you want autoscaling (metrics-server is
already running, so just `kubectl apply` it)
- Upgrade to CF Full (strict) SSL: generate origin cert, create
`cloudflare-origin-cert` Secret, add `tls:` block back to Ingress
- Set up a proper migration Job so `api` replicas don't each run `MigrateWithLock`
on startup — lets you drop the 240s startupProbe grace
- Remove `deploy/` (the Swarm-era config) once you're confident in k3s