Migrate prod deploy from Swarm to K3s; add full deployment book

Infrastructure: - Stack now runs on K3s v1.34.6 HA (3 Hetzner CX33 nodes as managers) - Traefik DaemonSet + hostNetwork replaces Caddy + ingress mesh - All manifests in deploy-k3s/manifests/; Swarm config (deploy/) kept temporarily for reference Bug fixes surfaced during migration: - Dockerfile: golang:1.24-alpine -> 1.25-alpine (go.mod requires 1.25) - cache_service.go: remove sync.Once reassignment from inside Do() callback (was causing 'unlock of unlocked mutex' fatal after Redis Ping failure) - router.go: relax CSP from 'default-src none' to 'default-src self' + allowlist fonts.googleapis.com so the marketing landing page CSS actually loads in browsers - deploy/scripts/deploy_prod.sh: use docker buildx with --platform linux/amd64 so arm64 (Apple Silicon) dev machines produce images runnable on x86_64 Hetzner nodes; fix array expansion under set -u - deploy/swarm-stack.prod.yml: fix secret source references to use top-level aliases (the '\${X_SECRET}' form never actually resolved); dozzle ports: long-form host_ip is rejected by Swarm, switched to short-form (bound to 0.0.0.0 with UFW-based loopback restriction); worker replicas 2 -> 1 (Asynq scheduler singleton) - deploy-k3s/manifests/admin/deployment.yaml: probe path '/admin/' -> '/' (Next.js serves at root; /admin/ returned 404 and killed pods); startupProbe failureThreshold 12 -> 24 - deploy-k3s/manifests/pod-disruption-budgets.yaml: worker minAvailable 1 -> 0 (singleton) - deploy-k3s/manifests/api/deployment.yaml: startupProbe failureThreshold 12 -> 48 (MigrateWithLock serializes across 3 replicas on first-boot; real startup takes up to 240s) - .gitignore: tighten 'api' -> '/api' (was matching deploy-k3s/manifests/api/ and admin/src/app/api/*, hiding legitimate files) New files: - deploy-k3s/manifests/traefik-helmchartconfig.yaml: DaemonSet + hostNetwork override for k3s-bundled Traefik - deploy-k3s/manifests/ingress/ingress-simple.yaml: plain Ingress without TLS (CF Flexible SSL) and without middleware - deploy-k3s/MIGRATION_NOTES.md: operator-facing migration log Documentation: - docs/deployment/ — full deployment book, 26 files, ~42k words: - Part I Overview, infrastructure, orchestrator choice (Ch 0-2) - Part II Networking, firewall, Cloudflare (Ch 3-4, 13) - Part III Security, Traefik ingress (Ch 5-6) - Part IV Services, DB, storage, secrets, registry (Ch 7-11) - Part V Data flow, deploy process, observability, failures, runbook (Ch 12, 14-17) - Part VI Cost, Swarm postmortem, roadmap (Ch 18-20) - Appendices: glossary, kubectl cheat sheet, file locations, consolidated citations - README.md: Production Deployment section replaced with pointer to the book; Go version bumped to 1.25 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 07:20:21 -05:00
parent 4ec4bbbfe8
commit 6f303dbbaa
46 changed files with 9785 additions and 93 deletions
@@ -0,0 +1,305 @@
+# 15 — Observability
+
+## Summary
+
+We have minimal observability today: `kubectl logs`, `kubectl top`,
+Cloudflare Analytics, and the Neon dashboard. No Prometheus, no Grafana,
+no centralized log aggregator, no APM. This is adequate for the
+current traffic volume (low) but is a known gap. This chapter documents
+what we *have* and what we'd add as traffic grows.
+
+## What we have
+
+### 1. `kubectl logs`
+
+Every container's stdout/stderr is captured by containerd and readable
+via kubectl:
+
+```bash
+# Live tail from all api pods
+kubectl logs -n honeydue -l app.kubernetes.io/name=api -f --prefix
+
+# Last 100 lines
+kubectl logs -n honeydue -l app.kubernetes.io/name=api --tail=100
+
+# Previous pod's logs (before the most recent restart)
+kubectl logs -n honeydue <pod-name> --previous
+
+# Events (not logs — k8s-level state changes)
+kubectl get events -n honeydue --sort-by=.lastTimestamp
+```
+
+**Retention**: containerd rotates logs when they exceed 10 MB (default).
+Only the last ~20 MB of logs is retained per container, on-disk on the
+node. Once a pod is deleted, its logs are gone.
+
+For persistent log access we'd need aggregation (see §what we'd add).
+
+### 2. `kubectl top`
+
+Pod and node resource usage via metrics-server:
+
+```bash
+kubectl top nodes
+# NAME                CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)
+# ubuntu-8gb-nbg1-1   169m         4%       748Mi           9%
+# ubuntu-8gb-nbg1-2   229m         5%       1043Mi          13%
+# ubuntu-8gb-nbg1-3   124m         3%       770Mi           9%
+
+kubectl top pods -n honeydue
+```
+
+**Retention**: In-memory only. Last few minutes of data. No
+historical view.
+
+### 3. Cloudflare Analytics
+
+CF Dashboard → Analytics & Logs. Per-zone stats:
+- Requests per second
+- Bandwidth
+- Cache hit ratio
+- Top HTTP status codes
+- Top request paths
+- Bot traffic score
+
+All aggregated, no individual request traces. Good for spotting macro
+trends ("suddenly 10× more 502s today"), poor for debugging specific
+issues.
+
+Free tier retention: 7 days of aggregate stats. Pro extends this.
+
+### 4. Neon dashboard
+
+Neon console → project → Monitoring:
+- Compute utilization (CU-hours consumed)
+- Query performance (slow queries)
+- Active connections
+- Storage usage
+
+Good for "is the DB busy?" and "am I close to my free tier limit?"
+Not real-time.
+
+### 5. Kubernetes events
+
+`kubectl get events` shows cluster-level state changes: pod scheduling,
+failures, image pulls, probe failures. Useful for post-mortem on
+deploys.
+
+Retention: events are stored in etcd but default to 1 hour.
+
+## What we don't have (the gap)
+
+### No log aggregation
+
+Individual pod logs are on the node. For multi-pod debugging ("show me
+all api pod logs for user X") we have to:
+
+```bash
+# Query all at once with stern (if installed)
+stern -n honeydue api
+
+# Or for specific pod
+kubectl logs -n honeydue <pod> | grep user_id=12345
+```
+
+This works but doesn't scale. Grep across 3 pods for a specific
+user_id is OK. Across 30 pods, intractable.
+
+**What we'd add**: [Loki](https://grafana.com/oss/loki/) — a lightweight
+log aggregator designed for k8s. ~$0 to self-host; integrates with
+Grafana for queries. Or [Betterstack](https://betterstack.com/logs)
+($10/mo, hosted).
+
+### No metrics/dashboards
+
+`kubectl top` tells us "is this pod hot right now?" but not "has CPU
+been climbing over the past hour?" We'd need:
+
+- **Prometheus** — scrapes metrics from kubelet and pods' `/metrics`
+  endpoints, stores time series
+- **Grafana** — queries Prometheus, renders dashboards
+
+K3s can install these via Helm in ~10 minutes. Adds ~500MB RAM to the
+cluster. Stability and operational load: moderate.
+
+**Alternative**: [Kubernetes Dashboard](https://github.com/kubernetes/dashboard)
+bundled with k3s (disabled by default). Minimal UI over the existing
+metrics API. Cheaper than Prometheus but less queryable.
+
+### No distributed tracing
+
+"This request took 800ms — which hop was slow?" is currently unanswerable
+beyond "the DB query, probably." A real trace would show:
+- TLS handshake time
+- Traefik routing time
+- Go handler time
+- Postgres query time
+- Redis call time
+- Each B2 request time
+
+We'd add OpenTelemetry to the Go app and export to Jaeger/Tempo. Work
+is moderate; value kicks in when we have complex request flows.
+
+### No alerting
+
+No PagerDuty, no Slack webhooks, no email on "api is returning 500s."
+The operator finds out when users complain.
+
+Cheapest fix: [Uptime Kuma](https://github.com/louislam/uptime-kuma)
+(self-hosted) or Better Stack Uptime (free for small teams). Ping
+`https://api.myhoneydue.com/api/health/` every minute; alert if it fails.
+
+### No APM (Application Performance Monitoring)
+
+No request-level profiling. We can't see "which endpoint has the highest
+p99 latency?" or "which SQL query is hot this week?"
+
+Options: Datadog, New Relic, Honeycomb, self-hosted Tempo+Grafana.
+All are meaningful work to set up and cost $$$.
+
+## The app's logging conventions
+
+The Go app uses zerolog and emits structured JSON:
+
+```json
+{
+  "level": "info",
+  "time": "2026-04-24T05:29:40Z",
+  "caller": "/app/cmd/api/main.go:189",
+  "addr": ":8000",
+  "message": "HTTP server listening"
+}
+```
+
+Log levels: `debug`, `info`, `warn`, `error`, `fatal`. Controlled by
+`DEBUG=true|false` in ConfigMap (true sets level to debug, false sets
+level to info).
+
+Every request is logged with:
+- Method, path, status code
+- Request ID (for correlating logs across pods)
+- User ID (if authenticated)
+- Latency
+
+```json
+{
+  "level": "info",
+  "method": "GET",
+  "path": "/api/tasks/",
+  "status": 200,
+  "latency_ms": 42,
+  "user_id": 123,
+  "request_id": "a6b5db35-..."
+}
+```
+
+This is queryable by grep. Better with log aggregation.
+
+## Health endpoints
+
+Each service exposes a health endpoint:
+
+| Service | Endpoint | What it checks |
+|---|---|---|
+| api | `/api/health/` | Process alive (doesn't verify DB) |
+| admin | `/` | Next.js is up |
+| worker | (none public) | Internal Asynq status |
+
+Health endpoints are **shallow** — they return 200 if the process is
+running and listening. They don't try to reach Postgres/Redis/etc.
+Rationale: if Postgres is briefly down, we don't want all api pods to
+start failing liveness and cascade-restart.
+
+## Dozzle (deprecated)
+
+The Swarm era had [Dozzle](https://github.com/amir20/dozzle) — a
+lightweight web UI for Docker logs. Accessible via SSH tunnel to the
+manager node. Not deployed on k3s; `kubectl logs` + `stern` fills the
+niche.
+
+## Kubernetes metrics the k8s API exposes
+
+Even without Prometheus, these are queryable:
+
+```bash
+# Resource metrics (via metrics-server)
+kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
+kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/honeydue/pods
+
+# Core API (k8s state)
+kubectl get --raw /api/v1/namespaces/honeydue/pods/<name>
+
+# Kubelet metrics (per-node; requires tunneling)
+kubectl get --raw /api/v1/nodes/<node>/proxy/metrics
+```
+
+If we ever spin up Prometheus, these are the endpoints it would scrape.
+
+## Future: what to add and when
+
+| Trigger | Add |
+|---|---|
+| 10k+ daily users | Loki + Grafana for logs |
+| 100+ req/s sustained | Prometheus + Grafana for metrics |
+| Performance incidents | OpenTelemetry tracing |
+| Revenue > $5k/mo | Paid monitoring (Datadog or similar) |
+| First production outage | Alerting to phone/Slack |
+
+The overall philosophy: observability is an investment that compounds.
+Add it before you need it, not after. But also don't over-invest at
+idle.
+
+**Next quarter**: set up Uptime Kuma + Loki at minimum.
+
+## Checking what's installed
+
+```bash
+# In kube-system namespace
+kubectl get pods -n kube-system
+# Should see: coredns, metrics-server, traefik, local-path-provisioner,
+# and some k3s-related helm install jobs
+
+# In honeydue namespace
+kubectl get pods -n honeydue
+# api, admin, worker, redis
+
+# No monitoring namespace (yet)
+kubectl get namespaces
+# default, honeydue, kube-node-lease, kube-public, kube-system
+```
+
+## Operator cheat sheet
+
+```bash
+# Tail all logs in the namespace
+kubectl logs -n honeydue --all-containers=true --tail=50 -l app.kubernetes.io/part-of=honeydue
+
+# With stern (if installed: brew install stern)
+stern -n honeydue .
+
+# Follow specific pod, including previous runs
+kubectl logs -n honeydue <pod> -f --previous=false
+
+# Pod resource usage
+kubectl top pods -n honeydue --sort-by=memory
+kubectl top pods -n honeydue --sort-by=cpu
+
+# Events (cluster-wide)
+kubectl get events -A --sort-by=.lastTimestamp | tail -20
+
+# Full state dump for a pod (debugging)
+kubectl describe pod -n honeydue <pod> > /tmp/pod-dump.txt
+kubectl logs -n honeydue <pod> > /tmp/pod-logs.txt
+```
+
+## References
+
+- [Kubernetes metrics-server][ms]
+- [K3s metrics][k3s-metrics]
+- [Loki][loki]
+- [Stern (multi-pod log tail)][stern]
+
+[ms]: https://github.com/kubernetes-sigs/metrics-server
+[k3s-metrics]: https://docs.k3s.io/advanced#enabling-metrics-server
+[loki]: https://grafana.com/oss/loki/
+[stern]: https://github.com/stern/stern