Migrate prod deploy from Swarm to K3s; add full deployment book

Infrastructure: - Stack now runs on K3s v1.34.6 HA (3 Hetzner CX33 nodes as managers) - Traefik DaemonSet + hostNetwork replaces Caddy + ingress mesh - All manifests in deploy-k3s/manifests/; Swarm config (deploy/) kept temporarily for reference Bug fixes surfaced during migration: - Dockerfile: golang:1.24-alpine -> 1.25-alpine (go.mod requires 1.25) - cache_service.go: remove sync.Once reassignment from inside Do() callback (was causing 'unlock of unlocked mutex' fatal after Redis Ping failure) - router.go: relax CSP from 'default-src none' to 'default-src self' + allowlist fonts.googleapis.com so the marketing landing page CSS actually loads in browsers - deploy/scripts/deploy_prod.sh: use docker buildx with --platform linux/amd64 so arm64 (Apple Silicon) dev machines produce images runnable on x86_64 Hetzner nodes; fix array expansion under set -u - deploy/swarm-stack.prod.yml: fix secret source references to use top-level aliases (the '\${X_SECRET}' form never actually resolved); dozzle ports: long-form host_ip is rejected by Swarm, switched to short-form (bound to 0.0.0.0 with UFW-based loopback restriction); worker replicas 2 -> 1 (Asynq scheduler singleton) - deploy-k3s/manifests/admin/deployment.yaml: probe path '/admin/' -> '/' (Next.js serves at root; /admin/ returned 404 and killed pods); startupProbe failureThreshold 12 -> 24 - deploy-k3s/manifests/pod-disruption-budgets.yaml: worker minAvailable 1 -> 0 (singleton) - deploy-k3s/manifests/api/deployment.yaml: startupProbe failureThreshold 12 -> 48 (MigrateWithLock serializes across 3 replicas on first-boot; real startup takes up to 240s) - .gitignore: tighten 'api' -> '/api' (was matching deploy-k3s/manifests/api/ and admin/src/app/api/*, hiding legitimate files) New files: - deploy-k3s/manifests/traefik-helmchartconfig.yaml: DaemonSet + hostNetwork override for k3s-bundled Traefik - deploy-k3s/manifests/ingress/ingress-simple.yaml: plain Ingress without TLS (CF Flexible SSL) and without middleware - deploy-k3s/MIGRATION_NOTES.md: operator-facing migration log Documentation: - docs/deployment/ — full deployment book, 26 files, ~42k words: - Part I Overview, infrastructure, orchestrator choice (Ch 0-2) - Part II Networking, firewall, Cloudflare (Ch 3-4, 13) - Part III Security, Traefik ingress (Ch 5-6) - Part IV Services, DB, storage, secrets, registry (Ch 7-11) - Part V Data flow, deploy process, observability, failures, runbook (Ch 12, 14-17) - Part VI Cost, Swarm postmortem, roadmap (Ch 18-20) - Appendices: glossary, kubectl cheat sheet, file locations, consolidated citations - README.md: Production Deployment section replaced with pointer to the book; Go version bumped to 1.25 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 07:20:21 -05:00
parent 4ec4bbbfe8
commit 6f303dbbaa
46 changed files with 9785 additions and 93 deletions
@@ -0,0 +1,298 @@
+# 08 — Database (Neon Postgres)
+
+## Summary
+
+Authoritative user data lives in a Neon-managed Postgres database in AWS
+us-east-1. Connections use TLS (`DB_SSLMODE=require`). Schema is managed
+via GORM AutoMigrate inside the api binary, coordinated across replicas
+by a Postgres advisory lock to prevent concurrent migration attempts.
+
+## Why Neon
+
+### Decision matrix
+
+At deploy time we considered:
+
+| Option | Setup effort | Monthly cost | Backup/PITR | Scale ceiling | Notes |
+|---|---|---|---|---|---|
+| **Neon Launch** | Zero (managed) | $5-15 | Included | Large | **Picked** |
+| Postgres on a Hetzner VPS | High | $8 (VPS) | Manual | Medium | More ops |
+| AWS RDS | Medium | $30+ | Included | Huge | Overkill, expensive |
+| Supabase Free | Zero | $0 | Limited | Small | Free tier has quota limits |
+| CNPG on our k3s | High (Helm) | $0 (using cluster) | Self-rolled | Medium | Operational burden |
+
+Neon Launch won on:
+- **Serverless**: scales compute to zero when idle (cheap)
+- **Branch databases**: we can create dev/staging branches from prod in seconds
+- **Connection pooling built-in**: PgBouncer on the hostname suffix `-pooler`
+- **Point-in-time recovery** included (paid tier)
+- **Pay-as-you-go** with a $5 minimum — fits a bootstrapped app
+
+### Connection details
+
+| Field | Value |
+|---|---|
+| Hostname | `ep-floral-truth-amttbc5a.c-5.us-east-1.aws.neon.tech` |
+| Port | 5432 |
+| Username | `neondb_owner` |
+| Database | `honeyDue` (case-sensitive!) |
+| TLS mode | `require` (enforced by Neon; app pg driver verifies) |
+| Branch | production (Neon's concept — isolated DB within the project) |
+
+### The database name is case-sensitive
+
+Postgres identifiers are lowercase unless quoted. Neon's UI created the
+database as `"honeyDue"` (quoted, camelCase preserved). In `prod.env` /
+ConfigMap we must use exactly `POSTGRES_DB=honeyDue` — lowercase
+`honeydue` gets a `database "honeydue" does not exist` error. This bit
+us during the initial Swarm deploy (Chapter 19 §Neon DB name).
+
+## Connection pooling
+
+### Why it matters
+
+Postgres is memory-hungry per connection (~5-10 MB each). 3 api replicas
+× `DB_MAX_OPEN_CONNS=25` = up to 75 direct Postgres connections. Add
+the worker's 25. Neon's free tier caps at 100 concurrent connections;
+paid tiers much higher.
+
+### PgBouncer on Neon
+
+Neon provides a built-in PgBouncer at `-pooler` subdomain. Our hostname
+already includes `-pooler` handling in the route, so connections go
+through PgBouncer transparently.
+
+Modes PgBouncer supports:
+- **session** — one server connection held per client session (transparent)
+- **transaction** — server connection released after each transaction (high-throughput)
+- **statement** — per-statement (most aggressive; breaks many features)
+
+Neon's pooler runs in **transaction mode**. This is compatible with GORM
+out of the box (we don't use session-level features like prepared
+statements or session variables).
+
+### Connection pool settings
+
+In `prod.env`:
+
+```
+DB_MAX_OPEN_CONNS=25
+DB_MAX_IDLE_CONNS=10
+DB_MAX_LIFETIME=600s
+```
+
+These are the Go `database/sql` pool settings (GORM uses `database/sql`
+underneath):
+
+- **MaxOpenConns: 25** — at most 25 concurrent connections per replica
+- **MaxIdleConns: 10** — keep up to 10 warm connections ready to reuse
+- **MaxLifetime: 600s** — recycle connections after 10 min (prevents
+  stale state in long-lived connections, good for Neon's idle timeout)
+
+### Worst-case connection count
+
+3 api + 1 worker replicas × 25 conns = 100 peak. Right at Neon free
+tier's ceiling, with zero margin. **This is a real risk** — a spike that
+saturates the pool on all replicas simultaneously would exhaust Neon's
+limit.
+
+Mitigations to consider:
+- Drop `DB_MAX_OPEN_CONNS` to 15 → 60 peak. Safe on free tier.
+- Upgrade to Neon Scale plan (1000+ connections).
+- Rely on Neon's PgBouncer to multiplex — the raw backend connections
+  to Postgres-proper are pooled, not our TCP connections to Neon.
+
+Currently we trust Neon's pooler to handle the multiplexing and run with
+the default 25/10. If we hit connection errors in prod, adjust.
+
+## Schema management
+
+### GORM AutoMigrate
+
+On startup, the Go API's `cmd/api/main.go` calls
+`database.MigrateWithLock()` which:
+
+1. Opens a dedicated Postgres connection
+2. `SELECT pg_advisory_lock(1751412071)` — acquires a session-level
+   advisory lock on a hardcoded key
+3. Calls `db.AutoMigrate(&models.*{})` for every GORM model
+4. `SELECT pg_advisory_unlock(...)` via deferred function
+5. Close the connection
+
+The advisory lock serializes migrations across replicas: when 3 api
+pods start simultaneously, one acquires the lock and migrates; the
+others block on the lock. Once the first finishes (≤2s for already-
+migrated schema, up to 90s on first cold boot), the next acquires and
+sees the schema is current (no-op migrate).
+
+### Why an advisory lock
+
+Without it, concurrent `CREATE TABLE IF NOT EXISTS ...` statements from
+multiple replicas would race — Postgres usually handles it, but GORM's
+AutoMigrate also alters tables (adds columns, indexes) which can deadlock
+under concurrency.
+
+The advisory lock pattern (also used by Rails + Django + Alembic) is the
+canonical solution.
+
+### The lock key
+
+`1751412071` is a hardcoded integer in `internal/database/database.go`.
+Arbitrary but unique — as long as nothing else in the Postgres instance
+uses the same advisory lock key, no conflicts.
+
+### First-boot behavior
+
+On a **fresh database** (new Neon project), the first api pod runs
+through every model's `CREATE TABLE` statement. This is ~50 tables for
+honeyDue and takes ~90 seconds.
+
+On a **warm database** (tables already exist), AutoMigrate is fast —
+typically under 2 seconds. It still runs (GORM checks every model
+against the schema) but finds no work to do.
+
+### Where this bit us
+
+With 3 api pods starting simultaneously and migrations taking 90s first
+time, the lock queue for the last replica is ~180s. We needed a
+startupProbe grace of 240s to cover this without false restart loops.
+See Chapter 7 §startupProbe and Chapter 19 §MigrateWithLock.
+
+### Downside: no schema versioning
+
+AutoMigrate can only *add* — new tables, new columns, new indexes. It
+won't drop columns, rename them, or change types destructively. For
+those we'd need raw SQL migrations (a tool like `golang-migrate` or
+`dbmate`).
+
+Today: we accept that schema changes are additive-only. When we need
+destructive changes, we'd hand-write them.
+
+## What's in the database
+
+Major tables (see `honeyDueAPI-go/internal/models/`):
+
+| Table | Purpose |
+|---|---|
+| `auth_user` | Users (Django legacy name kept for compatibility) |
+| `user_userprofile` | Profile data |
+| `authtoken_token` | API auth tokens |
+| `residence_residence` | Properties users manage |
+| `task_task` | Maintenance tasks |
+| `task_taskcompletion` | Task completion history |
+| `contractor_contractor` | Contractor contacts |
+| `documents_document` | Document records (files in B2) |
+| `notification_notification` | In-app notifications |
+| `subscription_usersubscription` | IAP subscriptions |
+| `admin_users` | Next.js admin panel users |
+
+See `honeyDueAPI-go/docs/TASK_LOGIC_ARCHITECTURE.md` for the task logic
+model details.
+
+## Backup and recovery
+
+### Neon's built-in
+
+Neon Launch includes **point-in-time recovery** within the last 24h
+(longer on Scale plan). To restore:
+
+1. Go to Neon console → project → Backups
+2. Create a branch from a timestamp
+3. Point the app at the new branch (change `DB_HOST` in our ConfigMap)
+
+Done. No tape-wrangling.
+
+### What we don't have
+
+- Off-site backup (if Neon itself is compromised, we have no exfil). A
+  nightly `pg_dump` to Backblaze B2 would close this gap. **TODO**
+  (Chapter 20).
+- Tested DR drills. We've never actually restored from a Neon backup
+  into a new branch and pointed the app at it. Should be routine; hasn't
+  been exercised.
+
+## Migrations from old MyCrib/Casera data
+
+honeyDue originally ran on a Django codebase (MyCrib / Casera-era). The
+schema inherits Django's naming (`app_model` table names, `_id` suffix
+foreign keys). The Go app's GORM models have `TableName()` methods that
+preserve this:
+
+```go
+func (Task) TableName() string { return "task_task" }
+```
+
+This isn't ideal (GORM's default `tasks` would be cleaner), but changing
+would require a migration that renames every table — more risk than
+value.
+
+## Neon regions
+
+Neon's default region for new projects is `aws-us-east-1` (Virginia).
+Our DB is there. Latency from Nuremberg to us-east-1 is **~90-120ms
+round trip**.
+
+This is the slowest hop in our data flow. Every api request that needs
+a DB query (most of them) pays this latency at least once.
+
+**When this matters**: When we start seeing ~200ms+ response times from
+complex endpoints, it's likely DB latency dominant. Options:
+- Migrate Neon to `aws-eu-central-1` (Frankfurt) — shaves ~90ms off
+- Add Redis caching for hot reads (Chapter 7)
+- Read replicas (Neon supports them on paid tiers)
+
+## Environment variables the app reads
+
+From ConfigMap:
+
+| Var | Purpose |
+|---|---|
+| `DB_HOST` | Neon pooler hostname |
+| `DB_PORT` | 5432 |
+| `POSTGRES_USER` | `neondb_owner` |
+| `POSTGRES_DB` | `honeyDue` |
+| `DB_SSLMODE` | `require` |
+| `DB_MAX_OPEN_CONNS` | 25 |
+| `DB_MAX_IDLE_CONNS` | 10 |
+| `DB_MAX_LIFETIME` | `600s` |
+
+From Secret (`honeydue-secrets`):
+
+| Var | Purpose |
+|---|---|
+| `POSTGRES_PASSWORD` | Neon DB password |
+
+## Operator cheat sheet
+
+```bash
+# Connect to Neon from workstation (requires psql + the password)
+PGPASSWORD="<pw>" psql -h ep-floral-truth-amttbc5a.c-5.us-east-1.aws.neon.tech \
+  -U neondb_owner -d honeyDue
+
+# From a pod (lets you debug against the actual in-cluster network path)
+kubectl exec -n honeydue -it deploy/api -- sh
+# inside the pod (no psql by default, but wget + JSON API works)
+wget -qO- http://127.0.0.1:8000/api/health/
+
+# See current migration state (no direct CLI, but the api logs show it)
+kubectl logs -n honeydue deploy/api | grep -i migration
+
+# See active connections (run against Neon)
+SELECT count(*), usename, state, application_name
+FROM pg_stat_activity
+GROUP BY usename, state, application_name;
+```
+
+## References
+
+- [Neon docs][neon-docs]
+- [Neon pricing][neon-pricing]
+- [Postgres advisory locks][pg-locks]
+- [GORM AutoMigrate][gorm-automigrate]
+- [honeyDue task architecture][task-arch] (repo-local)
+
+[neon-docs]: https://neon.com/docs/introduction
+[neon-pricing]: https://neon.com/pricing
+[pg-locks]: https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS
+[gorm-automigrate]: https://gorm.io/docs/migration.html
+[task-arch]: ../../docs/TASK_LOGIC_ARCHITECTURE.md