Swarm stack - Resource limits on all services, stop_grace_period 60s on api/worker/admin - Dozzle bound to manager loopback only (ssh -L required for access) - Worker health server on :6060, admin /api/health endpoint - Redis 200M LRU cap, B2/S3 env vars wired through to api service Deploy script - DRY_RUN=1 prints plan + exits - Auto-rollback on failed healthcheck, docker logout at end - Versioned-secret pruning keeps last SECRET_KEEP_VERSIONS (default 3) - PUSH_LATEST_TAG default flipped to false - B2 all-or-none validation before deploy Code - cmd/api takes pg_advisory_lock on a dedicated connection before AutoMigrate, serialising boot-time migrations across replicas - cmd/worker exposes an HTTP /health endpoint with graceful shutdown Docs - deploy/DEPLOYING.md: step-by-step walkthrough for a real deploy - deploy/shit_deploy_cant_do.md: manual prerequisites + recurring ops - deploy/README.md updated with storage toggle, worker-replica caveat, multi-arch recipe, connection-pool tuning, renumbered sections Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
314 lines
12 KiB
Markdown
314 lines
12 KiB
Markdown
# Deploy Folder
|
||
|
||
This folder is the full production deploy toolkit for `honeyDueAPI-go`.
|
||
|
||
**Recommended flow — always dry-run first:**
|
||
|
||
```bash
|
||
DRY_RUN=1 ./.deploy_prod # validates everything, prints the plan, no changes
|
||
./.deploy_prod # then the real deploy
|
||
```
|
||
|
||
The script refuses to run until all required values are set.
|
||
|
||
- Step-by-step walkthrough for a real deploy: [`DEPLOYING.md`](./DEPLOYING.md)
|
||
- Manual prerequisites the script cannot automate (Swarm init, firewall,
|
||
Cloudflare, Neon, APNS, etc.): [`shit_deploy_cant_do.md`](./shit_deploy_cant_do.md)
|
||
|
||
## First-Time Prerequisite: Create The Swarm Cluster
|
||
|
||
You must do this once before `./.deploy_prod` can work.
|
||
|
||
1. SSH to manager #1 and initialize Swarm:
|
||
|
||
```bash
|
||
docker swarm init --advertise-addr <manager1-private-ip>
|
||
```
|
||
|
||
2. On manager #1, get join commands:
|
||
|
||
```bash
|
||
docker swarm join-token manager
|
||
docker swarm join-token worker
|
||
```
|
||
|
||
3. SSH to each additional node and run the appropriate `docker swarm join ...` command.
|
||
|
||
4. Verify from manager #1:
|
||
|
||
```bash
|
||
docker node ls
|
||
```
|
||
|
||
## Security Requirements Before Public Launch
|
||
|
||
Use this as a mandatory checklist before you route production traffic.
|
||
|
||
### 1) Firewall Rules (Node-Level)
|
||
|
||
Apply firewall rules to all Swarm nodes:
|
||
|
||
- SSH port (for example `2222/tcp`): your IP only
|
||
- `80/tcp`, `443/tcp`: Hetzner LB only (or Cloudflare IP ranges only if no LB)
|
||
- `2377/tcp`: Swarm nodes only
|
||
- `7946/tcp,udp`: Swarm nodes only
|
||
- `4789/udp`: Swarm nodes only
|
||
- Everything else: blocked
|
||
|
||
### 2) SSH Hardening
|
||
|
||
On each node, harden `/etc/ssh/sshd_config`:
|
||
|
||
```text
|
||
Port 2222
|
||
PermitRootLogin no
|
||
PasswordAuthentication no
|
||
PubkeyAuthentication yes
|
||
AllowUsers deploy
|
||
```
|
||
|
||
### 3) Cloudflare Origin Lockdown
|
||
|
||
- Keep public DNS records proxied (orange cloud on).
|
||
- Point Cloudflare to LB, not node IPs.
|
||
- Do not publish Swarm node IPs in DNS.
|
||
- Enforce firewall source restrictions so public traffic cannot bypass Cloudflare/LB.
|
||
|
||
### 4) Secrets Policy
|
||
|
||
- Keep runtime secrets in Docker Swarm secrets only.
|
||
- Do not put production secrets in git or plain `.env` files.
|
||
- `./.deploy_prod` already creates versioned Swarm secrets from files in `deploy/secrets/`.
|
||
- Rotate secrets after incidents or credential exposure.
|
||
|
||
### 5) Data Path Security
|
||
|
||
- Neon/Postgres: `DB_SSLMODE=require`, strong DB password, Neon IP allowlist limited to node IPs.
|
||
- Backblaze B2: HTTPS only, scoped app keys (not master key), least-privilege bucket access.
|
||
- Swarm overlay: encrypted network enabled in stack (`driver_opts.encrypted: "true"`).
|
||
|
||
### 6) Dozzle Hardening
|
||
|
||
Dozzle exposes the full Docker log stream with no built-in auth — logs contain
|
||
secrets, tokens, and user data. The stack binds Dozzle to `127.0.0.1` on the
|
||
manager node only (`mode: host`, `host_ip: 127.0.0.1`), so it is **not
|
||
reachable from the public internet or from other Swarm nodes**.
|
||
|
||
To view logs, open an SSH tunnel from your workstation:
|
||
|
||
```bash
|
||
ssh -p "${DEPLOY_MANAGER_SSH_PORT}" \
|
||
-L "${DOZZLE_PORT}:127.0.0.1:${DOZZLE_PORT}" \
|
||
"${DEPLOY_MANAGER_USER}@${DEPLOY_MANAGER_HOST}"
|
||
# Then browse http://localhost:${DOZZLE_PORT}
|
||
```
|
||
|
||
Additional hardening if you ever need to expose Dozzle over a network:
|
||
|
||
- Put auth/SSO in front (Cloudflare Access or equivalent).
|
||
- Replace the raw `/var/run/docker.sock` mount with a Docker socket proxy
|
||
limited to read-only log endpoints.
|
||
- Prefer a persistent log aggregator (Loki, Datadog, CloudWatch) for prod —
|
||
Dozzle is ephemeral and not a substitute for audit trails.
|
||
|
||
### 7) Backup + Restore Readiness
|
||
|
||
Treat this as a pre-launch checklist. Nothing below is automated by
|
||
`./.deploy_prod`.
|
||
|
||
- [ ] Postgres PITR path tested in staging (restore a real dump, validate app boots).
|
||
- [x] Redis AOF persistence enabled (`appendonly yes --appendfsync everysec` in stack).
|
||
- [ ] Redis restore path tested (verify AOF replays on a fresh node).
|
||
- [ ] Written runbook for restore + secret rotation (see §4 and `shit_deploy_cant_do.md`).
|
||
- [ ] Named owner for incident response.
|
||
- [ ] Uploads bucket (Backblaze B2) lifecycle / versioning reviewed — deletes are
|
||
handled by the app, not by retention rules.
|
||
|
||
### 8) Storage Backend (Uploads)
|
||
|
||
The stack supports two storage backends. The choice is **runtime-only** — the
|
||
same image runs in both modes, selected by env vars in `prod.env`:
|
||
|
||
| Mode | When to use | Config |
|
||
|---|---|---|
|
||
| **Local volume** | Dev / single-node prod | Leave all `B2_*` empty. Files land on `/app/uploads` via the named volume. |
|
||
| **S3-compatible** (B2, MinIO) | Multi-replica prod | Set all four of `B2_ENDPOINT`, `B2_KEY_ID`, `B2_APP_KEY`, `B2_BUCKET_NAME`. |
|
||
|
||
The deploy script enforces **all-or-none** for the B2 vars — a partial config
|
||
fails fast rather than silently falling back to the local volume.
|
||
|
||
**Why this matters:** Docker Swarm named volumes are **per-node**. With 3 API
|
||
replicas spread across nodes, an upload written on node A is invisible to
|
||
replicas on nodes B and C (the client sees a random 404 two-thirds of the
|
||
time). In multi-replica prod you **must** use S3-compatible storage.
|
||
|
||
The `uploads:` volume is still declared as a harmless fallback: when B2 is
|
||
configured, nothing writes to it. `./.deploy_prod` prints the selected
|
||
backend at the start of each run.
|
||
|
||
### 9) Worker Replicas & Scheduler
|
||
|
||
Keep `WORKER_REPLICAS=1` in `cluster.env` until Asynq `PeriodicTaskManager`
|
||
is wired up. The current `asynq.Scheduler` in `cmd/worker/main.go` has no
|
||
Redis-based leader election, so each replica independently enqueues the
|
||
same cron task — users see duplicate daily digests / onboarding emails.
|
||
|
||
Asynq workers (task consumers) are already safe to scale horizontally; it's
|
||
only the scheduler singleton that is constrained. Future work: migrate to
|
||
`asynq.NewPeriodicTaskManager(...)` with `PeriodicTaskConfigProvider` so
|
||
multiple scheduler replicas coordinate via Redis.
|
||
|
||
### 10) Database Migrations
|
||
|
||
`cmd/api/main.go` runs `database.MigrateWithLock()` on startup, which takes a
|
||
Postgres session-level `pg_advisory_lock` on a dedicated connection before
|
||
calling `AutoMigrate`. This serialises boot-time migrations across all API
|
||
replicas — the first replica migrates, the rest wait, then each sees an
|
||
already-current schema and `AutoMigrate` is a no-op.
|
||
|
||
The lock is released on connection close, so a crashed replica can't leave
|
||
a stale lock behind.
|
||
|
||
For very large schema changes, run migrations as a separate pre-deploy
|
||
step (there is no dedicated `cmd/migrate` binary today — this is a future
|
||
improvement).
|
||
|
||
### 11) Redis Redundancy
|
||
|
||
Redis runs as a **single replica** with an AOF-persisted named volume. If
|
||
the node running Redis dies, Swarm reschedules the container but the named
|
||
volume is per-node — the new Redis boots **empty**.
|
||
|
||
Impact:
|
||
- **Cache** (ETag lookups, static data): regenerates on first request.
|
||
- **Asynq queue**: in-flight jobs at the moment of the crash are lost; Asynq
|
||
retry semantics cover most re-enqueues. Scheduled-but-not-yet-fired cron
|
||
events are re-triggered on the next cron tick.
|
||
- **Sessions / auth tokens**: not stored in Redis, so unaffected.
|
||
|
||
This is an accepted limitation today. Options to harden later: Redis
|
||
Sentinel, a managed Redis (Upstash, Dragonfly Cloud), or restoring from the
|
||
AOF on a pinned node.
|
||
|
||
### 12) Multi-Arch Builds
|
||
|
||
`./.deploy_prod` builds images for the **host** architecture of the machine
|
||
running the script. If your Swarm nodes are a different arch (e.g. ARM64
|
||
Ampere VMs), use `docker buildx` explicitly:
|
||
|
||
```bash
|
||
docker buildx create --use
|
||
docker buildx build --platform linux/arm64 --target api -t <image> --push .
|
||
# repeat for worker, admin
|
||
SKIP_BUILD=1 ./.deploy_prod # then deploy the already-pushed images
|
||
```
|
||
|
||
The Go stages cross-compile cleanly (`TARGETARCH` is already honoured).
|
||
The Node/admin stages require QEMU emulation (`docker run --privileged --rm
|
||
tonistiigi/binfmt --install all` on the build host) since native deps may
|
||
need to be rebuilt for the target arch.
|
||
|
||
### 13) Connection Pool & TLS Tuning
|
||
|
||
Because Postgres is external (Neon/RDS), each replica opens its own pool.
|
||
Sizing matters: total open connections across the cluster must stay under
|
||
the database's configured limit. Defaults in `prod.env.example`:
|
||
|
||
| Setting | Default | Notes |
|
||
|---|---|---|
|
||
| `DB_SSLMODE` | `require` | Never set to `disable` in prod. For Neon use `require`. |
|
||
| `DB_MAX_OPEN_CONNS` | `25` | Per-replica cap. Worst case: 25 × (API+worker replicas). |
|
||
| `DB_MAX_IDLE_CONNS` | `10` | Keep warm connections ready without exhausting the pool. |
|
||
| `DB_MAX_LIFETIME` | `600s` | Recycle before Neon's idle disconnect (typically 5 min). |
|
||
|
||
Worked example with default replicas (3 API + 1 worker — see §9 for why
|
||
worker is pinned to 1):
|
||
|
||
```
|
||
3 × 25 + 1 × 25 = 100 peak open connections
|
||
```
|
||
|
||
That lands exactly on Neon's free-tier ceiling (100 concurrent connections),
|
||
which is risky with even one transient spike. For Neon free tier drop
|
||
`DB_MAX_OPEN_CONNS=15` (→ 60 peak). Paid tiers (Neon Scale, 1000+
|
||
connections) can keep the default or raise it.
|
||
|
||
Operational checklist:
|
||
|
||
- Confirm Neon IP allowlist includes every Swarm node IP.
|
||
- After changing pool sizes, redeploy and watch `pg_stat_activity` /
|
||
Neon metrics for saturation.
|
||
- Keep `DB_MAX_LIFETIME` ≤ Neon idle timeout to avoid "terminating
|
||
connection due to administrator command" errors in the API logs.
|
||
- For read-heavy workloads, consider a Neon read replica and split
|
||
query traffic at the application layer.
|
||
|
||
## Files You Fill In
|
||
|
||
Paste your values into these files:
|
||
|
||
- `deploy/cluster.env`
|
||
- `deploy/registry.env`
|
||
- `deploy/prod.env`
|
||
- `deploy/secrets/postgres_password.txt`
|
||
- `deploy/secrets/secret_key.txt`
|
||
- `deploy/secrets/email_host_password.txt`
|
||
- `deploy/secrets/fcm_server_key.txt`
|
||
- `deploy/secrets/apns_auth_key.p8`
|
||
|
||
If one is missing, the deploy script auto-copies it from its `.example` template and exits so you can fill it.
|
||
|
||
## What `./.deploy_prod` Does
|
||
|
||
1. Validates all required config files and credentials.
|
||
2. Validates the storage-backend toggle (all-or-none for `B2_*`). Prints
|
||
the selected backend (S3 or local volume) before continuing.
|
||
3. Builds and pushes `api`, `worker`, and `admin` images (skip with
|
||
`SKIP_BUILD=1`).
|
||
4. Uploads deploy bundle to your Swarm manager over SSH.
|
||
5. Creates versioned Docker secrets on the manager.
|
||
6. Deploys the stack with `docker stack deploy --with-registry-auth`.
|
||
7. Waits until service replicas converge.
|
||
8. Prunes old secret versions, keeping the last `SECRET_KEEP_VERSIONS`
|
||
(default 3).
|
||
9. Runs an HTTP health check (if `DEPLOY_HEALTHCHECK_URL` is set). **On
|
||
failure, automatically runs `docker service rollback` for every service
|
||
in the stack and exits non-zero.**
|
||
10. Logs out of the registry on both the dev host and the manager so the
|
||
token doesn't linger in `~/.docker/config.json`.
|
||
|
||
## Useful Flags
|
||
|
||
Environment flags:
|
||
|
||
- `DRY_RUN=1 ./.deploy_prod` — validate config and print the deploy plan
|
||
without building, pushing, or touching the cluster. Use this before every
|
||
production deploy to review images, replicas, and secret names.
|
||
- `SKIP_BUILD=1 ./.deploy_prod` — deploy already-pushed images.
|
||
- `SKIP_HEALTHCHECK=1 ./.deploy_prod` — skip final URL check.
|
||
- `DEPLOY_TAG=<tag> ./.deploy_prod` — deploy a specific image tag.
|
||
- `PUSH_LATEST_TAG=true ./.deploy_prod` — also push `:latest` to the registry
|
||
(default is `false` so prod pins to the SHA tag and stays reproducible).
|
||
- `SECRET_KEEP_VERSIONS=<n> ./.deploy_prod` — how many versions of each
|
||
Swarm secret to retain after deploy (default: 3). Older unused versions
|
||
are pruned automatically once the stack converges.
|
||
|
||
## Secret Versioning & Pruning
|
||
|
||
Each deploy creates a fresh set of Swarm secrets named
|
||
`<stack>_<secret>_<deploy_id>` (for example
|
||
`honeydue_secret_key_abc1234_20260413120000`). The stack file references the
|
||
current names via `${POSTGRES_PASSWORD_SECRET}` etc., so rolling updates never
|
||
reuse a secret that a running task still holds open.
|
||
|
||
After the new stack converges, `./.deploy_prod` SSHes to the manager and
|
||
prunes old versions per base name, keeping the most recent
|
||
`SECRET_KEEP_VERSIONS` (default 3). Anything still referenced by a running
|
||
task is left alone (Docker refuses to delete in-use secrets) and will be
|
||
pruned on the next deploy.
|
||
|
||
## Important
|
||
|
||
- `deploy/shit_deploy_cant_do.md` lists the manual tasks this script cannot automate.
|
||
- Keep real credentials and secret files out of git.
|