# Runbook — Secret Rotation Closes audit finding `K3S-F12` (secrets unrotated since cluster bootstrap, no rotation cadence). See `deploy-k3s/SECURITY.md` Stage 2. **Cadence:** rotate every secret at least **annually**. Rotate **immediately** on suspected exposure, on an operator-device loss, or when anyone who has seen a secret leaves the project. **Record keeping:** after each rotation, annotate the secret so the age is visible: ```bash kubectl -n honeydue annotate secret \ honeydue.dev/last-rotated="$(date -u +%Y-%m-%d)" --overwrite ``` --- ## How rotation works Every secret has a **source of truth** on the operator workstation. The deploy scripts read those sources and (re)create the Kubernetes Secrets. Rotation is always: **update the source → re-run `02-setup-secrets.sh` → restart the pods that consume it → revoke the old credential at its provider.** `02-setup-secrets.sh` uses `kubectl apply` (via `--dry-run=client -o yaml`), so re-running it is idempotent and only changes what you changed. | Kubernetes Secret | Source of truth | Consumed by | |---|---|---| | `honeydue-secrets` → `POSTGRES_PASSWORD` | `deploy-k3s/secrets/postgres_password.txt` | api, worker | | `honeydue-secrets` → `SECRET_KEY` | `deploy-k3s/secrets/secret_key.txt` | api, worker | | `honeydue-secrets` → `EMAIL_HOST_PASSWORD` | `deploy-k3s/secrets/email_host_password.txt` | api, worker | | `honeydue-secrets` → `FCM_SERVER_KEY` | `deploy-k3s/secrets/fcm_server_key.txt` | api, worker | | `honeydue-secrets` → `REDIS_PASSWORD` | `config.yaml` key `redis.password` | api, worker, redis | | `honeydue-secrets` → `OBS_INGEST_TOKEN` | `deploy/prod.env` | api, worker | | `honeydue-apns-key` → `apns_auth_key.p8` | `deploy-k3s/secrets/apns_auth_key.p8` | api, worker | | `cloudflare-origin-cert` | `deploy-k3s/secrets/cloudflare-origin.{crt,key}` | Traefik ingress | | `ghcr-credentials` | `config.yaml` block `registry.*` | image pulls (all pods) | | `admin-basic-auth` | `config.yaml` keys `admin.basic_auth_user` / `..._password` | Traefik `admin-auth` middleware | The `deploy-k3s/secrets/` directory and `config.yaml` are **gitignored** — never commit them. --- ## Standard rotation procedure ```bash cd honeyDueAPI-go export KUBECONFIG="$(pwd)/deploy-k3s/kubeconfig" # 1. Update the source (file under deploy-k3s/secrets/ or a config.yaml key) # 2. Recreate the Kubernetes Secrets from sources ./deploy-k3s/scripts/02-setup-secrets.sh # 3. Restart the consumers (see per-secret notes below for which) kubectl -n honeydue rollout restart deploy/api deploy/worker # 4. Confirm health kubectl -n honeydue rollout status deploy/api kubectl -n honeydue rollout status deploy/worker # 5. Revoke the OLD credential at its provider (see per-secret notes) # 6. Annotate the rotated secret with today's date ``` --- ## Per-secret notes ### `POSTGRES_PASSWORD` 1. Rotate the role password in the Neon dashboard. 2. Write the new value to `deploy-k3s/secrets/postgres_password.txt`. 3. `02-setup-secrets.sh`, then `rollout restart deploy/api deploy/worker`. 4. Watch logs for connection errors; the old password stops working the moment Neon applies the change, so do steps 2–3 promptly. ### `SECRET_KEY` ⚠️ user-visible This signs auth tokens. **Rotating it logs every user out** — all existing tokens become invalid and every client must re-authenticate. 1. Generate: `openssl rand -hex 32`. 2. Write to `deploy-k3s/secrets/secret_key.txt` (must be ≥32 chars — the script enforces this; the app refuses to start in production without it). 3. `02-setup-secrets.sh`, then `rollout restart deploy/api deploy/worker`. - Only rotate on a schedule or on suspected compromise — not casually. - A future improvement (overlap window via a key-id header) would let old tokens validate during the transition; not implemented today. ### `EMAIL_HOST_PASSWORD` 1. Generate a new app password in Fastmail; keep the old one alive briefly. 2. Write to `deploy-k3s/secrets/email_host_password.txt`. 3. `02-setup-secrets.sh`, `rollout restart deploy/api deploy/worker`. 4. Delete the old Fastmail app password. ### `FCM_SERVER_KEY` 1. Rotate the key in the Firebase console. 2. Write to `deploy-k3s/secrets/fcm_server_key.txt`. 3. `02-setup-secrets.sh`, `rollout restart deploy/api deploy/worker`. ### `REDIS_PASSWORD` Source is `config.yaml` key `redis.password` (hex only — it is embedded in the `REDIS_URL`, so non-hex characters would break URL parsing). 1. Generate: `openssl rand -hex 32`. 2. Set `redis.password` in `config.yaml`. 3. `02-setup-secrets.sh`. 4. Restart **redis as well as** api/worker so the new `--requirepass` and the new `REDIS_URL` land together: `kubectl -n honeydue rollout restart deploy/redis deploy/api deploy/worker`. Expect a few seconds where api/worker reconnect. ### `apns_auth_key.p8` 1. Revoke the key in the Apple Developer console, generate a new `.p8`. 2. Replace `deploy-k3s/secrets/apns_auth_key.p8`. 3. `02-setup-secrets.sh`, `rollout restart deploy/api deploy/worker`. 4. If the Key ID changed, update `push.apns_key_id` in `config.yaml` too. ### `cloudflare-origin-cert` 1. Generate a new Origin CA certificate in the Cloudflare dashboard. 2. Replace `deploy-k3s/secrets/cloudflare-origin.crt` and `.key`. 3. `02-setup-secrets.sh`. Traefik picks up the new TLS secret; no app restart needed. Verify the served cert with `openssl s_client`. ### `ghcr-credentials` (Gitea registry) 1. Generate a new PAT in Gitea (scope: `read:packages`). 2. Update the `registry.token` value in `config.yaml`. 3. `02-setup-secrets.sh`. No restart needed unless a pull is pending. 4. Revoke the old PAT in Gitea. ### `admin-basic-auth` Source is `config.yaml` keys `admin.basic_auth_user` / `basic_auth_password`. 1. Set a new password (e.g. `openssl rand -hex 24`). 2. `02-setup-secrets.sh` regenerates the bcrypt htpasswd secret. 3. No app restart needed — Traefik reloads the `admin-auth` middleware. 4. Distribute the new credential to whoever uses the admin panel. --- ## After any rotation - Run `./deploy-k3s/scripts/04-verify.sh` and confirm no `✗` lines. - Annotate the rotated secret (see "Record keeping" above). - If the rotation was due to a compromise, also follow the relevant playbook in `deploy-k3s/SECURITY.md` → Appendix (Incident response).