docs(deployment): rewrite migration prose for goose adoption
Update the deployment book and glossary to reflect the goose-based schema migration flow shipped in 12b2f9d/0f7450a: - ch07: clarify startup probe assumes migrations ran out-of-band - ch08: drop AutoMigrate-with-advisory-lock prose; describe goose Job - ch12: pod startup checks goose_db_version, no longer runs migrations - ch14: document the Job→wait→roll deploy gate and how to debug failures - ch16: add "Migrate Job fails during deploy" + "Schema precondition failed" failure modes - ch17: new runbook entries §26 (run migrations manually), §27 (recover from failed/dirty migration), §28 (bootstrap goose on fresh clone) - ch19: postscript on §13 noting MigrateWithLock approach is superseded - ch20: mark "Migration Job for schema changes" task done - glossary: add `goose` and `goose_db_version`; flag AutoMigrate as tests-only - references: add goose links; flag AutoMigrate as tests-only
This commit is contained in:
@@ -317,10 +317,47 @@ Timeline (approximate, warm state):
|
||||
- t=60s: another old pod terminates
|
||||
- ...continues until all on new RS
|
||||
|
||||
For cold-boot (e.g., first deploy on a rebuilt cluster), the
|
||||
MigrateWithLock advisory lock extends this to several minutes. But the
|
||||
rollout is serialized — only one pod starts per iteration, so the lock
|
||||
queue is small.
|
||||
Migrations run as a separate Kubernetes Job that completes before any
|
||||
api/worker pod is rolled. So the rollout above never includes migration
|
||||
work — pods that boot are guaranteed to find the schema already at the
|
||||
expected version. See §"Migrations are gated, not interleaved" below.
|
||||
|
||||
## Migrations are gated, not interleaved
|
||||
|
||||
`03-deploy.sh` runs `goose up` as a one-shot Job before applying any
|
||||
api/worker manifests:
|
||||
|
||||
```
|
||||
1. kubectl delete job honeydue-migrate (idempotent, removes prior run)
|
||||
2. kubectl apply -f manifests/migrate/job.yaml (with current api image)
|
||||
3. kubectl wait --for=condition=complete --timeout=10m job/honeydue-migrate
|
||||
4. (only if Job succeeded) kubectl apply -f manifests/api/...
|
||||
```
|
||||
|
||||
The Job uses the api image — `/usr/local/bin/goose` is baked in at
|
||||
Dockerfile build time. The Job script strips the `-pooler` segment
|
||||
from `DB_HOST` before connecting (goose's session-scoped advisory
|
||||
lock can't survive PgBouncer transaction-mode), runs `goose up`, exits.
|
||||
|
||||
If the Job fails, the script aborts before any new app pod sees a
|
||||
stale schema. To debug:
|
||||
|
||||
```bash
|
||||
kubectl -n honeydue logs job/honeydue-migrate --tail=200
|
||||
kubectl -n honeydue describe job honeydue-migrate
|
||||
```
|
||||
|
||||
After investigating, fix the migration file and re-run `03-deploy.sh`.
|
||||
The Job is idempotent — successful migrations stay applied, only the
|
||||
new/failed file gets retried.
|
||||
|
||||
api/worker pods run a `RequireSchemaApplied` check at startup that
|
||||
queries `goose_db_version` and refuses to boot if the table is missing
|
||||
or the latest row is `is_applied=false`. This is the fail-fast for
|
||||
"someone bypassed the deploy script and the schema isn't current."
|
||||
|
||||
For full schema management background, see
|
||||
[Chapter 8 §Schema management](./08-database.md).
|
||||
|
||||
## Hotfix workflow
|
||||
|
||||
|
||||
Reference in New Issue
Block a user