Files
honeyDueAPI/docs/GO_TO_PROD.md
Trey t 4976eafc6c Rebrand from Casera/MyCrib to honeyDue
Total rebrand across all Go API source files:
- Go module path: casera-api -> honeydue-api
- All imports updated (130+ files)
- Docker: containers, images, networks renamed
- Email templates: support email, noreply, icon URL
- Domains: casera.app/mycrib.treytartt.com -> honeyDue.treytartt.com
- Bundle IDs: com.tt.casera -> com.tt.honeyDue
- IAP product IDs updated
- Landing page, admin panel, config defaults
- Seeds, CI workflows, Makefile, docs
- Database table names preserved (no migration needed)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 06:33:38 -06:00

7.3 KiB

Go To Prod Plan

This document is a phased production-readiness plan for the honeyDue Go API repo. Execute phases in order. Do not skip exit criteria.

How To Use This Plan

  1. Create an issue/epic per phase.
  2. Track each checklist item as a task.
  3. Only advance phases after all exit criteria pass in CI and staging.

Phase 0 - Baseline And Drift Cleanup

Goal: eliminate known repo/config drift before hardening.

Tasks

  1. Fix stale admin build/run targets in Makefile that reference cmd/admin (non-existent).
  2. Align worker env vars in docker-compose.yml with Go config:
    • use TASK_REMINDER_HOUR
    • use OVERDUE_REMINDER_HOUR
    • use DAILY_DIGEST_HOUR
  3. Align supported locales in internal/i18n/i18n.go with translation files in internal/i18n/translations.
  4. Remove any committed secrets/keys from repo and history; rotate immediately.

Validation

  1. go test ./...
  2. go build ./cmd/api ./cmd/worker
  3. docker compose config succeeds.

Exit Criteria

  1. No stale targets or mismatched env keys remain.
  2. CI and local boot work with a single source-of-truth config model.

Phase 1 - Non-Negotiable CI Gates

Goal: block regressions by policy.

Tasks

  1. Update /.github/workflows/backend-ci.yml with required jobs:
    • lint (go vet ./..., gofmt -l .)
    • test (go test -race -count=1 ./...)
    • contract (go test -v -run "TestRouteSpecContract|TestKMPSpecContract" ./internal/integration/)
    • build (go build ./cmd/api ./cmd/worker)
  2. Add govulncheck ./... job.
  3. Add secret scanning (for example, gitleaks).
  4. Set branch protection on main and develop:
    • require PR
    • require all status checks
    • require at least one review
    • dismiss stale reviews on new commits

Validation

  1. Open test PR with intentional formatting error; ensure merge is blocked.
  2. Open test PR with OpenAPI/route drift; ensure merge is blocked.

Exit Criteria

  1. No direct merge path exists without passing all gates.

Phase 2 - Contract, Data, And Migration Safety

Goal: guarantee deploy safety for API behavior and schema changes.

Tasks

  1. Keep OpenAPI as source of truth in docs/openapi.yaml.
  2. Require route/schema updates in same PR as handler changes.
  3. Add migration checks in CI:
    • migrate up on clean DB
    • migrate down one step
    • migrate up again
  4. Add DB constraints for business invariants currently enforced only in service code.
  5. Add idempotency protections for webhook/job handlers.

Validation

  1. Run migration smoke test pipeline against ephemeral Postgres.
  2. Re-run integration contract tests after each endpoint change.

Exit Criteria

  1. Schema changes are reversible and validated before merge.
  2. API contract drift is caught pre-merge.

Phase 3 - Test Hardening For Failure Modes

Goal: increase confidence in edge cases and concurrency.

Tasks

  1. Add table-driven tests for task lifecycle transitions:
    • cancel/uncancel
    • archive/unarchive
    • complete/quick-complete
    • recurring next due date transitions
  2. Add timezone boundary tests around midnight and DST.
  3. Add concurrency tests for race-prone flows in services/repositories.
  4. Add fuzz/property tests for:
    • task categorization predicates
    • reminder schedule logic
  5. Add unauthorized-access tests for media/document/task cross-residence access.

Validation

  1. go test -race -count=1 ./... stays green.
  2. New tests fail when logic is intentionally broken (mutation spot checks).

Exit Criteria

  1. High-risk flows have explicit edge-case coverage.

Phase 4 - Security Hardening

Goal: reduce breach and abuse risk.

Tasks

  1. Add strict request size/time limits for upload and auth endpoints.
  2. Add rate limits for:
    • login
    • forgot/reset password
    • verification endpoints
    • webhooks
  3. Ensure logs redact secrets/tokens/PII payloads.
  4. Enforce least-privilege for runtime creds and service accounts.
  5. Enable dependency update cadence with security review.

Validation

  1. Abuse test scripts for brute-force and oversized payload attempts.
  2. Verify logs do not expose secrets under failure paths.

Exit Criteria

  1. Security scans pass and abuse protections are enforced in runtime.

Phase 5 - Observability And Operations

Goal: make production behavior measurable and actionable.

Tasks

  1. Standardize request correlation IDs across API and worker logs.
  2. Define SLOs:
    • API availability
    • p95 latency for key endpoints
    • worker queue delay
  3. Add dashboards + alerts for:
    • 5xx error rate
    • auth failures
    • queue depth/retry spikes
    • DB latency
  4. Add dead-letter queue review and replay procedure.
  5. Document incident runbooks in docs/:
    • DB outage
    • Redis outage
    • push provider outage
    • webhook backlog

Validation

  1. Trigger synthetic failures in staging and confirm alerts fire.
  2. Execute at least one incident drill and capture MTTR.

Exit Criteria

  1. Team can detect and recover from common failures quickly.

Phase 6 - Performance And Capacity

Goal: prove headroom before production growth.

Tasks

  1. Define load profiles for hot endpoints:
    • /api/tasks/
    • /api/static_data/
    • /api/auth/login/
  2. Run load and soak tests in staging.
  3. Capture query plans for slow SQL and add indexes where needed.
  4. Validate Redis/cache fallback behavior under cache loss.
  5. Tune worker concurrency and queue weights from measured data.

Validation

  1. Meet agreed latency/error SLOs under target load.
  2. No sustained queue growth under steady-state load.

Exit Criteria

  1. Capacity plan is documented with clear limits and scaling triggers.

Phase 7 - Release Discipline And Recovery

Goal: safe deployments and verified rollback/recovery.

Tasks

  1. Adopt canary or blue/green deploy strategy.
  2. Add automatic rollback triggers based on SLO violations.
  3. Add pre-deploy checklist:
    • migrations reviewed
    • CI green
    • queue backlog healthy
    • dependencies healthy
  4. Validate backups with restore drills (not just backup existence).
  5. Document RPO/RTO targets and current measured reality.

Validation

  1. Perform one full staging rollback rehearsal.
  2. Perform one restore-from-backup rehearsal.

Exit Criteria

  1. Deploy and rollback are repeatable, scripted, and tested.

Definition Of Done (Every PR)

  1. go vet ./...
  2. gofmt -l . returns no files
  3. go test -race -count=1 ./...
  4. Contract tests pass
  5. OpenAPI updated for endpoint changes
  6. Migrations added and reversible for schema changes
  7. Security impact reviewed for auth/uploads/media/webhooks
  8. Observability impact reviewed for new critical paths

  1. Week 1: Phase 0 + Phase 1
  2. Week 2: Phase 2
  3. Week 3-4: Phase 3 + Phase 4
  4. Week 5: Phase 5
  5. Week 6: Phase 6 + Phase 7 rehearsal

Adjust timeline based on team size and release pressure, but keep ordering.