Files
honeyDueAPI/deploy-k3s/manifests/kratos
Trey t 6de90acef7
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
feat(kratos): deploy Ory Kratos to production (Apple-only OIDC)
Auth was structurally broken — the api's Kratos middleware was pointing
at http://kratos:4433 but Kratos wasn't deployed. The only thing keeping
users logged in was a 5-min Redis cache; once it expired the middleware
called Whoami → no DNS → 401 → forced relogin with no path back.

This commit deploys Kratos for real:

Manifests:
  - kratos.yaml + migrate-job.yaml: pin oryd/kratos:v26.2.0@sha256:92eedc...
    (CalVer current stable as of 2026-06-03)
  - configmap.yaml: drop Google OIDC provider (not in scope); fill the
    Apple provider with real Services ID / Team ID / Key ID — Apple now
    sits at providers[0]
  - kratos.yaml: drop the Google-secret env binding; rebind APPLE_PRIVATE_KEY
    to PROVIDERS_0_APPLE_PRIVATE_KEY (shifted from index 1)
  - network-policies.yaml: add a kratos egress rule to allow-egress-from-api.
    Without this, even with kratos running, the api gets "connection refused"
    on http://kratos:4433 (post-DNAT NetworkPolicy enforcement — runbook §9.2).

Operator prerequisites that were completed alongside this commit:
  - Neon kratos database created (separate from honeyDue, owner neondb_owner)
  - Cloudflare DNS for auth.myhoneydue.com (3 A records, proxied)
  - kratos: block added to config.yaml (gitignored): DSN to the Neon DIRECT
    endpoint, cookie + cipher secrets generated, Fastmail SMTPS URI,
    .p8 contents inline

Out of scope intentionally:
  - Google sign-in (additive; can append providers[] later)
  - Migrating existing auth_user rows onto Kratos identities — pre-prod;
    existing users will need to sign in fresh, which creates a new Kratos
    identity and a new local user row (per migration plan in
    manifests/kratos/README.md).

Verified end-to-end:
  - 338 schema migrations applied successfully
  - 2/2 kratos pods Ready
  - api → kratos:4433/sessions/whoami returns 401 for invalid token (was
    "connection refused" before this commit's NetworkPolicy patch)
  - auth.myhoneydue.com resolves through CF; cloudflare-only middleware
    keeps the origin protected exactly like the other hostnames

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-03 11:08:09 -05:00
..

Ory Kratos — honeyDue identity service (Phase 1: infrastructure)

This directory deploys Ory Kratos into the honeydue namespace as the identity provider — replacing the hand-rolled auth in internal/services/auth_service.go etc.

Phase 1 is infrastructure only. Once deployed, Kratos runs but nothing uses it yet — the honeyDue Go API still does its own auth. Phase 2 (backend swap) and Phase 3 (KMP/web clients) follow. Migrating onto Kratos can lose all existing user data — honeyDue is pre-production, so no user import is done.

The deploy is gated: 03-deploy.sh applies Kratos only when the kratos-secrets Secret exists, and 02-setup-secrets.sh creates that Secret only when config.yaml has a kratos: block. Until then the existing stack deploys completely unaffected.

Files

File What
configmap.yaml kratos.yml, identity schema, Google/Apple OIDC claim mappers (no secrets)
migrate-job.yaml kratos migrate sql — schema migration, run before the Deployment
kratos.yaml Deployment (×2), Service, NetworkPolicies
ingress.yaml auth.myhoneydue.com → Kratos public API :4433

Operator prerequisites (must be done before deploying)

  1. Kratos version — Ory uses CalVer (v25.x / v26.x). Pick the current stable, then replace REPLACE_WITH_CURRENT_STABLE_TAG in kratos.yaml and migrate-job.yaml with oryd/kratos:vXX.Y@sha256:<digest>, and set the matching version: in configmap.yaml.

  2. Kratos database — create a separate Neon database named kratos (do not share honeyDue's). Capture its connection string as the DSN.

  3. DNS — add auth.myhoneydue.com in Cloudflare (proxied), pointing at the cluster ingress like the other honeyDue hosts. Confirm the cloudflare-origin-cert TLS secret covers auth.myhoneydue.com.

  4. Google OAuth client — Google Cloud Console → create an OAuth 2.0 client. Redirect URI: https://auth.myhoneydue.com/self-service/methods/oidc/callback/google. Put the client ID into configmap.yaml (GOOGLE_OAUTH_CLIENT_ID); the client secret goes in config.yaml.

  5. Apple Sign In — Apple Developer → a Services ID + a Sign in with Apple key. Return URL: https://auth.myhoneydue.com/self-service/methods/oidc/callback/apple. Put the Services ID / Team ID / Key ID into configmap.yaml (APPLE_SERVICES_ID / APPLE_TEAM_ID / APPLE_PRIVATE_KEY_ID); the .p8 private key goes in config.yaml.

  6. config.yaml — add a kratos: block:

    kratos:
      dsn: "postgres://USER:PASS@HOST/kratos?sslmode=require"
      secrets_cookie: "<openssl rand -hex 16>"   # generate ONCE, keep stable
      secrets_cipher: "<openssl rand -hex 16>"   # must be exactly 32 chars
      smtp_connection_uri: "smtps://USER:PASS@smtp.fastmail.com:465/"
      google_client_secret: "<from Google Cloud Console>"
      apple_private_key: |
        -----BEGIN PRIVATE KEY-----
        ...
        -----END PRIVATE KEY-----
    

    secrets_cookie / secrets_cipher must stay stable forever — rotating them invalidates every session and makes encrypted data unreadable.

Deploy

cd honeyDueAPI-go
export KUBECONFIG="$(pwd)/deploy-k3s/kubeconfig"
./deploy-k3s/scripts/02-setup-secrets.sh   # creates kratos-secrets from config.yaml
./deploy-k3s/scripts/03-deploy.sh          # applies kratos manifests, runs migrate, rolls

03-deploy.sh applies configmap.yaml → runs migrate-job.yaml → waits → applies kratos.yaml + ingress.yaml.

Verify

  • kubectl -n honeydue get pods -l app.kubernetes.io/name=kratos — 2/2 Running
  • kubectl -n honeydue logs job/kratos-migrate — migration succeeded
  • curl https://auth.myhoneydue.com/health/ready{"status":"ok"}
  • curl https://auth.myhoneydue.com/self-service/registration/api — returns a flow

Not yet done (later phases)

  • Phase 2 — honeyDue Go backend: swap middleware/auth.go for Kratos session validation, drop the hand-rolled auth code, rebuild the users table keyed on the Kratos identity ID.
  • Phase 3 — KMP mobile + Next.js web clients point at Kratos flows.
  • Admin-panel auth stays on its own JWT (out of scope).