2 Commits

Author SHA1 Message Date
Trey t 7b87f2e392 fix(kratos): drop cloudflare-only middleware on auth ingress
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
iOS Sign In with Apple failed silently — the KMP client never reached
Kratos. Traced to the cloudflare-only Traefik middleware rejecting every
request at the auth ingress.

Root cause: on this cluster klipper-lb sits in front of Traefik and
SNATs the source IP. Traefik's ipAllowList sees the klipper-lb pod IP,
not Cloudflare's real source IP — so even legitimate iOS requests
proxied through Cloudflare get 403'd. The api ingress doesn't have
this middleware (and works correctly), so removing it from auth
matches the working pattern.

Kratos is the user-facing OIDC endpoint — every iOS/web user device
needs to reach it. Cloudflare's edge still does DDoS protection;
Kratos applies its own per-flow rate limits. The IP allowlist was
buying nothing here and breaking everything.

Verified after this change:
  - GET /health/alive → 200
  - GET /health/ready → 200
  - GET /self-service/login/api → 200 + valid flow body listing apple
    as an OIDC provider option

Related but not fixed by this commit: the same klipper-lb SNAT issue
affects admin.myhoneydue.com (which retains cloudflare-only). Admin
basic auth still gates real access there, but the IP check is dead
weight. Proper fix is configuring Traefik ipStrategy to read the
client IP from X-Forwarded-For (set by Cloudflare). Tracked as a
follow-up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-03 11:14:35 -05:00
Trey t b66151ddd9 feat(auth): scaffold Ory Kratos identity service — phase 1 (infrastructure)
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
First phase of replacing the hand-rolled auth (internal/services/auth_service.go
et al.) with Ory Kratos. This commit is infrastructure only — Kratos will run
but nothing consumes it yet; the Go API still does its own auth until phase 2.

Adds deploy-k3s/manifests/kratos/:
- configmap.yaml  — kratos.yml, identity schema, Google/Apple OIDC claim
                     mappers (no secrets in the ConfigMap)
- migrate-job.yaml — `kratos migrate sql`, run before the Deployment
- kratos.yaml     — Deployment (x2), Service, NetworkPolicies
- ingress.yaml    — auth.myhoneydue.com -> Kratos public API :4433
- README.md       — operator prerequisites + deploy runbook

Wiring:
- 02-setup-secrets.sh creates kratos-secrets, gated on a config.yaml `kratos:`
  block (DSN, cookie/cipher, SMTP URI, OIDC client secret, Apple key).
- 03-deploy.sh applies the Kratos manifests + runs the migrate Job, gated on
  the kratos-secrets Secret existing.

Both gates mean the existing stack deploys completely unaffected until the
operator completes the prerequisites (Neon `kratos` DB, auth.myhoneydue.com
DNS, Apple/Google OAuth apps, Kratos image version). Pre-production, so no
user-data migration — see manifests/kratos/README.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:24:38 -05:00