feat(kratos): deploy Ory Kratos to production (Apple-only OIDC)
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Backend CI / Build (push) Has been cancelled

Auth was structurally broken — the api's Kratos middleware was pointing
at http://kratos:4433 but Kratos wasn't deployed. The only thing keeping
users logged in was a 5-min Redis cache; once it expired the middleware
called Whoami → no DNS → 401 → forced relogin with no path back.

This commit deploys Kratos for real:

Manifests:
  - kratos.yaml + migrate-job.yaml: pin oryd/kratos:v26.2.0@sha256:92eedc...
    (CalVer current stable as of 2026-06-03)
  - configmap.yaml: drop Google OIDC provider (not in scope); fill the
    Apple provider with real Services ID / Team ID / Key ID — Apple now
    sits at providers[0]
  - kratos.yaml: drop the Google-secret env binding; rebind APPLE_PRIVATE_KEY
    to PROVIDERS_0_APPLE_PRIVATE_KEY (shifted from index 1)
  - network-policies.yaml: add a kratos egress rule to allow-egress-from-api.
    Without this, even with kratos running, the api gets "connection refused"
    on http://kratos:4433 (post-DNAT NetworkPolicy enforcement — runbook §9.2).

Operator prerequisites that were completed alongside this commit:
  - Neon kratos database created (separate from honeyDue, owner neondb_owner)
  - Cloudflare DNS for auth.myhoneydue.com (3 A records, proxied)
  - kratos: block added to config.yaml (gitignored): DSN to the Neon DIRECT
    endpoint, cookie + cipher secrets generated, Fastmail SMTPS URI,
    .p8 contents inline

Out of scope intentionally:
  - Google sign-in (additive; can append providers[] later)
  - Migrating existing auth_user rows onto Kratos identities — pre-prod;
    existing users will need to sign in fresh, which creates a new Kratos
    identity and a new local user row (per migration plan in
    manifests/kratos/README.md).

Verified end-to-end:
  - 338 schema migrations applied successfully
  - 2/2 kratos pods Ready
  - api → kratos:4433/sessions/whoami returns 401 for invalid token (was
    "connection refused" before this commit's NetworkPolicy patch)
  - auth.myhoneydue.com resolves through CF; cloudflare-only middleware
    keeps the origin protected exactly like the other hostnames

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Trey t
2026-06-03 11:08:09 -05:00
parent 64c656bde1
commit 6de90acef7
4 changed files with 46 additions and 34 deletions
+15 -18
View File
@@ -5,9 +5,10 @@
# kratos-secrets Secret (see kratos.yaml). Kratos is configured natively via # kratos-secrets Secret (see kratos.yaml). Kratos is configured natively via
# env vars, so this is the idiomatic split — only non-secret config here. # env vars, so this is the idiomatic split — only non-secret config here.
# #
# OPERATOR: replace the GOOGLE_OAUTH_CLIENT_ID / APPLE_* client-id placeholders # OIDC scope: Apple-only as of 2026-06-03. Google is intentionally absent;
# below with the real (non-secret) OAuth client identifiers once the Apple and # adding it later is additive — append a `- id: google` block under
# Google OAuth apps exist. The matching secrets go in kratos-secrets. # selfservice.methods.oidc.config.providers (it becomes index 1) and bind a
# matching CLIENT_SECRET env in kratos.yaml.
apiVersion: v1 apiVersion: v1
kind: ConfigMap kind: ConfigMap
metadata: metadata:
@@ -18,9 +19,9 @@ metadata:
app.kubernetes.io/part-of: honeydue app.kubernetes.io/part-of: honeydue
data: data:
kratos.yml: | kratos.yml: |
# version must track the Kratos image tag — confirm against the deployed # version must track the Kratos image tag — kratos.yaml + migrate-job.yaml
# Kratos release (Ory uses CalVer, e.g. v26.x). See kratos/README.md. # both pin oryd/kratos:v26.2.0 (2026-06-03). See kratos/README.md.
version: v1.3.0 version: v1.3.0 # internal config schema version; do not change unless Kratos release notes require it
serve: serve:
public: public:
@@ -57,20 +58,16 @@ data:
enabled: true enabled: true
config: config:
providers: providers:
# index 0 — Google. client_secret is injected via env var # index 0 — Apple Sign In. apple_private_key (.p8 contents) is
# SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS_0_CLIENT_SECRET. # injected via env SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS_0_APPLE_PRIVATE_KEY.
- id: google # client_id is the Apple Services ID (here: the bundle ID, which
provider: google # was configured as a Services ID with Sign In with Apple
client_id: GOOGLE_OAUTH_CLIENT_ID # capability — see operator notes in README.md §5).
mapper_url: file:///etc/kratos/oidc.google.jsonnet
scope: [openid, email, profile]
# index 1 — Apple. apple_private_key is injected via env var
# SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS_1_APPLE_PRIVATE_KEY.
- id: apple - id: apple
provider: apple provider: apple
client_id: APPLE_SERVICES_ID client_id: com.myhoneydue.honeyDue
apple_team_id: APPLE_TEAM_ID apple_team_id: X86BR9WTLD
apple_private_key_id: APPLE_PRIVATE_KEY_ID apple_private_key_id: HQD3NCF99C
mapper_url: file:///etc/kratos/oidc.apple.jsonnet mapper_url: file:///etc/kratos/oidc.apple.jsonnet
scope: [openid, email, name] scope: [openid, email, name]
+14 -13
View File
@@ -1,14 +1,17 @@
# Ory Kratos — identity service for honeyDue. # Ory Kratos — identity service for honeyDue.
# #
# Deployed only once the operator has completed the prerequisites in # Deployed once the operator has completed the prerequisites in kratos/README.md
# kratos/README.md (Neon `kratos` database, auth.myhoneydue.com DNS, Apple + # (Neon `kratos` database, auth.myhoneydue.com DNS, Apple Sign In OIDC client,
# Google OAuth apps, and the kratos-secrets Secret). Until then 03-deploy.sh # and the kratos-secrets Secret). Until then 03-deploy.sh skips the Kratos
# skips the Kratos apply, so the existing stack is unaffected. # apply, so the existing stack is unaffected.
# #
# IMAGE: oryd/kratos uses CalVer (v25.x / v26.x). The tag below is a # IMAGE: pinned to oryd/kratos v26.2.0 (CalVer current stable as of 2026-06-03)
# fail-loud placeholder — set the current stable tag and pin a @sha256: # with the linux/amd64 digest. The schema-migration Job is in migrate-job.yaml
# digest (like redis/vmagent) before deploying. See kratos/README.md. # and runs before this Deployment rolls.
# The schema-migration Job is in migrate-job.yaml (run before this). #
# OIDC: currently Apple-only (configmap.yaml providers[0]). Google was scoped
# out at deploy time; adding it later is additive — append to providers[] in
# configmap.yaml and add the matching CLIENT_SECRET env binding here.
--- ---
apiVersion: apps/v1 apiVersion: apps/v1
kind: Deployment kind: Deployment
@@ -41,7 +44,7 @@ spec:
type: RuntimeDefault type: RuntimeDefault
containers: containers:
- name: kratos - name: kratos
image: oryd/kratos:REPLACE_WITH_CURRENT_STABLE_TAG image: oryd/kratos:v26.2.0@sha256:92eedc292ff8e1a918ac442c88ed0abe44610c75121700963114549908a45ac3
imagePullPolicy: IfNotPresent imagePullPolicy: IfNotPresent
args: args:
- serve - serve
@@ -65,10 +68,8 @@ spec:
- name: COURIER_SMTP_CONNECTION_URI - name: COURIER_SMTP_CONNECTION_URI
valueFrom: { secretKeyRef: { name: kratos-secrets, key: smtp_connection_uri } } valueFrom: { secretKeyRef: { name: kratos-secrets, key: smtp_connection_uri } }
# OIDC provider secrets — index must match the providers list # OIDC provider secrets — index must match the providers list
# order in configmap.yaml (0 = google, 1 = apple). # order in configmap.yaml. Apple-only for now (index 0).
- name: SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS_0_CLIENT_SECRET - name: SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS_0_APPLE_PRIVATE_KEY
valueFrom: { secretKeyRef: { name: kratos-secrets, key: google_client_secret } }
- name: SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS_1_APPLE_PRIVATE_KEY
valueFrom: { secretKeyRef: { name: kratos-secrets, key: apple_private_key } } valueFrom: { secretKeyRef: { name: kratos-secrets, key: apple_private_key } }
volumeMounts: volumeMounts:
- name: config - name: config
+3 -3
View File
@@ -2,8 +2,8 @@
# database before the Kratos Deployment rolls. 03-deploy.sh applies this, # database before the Kratos Deployment rolls. 03-deploy.sh applies this,
# waits for completion, then applies kratos.yaml. # waits for completion, then applies kratos.yaml.
# #
# IMAGE: set the same oryd/kratos tag as kratos.yaml (Ory CalVer v25.x/v26.x); # IMAGE: pinned to oryd/kratos v26.2.0 (CalVer current stable as of 2026-06-03)
# pin a @sha256: digest. See kratos/README.md. # with the linux/amd64 digest. Bump in sync with kratos.yaml's image.
apiVersion: batch/v1 apiVersion: batch/v1
kind: Job kind: Job
metadata: metadata:
@@ -28,7 +28,7 @@ spec:
type: RuntimeDefault type: RuntimeDefault
containers: containers:
- name: kratos-migrate - name: kratos-migrate
image: oryd/kratos:REPLACE_WITH_CURRENT_STABLE_TAG image: oryd/kratos:v26.2.0@sha256:92eedc292ff8e1a918ac442c88ed0abe44610c75121700963114549908a45ac3
imagePullPolicy: IfNotPresent imagePullPolicy: IfNotPresent
args: ["migrate", "sql", "-e", "--yes"] args: ["migrate", "sql", "-e", "--yes"]
env: env:
@@ -140,6 +140,20 @@ spec:
ports: ports:
- protocol: TCP - protocol: TCP
port: 6379 port: 6379
# Kratos (in-cluster). The auth middleware validates every session via
# http://kratos:4433/sessions/whoami; the AuthService also uses :4434
# for account deletion (DELETE /admin/identities/{id}). k3s evaluates
# egress rules AFTER kube-proxy DNAT (runbook §9.2), so this podSelector
# rule covers Service ClusterIP traffic correctly.
- to:
- podSelector:
matchLabels:
app.kubernetes.io/name: kratos
ports:
- protocol: TCP
port: 4433
- protocol: TCP
port: 4434
# External services: Neon DB (5432), SMTP (587), HTTPS (443 — APNs, FCM, B2, PostHog) # External services: Neon DB (5432), SMTP (587), HTTPS (443 — APNs, FCM, B2, PostHog)
- to: - to:
- ipBlock: - ipBlock: