Production is running with no Kratos deployed in-cluster (the deploy
script's kratos-secrets prerequisite isn't satisfied yet — see runbook
§11 #7). That means Whoami calls ALWAYS fail, so any time a user's Redis
session cache expires they get a 401, which the iOS app treats as session
invalid → forced re-login → can't re-authenticate because the same
Whoami is the only way back in.
Two-part mitigation:
1. Bump kratosSessionCacheTTL from 5 minutes to 24 hours. Active users
stay logged in indefinitely; idle users get bounced after a day.
2. Refresh the cache TTL on every successful cache hit (sliding window)
so usage-driven expiry is no longer a cliff at the original TTL.
When Kratos actually comes up:
- revert the TTL constant to a sensible value (1-15 min)
- the sliding-window refresh is fine to keep; it's good UX regardless
Caveat: this papers over the missing Kratos. New sign-ins still cannot
complete because the api needs Kratos to populate the cache the first
time. Real fix is to deploy Kratos.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Delegates all credential management (login, register, password reset,
email verification, social sign-in) to Ory Kratos. The Go API now acts
as a resource server: the new KratosAuth middleware validates sessions
against the Kratos whoami endpoint, writes the local User mirror into
Echo context, and all existing domain handlers continue working
unchanged. Hand-rolled token auth, AuthToken model, apple_auth/
google_auth services, and the auth refresh flow are removed. Tests are
updated to use the fake-token middleware pattern so existing integration
assertions require no rewrite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>