88fb1751c7
Stack of optimizations against the same Hetzner→Neon transatlantic link. The trace revealed every visible ms was network/proxy overhead — DB execution itself is sub-millisecond per query (verified via EXPLAIN ANALYZE: index scans on every hot path). Connection layer: - DB_HOST → Neon pooler endpoint (-pooler suffix). PgBouncer transaction-mode keeps backend Postgres connections warm so we no longer pay the ~110ms Postgres-startup RTT on cold queries. - GORM pool tuned: MaxIdleConns 10→20, MaxLifetime 600s→1800s, MaxIdleTime added (default 0 = never close idle). - Eager pool warm-up at boot via parallel pings — first user request no longer pays the ~440ms TCP+TLS+startup handshake. - Redis maxmemory-policy noeviction → allkeys-lru. Cache writes will evict cold keys instead of erroring at the 256MB limit. Auth layer: - TokenCacheTTL 5min → 1 hour (Redis token cache). - UserCacheTTL 30s → 5min (in-memory User cache, per pod). - UserCache gains a 5,000-entry LRU cap so a flood of unique users can't blow up pod RSS. ~5MB worst-case per pod. - Token + user lookup collapsed from 2 GORM Preload queries into a single INNER JOIN. Saves 1 RTT per cold-cache request. - Auth middleware's m.db.* now use db.WithContext(ctx) so the SQL spans nest under the parent HTTP request in Jaeger. Service layer: - TaskService.ListTasks: replaced two-step FindResidenceIDsByUser → GetKanbanDataForMultipleResidences with a single GetKanbanDataForUser that uses a Postgres subquery for residence-access. One round-trip instead of two. - New CacheService residence-IDs cache: \"residence_ids_user:<id>\" with 5-min TTL. Wired into Task/Residence/Contractor/Document services for the four hot read paths that need this list. - Cache invalidation on every relevant mutation: CreateResidence, DeleteResidence, JoinWithCode, RemoveUser. DeleteResidence invalidates every member of the residence, not just the owner. What this stacks up to (Hetzner→Neon, before US migration): Path Before After (target) Cache-warm authed read ~800ms ~100-200ms Cache-cold authed read (1st in 1hr) ~2500ms ~500-700ms First request after deploy ~2500ms ~700-900ms The endgame US-region migration on top of this gets us to ~30-50ms warm-cache, but we're shippable at ~150ms warm right now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
111 lines
3.3 KiB
YAML
111 lines
3.3 KiB
YAML
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: redis
|
|
namespace: honeydue
|
|
labels:
|
|
app.kubernetes.io/name: redis
|
|
app.kubernetes.io/part-of: honeydue
|
|
spec:
|
|
replicas: 1
|
|
strategy:
|
|
type: Recreate # ReadWriteOnce PVC — can't attach to two pods
|
|
selector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: redis
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app.kubernetes.io/name: redis
|
|
app.kubernetes.io/part-of: honeydue
|
|
spec:
|
|
serviceAccountName: redis
|
|
nodeSelector:
|
|
honeydue/redis: "true"
|
|
securityContext:
|
|
runAsNonRoot: true
|
|
runAsUser: 999
|
|
runAsGroup: 999
|
|
fsGroup: 999
|
|
seccompProfile:
|
|
type: RuntimeDefault
|
|
containers:
|
|
- name: redis
|
|
image: redis:7-alpine
|
|
command:
|
|
- sh
|
|
- -c
|
|
- |
|
|
# allkeys-lru: under memory pressure, evict the least-recently-used key.
|
|
# honeyDue uses Redis as a cache + asynq queue. The cache layer falls
|
|
# through to DB on miss, so eviction is graceful. asynq keys with TTLs
|
|
# would be evicted only after older cache entries are gone.
|
|
ARGS="--appendonly yes --appendfsync everysec --maxmemory 256mb --maxmemory-policy allkeys-lru"
|
|
if [ -n "$REDIS_PASSWORD" ]; then
|
|
ARGS="$ARGS --requirepass $REDIS_PASSWORD"
|
|
fi
|
|
exec redis-server $ARGS
|
|
ports:
|
|
- containerPort: 6379
|
|
protocol: TCP
|
|
securityContext:
|
|
allowPrivilegeEscalation: false
|
|
readOnlyRootFilesystem: true
|
|
capabilities:
|
|
drop: ["ALL"]
|
|
env:
|
|
- name: REDIS_PASSWORD
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: honeydue-secrets
|
|
key: REDIS_PASSWORD
|
|
optional: true
|
|
volumeMounts:
|
|
- name: redis-data
|
|
mountPath: /data
|
|
- name: tmp
|
|
mountPath: /tmp
|
|
resources:
|
|
requests:
|
|
cpu: 100m
|
|
memory: 128Mi
|
|
limits:
|
|
cpu: 500m
|
|
memory: 512Mi
|
|
readinessProbe:
|
|
exec:
|
|
command:
|
|
- sh
|
|
- -c
|
|
- |
|
|
if [ -n "$REDIS_PASSWORD" ]; then
|
|
redis-cli -a "$REDIS_PASSWORD" ping 2>/dev/null | grep -q PONG
|
|
else
|
|
redis-cli ping | grep -q PONG
|
|
fi
|
|
initialDelaySeconds: 5
|
|
periodSeconds: 10
|
|
timeoutSeconds: 5
|
|
livenessProbe:
|
|
exec:
|
|
command:
|
|
- sh
|
|
- -c
|
|
- |
|
|
if [ -n "$REDIS_PASSWORD" ]; then
|
|
redis-cli -a "$REDIS_PASSWORD" ping 2>/dev/null | grep -q PONG
|
|
else
|
|
redis-cli ping | grep -q PONG
|
|
fi
|
|
initialDelaySeconds: 15
|
|
periodSeconds: 20
|
|
timeoutSeconds: 5
|
|
volumes:
|
|
- name: redis-data
|
|
persistentVolumeClaim:
|
|
claimName: redis-data
|
|
- name: tmp
|
|
emptyDir:
|
|
medium: Memory
|
|
sizeLimit: 64Mi
|