88fb1751c7
Stack of optimizations against the same Hetzner→Neon transatlantic link. The trace revealed every visible ms was network/proxy overhead — DB execution itself is sub-millisecond per query (verified via EXPLAIN ANALYZE: index scans on every hot path). Connection layer: - DB_HOST → Neon pooler endpoint (-pooler suffix). PgBouncer transaction-mode keeps backend Postgres connections warm so we no longer pay the ~110ms Postgres-startup RTT on cold queries. - GORM pool tuned: MaxIdleConns 10→20, MaxLifetime 600s→1800s, MaxIdleTime added (default 0 = never close idle). - Eager pool warm-up at boot via parallel pings — first user request no longer pays the ~440ms TCP+TLS+startup handshake. - Redis maxmemory-policy noeviction → allkeys-lru. Cache writes will evict cold keys instead of erroring at the 256MB limit. Auth layer: - TokenCacheTTL 5min → 1 hour (Redis token cache). - UserCacheTTL 30s → 5min (in-memory User cache, per pod). - UserCache gains a 5,000-entry LRU cap so a flood of unique users can't blow up pod RSS. ~5MB worst-case per pod. - Token + user lookup collapsed from 2 GORM Preload queries into a single INNER JOIN. Saves 1 RTT per cold-cache request. - Auth middleware's m.db.* now use db.WithContext(ctx) so the SQL spans nest under the parent HTTP request in Jaeger. Service layer: - TaskService.ListTasks: replaced two-step FindResidenceIDsByUser → GetKanbanDataForMultipleResidences with a single GetKanbanDataForUser that uses a Postgres subquery for residence-access. One round-trip instead of two. - New CacheService residence-IDs cache: \"residence_ids_user:<id>\" with 5-min TTL. Wired into Task/Residence/Contractor/Document services for the four hot read paths that need this list. - Cache invalidation on every relevant mutation: CreateResidence, DeleteResidence, JoinWithCode, RemoveUser. DeleteResidence invalidates every member of the residence, not just the owner. What this stacks up to (Hetzner→Neon, before US migration): Path Before After (target) Cache-warm authed read ~800ms ~100-200ms Cache-cold authed read (1st in 1hr) ~2500ms ~500-700ms First request after deploy ~2500ms ~700-900ms The endgame US-region migration on top of this gets us to ~30-50ms warm-cache, but we're shippable at ~150ms warm right now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
42 lines
1.2 KiB
Go
42 lines
1.2 KiB
Go
package services
|
|
|
|
import (
|
|
"context"
|
|
|
|
"github.com/treytartt/honeydue-api/internal/repositories"
|
|
)
|
|
|
|
// cachedResidenceIDsForUser fetches the residence-ID list for a user, going
|
|
// through Redis (5-min TTL) before falling back to Postgres.
|
|
//
|
|
// Used on every authed read path (tasks, documents, contractors, summary)
|
|
// because the list rarely changes — only on share-code accept, member
|
|
// removal, or residence delete. Callers must invalidate after mutations
|
|
// via cache.InvalidateResidenceIDsForUsers.
|
|
//
|
|
// A nil cache is permitted — the function falls through to the repo
|
|
// directly, so this works in tests and in failure modes.
|
|
func cachedResidenceIDsForUser(
|
|
ctx context.Context,
|
|
cache *CacheService,
|
|
residenceRepo *repositories.ResidenceRepository,
|
|
userID uint,
|
|
) ([]uint, error) {
|
|
if cache != nil {
|
|
if ids, err := cache.GetCachedResidenceIDsForUser(ctx, userID); err == nil {
|
|
return ids, nil
|
|
}
|
|
}
|
|
|
|
ids, err := residenceRepo.WithContext(ctx).FindResidenceIDsByUser(userID)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
|
|
if cache != nil {
|
|
// Best-effort cache fill; don't fail the request on Redis hiccup.
|
|
_ = cache.CacheResidenceIDsForUser(ctx, userID, ids)
|
|
}
|
|
return ids, nil
|
|
}
|