perf(subscription-status): cache + parallelize + invalidate on mutations
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled

GET /api/subscription/status/ was the slowest endpoint in the API at
p50≈1750ms / p95≈2425ms — about 12× the floor for our cluster→Neon
geography. Jaeger traces showed seven sequential SQL queries each
costing roughly one transatlantic RTT (~110ms), with the actual queries
running in 0.073ms at the database. Pure network serialization, not slow
SQL.

Three changes, in order of leverage:

1. Cache the assembled SubscriptionStatusResponse per-user in Redis with
   a 5-minute TTL. Hot path collapses to a single Redis GET (~5ms) on
   warm reads; the TTL is a safety net against missed invalidations.

2. Parallelize the three independent COUNT queries in getUserUsage
   (task_task / task_contractor / task_document) via golang.org/x/sync
   errgroup. Three RTTs collapse to one. Also dropped the redundant
   residence_residence COUNT — len(residenceIDs) from FindResidenceIDsByOwner
   is the same number, no need to re-query.

3. Wire explicit invalidation into every mutation that could change a
   user's response — residence/task/contractor/document CRUD,
   residence membership changes (JoinWithCode, RemoveUser, DeleteResidence),
   and every subscription tier flip across the IAP/Stripe/webhook surface.
   Residence-scoped invalidations fan out to every user with access via a
   new ResidenceRepository.FindUserIDsByResidence helper, so members of a
   shared residence don't see stale `usage` numbers when another member
   adds a task.

Net effect: warm path goes from ~1350ms to ~5ms (Redis hit). Cold path
goes from ~1350ms to ~250-450ms (5 sequential queries → 2 phases:
residence IDs lookup, then parallel task/contractor/document counts).

Also fixed a pre-existing CheckLimit signature drift in
internal/integration/subscription_is_free_test.go that was blocking the
package build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trey t
2026-05-01 11:00:23 -07:00
parent 0798ae8d74
commit 9bee436e86
11 changed files with 286 additions and 34 deletions
+21
View File
@@ -197,6 +197,9 @@ func (s *TaskService) CreateTask(ctx context.Context, req *requests.CreateTaskRe
return nil, apperrors.Internal(err)
}
// tasks_count for every member of this residence just changed.
invalidateSubStatusForResidence(ctx, s.cache, s.residenceRepo, req.ResidenceID)
return &responses.TaskWithSummaryResponse{
Data: responses.NewTaskResponseWithTime(task, 30, now),
Summary: s.getSummaryForUser(userID),
@@ -273,6 +276,10 @@ func (s *TaskService) BulkCreateTasks(ctx context.Context, req *requests.BulkCre
created = append(created, responses.NewTaskResponseWithTime(t, 30, now))
}
// One residence per batch, so a single fanout invalidation covers all
// affected users.
invalidateSubStatusForResidence(ctx, s.cache, s.residenceRepo, req.ResidenceID)
return &responses.BulkCreateTasksResponse{
Tasks: created,
Summary: s.getSummaryForUser(userID),
@@ -385,6 +392,8 @@ func (s *TaskService) DeleteTask(ctx context.Context, taskID, userID uint) (*res
return nil, apperrors.Internal(err)
}
invalidateSubStatusForResidence(ctx, s.cache, s.residenceRepo, task.ResidenceID)
return &responses.DeleteWithSummaryResponse{
Data: "task deleted",
Summary: s.getSummaryForUser(userID),
@@ -469,6 +478,9 @@ func (s *TaskService) CancelTask(ctx context.Context, taskID, userID uint, now t
return nil, apperrors.Internal(err)
}
// CountByResidenceIDs filters out is_cancelled, so this drops tasks_count.
invalidateSubStatusForResidence(ctx, s.cache, s.residenceRepo, task.ResidenceID)
return &responses.TaskWithSummaryResponse{
Data: responses.NewTaskResponseWithTime(task, 30, now),
Summary: s.getSummaryForUser(userID),
@@ -508,6 +520,9 @@ func (s *TaskService) UncancelTask(ctx context.Context, taskID, userID uint, now
return nil, apperrors.Internal(err)
}
// Reverse of Cancel — tasks_count goes back up.
invalidateSubStatusForResidence(ctx, s.cache, s.residenceRepo, task.ResidenceID)
return &responses.TaskWithSummaryResponse{
Data: responses.NewTaskResponseWithTime(task, 30, now),
Summary: s.getSummaryForUser(userID),
@@ -551,6 +566,9 @@ func (s *TaskService) ArchiveTask(ctx context.Context, taskID, userID uint, now
return nil, apperrors.Internal(err)
}
// Same as Cancel — CountByResidenceIDs filters is_archived too.
invalidateSubStatusForResidence(ctx, s.cache, s.residenceRepo, task.ResidenceID)
return &responses.TaskWithSummaryResponse{
Data: responses.NewTaskResponseWithTime(task, 30, now),
Summary: s.getSummaryForUser(userID),
@@ -590,6 +608,9 @@ func (s *TaskService) UnarchiveTask(ctx context.Context, taskID, userID uint, no
return nil, apperrors.Internal(err)
}
// Reverse of Archive — tasks_count goes back up.
invalidateSubStatusForResidence(ctx, s.cache, s.residenceRepo, task.ResidenceID)
return &responses.TaskWithSummaryResponse{
Data: responses.NewTaskResponseWithTime(task, 30, now),
Summary: s.getSummaryForUser(userID),