Replaces the multipart-via-API path for image uploads with a three-step
direct-to-storage flow:
1. Client POSTs /api/uploads/presign with content_length + content_type;
server validates size (10 MB cap), mime allow-list per category, rate
limit (50/hour/user via Redis sliding window), and concurrent unclaimed
cap (10 in-flight per user). On success it persists a pending_uploads
row, signs an S3 POST policy with content-length-range bound to the
claimed length ±256 bytes, and returns the URL+fields.
2. Client POSTs the bytes directly to B2 using the signed policy. B2
enforces size, content-type, and key match before accepting.
3. Client passes upload_ids[] to /api/task-completions/ or /api/documents/.
Service HEADs each B2 object, verifies size matches expected_bytes
within slack, marks pending_uploads claimed_at, and creates the
associated TaskCompletionImage / DocumentImage rows.
Bytes never traverse our API server. The 1 MB Echo BodyLimit middleware
that was rejecting all task-completion image uploads becomes irrelevant
for this path. Existing multipart endpoints stay functional alongside,
soak-testing the new path before legacy removal.
Cleanup:
- cmd/worker registers a new hourly cron (TypeUploadCleanup, "30 * * * *")
that reaps pending_uploads where claimed_at IS NULL AND expires_at < NOW().
Reaps both the B2 object and the row.
- B2 bucket lifecycle rule on `uploads/` prefix (7 days hide → 1 day delete)
documented in deploy-k3s/manifests/b2-lifecycle.md as a backstop.
Schema:
- migrations/000002_pending_uploads.sql adds the table + partial index for
cleanup + nullable pending_upload_id FKs on task_taskcompletionimage and
task_documentimage.
Policy (single tier, no free/pro split):
- 10 MB cap per upload
- 50 presigns/hour/user
- 10 concurrent unclaimed uploads/user
- allow-list: jpeg/png/heic/heif/webp for image categories;
+ pdf for document_file
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GET /api/subscription/status/ was the slowest endpoint in the API at
p50≈1750ms / p95≈2425ms — about 12× the floor for our cluster→Neon
geography. Jaeger traces showed seven sequential SQL queries each
costing roughly one transatlantic RTT (~110ms), with the actual queries
running in 0.073ms at the database. Pure network serialization, not slow
SQL.
Three changes, in order of leverage:
1. Cache the assembled SubscriptionStatusResponse per-user in Redis with
a 5-minute TTL. Hot path collapses to a single Redis GET (~5ms) on
warm reads; the TTL is a safety net against missed invalidations.
2. Parallelize the three independent COUNT queries in getUserUsage
(task_task / task_contractor / task_document) via golang.org/x/sync
errgroup. Three RTTs collapse to one. Also dropped the redundant
residence_residence COUNT — len(residenceIDs) from FindResidenceIDsByOwner
is the same number, no need to re-query.
3. Wire explicit invalidation into every mutation that could change a
user's response — residence/task/contractor/document CRUD,
residence membership changes (JoinWithCode, RemoveUser, DeleteResidence),
and every subscription tier flip across the IAP/Stripe/webhook surface.
Residence-scoped invalidations fan out to every user with access via a
new ResidenceRepository.FindUserIDsByResidence helper, so members of a
shared residence don't see stale `usage` numbers when another member
adds a task.
Net effect: warm path goes from ~1350ms to ~5ms (Redis hit). Cold path
goes from ~1350ms to ~250-450ms (5 sequential queries → 2 phases:
residence IDs lookup, then parallel task/contractor/document counts).
Also fixed a pre-existing CheckLimit signature drift in
internal/integration/subscription_is_free_test.go that was blocking the
package build.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stack of optimizations against the same Hetzner→Neon transatlantic link.
The trace revealed every visible ms was network/proxy overhead — DB
execution itself is sub-millisecond per query (verified via EXPLAIN
ANALYZE: index scans on every hot path).
Connection layer:
- DB_HOST → Neon pooler endpoint (-pooler suffix). PgBouncer
transaction-mode keeps backend Postgres connections warm so we no
longer pay the ~110ms Postgres-startup RTT on cold queries.
- GORM pool tuned: MaxIdleConns 10→20, MaxLifetime 600s→1800s,
MaxIdleTime added (default 0 = never close idle).
- Eager pool warm-up at boot via parallel pings — first user request
no longer pays the ~440ms TCP+TLS+startup handshake.
- Redis maxmemory-policy noeviction → allkeys-lru. Cache writes will
evict cold keys instead of erroring at the 256MB limit.
Auth layer:
- TokenCacheTTL 5min → 1 hour (Redis token cache).
- UserCacheTTL 30s → 5min (in-memory User cache, per pod).
- UserCache gains a 5,000-entry LRU cap so a flood of unique users
can't blow up pod RSS. ~5MB worst-case per pod.
- Token + user lookup collapsed from 2 GORM Preload queries into a
single INNER JOIN. Saves 1 RTT per cold-cache request.
- Auth middleware's m.db.* now use db.WithContext(ctx) so the SQL
spans nest under the parent HTTP request in Jaeger.
Service layer:
- TaskService.ListTasks: replaced two-step
FindResidenceIDsByUser → GetKanbanDataForMultipleResidences
with a single GetKanbanDataForUser that uses a Postgres subquery
for residence-access. One round-trip instead of two.
- New CacheService residence-IDs cache: \"residence_ids_user:<id>\"
with 5-min TTL. Wired into Task/Residence/Contractor/Document
services for the four hot read paths that need this list.
- Cache invalidation on every relevant mutation: CreateResidence,
DeleteResidence, JoinWithCode, RemoveUser. DeleteResidence
invalidates every member of the residence, not just the owner.
What this stacks up to (Hetzner→Neon, before US migration):
Path Before After (target)
Cache-warm authed read ~800ms ~100-200ms
Cache-cold authed read (1st in 1hr) ~2500ms ~500-700ms
First request after deploy ~2500ms ~700-900ms
The endgame US-region migration on top of this gets us to ~30-50ms
warm-cache, but we're shippable at ~150ms warm right now.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every public method on these five services now takes ctx context.Context as
the first arg and routes its repo calls through .WithContext(ctx). With
TaskService and ResidenceService already migrated, this means every
in-process service that touches Postgres now produces a flame graph in
Jaeger where the SQL spans nest under the parent HTTP request span.
Endpoints now fully traced (HTTP → service → SQL):
- /api/auth/login, /register, /logout, /me, /verify-email, /resend-verification
- /api/auth/forgot-password, /verify-reset, /reset-password, /update-profile
- /api/contractors/* (CRUD + favorite + by-residence + tasks)
- /api/documents/* (CRUD + activate/deactivate + image upload/delete)
- /api/notifications/* (list, count, mark-read, prefs, devices)
- /api/subscription/* (status, purchase, cancel, triggers, promotions)
- All previously-migrated /api/tasks/* and /api/residences/* paths
Internal helpers also threaded:
- TaskService.sendTaskCompletedNotification → forwards ctx
- TaskService.UpdateUserTimezone → forwards ctx to NotificationService
- ResidenceService.CreateResidence → forwards ctx to SubscriptionService.CheckLimit
- NotificationService.registerAPNSDevice / registerGCMDevice → both take ctx
~75 method signatures, ~120 handler/test call sites updated. Tests pass
green; the only failure is the pre-existing flaky TaskHandler_QuickComplete
SQLite race that fails ~60% of runs on master.
Step 3 of the observability plan is now genuinely complete: every API
endpoint backed by a Go service emits a per-request flame graph with
HTTP → service → SQL spans, plus B2/APNs/FCM/asynq spans where applicable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add document list filter support (residence, type, category, contractor, is_active, expiring_soon, search) to handler/service/repo
- Add `days` query param parsing to ListTasks handler (matches ListTasksByResidence)
- Add `error.invalid_token` i18n key to all 9 non-English locale files
- Update contract test to include VerificationResponse mapping
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Database Indexes (migrations 006-009):
- Add case-insensitive indexes for auth lookups (email, username)
- Add composite indexes for task kanban queries
- Add indexes for notification, document, and completion queries
- Add unique index for active share codes
- Remove redundant idx_share_code_active and idx_notification_user_sent
Repository Optimizations:
- Add FindResidenceIDsByUser() lightweight method (IDs only, no preloads)
- Optimize GetResidenceUsers() with single UNION query (was 2 queries)
- Optimize kanban completion preloads to minimal columns (id, task_id, completed_at)
Service Optimizations:
- Remove Category/Priority/Frequency preloads from task queries
- Remove summary calculations from CRUD responses (client calculates)
- Use lightweight FindResidenceIDsByUser() instead of full FindByUser()
These changes reduce database load and response times for common operations.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add TaskCompletionImage and DocumentImage models with one-to-many relationships
- Update admin panel to display images for completions and documents
- Add image arrays to API request/response DTOs
- Update repositories with Preload("Images") for eager loading
- Fix seed SQL execution to use raw SQL instead of prepared statements
- Fix table names in seed file (admin_users, push_notifications_*)
- Add comprehensive seed test data with 34 completion images and 24 document images
- Add subscription limitations admin feature with toggle
- Update admin sidebar with limitations link
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Features:
- PDF service for generating task reports with ReportLab-style formatting
- Storage service for file uploads (local and S3-compatible)
- Admin authentication middleware with JWT support
- Admin user model and repository
Infrastructure:
- Updated Docker configuration for admin panel builds
- Email service enhancements for task notifications
- Updated router with admin and file upload routes
- Environment configuration updates
Tests:
- Unit tests for handlers (auth, residence, task)
- Unit tests for models (user, residence, task)
- Unit tests for repositories (user, residence, task)
- Unit tests for services (residence, task)
- Integration test setup
- Test utilities for mocking database and services
Database:
- Admin user seed data
- Updated test data seeds
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Complete rewrite of Django REST API to Go with:
- Gin web framework for HTTP routing
- GORM for database operations
- GoAdmin for admin panel
- Gorush integration for push notifications
- Redis for caching and job queues
Features implemented:
- User authentication (login, register, logout, password reset)
- Residence management (CRUD, sharing, share codes)
- Task management (CRUD, kanban board, completions)
- Contractor management (CRUD, specialties)
- Document management (CRUD, warranties)
- Notifications (preferences, push notifications)
- Subscription management (tiers, limits)
Infrastructure:
- Docker Compose for local development
- Database migrations and seed data
- Admin panel for data management
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>