feat(uploads): direct-to-B2 presigned uploads with content-length-range policy
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled

Replaces the multipart-via-API path for image uploads with a three-step
direct-to-storage flow:

  1. Client POSTs /api/uploads/presign with content_length + content_type;
     server validates size (10 MB cap), mime allow-list per category, rate
     limit (50/hour/user via Redis sliding window), and concurrent unclaimed
     cap (10 in-flight per user). On success it persists a pending_uploads
     row, signs an S3 POST policy with content-length-range bound to the
     claimed length ±256 bytes, and returns the URL+fields.
  2. Client POSTs the bytes directly to B2 using the signed policy. B2
     enforces size, content-type, and key match before accepting.
  3. Client passes upload_ids[] to /api/task-completions/ or /api/documents/.
     Service HEADs each B2 object, verifies size matches expected_bytes
     within slack, marks pending_uploads claimed_at, and creates the
     associated TaskCompletionImage / DocumentImage rows.

Bytes never traverse our API server. The 1 MB Echo BodyLimit middleware
that was rejecting all task-completion image uploads becomes irrelevant
for this path. Existing multipart endpoints stay functional alongside,
soak-testing the new path before legacy removal.

Cleanup:
  - cmd/worker registers a new hourly cron (TypeUploadCleanup, "30 * * * *")
    that reaps pending_uploads where claimed_at IS NULL AND expires_at < NOW().
    Reaps both the B2 object and the row.
  - B2 bucket lifecycle rule on `uploads/` prefix (7 days hide → 1 day delete)
    documented in deploy-k3s/manifests/b2-lifecycle.md as a backstop.

Schema:
  - migrations/000002_pending_uploads.sql adds the table + partial index for
    cleanup + nullable pending_upload_id FKs on task_taskcompletionimage and
    task_documentimage.

Policy (single tier, no free/pro split):
  - 10 MB cap per upload
  - 50 presigns/hour/user
  - 10 concurrent unclaimed uploads/user
  - allow-list: jpeg/png/heic/heif/webp for image categories;
    + pdf for document_file

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trey t
2026-05-01 14:36:42 -07:00
parent 9bee436e86
commit 29c9014a33
20 changed files with 1032 additions and 9 deletions
+41 -1
View File
@@ -37,9 +37,18 @@ type TaskService struct {
notificationService *NotificationService
emailService *EmailService
storageService *StorageService
uploadService *UploadService // optional — only set when S3 storage is configured
cache *CacheService
}
// SetUploadService wires the presigned-URL upload service so CreateCompletion
// can claim pending_uploads rows by id and convert them into completion image
// rows. Optional: with local-disk storage there's no presigned flow and the
// service is left nil.
func (s *TaskService) SetUploadService(us *UploadService) {
s.uploadService = us
}
// SetCacheService wires Redis caching for residence-ID lookups.
func (s *TaskService) SetCacheService(cache *CacheService) {
s.cache = cache
@@ -694,6 +703,21 @@ func (s *TaskService) CreateCompletion(ctx context.Context, req *requests.Create
task.InProgress = false
}
// New presigned-URL path: claim pending_uploads rows that the client
// already POSTed to B2. We do this BEFORE the txn because VerifyAndClaim
// HEADs each B2 object — we don't want to hold a Postgres transaction
// open across HTTP calls. If the txn rolls back later, the rows stay
// claimed but unreferenced; they're cents of storage and visible via
// admin queries if cleanup ever matters.
var claimedUploads []models.PendingUpload
if len(req.UploadIDs) > 0 && s.uploadService != nil {
var claimErr error
claimedUploads, claimErr = s.uploadService.VerifyAndClaim(ctx, userID, req.UploadIDs)
if claimErr != nil {
return nil, claimErr
}
}
// P1-5 + B-07: Wrap completion creation, task update, and image creation
// in a single transaction for atomicity. If any operation fails, all are rolled back.
txErr := s.taskRepo.WithContext(ctx).DB().Transaction(func(tx *gorm.DB) error {
@@ -703,7 +727,12 @@ func (s *TaskService) CreateCompletion(ctx context.Context, req *requests.Create
if err := s.taskRepo.WithContext(ctx).UpdateTx(tx, task); err != nil {
return err
}
// B-07: Create images inside the same transaction as completion
// B-07: Create images inside the same transaction as completion.
// Two sources contribute, both produce TaskCompletionImage rows:
// 1. Legacy multipart path — client uploaded via the API and got
// back URLs in req.ImageURLs.
// 2. New presigned path — client uploaded direct to B2 and we
// claimed the pending_uploads rows above.
for _, imageURL := range req.ImageURLs {
if imageURL != "" {
img := &models.TaskCompletionImage{
@@ -715,6 +744,17 @@ func (s *TaskService) CreateCompletion(ctx context.Context, req *requests.Create
}
}
}
for i := range claimedUploads {
pu := claimedUploads[i]
img := &models.TaskCompletionImage{
CompletionID: completion.ID,
ImageURL: urlForUploadKey(s.storageService, pu.B2Key),
PendingUploadID: &pu.ID,
}
if err := tx.Create(img).Error; err != nil {
return fmt.Errorf("failed to create completion image from upload %d: %w", pu.ID, err)
}
}
return nil
})
if txErr != nil {