feat(uploads): direct-to-B2 presigned uploads with content-length-range policy
Replaces the multipart-via-API path for image uploads with a three-step
direct-to-storage flow:
1. Client POSTs /api/uploads/presign with content_length + content_type;
server validates size (10 MB cap), mime allow-list per category, rate
limit (50/hour/user via Redis sliding window), and concurrent unclaimed
cap (10 in-flight per user). On success it persists a pending_uploads
row, signs an S3 POST policy with content-length-range bound to the
claimed length ±256 bytes, and returns the URL+fields.
2. Client POSTs the bytes directly to B2 using the signed policy. B2
enforces size, content-type, and key match before accepting.
3. Client passes upload_ids[] to /api/task-completions/ or /api/documents/.
Service HEADs each B2 object, verifies size matches expected_bytes
within slack, marks pending_uploads claimed_at, and creates the
associated TaskCompletionImage / DocumentImage rows.
Bytes never traverse our API server. The 1 MB Echo BodyLimit middleware
that was rejecting all task-completion image uploads becomes irrelevant
for this path. Existing multipart endpoints stay functional alongside,
soak-testing the new path before legacy removal.
Cleanup:
- cmd/worker registers a new hourly cron (TypeUploadCleanup, "30 * * * *")
that reaps pending_uploads where claimed_at IS NULL AND expires_at < NOW().
Reaps both the B2 object and the row.
- B2 bucket lifecycle rule on `uploads/` prefix (7 days hide → 1 day delete)
documented in deploy-k3s/manifests/b2-lifecycle.md as a backstop.
Schema:
- migrations/000002_pending_uploads.sql adds the table + partial index for
cleanup + nullable pending_upload_id FKs on task_taskcompletionimage and
task_documentimage.
Policy (single tier, no free/pro split):
- 10 MB cap per upload
- 50 presigns/hour/user
- 10 concurrent unclaimed uploads/user
- allow-list: jpeg/png/heic/heif/webp for image categories;
+ pdf for document_file
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,28 @@
|
||||
package responses
|
||||
|
||||
// PresignUploadResponse is what /api/uploads/presign returns to the client.
|
||||
//
|
||||
// The client uses URL + Fields to build a multipart/form-data POST directly
|
||||
// to S3-compatible storage (B2). Once the upload completes, the client calls
|
||||
// the relevant entity-creation endpoint (POST /api/task-completions/, POST
|
||||
// /api/documents/) with `upload_ids: [Id]` to claim and attach the object.
|
||||
type PresignUploadResponse struct {
|
||||
// ID is the pending_uploads.id the client passes back via upload_ids[].
|
||||
ID uint `json:"id"`
|
||||
|
||||
// URL is the storage endpoint to POST to (no query string).
|
||||
URL string `json:"upload_url"`
|
||||
|
||||
// Fields are the form fields (policy, signature, key, etc.) that must be
|
||||
// submitted with the multipart form. The file part must be named "file"
|
||||
// and come last per S3 POST policy rules.
|
||||
Fields map[string]string `json:"fields"`
|
||||
|
||||
// Key is the object key chosen by the server. Echoed for client logging
|
||||
// and debugging; the canonical reference is via ID.
|
||||
Key string `json:"key"`
|
||||
|
||||
// ExpiresAt is when the signed URL stops working. Clients should retry
|
||||
// with a fresh presign rather than relying on long-lived URLs.
|
||||
ExpiresAt string `json:"expires_at"`
|
||||
}
|
||||
Reference in New Issue
Block a user