# 09 — Object Storage (Backblaze B2) ## Summary User-uploaded files (photos, documents, task completion attachments) go to Backblaze B2 via its S3-compatible API. The Go API uses `minio-go/v7` as the client. This works around a Swarm-era problem where named volumes are per-node — uploads on node A were invisible to replicas on B and C. With k3s we could use a shared PVC instead, but B2 is cheaper, offsite, and already set up. ## Why Backblaze B2 ### Decision matrix | Option | Price per TB stored | Egress | Pros | Cons | |---|---|---|---|---| | **Backblaze B2** | **$6/mo** | $0.01/GB, free via CF | Cheap, hard spending caps, S3-compatible | US-West/East regions only (not EU) | | AWS S3 Standard | $23/mo | $0.09/GB | Most ubiquitous | Expensive | | Cloudflare R2 | $15/mo | Free (!) | Zero egress, CF-native | Newer, fewer features | | DigitalOcean Spaces | $5/mo for 250GB + $0.01/GB | Free 1TB, $0.01/GB after | Simple | Less reliable than AWS | | Local PVC on k3s | $0 | $0 | Already in cluster | Per-node, no HA, no offsite | B2 won because: 1. **Hard spending cap** — unique in the industry. No surprise AWS bill. 2. **Cheapest at rest** — 3–4× cheaper than S3. 3. **Free egress through Cloudflare** — we already use CF; when we eventually serve upload URLs through CF, egress is free. 4. **Mature S3-compatible API** — minio-go talks to it natively. Rejected: - **R2** was the close second. Zero egress is amazing. Rejected primarily for inertia (B2 already set up in the MyCrib era). A future migration to R2 would be reasonable. - **Local PVC** doesn't work for our setup because we want uploads durable and accessible from any node/replica. ## Configuration Bucket: `honeyDueProd` (mixed case; B2 allows this, minio-go handles it via path-style addressing — see §path-style below). Region: `us-east-005` (B2's South Carolina region — closer to our Neon DB in AWS us-east-1 than the West Coast options). Endpoint: `s3.us-east-005.backblazeb2.com` ### Environment variables From ConfigMap: | Var | Value | |---|---| | `B2_ENDPOINT` | `s3.us-east-005.backblazeb2.com` | | `B2_BUCKET_NAME` | `honeyDueProd` | | `B2_REGION` | `us-east-005` | | `B2_USE_SSL` | `true` (but see §vestigial var below) | From Secret: | Var | Value | |---|---| | `B2_KEY_ID` | App key ID (B2-specific identifier) | | `B2_APP_KEY` | App key secret | ### App key scope The B2 app key is **bucket-scoped**, not account-scoped. Can only read/write the `honeyDueProd` bucket. Cannot: - List other buckets - Delete the bucket - Create new buckets - Touch account settings This is the B2 equivalent of an IAM role with least privilege. If the key leaks, the damage is limited to the `honeyDueProd` bucket. ## The minio-go client The Go app uses `github.com/minio/minio-go/v7` — a Go SDK compatible with any S3-flavored API. Relevant code at `internal/services/storage_backend_s3.go`: ```go client, err := minio.New(endpoint, &minio.Options{ Creds: credentials.NewStaticV4(keyID, appKey, ""), Secure: useSSL, Region: region, }) ``` ### Path-style vs virtual-hosted addressing S3's URL scheme has two flavors: - **Virtual-hosted**: `https://mybucket.s3.amazonaws.com/mykey` - **Path-style**: `https://s3.amazonaws.com/mybucket/mykey` With virtual-hosted style, the bucket name must be DNS-compatible — lowercase, no uppercase letters. `honeyDueProd` fails this. With path-style, the bucket name is just a URL path segment — any valid string works. minio-go auto-detects: for AWS S3 it prefers virtual-hosted; for non-AWS endpoints (like B2) it defaults to path-style. So `honeyDueProd` with capital letters works transparently. ## The `B2_USE_SSL` vestigial variable `prod.env` has `B2_USE_SSL=true`. But the Go app's `internal/config/config.go:295` reads the env var `STORAGE_USE_SSL`, not `B2_USE_SSL`: ```go S3UseSSL: viper.GetString("STORAGE_USE_SSL") == "" || viper.GetBool("STORAGE_USE_SSL"), ``` Whoever wrote the original config used `B2_USE_SSL` in `prod.env` and `STORAGE_USE_SSL` in the code. They don't match. **Net effect**: The app reads `STORAGE_USE_SSL`, which is unset, and the default `(empty) || true` evaluates to `true`. So SSL is always on, despite `B2_USE_SSL=false` or `true` or anything else. This is a dormant bug. Anyone setting `B2_USE_SSL=false` expecting to disable TLS would be surprised it stays on. Fortunately that's the right default for production B2 (which only accepts HTTPS anyway). **TODO**: Rename `STORAGE_USE_SSL` → `B2_USE_SSL` in the Go code to match the config. Documented in Chapter 19 §Vestigial config. ## What we store there Today (limited rollout): - User profile photos - Task completion photos - Document uploads (PDFs, images attached to records) File keys follow a hierarchy like: ``` users//profile/.jpg residences//documents/.pdf tasks//completions/.jpg ``` Max file size is **10 MB** per upload (`STORAGE_MAX_FILE_SIZE=10485760`). Allowed MIME types: `image/jpeg`, `image/png`, `image/gif`, `image/webp`, `application/pdf` (`STORAGE_ALLOWED_TYPES`). ## Access control ### Upload flow 1. Client POSTs to `/api/upload/` 2. Go API validates the user is authenticated and authorized for the target resource 3. Go API streams the upload to B2 via minio-go's `PutObject` 4. B2 returns a key 5. Go API stores the key in Postgres 6. Returns the key to the client The B2 bucket is **private**. Clients can't GET directly; they always go through the Go API. ### Download flow (current) 1. Client requests `/api/media/` 2. Go API checks the user can access this key 3. Go API fetches from B2 and streams back to the client This proxies every download through the api. For high-traffic media that's inefficient (api becomes an egress bottleneck). ### Future: signed URLs We could generate time-limited signed URLs for B2 objects: ```go url, err := s3Client.PresignedGetObject(ctx, bucket, key, 1*time.Hour, nil) ``` Returns a URL the client can GET directly from B2, scoped to a specific object, valid for 1h. Saves api bandwidth and latency. Not yet implemented. TODO (Chapter 20). ## Lifecycle and retention We have **no lifecycle rules** set on the bucket. Objects live forever unless the app deletes them. When a user deletes their account, the app should delete their B2 objects. This is currently not automated — a compliance gap for any "right to be forgotten" request. **TODO** (Chapter 20): Either: - Implement explicit cleanup in the user deletion handler, or - Add B2 lifecycle rule tied to object metadata (tag objects with user ID; rule deletes tagged objects when user is soft-deleted) ## Backup of B2 We have no backup of B2 objects. B2 itself replicates within the region, but: - Accidental deletion via our app = data gone - B2 itself being compromised = data gone B2 offers **Object Lock** (WORM — write once read many) which prevents deletion for a retention period. Not enabled; revisit if/when user data sensitivity justifies it. ## Cost projection Current usage is **small** — estimated <50 GB stored. ``` 50 GB × $0.006/GB = $0.30/mo storage 1 GB/mo egress (mostly uncached media served via api) → $0.01 (first 3× of stored amount is free anyway, so effectively $0) ``` Total B2 cost: **< $1/mo**. Hard spending cap set to $20/mo in B2 console — if we ever breach that, something's wrong and we want to know immediately. At 100k users each uploading ~10 MB average: - 1 TB stored = $6/mo - Egress depends on access patterns; with signed URLs served through CF the egress could still be ~free ## Operator cheat sheet ```bash # List bucket contents (requires mc or aws CLI configured with B2 creds) mc alias set b2 https://s3.us-east-005.backblazeb2.com mc ls b2/honeyDueProd/ # Count objects mc find b2/honeyDueProd/ --type f | wc -l # Download an object mc cp b2/honeyDueProd/ ./ # Check B2 console for usage graphs: # https://secure.backblaze.com/b2_buckets.htm ``` From inside a Go api pod: ```bash # Check the in-cluster client config kubectl exec -n honeydue deploy/api -- env | grep B2_ ``` ## References - [Backblaze B2 docs][b2-docs] - [B2 S3-compatible API][b2-s3] - [minio-go/v7][minio-go] - [S3 path-style vs virtual-hosted][s3-style] [b2-docs]: https://www.backblaze.com/docs/ [b2-s3]: https://www.backblaze.com/docs/cloud-storage-s3-compatible-api [minio-go]: https://github.com/minio/minio-go [s3-style]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html