Files
honeyDueAPI/docs/server_2026_2_24.md
Trey T bec880886b Coverage priorities 1-5: test pure functions, extract interfaces, mock-based handler tests
- Priority 1: Test NewSendEmailTask + NewSendPushTask (5 tests)
- Priority 2: Test customHTTPErrorHandler — all 15+ branches (21 tests)
- Priority 3: Extract Enqueuer interface + payload builders in worker pkg (5 tests)
- Priority 4: Extract ClassifyFile/ComputeRelPath in migrate-encrypt (6 tests)
- Priority 5: Define Handler interfaces, refactor to accept them, mock-based tests (14 tests)
- Fix .gitignore: /worker instead of worker to stop ignoring internal/worker/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-01 20:30:09 -05:00

11 KiB

Casera Infrastructure Plan — February 2026

Architecture Overview

                    ┌─────────────┐
                    │  Cloudflare  │
                    │  (CDN/DNS)   │
                    └──────┬──────┘
                           │ HTTPS
                    ┌──────┴──────┐
                    │  Hetzner LB  │
                    │   ($5.99)    │
                    └──────┬──────┘
                           │
          ┌────────────────┼────────────────┐
          │                │                │
   ┌──────┴──────┐  ┌──────┴──────┐  ┌──────┴──────┐
   │  CX33 #1    │  │  CX33 #2    │  │  CX33 #3    │
   │  (manager)  │  │  (manager)  │  │  (manager)  │
   │             │  │             │  │             │
   │  api (x2)   │  │  api (x2)   │  │  api (x1)   │
   │  admin      │  │  worker     │  │  worker     │
   │  redis      │  │  dozzle     │  │             │
   └──────┬──────┘  └──────┬──────┘  └──────┬──────┘
          │                │                │
          │    Docker Swarm Overlay (IPsec) │
          └────────────────┼────────────────┘
                           │
              ┌────────────┼────────────────┐
              │                             │
       ┌──────┴──────┐              ┌───────┴──────┐
       │    Neon      │              │  Backblaze   │
       │  (Postgres)  │              │     B2       │
       │   Launch     │              │   (media)    │
       └─────────────┘              └──────────────┘

Swarm Nodes — Hetzner CX33

All 3 nodes are manager+worker (Raft consensus requires 3 managers for fault tolerance — 1 node can go down and the cluster stays operational).

Spec Value
Plan CX33 (Shared Regular Performance)
vCPU 4
RAM 8 GB
Disk 80 GB SSD
Traffic 20 TB/mo included
Price $6.59/mo per node
Region Pick closest to users (US: Ashburn or Hillsboro, EU: Nuremberg/Falkenstein/Helsinki)

Why CX33 over CX23: 8 GB RAM gives headroom for Redis, multiple API replicas, and the admin panel without pressure. The $2.50/mo difference per node isn't worth optimizing away.

Container Distribution

Container Replicas Notes
api 3-6 Spread across all nodes by Swarm
worker 2-3 Asynq workers pull jobs from Redis concurrently
admin 1 Next.js admin panel
redis 1 Pinned to one node with its volume
dozzle 1 Pinned to a manager node (needs Docker socket)

Scaling Path

  • Need more capacity? Add another CX33 with docker swarm join. Swarm rebalances automatically.
  • Need more API throughput? Bump replicas in the compose file. No infra change.
  • Only infrastructure addition needed at scale: the Hetzner Load Balancer ($5.99/mo).

Load Balancer — Hetzner LB

Spec Value
Price $5.99/mo
Purpose Distribute traffic across Swarm nodes, TLS termination
When to add When you need redundant ingress (not required day 1 if using Cloudflare to proxy to a single node)

Database — Neon Postgres (Launch Plan)

Spec Value
Plan Launch (usage-based, no monthly minimum)
Compute $0.106/CU-hr, up to 16 CU (64 GB RAM)
Storage $0.35/GB-month
Connections Up to 10,000 via built-in PgBouncer
Typical cost ~$5-15/mo for light load, ~$20-40/mo at 100k users
Free tier Available for dev/staging (100 CU-hrs/mo, 0.5 GB)

Connection Pooling

Neon includes built-in PgBouncer on all plans. Enable by adding -pooler to the hostname:

# Direct connection
ep-cool-darkness-123456.us-east-2.aws.neon.tech

# Pooled connection (use this in production)
ep-cool-darkness-123456-pooler.us-east-2.aws.neon.tech

Runs in transaction mode — compatible with GORM out of the box.

Configuration

DB_HOST=ep-xxxxx-pooler.us-east-2.aws.neon.tech
DB_PORT=5432
DB_SSLMODE=require
POSTGRES_USER=<from neon dashboard>
POSTGRES_PASSWORD=<from neon dashboard>
POSTGRES_DB=casera

Object Storage — Backblaze B2

Spec Value
Storage $6/TB/mo ($0.006/GB)
Egress $0.01/GB (first 3x stored amount is free)
Free tier 10 GB storage always free
API calls Class A free, Class B/C free first 2,500/day
Spending cap Built-in data caps with alerts at 75% and 100%

Bucket Setup

Bucket Visibility Key Permissions Contents
casera-uploads Private Read/Write (API containers) User-uploaded photos, documents
casera-certs Private Read-only (API + worker) APNs push certificates

Serve files through the API using signed URLs — never expose buckets publicly.

Why B2 Over Others

  • Spending cap: only S3-compatible provider with built-in hard caps and alerts. No surprise bills.
  • Cheapest storage: $6/TB vs Cloudflare R2 at $15/TB vs Tigris at $20/TB.
  • Free egress partner CDNs: Cloudflare, Fastly, bunny.net — zero egress when behind Cloudflare.

CDN — Cloudflare (Free Tier)

Spec Value
Price $0
Purpose DNS, CDN caching, DDoS protection, TLS termination
Setup Point DNS to Cloudflare, proxy traffic to Hetzner LB (or directly to a Swarm node)

Add this on day 1. No reason not to.

Logging — Dozzle

Spec Value
Price $0 (open source)
Port 9999 (internal only — do not expose publicly)
Features Real-time log viewer, webhook support for alerts

Runs as a container in the Swarm. Needs Docker socket access, so it's pinned to a manager node.

For 100k+ users, consider adding Prometheus + Grafana (self-hosted, free) or Betterstack (~$10/mo) for metrics and alerting beyond log viewing.

Security

Swarm Node Firewall (Hetzner Cloud Firewall — free)

Port Protocol Source Purpose
Custom (e.g. 2222) TCP Your IP only SSH
80, 443 TCP Anywhere Public traffic
2377 TCP Swarm nodes only Cluster management
7946 TCP/UDP Swarm nodes only Node discovery
4789 UDP Swarm nodes only Overlay network (VXLAN)
Everything else Blocked

Set up once in Hetzner dashboard, apply to all 3 nodes.

SSH Hardening

# /etc/ssh/sshd_config
Port 2222                    # Non-default port
PermitRootLogin no           # No root SSH
PasswordAuthentication no    # Key-only auth
PubkeyAuthentication yes
AllowUsers deploy            # Only your deploy user

Swarm ↔ Neon (Postgres)

Layer Method
Encryption TLS enforced by Neon (DB_SSLMODE=require)
Authentication Strong password stored as Docker secret
Access control IP allowlist in Neon dashboard — restrict to 3 Swarm node IPs

Swarm ↔ B2 (Object Storage)

Layer Method
Encryption HTTPS always (enforced by B2 API)
Authentication Scoped application keys (not master key)
Access control Per-bucket key permissions (read-only where possible)

Swarm Internal

Layer Method
Overlay encryption driver_opts: encrypted: "true" on overlay network (IPsec between nodes)
Secrets Use docker secret create for DB password, SECRET_KEY, B2 keys, APNs keys. Mounted at /run/secrets/, encrypted in Swarm raft log.
Container isolation Non-root users in all containers (already configured in Dockerfile)

Docker Secrets Migration

Current setup uses environment variables for secrets. Migrate to Docker secrets for production:

# Create secrets
echo "your-db-password" | docker secret create postgres_password -
echo "your-secret-key" | docker secret create secret_key -
echo "your-b2-app-key" | docker secret create b2_app_key -

# Reference in compose file
services:
  api:
    secrets:
      - postgres_password
      - secret_key
secrets:
  postgres_password:
    external: true
  secret_key:
    external: true

Application code reads from /run/secrets/<name> instead of env vars.

Redis (In-Cluster)

Redis stays inside the Swarm — no need to externalize.

Purpose Details
Asynq job queue Background jobs: push notifications, digests, reminders, onboarding emails
Static data cache Cached lookup tables with ETag support
Resource usage ~20-50 MB RAM, negligible CPU

At 100k users, Redis handles job queuing for nightly digests (100k enqueue + dequeue operations) without issue. A single Redis instance handles millions of operations per second.

Asynq coordinates multiple worker replicas automatically — each job is dequeued atomically by exactly one worker, no double-processing.

Performance Estimates

Metric Value
Single CX33 API throughput ~1,000-2,000 req/s (blended, with Neon latency)
3-node cluster throughput ~3,000-6,000 req/s
Avg requests per user per day ~50
Estimated user capacity (3 nodes) ~200k-500k registered users
Bottleneck at scale Neon compute tier, not Go or Swarm

These are napkin estimates. Load test before launch.

Monthly Cost Summary

Starting Out

Component Provider Cost
3x Swarm nodes Hetzner CX33 $19.77/mo
Postgres Neon Launch ~$5-15/mo
Object storage Backblaze B2 <$1/mo
CDN Cloudflare Free $0
Logging Dozzle (self-hosted) $0
Total ~$25-35/mo

At Scale (100k users)

Component Provider Cost
3x Swarm nodes Hetzner CX33 $19.77/mo
Load balancer Hetzner LB $5.99/mo
Postgres Neon Launch ~$20-40/mo
Object storage Backblaze B2 ~$1-3/mo
CDN Cloudflare Free $0
Monitoring Betterstack or self-hosted ~$0-10/mo
Total ~$47-79/mo

TODO

  • Set up 3x Hetzner CX33 instances
  • Initialize Docker Swarm (docker swarm init on first node, docker swarm join on others)
  • Configure Hetzner Cloud Firewall
  • Harden SSH on all nodes
  • Create Neon project (Launch plan), configure IP allowlist
  • Create Backblaze B2 buckets with scoped application keys
  • Set up Cloudflare DNS proxying
  • Update prod compose file: remove db service, add overlay encryption, add Docker secrets
  • Add B2 SDK integration for file uploads (code change)
  • Update config to read from /run/secrets/ for Docker secrets
  • Set B2 spending cap and alerts
  • Load test the deployed stack
  • Add Hetzner LB when needed