Files
honeyDueAPI/docs/deployment/18-cost.md
T
Trey t 6f303dbbaa
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Migrate prod deploy from Swarm to K3s; add full deployment book
Infrastructure:
- Stack now runs on K3s v1.34.6 HA (3 Hetzner CX33 nodes as managers)
- Traefik DaemonSet + hostNetwork replaces Caddy + ingress mesh
- All manifests in deploy-k3s/manifests/; Swarm config (deploy/) kept
  temporarily for reference

Bug fixes surfaced during migration:
- Dockerfile: golang:1.24-alpine -> 1.25-alpine (go.mod requires 1.25)
- cache_service.go: remove sync.Once reassignment from inside Do()
  callback (was causing 'unlock of unlocked mutex' fatal after
  Redis Ping failure)
- router.go: relax CSP from 'default-src none' to 'default-src self'
  + allowlist fonts.googleapis.com so the marketing landing page CSS
  actually loads in browsers
- deploy/scripts/deploy_prod.sh: use docker buildx with
  --platform linux/amd64 so arm64 (Apple Silicon) dev machines produce
  images runnable on x86_64 Hetzner nodes; fix array expansion under
  set -u
- deploy/swarm-stack.prod.yml: fix secret source references to use
  top-level aliases (the '\${X_SECRET}' form never actually resolved);
  dozzle ports: long-form host_ip is rejected by Swarm, switched to
  short-form (bound to 0.0.0.0 with UFW-based loopback restriction);
  worker replicas 2 -> 1 (Asynq scheduler singleton)
- deploy-k3s/manifests/admin/deployment.yaml: probe path '/admin/' -> '/'
  (Next.js serves at root; /admin/ returned 404 and killed pods);
  startupProbe failureThreshold 12 -> 24
- deploy-k3s/manifests/pod-disruption-budgets.yaml: worker minAvailable
  1 -> 0 (singleton)
- deploy-k3s/manifests/api/deployment.yaml: startupProbe failureThreshold
  12 -> 48 (MigrateWithLock serializes across 3 replicas on first-boot;
  real startup takes up to 240s)
- .gitignore: tighten 'api' -> '/api' (was matching deploy-k3s/manifests/api/
  and admin/src/app/api/*, hiding legitimate files)

New files:
- deploy-k3s/manifests/traefik-helmchartconfig.yaml: DaemonSet +
  hostNetwork override for k3s-bundled Traefik
- deploy-k3s/manifests/ingress/ingress-simple.yaml: plain Ingress
  without TLS (CF Flexible SSL) and without middleware
- deploy-k3s/MIGRATION_NOTES.md: operator-facing migration log

Documentation:
- docs/deployment/ — full deployment book, 26 files, ~42k words:
  - Part I Overview, infrastructure, orchestrator choice (Ch 0-2)
  - Part II Networking, firewall, Cloudflare (Ch 3-4, 13)
  - Part III Security, Traefik ingress (Ch 5-6)
  - Part IV Services, DB, storage, secrets, registry (Ch 7-11)
  - Part V Data flow, deploy process, observability, failures, runbook
    (Ch 12, 14-17)
  - Part VI Cost, Swarm postmortem, roadmap (Ch 18-20)
  - Appendices: glossary, kubectl cheat sheet, file locations,
    consolidated citations
- README.md: Production Deployment section replaced with pointer to
  the book; Go version bumped to 1.25

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 07:20:54 -05:00

7.2 KiB
Raw Blame History

18 — Cost

Summary

Current monthly infrastructure cost is ~$30-40. External SaaS (Fastmail, Apple Developer, Google Play) adds ~$8-17/mo depending on push-enable status. This chapter itemizes every line, projects costs at scale (10k, 100k, 1M users), and shows what dials to turn when we need to save or spend.

Current monthly cost

Compute (Hetzner)

Item Unit cost Count Monthly
CX33 (4 vCPU, 8 GB RAM, 80 GB SSD) $7.99 3 $23.97
Traffic $0 (20 TB/mo included per node, well below) $0
Hetzner Cloud Firewall $0 $0
IPv4 public address $0 (included) 3 $0
Subtotal $23.97

Database (Neon)

Neon Launch plan: $0.106/CU-hour + $0.35/GB-month storage, $5 minimum.

At current usage (low traffic, small schema):

  • ~10 CU-hours/month × $0.106 ≈ $1
  • ~1 GB storage × $0.35 ≈ $0.35
  • Hits the $5 minimum
Item Monthly
Neon Launch ($5 min + usage) ~$5

Object storage (Backblaze B2)

At current usage (~50 GB stored):

Item Monthly
Storage ($0.006/GB × 50 GB) $0.30
Egress (effectively $0 — mostly served through CF) $0
Subtotal ~$0.30

Edge (Cloudflare)

Item Monthly
Cloudflare Free plan (DNS, TLS, CDN, basic DDoS) $0

Registry (Gitea)

Self-hosted on the operator's existing Gitea VPS. Not charged to honeyDue.

Item Monthly
Gitea container registry $0

Total infrastructure

Category Monthly
Compute $23.97
Database ~$5
Storage ~$0.30
Edge $0
Registry $0
Total ~$30

External SaaS

Things not part of the deploy but required for the product:

Item Cost Notes
Fastmail (SMTP for transactional email) Part of operator's existing plan
Apple Developer Program $99/year = $8.25/mo Required for iOS app + APNs
Google Play Developer $25 one-time + $0/mo ongoing
Hetzner Cloud Firewall $0 Free; we use UFW instead

At push-enabled state, total monthly run rate is ~$38-42.

Hidden / untracked costs

  • Operator time: The biggest cost for a bootstrapped project. Treating ops time at $100/hr, a 4-hour incident = $400.
  • Electricity for operator workstation during builds: trivial.
  • Domain registration (myhoneydue.com): ~$12/year = $1/mo.

Cost drivers

1. Compute (scales with traffic)

If api gets >70% CPU utilization, HPA will scale from 3 to 6 replicas. Memory at 3 replicas × 512Mi limit = 1.5 GB; nodes have 8 GB each. Plenty of room before needing more nodes.

Tipping points:

  • 6 api replicas needed sustainedly = bigger CX43 (8 vCPU, 16 GB, ~$16/mo each) or more CX33s

  • Heavy worker throughput = need Asynq PeriodicTaskManager (code change, not infra)

2. Database (scales with query volume + data)

Neon Launch: pay per CU-hour of compute. If idle time ≫ active time, we stay near $5 min. If the app is busy, CU-hours grow.

Tipping points:

  • Consistently >$30/mo at Launch → evaluate Neon Scale plan
  • DB storage >50 GB → $15+/mo just for storage
  • Active query load → consider read replicas (paid feature)

3. Storage (scales with user uploads)

B2 at $0.006/GB is cheap. 1 TB = $6/mo.

Tipping points:

  • 5 TB stored = consider R2 (free egress) if egress becomes a factor

  • Very high egress = evaluate moving B2 behind CF Workers

4. Edge

Cloudflare Free is generous. We move to Pro ($20/mo) if:

  • We need custom WAF rules beyond 5
  • We need Image Resizing for user uploads
  • We need custom Page Rules beyond 3

Projections

10,000 daily active users

Assume 50 API requests per user per day = 500k req/day = ~6 req/s avg. Peaks maybe 3-5× = ~25 req/s.

Bottleneck: probably Neon free-tier CU-hours. At 25 req/s with DB calls, we'd burn through CU-hours fast. Neon bill: $15-30/mo.

Compute: 3 CX33s still handle this comfortably.

Category Projected monthly
Compute $24
Neon ~$20
Storage ~$2
Cloudflare $0
Total ~$46

100,000 daily active users

500k req/s peaks = multi-node api scaling. HPA kicks in.

Category Projected monthly
Compute (3x CX33) $24
Plus Hetzner LB $8.49
Neon Scale (pay-as-you-go, higher baseline) $40-60
B2 (200 GB stored, some egress) $2
Cloudflare Pro $20
Total ~$95-115

At this scale, operator time becomes the bigger cost. Adding paid monitoring (Betterstack ~$15/mo) and uptime (Betterstack Uptime $5/mo) becomes reasonable.

1,000,000 daily active users

Bigger question. We'd be re-evaluating:

  • More Hetzner nodes or bigger instances
  • Neon at scale vs. self-hosted Postgres
  • Maybe Cloudflare Workers to offload traffic

Ballpark: $300-500/mo. At this scale, the company has revenue to justify an ops hire, and this chapter's assumptions break down.

Dials to save money

Immediate (reduce $)

Lever Savings Trade-off
Switch 3 CX33 → 3 Netcup VPS1000G11 ~$4/mo Less polished provider, slightly worse UX
Disable Neon Launch, use Supabase free tier ~$5/mo Supabase free tier limits
2 nodes instead of 3 ~$8/mo Lose HA, two-node Raft is worse than one
1 CX23 (2 vCPU, 4 GB) for admin + worker; 2 CX33 for api ~$5/mo Complexity; node roles

None of these are compelling. Current cost is in the "don't optimize" zone.

Dials to spend when it becomes worth it

Spend Return
Upgrade Neon to Scale ($20+) More CU-hours, connection count room
Add Hetzner LB ($8.49) Real active health checks, sub-second failover
Add monitoring (Betterstack $15) Proactive detection of issues
Add uptime monitoring ($5) Alerts when site is down
CF Pro ($20) Better WAF, Image Resizing
CF Load Balancing ($5) Multi-region failover, active checks on origins

Cumulatively ~$70/mo takes us to a fully-monitored, fully-alerted, multi-region-failing-over setup. At 100k users, worth it.

Historical spend

April 2026 MTD: ~$35 (Hetzner + Neon prorated).

April 2026 (projected): $30-40.

March 2026: Pre-launch; no user traffic yet. Just node rentals. ~$25.

Hetzner April 2026 price adjustment

CX33 went from ~$6.59 → $7.99/mo on 2026-04-01. Our monthly compute cost rose by $4.20 overnight. This is on our budget radar but isn't a forcing function to switch providers.

If Hetzner keeps raising prices (which they've historically resisted; the 2026 adjustment was their first in several years), reconsider.

Budget alerts

  • B2: hard-capped via B2 console at $20/mo. If we breach, something is wrong and B2 rejects further writes.
  • Neon: soft limits via Neon alerts. Set threshold at $20 to get email when approaching.
  • Hetzner: no variable cost at our scale, no alerts needed.
  • Cloudflare: Free plan has hard quotas; no surprise bills possible.

References