Files
honeyDueAPI/docs/deployment/18-cost.md
T
Trey t 6f303dbbaa
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Migrate prod deploy from Swarm to K3s; add full deployment book
Infrastructure:
- Stack now runs on K3s v1.34.6 HA (3 Hetzner CX33 nodes as managers)
- Traefik DaemonSet + hostNetwork replaces Caddy + ingress mesh
- All manifests in deploy-k3s/manifests/; Swarm config (deploy/) kept
  temporarily for reference

Bug fixes surfaced during migration:
- Dockerfile: golang:1.24-alpine -> 1.25-alpine (go.mod requires 1.25)
- cache_service.go: remove sync.Once reassignment from inside Do()
  callback (was causing 'unlock of unlocked mutex' fatal after
  Redis Ping failure)
- router.go: relax CSP from 'default-src none' to 'default-src self'
  + allowlist fonts.googleapis.com so the marketing landing page CSS
  actually loads in browsers
- deploy/scripts/deploy_prod.sh: use docker buildx with
  --platform linux/amd64 so arm64 (Apple Silicon) dev machines produce
  images runnable on x86_64 Hetzner nodes; fix array expansion under
  set -u
- deploy/swarm-stack.prod.yml: fix secret source references to use
  top-level aliases (the '\${X_SECRET}' form never actually resolved);
  dozzle ports: long-form host_ip is rejected by Swarm, switched to
  short-form (bound to 0.0.0.0 with UFW-based loopback restriction);
  worker replicas 2 -> 1 (Asynq scheduler singleton)
- deploy-k3s/manifests/admin/deployment.yaml: probe path '/admin/' -> '/'
  (Next.js serves at root; /admin/ returned 404 and killed pods);
  startupProbe failureThreshold 12 -> 24
- deploy-k3s/manifests/pod-disruption-budgets.yaml: worker minAvailable
  1 -> 0 (singleton)
- deploy-k3s/manifests/api/deployment.yaml: startupProbe failureThreshold
  12 -> 48 (MigrateWithLock serializes across 3 replicas on first-boot;
  real startup takes up to 240s)
- .gitignore: tighten 'api' -> '/api' (was matching deploy-k3s/manifests/api/
  and admin/src/app/api/*, hiding legitimate files)

New files:
- deploy-k3s/manifests/traefik-helmchartconfig.yaml: DaemonSet +
  hostNetwork override for k3s-bundled Traefik
- deploy-k3s/manifests/ingress/ingress-simple.yaml: plain Ingress
  without TLS (CF Flexible SSL) and without middleware
- deploy-k3s/MIGRATION_NOTES.md: operator-facing migration log

Documentation:
- docs/deployment/ — full deployment book, 26 files, ~42k words:
  - Part I Overview, infrastructure, orchestrator choice (Ch 0-2)
  - Part II Networking, firewall, Cloudflare (Ch 3-4, 13)
  - Part III Security, Traefik ingress (Ch 5-6)
  - Part IV Services, DB, storage, secrets, registry (Ch 7-11)
  - Part V Data flow, deploy process, observability, failures, runbook
    (Ch 12, 14-17)
  - Part VI Cost, Swarm postmortem, roadmap (Ch 18-20)
  - Appendices: glossary, kubectl cheat sheet, file locations,
    consolidated citations
- README.md: Production Deployment section replaced with pointer to
  the book; Go version bumped to 1.25

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 07:20:54 -05:00

244 lines
7.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 18 — Cost
## Summary
Current monthly infrastructure cost is ~$30-40. External SaaS (Fastmail,
Apple Developer, Google Play) adds ~$8-17/mo depending on push-enable
status. This chapter itemizes every line, projects costs at scale
(10k, 100k, 1M users), and shows what dials to turn when we need to
save or spend.
## Current monthly cost
### Compute (Hetzner)
| Item | Unit cost | Count | Monthly |
|---|---:|---|---:|
| CX33 (4 vCPU, 8 GB RAM, 80 GB SSD) | $7.99 | 3 | **$23.97** |
| Traffic | $0 (20 TB/mo included per node, well below) | — | $0 |
| Hetzner Cloud Firewall | $0 | — | $0 |
| IPv4 public address | $0 (included) | 3 | $0 |
| **Subtotal** | | | **$23.97** |
### Database (Neon)
Neon Launch plan: $0.106/CU-hour + $0.35/GB-month storage, $5 minimum.
At current usage (low traffic, small schema):
- ~10 CU-hours/month × $0.106 ≈ $1
- ~1 GB storage × $0.35 ≈ $0.35
- Hits the $5 minimum
| Item | Monthly |
|---|---:|
| Neon Launch ($5 min + usage) | **~$5** |
### Object storage (Backblaze B2)
At current usage (~50 GB stored):
| Item | Monthly |
|---|---:|
| Storage ($0.006/GB × 50 GB) | $0.30 |
| Egress (effectively $0 — mostly served through CF) | $0 |
| **Subtotal** | **~$0.30** |
### Edge (Cloudflare)
| Item | Monthly |
|---|---:|
| Cloudflare Free plan (DNS, TLS, CDN, basic DDoS) | **$0** |
### Registry (Gitea)
Self-hosted on the operator's existing Gitea VPS. Not charged to
honeyDue.
| Item | Monthly |
|---|---:|
| Gitea container registry | **$0** |
### Total infrastructure
| Category | Monthly |
|---|---:|
| Compute | $23.97 |
| Database | ~$5 |
| Storage | ~$0.30 |
| Edge | $0 |
| Registry | $0 |
| **Total** | **~$30** |
## External SaaS
Things not part of the deploy but required for the product:
| Item | Cost | Notes |
|---|---:|---|
| Fastmail (SMTP for transactional email) | Part of operator's existing plan | — |
| Apple Developer Program | $99/year = $8.25/mo | Required for iOS app + APNs |
| Google Play Developer | $25 one-time + $0/mo ongoing | — |
| Hetzner Cloud Firewall | $0 | Free; we use UFW instead |
At push-enabled state, total monthly run rate is **~$38-42**.
## Hidden / untracked costs
- **Operator time**: The biggest cost for a bootstrapped project.
Treating ops time at $100/hr, a 4-hour incident = $400.
- **Electricity for operator workstation during builds**: trivial.
- **Domain registration (myhoneydue.com)**: ~$12/year = $1/mo.
## Cost drivers
### 1. Compute (scales with traffic)
If api gets >70% CPU utilization, HPA will scale from 3 to 6 replicas.
Memory at 3 replicas × 512Mi limit = 1.5 GB; nodes have 8 GB each.
Plenty of room before needing more nodes.
Tipping points:
- >6 api replicas needed sustainedly = bigger CX43 (8 vCPU, 16 GB,
~$16/mo each) or more CX33s
- Heavy worker throughput = need Asynq PeriodicTaskManager (code
change, not infra)
### 2. Database (scales with query volume + data)
Neon Launch: pay per CU-hour of compute. If idle time ≫ active time,
we stay near $5 min. If the app is busy, CU-hours grow.
Tipping points:
- Consistently >$30/mo at Launch → evaluate Neon Scale plan
- DB storage >50 GB → $15+/mo just for storage
- Active query load → consider read replicas (paid feature)
### 3. Storage (scales with user uploads)
B2 at $0.006/GB is cheap. 1 TB = $6/mo.
Tipping points:
- >5 TB stored = consider R2 (free egress) if egress becomes a factor
- Very high egress = evaluate moving B2 behind CF Workers
### 4. Edge
Cloudflare Free is generous. We move to Pro ($20/mo) if:
- We need custom WAF rules beyond 5
- We need Image Resizing for user uploads
- We need custom Page Rules beyond 3
## Projections
### 10,000 daily active users
Assume 50 API requests per user per day = 500k req/day = ~6 req/s avg.
Peaks maybe 3-5× = ~25 req/s.
Bottleneck: probably Neon free-tier CU-hours. At 25 req/s with DB calls,
we'd burn through CU-hours fast. Neon bill: $15-30/mo.
Compute: 3 CX33s still handle this comfortably.
| Category | Projected monthly |
|---|---:|
| Compute | $24 |
| Neon | ~$20 |
| Storage | ~$2 |
| Cloudflare | $0 |
| **Total** | **~$46** |
### 100,000 daily active users
500k req/s peaks = multi-node api scaling. HPA kicks in.
| Category | Projected monthly |
|---|---:|
| Compute (3x CX33) | $24 |
| Plus Hetzner LB | $8.49 |
| Neon Scale (pay-as-you-go, higher baseline) | $40-60 |
| B2 (200 GB stored, some egress) | $2 |
| Cloudflare Pro | $20 |
| **Total** | **~$95-115** |
At this scale, operator time becomes the bigger cost. Adding paid
monitoring (Betterstack ~$15/mo) and uptime (Betterstack Uptime $5/mo)
becomes reasonable.
### 1,000,000 daily active users
Bigger question. We'd be re-evaluating:
- More Hetzner nodes or bigger instances
- Neon at scale vs. self-hosted Postgres
- Maybe Cloudflare Workers to offload traffic
Ballpark: $300-500/mo. At this scale, the company has revenue to
justify an ops hire, and this chapter's assumptions break down.
## Dials to save money
### Immediate (reduce $)
| Lever | Savings | Trade-off |
|---|---|---|
| Switch 3 CX33 → 3 Netcup VPS1000G11 | ~$4/mo | Less polished provider, slightly worse UX |
| Disable Neon Launch, use Supabase free tier | ~$5/mo | Supabase free tier limits |
| 2 nodes instead of 3 | ~$8/mo | Lose HA, two-node Raft is worse than one |
| 1 CX23 (2 vCPU, 4 GB) for admin + worker; 2 CX33 for api | ~$5/mo | Complexity; node roles |
None of these are compelling. Current cost is in the "don't optimize"
zone.
### Dials to spend when it becomes worth it
| Spend | Return |
|---|---|
| Upgrade Neon to Scale ($20+) | More CU-hours, connection count room |
| Add Hetzner LB ($8.49) | Real active health checks, sub-second failover |
| Add monitoring (Betterstack $15) | Proactive detection of issues |
| Add uptime monitoring ($5) | Alerts when site is down |
| CF Pro ($20) | Better WAF, Image Resizing |
| CF Load Balancing ($5) | Multi-region failover, active checks on origins |
Cumulatively **~$70/mo** takes us to a fully-monitored, fully-alerted,
multi-region-failing-over setup. At 100k users, worth it.
## Historical spend
**April 2026 MTD**: ~$35 (Hetzner + Neon prorated).
**April 2026 (projected)**: $30-40.
**March 2026**: Pre-launch; no user traffic yet. Just node rentals.
~$25.
## Hetzner April 2026 price adjustment
CX33 went from ~$6.59 → $7.99/mo on 2026-04-01. Our monthly compute
cost rose by $4.20 overnight. This is on our budget radar but isn't a
forcing function to switch providers.
If Hetzner keeps raising prices (which they've historically resisted;
the 2026 adjustment was their first in several years), reconsider.
## Budget alerts
- **B2**: hard-capped via B2 console at $20/mo. If we breach, something
is wrong and B2 rejects further writes.
- **Neon**: soft limits via Neon alerts. Set threshold at $20 to get
email when approaching.
- **Hetzner**: no variable cost at our scale, no alerts needed.
- **Cloudflare**: Free plan has hard quotas; no surprise bills possible.
## References
- [Hetzner Cloud pricing][hetzner-cloud]
- [Neon pricing][neon-pricing]
- [Backblaze B2 pricing][b2-pricing]
- [Cloudflare Free plan][cf-free]
[hetzner-cloud]: https://www.hetzner.com/cloud/
[neon-pricing]: https://neon.com/pricing
[b2-pricing]: https://www.backblaze.com/cloud-storage/pricing
[cf-free]: https://www.cloudflare.com/plans/free/