Infrastructure:
- Stack now runs on K3s v1.34.6 HA (3 Hetzner CX33 nodes as managers)
- Traefik DaemonSet + hostNetwork replaces Caddy + ingress mesh
- All manifests in deploy-k3s/manifests/; Swarm config (deploy/) kept
temporarily for reference
Bug fixes surfaced during migration:
- Dockerfile: golang:1.24-alpine -> 1.25-alpine (go.mod requires 1.25)
- cache_service.go: remove sync.Once reassignment from inside Do()
callback (was causing 'unlock of unlocked mutex' fatal after
Redis Ping failure)
- router.go: relax CSP from 'default-src none' to 'default-src self'
+ allowlist fonts.googleapis.com so the marketing landing page CSS
actually loads in browsers
- deploy/scripts/deploy_prod.sh: use docker buildx with
--platform linux/amd64 so arm64 (Apple Silicon) dev machines produce
images runnable on x86_64 Hetzner nodes; fix array expansion under
set -u
- deploy/swarm-stack.prod.yml: fix secret source references to use
top-level aliases (the '\${X_SECRET}' form never actually resolved);
dozzle ports: long-form host_ip is rejected by Swarm, switched to
short-form (bound to 0.0.0.0 with UFW-based loopback restriction);
worker replicas 2 -> 1 (Asynq scheduler singleton)
- deploy-k3s/manifests/admin/deployment.yaml: probe path '/admin/' -> '/'
(Next.js serves at root; /admin/ returned 404 and killed pods);
startupProbe failureThreshold 12 -> 24
- deploy-k3s/manifests/pod-disruption-budgets.yaml: worker minAvailable
1 -> 0 (singleton)
- deploy-k3s/manifests/api/deployment.yaml: startupProbe failureThreshold
12 -> 48 (MigrateWithLock serializes across 3 replicas on first-boot;
real startup takes up to 240s)
- .gitignore: tighten 'api' -> '/api' (was matching deploy-k3s/manifests/api/
and admin/src/app/api/*, hiding legitimate files)
New files:
- deploy-k3s/manifests/traefik-helmchartconfig.yaml: DaemonSet +
hostNetwork override for k3s-bundled Traefik
- deploy-k3s/manifests/ingress/ingress-simple.yaml: plain Ingress
without TLS (CF Flexible SSL) and without middleware
- deploy-k3s/MIGRATION_NOTES.md: operator-facing migration log
Documentation:
- docs/deployment/ — full deployment book, 26 files, ~42k words:
- Part I Overview, infrastructure, orchestrator choice (Ch 0-2)
- Part II Networking, firewall, Cloudflare (Ch 3-4, 13)
- Part III Security, Traefik ingress (Ch 5-6)
- Part IV Services, DB, storage, secrets, registry (Ch 7-11)
- Part V Data flow, deploy process, observability, failures, runbook
(Ch 12, 14-17)
- Part VI Cost, Swarm postmortem, roadmap (Ch 18-20)
- Appendices: glossary, kubectl cheat sheet, file locations,
consolidated citations
- README.md: Production Deployment section replaced with pointer to
the book; Go version bumped to 1.25
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 KiB
13 — Cloudflare
Summary
Cloudflare sits in front of every public request. It provides DNS
(authoritative nameservers for myhoneydue.com), TLS termination at
the edge, DDoS mitigation, caching, and the round-robin fan-out across
our three node IPs. We use the Free plan. TLS mode is "Flexible"
(HTTP between CF and origin). This chapter documents every Cloudflare
setting that matters.
DNS
Zone
myhoneydue.com, managed by Cloudflare. Authoritative nameservers:
carol.ns.cloudflare.com
ishaan.ns.cloudflare.com
Records that matter
| Type | Name | Content | Proxy | Notes |
|---|---|---|---|---|
| A | api |
178.104.247.152 | 🟠 Proxied | hetzner1 |
| A | api |
178.105.32.198 | 🟠 Proxied | hetzner2 |
| A | api |
178.104.249.189 | 🟠 Proxied | hetzner3 |
| A | admin |
178.104.247.152 | 🟠 Proxied | same 3 IPs |
| A | admin |
178.105.32.198 | 🟠 Proxied | |
| A | admin |
178.104.249.189 | 🟠 Proxied | |
| A | @ |
178.104.247.152 | 🟠 Proxied | same 3 IPs |
| A | @ |
178.105.32.198 | 🟠 Proxied | |
| A | @ |
178.104.249.189 | 🟠 Proxied |
Three A records per name → Cloudflare selects one per request. With proxying on (orange cloud), the client never sees these IPs — it sees a Cloudflare edge IP. CF internally picks which of the three origin IPs to connect to; if one fails the connection, CF retries the next.
TXT records for email (Fastmail sending domain): SPF, DKIM, DMARC. Not our immediate concern; configured by the Fastmail custom-domain setup.
Why three A records per name, not one
With one record pointing at hetzner1:
- Only hetzner1 sees traffic
- If hetzner1 is unreachable, everything breaks until we change DNS
With three records:
- CF chooses one origin per connection
- If one node's port :80 stops responding, CF tries the others
- Node upgrades can be done one at a time with no user impact
This is poor-man's load balancing. A Hetzner Load Balancer or Cloudflare Load Balancer (paid) would be more sophisticated — with active health checks and automatic failover on sub-second latency. Our DNS approach is "good enough" for the traffic volume.
Cloudflare's origin health checks
On Free plan, CF doesn't actively probe origins. It reacts to real connection failures: if an origin returns 5xx repeatedly or connection times out, CF marks it unhealthy for that edge POP for some time.
Upgrading to Cloudflare Load Balancing ($5/mo add-on) would enable active health checks — explicit probes independent of traffic. Useful when you want sub-second failover.
TLS
Mode: Flexible
CF Dashboard → SSL/TLS → Overview → Flexible.
What this means:
- User ↔ Cloudflare: TLS (HTTPS)
- Cloudflare ↔ Origin: plaintext HTTP (port 80)
Why we chose it:
- No origin cert required on the Hetzner nodes
- Zero Traefik cert-management complexity
- Fine for a site where CF terminates all user-facing TLS
Downsides:
- An attacker with network access between CF and Hetzner could read traffic. Realistically: nobody between CF's POPs and Hetzner's Nuremberg DC, but it's theoretically plaintext on the wire.
- MitM risk if DNS gets hijacked and traffic is routed through an unintended origin.
Future: Full (strict)
The next step up is Full (strict): CF verifies origin's TLS cert and connects over HTTPS. Cloudflare provides free Origin CA certificates for this: they're issued by a CF-internal CA that only CF's own edge accepts. An attacker without a CF-signed cert can't impersonate our origin.
Path to enable:
- Generate Origin CA cert in CF dashboard → SSL/TLS → Origin Server
- Download as PEM
- Create k8s Secret
cloudflare-origin-cert:kubectl create secret tls cloudflare-origin-cert -n honeydue \ --cert=origin.crt --key=origin.key - Add
tls:block to our Ingress:spec: tls: - hosts: [api.myhoneydue.com] secretName: cloudflare-origin-cert - Switch CF SSL mode to Full (strict)
Trad-off: the cloudflare-origin-cert expires (default 15 years), so
low maintenance. TODO (Chapter 20).
Edge certificate
CF provides a free edge certificate for *.myhoneydue.com and
myhoneydue.com. Auto-renewed by Cloudflare. We don't touch it.
Always Use HTTPS
SSL/TLS → Edge Certificates → Always Use HTTPS: On (default).
Redirects any HTTP → HTTPS at the CF edge. Clients that hit
http://api.myhoneydue.com/* get 301'd to https://.... Origin never
sees the HTTP request.
HSTS
Not currently enabled. HSTS (HTTP Strict Transport Security) sends
a header telling browsers "always use HTTPS for this domain." Once set
with long max-age, it's permanent until it expires — if we later
misconfigure TLS, HSTS-enabled browsers refuse to connect at all.
Enabling HSTS is a TODO but requires confidence in our TLS stability. Not tonight.
DDoS mitigation
CF's Free plan includes basic DDoS protection:
- Volumetric attacks absorbed at the edge
- Obvious bot patterns blocked (known-bad user agents, headless browsers doing suspicious things)
Under a large attack, CF might:
- Insert a "checking your browser" JavaScript challenge (the ~5-second "Cloudflare is checking your browser" page)
- Rate-limit by IP
Under a sustained, sophisticated attack we might need:
- CF Pro plan ($20/mo) for more rule customization
- Enterprise plan for negotiated protection
- Extra measures like Cloudflare Magic Transit
So far, not needed.
Caching
Default CF caching:
- Static assets (CSS, JS, images) cached aggressively based on extension
- HTML pages honored per
Cache-Controlheaders from origin - JSON API responses typically not cached (no
Cache-Control: public)
Our Go API doesn't set Cache-Control: public on any endpoint, so CF
treats them as uncacheable. Every API call reaches origin.
If we wanted to cache certain endpoints (e.g., public lookup tables):
c.Response().Header().Set("Cache-Control", "public, max-age=300")
And CF will cache for 5 minutes.
Firewall rules at CF
CF Dashboard → Security → WAF. On Free tier:
- Managed rules: a small free allowlist of "obvious-attack" patterns
- Custom rules: limited (5 on Free, 20 on Pro)
We have no custom rules defined currently. The managed ruleset covers:
- SQL injection attempts in query strings
- Known-vulnerable bot User-Agents
- XSS attempts in common parameters
Rate limiting
CF Free: 10,000 requests per 10 minutes per IP for free rules (we haven't configured any). The API itself should have rate limits for sensitive endpoints; we don't rely on CF for that.
What CF does NOT do for us
- Authenticate users — our app does
- Authorize requests — our app does
- Encrypt pod-to-pod traffic — nothing Cloudflare can help with
- Backup origin data — CF caches but doesn't store copies persistently
Turnstile / bot management
Not enabled. If we start seeing account-creation spam, Cloudflare Turnstile (free) would be a good addition — a CAPTCHA replacement that doesn't require user interaction for most traffic.
Origin IP protection
CF proxying (orange cloud) is the primary protection of our origin IPs. When proxying is on:
- DNS queries return CF edge IPs, never origin
- HTTP/HTTPS traffic goes through CF
However, our origin IPs can leak via:
- Email sending (if the app ever sent email directly from the origin IP) — we use Fastmail so this isn't an issue
- Outbound connections (our pods connect out to Neon, B2, Fastmail from the nodes' public IPs; those IPs appear in external logs)
- Historical DNS records (services like SecurityTrails log historical DNS; if we ever had unproxied A records, attackers can look them up)
If origin IPs leak, attackers can bypass CF's protection by connecting directly to node IPs. Current mitigation:
- UFW only allows :80/:443 from anywhere
- Our app has no ports bound to the public IP
Future (Chapter 20): UFW rule to allow :80/:443 only from CF IP ranges. Prevents direct-connect bypass entirely.
Cloudflare IP ranges (used in Traefik trustedIPs)
From cloudflare.com/ips:
IPv4 ranges:
173.245.48.0/20
103.21.244.0/22
103.22.200.0/22
103.31.4.0/22
141.101.64.0/18
108.162.192.0/18
190.93.240.0/20
188.114.96.0/20
197.234.240.0/22
198.41.128.0/17
162.158.0.0/15
104.16.0.0/13
104.24.0.0/14
172.64.0.0/13
131.0.72.0/22
IPv6 ranges:
2400:cb00::/32
2606:4700::/32
2803:f800::/32
2405:b500::/32
2405:8100::/32
2a06:98c0::/29
2c0f:f248::/32
These are used in two places:
- Traefik
forwardedHeaders.trustedIPs— we already have this configured (Chapter 6) - UFW
allow 80/tcp from <cf-range>— NOT configured (TODO)
CF occasionally adds new ranges. If a future CF range isn't in our list, we'd either trust unknown IPs (if lax) or reject legitimate CF traffic (if strict). The canonical source is the public API:
curl -sS https://www.cloudflare.com/ips-v4
curl -sS https://www.cloudflare.com/ips-v6
API token for programmatic changes
If we automate DNS changes (e.g., adding new subdomain on deploy),
we'd need a CF API token with Zone:DNS:Edit scope for the
myhoneydue.com zone.
Currently not automated; DNS is managed in the CF dashboard by hand.
Cost
$0/mo. Free plan covers everything we use. Paid plans add features we don't need yet:
| Feature | Free | Pro ($20) | Business ($200) |
|---|---|---|---|
| DNS + proxying | ✓ | ✓ | ✓ |
| Basic DDoS | ✓ | ✓ | ✓ |
| SSL (edge + Flexible + Full + Full strict) | ✓ | ✓ | ✓ |
| WAF managed rules | ✓ (limited) | ✓ (more) | ✓ (all) |
| Custom firewall rules | 5 | 20 | 100 |
| Page Rules | 3 | 20 | 50 |
| Image Resizing | no | no | ✓ |
| Load Balancing | no | $5/mo add-on | ✓ |
We'd consider Pro ($20/mo) if:
- We needed a custom WAF rule beyond the 5-rule limit
- We wanted Image Resizing for user-uploaded photos
Neither is needed today.
Operator cheat sheet
# Query current CF-served DNS
dig +short @1.1.1.1 api.myhoneydue.com # returns CF edge IPs when proxied
# Query our origin directly (bypass CF)
curl -sS -H "Host: api.myhoneydue.com" http://178.104.247.152/api/health/
# Check CF headers (confirm you're going through CF)
curl -sS -I https://api.myhoneydue.com/api/health/ | grep -i cf-
# Purge CF cache (requires API token)
curl -X POST \
-H "Authorization: Bearer $CF_TOKEN" \
-H "Content-Type: application/json" \
"https://api.cloudflare.com/client/v4/zones/<zone_id>/purge_cache" \
-d '{"purge_everything":true}'