iOS Sign In with Apple failed silently — the KMP client never reached
Kratos. Traced to the cloudflare-only Traefik middleware rejecting every
request at the auth ingress.
Root cause: on this cluster klipper-lb sits in front of Traefik and
SNATs the source IP. Traefik's ipAllowList sees the klipper-lb pod IP,
not Cloudflare's real source IP — so even legitimate iOS requests
proxied through Cloudflare get 403'd. The api ingress doesn't have
this middleware (and works correctly), so removing it from auth
matches the working pattern.
Kratos is the user-facing OIDC endpoint — every iOS/web user device
needs to reach it. Cloudflare's edge still does DDoS protection;
Kratos applies its own per-flow rate limits. The IP allowlist was
buying nothing here and breaking everything.
Verified after this change:
- GET /health/alive → 200
- GET /health/ready → 200
- GET /self-service/login/api → 200 + valid flow body listing apple
as an OIDC provider option
Related but not fixed by this commit: the same klipper-lb SNAT issue
affects admin.myhoneydue.com (which retains cloudflare-only). Admin
basic auth still gates real access there, but the IP check is dead
weight. Proper fix is configuring Traefik ipStrategy to read the
client IP from X-Forwarded-For (set by Cloudflare). Tracked as a
follow-up.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
honeyDue — K3s Production Deployment
Production Kubernetes deployment for honeyDue on Hetzner Cloud using K3s.
Architecture: 3-node HA K3s cluster (CX33), Neon Postgres, Redis (in-cluster), Backblaze B2 (uploads), Cloudflare CDN/TLS.
Domains: api.myhoneydue.com, admin.myhoneydue.com
Quick Start
cd honeyDueAPI-go/deploy-k3s
# 1. Fill in the single config file
cp config.yaml.example config.yaml
# Edit config.yaml — fill in ALL empty values
# 2. Create secret files
# See secrets/README.md for the full list
echo "your-neon-password" > secrets/postgres_password.txt
openssl rand -base64 48 > secrets/secret_key.txt
echo "your-smtp-password" > secrets/email_host_password.txt
echo "your-fcm-key" > secrets/fcm_server_key.txt
cp /path/to/AuthKey.p8 secrets/apns_auth_key.p8
cp /path/to/origin.pem secrets/cloudflare-origin.crt
cp /path/to/origin-key.pem secrets/cloudflare-origin.key
# 3. Provision → Secrets → Deploy
./scripts/01-provision-cluster.sh
./scripts/02-setup-secrets.sh
./scripts/03-deploy.sh
# 4. Set up Hetzner LB + Cloudflare DNS (see sections below)
# 5. Verify
./scripts/04-verify.sh
curl https://api.myhoneydue.com/api/health/
That's it. Everything reads from config.yaml + secrets/.
Table of Contents
- Prerequisites
- Configuration
- Provision Cluster
- Create Secrets
- Deploy
- Configure Load Balancer & DNS
- Verify
- Monitoring & Logs
- Scaling
- Rollback
- Backup & DR
- Security Checklist
- Troubleshooting
1. Prerequisites
| Tool | Install | Purpose |
|---|---|---|
hetzner-k3s |
gem install hetzner-k3s |
Cluster provisioning |
kubectl |
https://kubernetes.io/docs/tasks/tools/ | Cluster management |
helm |
https://helm.sh/docs/intro/install/ | Optional: Prometheus/Grafana |
stern |
brew install stern |
Multi-pod log tailing |
docker |
https://docs.docker.com/get-docker/ | Image building |
python3 |
Pre-installed on macOS | Config parsing |
htpasswd |
brew install httpd or apt install apache2-utils |
Admin basic auth secret |
Verify:
hetzner-k3s version && kubectl version --client && docker version && python3 --version
2. Configuration
There are two things to fill in:
config.yaml — all string configuration
cp config.yaml.example config.yaml
Open config.yaml and fill in every empty "" value:
| Section | What to fill in |
|---|---|
cluster.hcloud_token |
Hetzner API token (Read/Write) — generate at console.hetzner.cloud |
registry.* |
GHCR credentials (same as Docker Swarm setup) |
database.host, database.user |
Neon PostgreSQL connection info |
email.user |
Fastmail email address |
push.apns_key_id, push.apns_team_id |
Apple Push Notification identifiers |
storage.b2_* |
Backblaze B2 bucket and credentials |
redis.password |
Strong password for Redis authentication (required for production) |
admin.basic_auth_user |
HTTP basic auth username for admin panel |
admin.basic_auth_password |
HTTP basic auth password for admin panel |
Everything else has sensible defaults. config.yaml is gitignored.
secrets/ — file-based secrets
These are binary or multi-line files that can't go in YAML:
| File | Source |
|---|---|
secrets/postgres_password.txt |
Your Neon database password |
secrets/secret_key.txt |
openssl rand -base64 48 (min 32 chars) |
secrets/email_host_password.txt |
Fastmail app password |
secrets/fcm_server_key.txt |
Firebase console → Project Settings → Cloud Messaging |
secrets/apns_auth_key.p8 |
Apple Developer → Keys → APNs key |
secrets/cloudflare-origin.crt |
Cloudflare → SSL/TLS → Origin Server → Create Certificate |
secrets/cloudflare-origin.key |
(saved with the certificate above) |
3. Provision Cluster
export KUBECONFIG=$(pwd)/kubeconfig
./scripts/01-provision-cluster.sh
This script:
- Reads cluster config from
config.yaml - Generates
cluster-config.yamlfor hetzner-k3s - Provisions 3x CX33 nodes with HA etcd (5-10 minutes)
- Writes node IPs back into
config.yaml - Labels the Redis node
After provisioning:
kubectl get nodes
4. Create Secrets
./scripts/02-setup-secrets.sh
This reads config.yaml for registry credentials and creates all Kubernetes Secrets from the secrets/ files:
honeydue-secrets— DB password, app secret, email password, FCM key, Redis password (if configured)honeydue-apns-key— APNS .p8 key (mounted as volume in pods)ghcr-credentials— GHCR image pull credentialscloudflare-origin-cert— TLS certificate for Ingressadmin-basic-auth— htpasswd secret for admin panel basic auth (if configured)
5. Deploy
Full deploy (build + push + apply):
./scripts/03-deploy.sh
Deploy pre-built images (skip build):
./scripts/03-deploy.sh --skip-build --tag abc1234
The script:
- Reads registry config from
config.yaml - Builds and pushes 3 Docker images to GHCR
- Generates a Kubernetes ConfigMap from
config.yaml(converts to flat env vars) - Applies all manifests with image tag substitution
- Waits for all rollouts to complete
6. Configure Load Balancer & DNS
Hetzner Load Balancer
- Hetzner Console → Load Balancers → Create
- Location: fsn1, add all 3 nodes as targets
- Service: TCP 443 → 443, health check on TCP 443
- Note the LB IP and update
load_balancer_ipinconfig.yaml
Cloudflare DNS
-
Cloudflare Dashboard →
myhoneydue.com→ DNSType Name Content Proxy A api<LB_IP>Proxied (orange cloud) A admin<LB_IP>Proxied (orange cloud) -
SSL/TLS → Overview → Set mode to Full (Strict)
-
If you haven't generated the origin cert yet: SSL/TLS → Origin Server → Create Certificate
- Hostnames:
*.myhoneydue.com,myhoneydue.com - Validity: 15 years
- Save to
secrets/cloudflare-origin.crtandsecrets/cloudflare-origin.key - Re-run
./scripts/02-setup-secrets.sh
- Hostnames:
7. Verify
# Automated cluster health check
./scripts/04-verify.sh
# External health check (after DNS propagation)
curl -v https://api.myhoneydue.com/api/health/
Expected: {"status": "ok"} with HTTP 200.
8. Monitoring & Logs
Logs with stern
stern -n honeydue api # All API pod logs
stern -n honeydue worker # All worker logs
stern -n honeydue . # Everything
stern -n honeydue api | grep ERROR # Filter
kubectl logs
kubectl logs -n honeydue deployment/api -f
kubectl logs -n honeydue <pod-name> --previous # Crashed container
Resource usage
kubectl top pods -n honeydue
kubectl top nodes
Optional: Prometheus + Grafana
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install monitoring prometheus-community/kube-prometheus-stack \
--namespace monitoring --create-namespace \
--set grafana.adminPassword=your-password
# Access Grafana
kubectl port-forward -n monitoring svc/monitoring-grafana 3001:80
# Open http://localhost:3001
9. Scaling
Manual
kubectl scale deployment/api -n honeydue --replicas=5
kubectl scale deployment/worker -n honeydue --replicas=3
HPA (auto-scaling)
API auto-scales 3→6 replicas on CPU > 70% or memory > 80%:
kubectl get hpa -n honeydue
kubectl describe hpa api -n honeydue
Adding nodes
Edit config.yaml to add nodes, then re-run provisioning:
./scripts/01-provision-cluster.sh
10. Rollback
./scripts/rollback.sh
Shows rollout history, asks for confirmation, rolls back all deployments to previous revision.
Single deployment rollback:
kubectl rollout undo deployment/api -n honeydue
11. Backup & DR
| Component | Strategy | Action Required |
|---|---|---|
| PostgreSQL | Neon PITR (automatic) | None |
| Redis | Reconstructible cache + Asynq queue | None |
| etcd | K3s auto-snapshots (12h, keeps 5) | None |
| B2 Storage | B2 versioning + lifecycle rules | Enable in B2 settings |
| Secrets | Local secrets/ + config.yaml |
Keep secure offline backup |
Disaster recovery: Re-provision → re-create secrets → re-deploy. Database recovers via Neon PITR.
12. Security
See SECURITY.md for the comprehensive hardening guide, incident response playbooks, and full compliance checklist.
Summary of deployed security controls
| Control | Status | Manifests |
|---|---|---|
| Pod security contexts (non-root, read-only FS, no caps) | Applied | All deployment.yaml |
| Network policies (default-deny + explicit allows) | Applied | manifests/network-policies.yaml |
| RBAC (dedicated SAs, no K8s API access) | Applied | manifests/rbac.yaml |
| Pod disruption budgets | Applied | manifests/pod-disruption-budgets.yaml |
| Redis authentication | Applied (if redis.password set) |
redis/deployment.yaml |
| Cloudflare-only origin lockdown | Applied | ingress/ingress.yaml |
| Admin basic auth | Applied (if admin.* set) |
ingress/middleware.yaml |
| Security headers (HSTS, CSP, Permissions-Policy) | Applied | ingress/middleware.yaml |
| Secret encryption at rest | K3s config | --secrets-encryption |
Quick checklist
- Hetzner Firewall: allow only 22, 443, 6443 from your IP
- SSH: key-only auth (
PasswordAuthentication no) redis.passwordset inconfig.yamladmin.basic_auth_userandadmin.basic_auth_passwordset inconfig.yamlkubeconfig:chmod 600 kubeconfig, never commitconfig.yaml: contains tokens — never commit, keep secure backup- Image scanning:
trivy imageordocker scout cvesbefore deploy - Run
./scripts/04-verify.sh— includes automated security checks
13. Troubleshooting
ImagePullBackOff
kubectl describe pod <pod-name> -n honeydue
# Check: image name, GHCR credentials, image exists
Fix: verify registry.* in config.yaml, re-run 02-setup-secrets.sh.
CrashLoopBackOff
kubectl logs <pod-name> -n honeydue --previous
# Common: missing env vars, DB connection failure, invalid APNS key
Redis connection refused / NOAUTH
kubectl get pods -n honeydue -l app.kubernetes.io/name=redis
# If redis.password is set, you must authenticate:
kubectl exec -it deploy/redis -n honeydue -- redis-cli -a "$REDIS_PASSWORD" ping
# Without -a: (error) NOAUTH Authentication required.
Health check failures
kubectl exec -it deploy/api -n honeydue -- curl -v http://localhost:8000/api/health/
kubectl exec -it deploy/api -n honeydue -- env | sort
Pods stuck in Pending
kubectl describe pod <pod-name> -n honeydue
# For Redis: ensure a node has label honeydue/redis=true
kubectl get nodes --show-labels | grep redis
DNS not resolving
dig api.myhoneydue.com +short
# Verify LB IP matches what's in config.yaml
Certificate / TLS errors
kubectl get secret cloudflare-origin-cert -n honeydue
kubectl describe ingress honeydue -n honeydue
curl -vk --resolve api.myhoneydue.com:443:<NODE_IP> https://api.myhoneydue.com/api/health/