Mirrors the prod deploy-k3s/ setup but runs all services in-cluster on a single node: PostgreSQL (replaces Neon), MinIO S3-compatible storage (replaces B2), Redis, API, worker, and admin. Includes fully automated setup scripts (00-init through 04-verify), server hardening (SSH, fail2ban, ufw), Let's Encrypt TLS via Traefik, network policies, RBAC, and security contexts matching prod. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
24 KiB
honeyDue — Production Security Hardening Guide
Comprehensive security documentation for the honeyDue K3s deployment. Covers every layer from cloud provider to application.
Last updated: 2026-03-28
Table of Contents
- Threat Model
- Hetzner Cloud (Host)
- K3s Cluster
- Pod Security
- Network Segmentation
- Redis
- PostgreSQL (Neon)
- Cloudflare
- Container Images
- Secrets Management
- B2 Object Storage
- Monitoring & Alerting
- Incident Response
- Compliance Checklist
1. Threat Model
What We're Protecting
| Asset | Impact if Compromised |
|---|---|
| User credentials (bcrypt hashes) | Account takeover, password reuse attacks |
| Auth tokens | Session hijacking |
| Personal data (email, name, residences) | Privacy violation, regulatory exposure |
| Push notification keys (APNs, FCM) | Spam push to all users, key revocation |
| Cloudflare origin cert | Direct TLS impersonation |
| Database credentials | Full data exfiltration |
| Redis data | Session replay, job queue manipulation |
| B2 storage keys | Document theft or deletion |
Attack Surface
Internet
│
▼
Cloudflare (WAF, DDoS protection, TLS termination)
│
▼ (origin cert, Full Strict)
Hetzner Cloud Firewall (ports 22, 443, 6443)
│
▼
K3s Traefik Ingress (Cloudflare-only IP allowlist)
│
├──► API pods (Go) ──► Neon PostgreSQL (external, TLS)
│ ──► Redis (internal, authenticated)
│ ──► APNs/FCM (external, TLS)
│ ──► B2 Storage (external, TLS)
│ ──► SMTP (external, TLS)
│
├──► Admin pods (Next.js) ──► API pods (internal)
│
└──► Worker pods (Go) ──► same as API
Trust Boundaries
- Internet → Cloudflare: Untrusted. Cloudflare handles DDoS, WAF, TLS.
- Cloudflare → Origin: Semi-trusted. Origin cert validates, IP allowlist enforces.
- Ingress → Pods: Trusted network, but segmented by NetworkPolicy.
- Pods → External Services: Outbound only, TLS required, credentials scoped.
- Pods → K8s API: Denied. Service accounts have no permissions.
2. Hetzner Cloud (Host)
Firewall Rules
Only three ports should be open on the Hetzner Cloud Firewall:
| Port | Protocol | Source | Purpose |
|---|---|---|---|
| 22 | TCP | Your IP(s) only | SSH management |
| 443 | TCP | Cloudflare IPs only | HTTPS traffic |
| 6443 | TCP | Your IP(s) only | K3s API (kubectl) |
# Verify Hetzner firewall rules (Hetzner CLI)
hcloud firewall describe honeydue-fw
SSH Hardening
- Key-only authentication — password auth disabled in
/etc/ssh/sshd_config - Root login disabled —
PermitRootLogin no - fail2ban active — auto-bans IPs after 5 failed SSH attempts
# Verify SSH config on each node
ssh user@NODE_IP "grep -E 'PasswordAuthentication|PermitRootLogin' /etc/ssh/sshd_config"
# Expected: PasswordAuthentication no, PermitRootLogin no
# Check fail2ban status
ssh user@NODE_IP "sudo fail2ban-client status sshd"
OS Updates
# Enable unattended security updates (Ubuntu 24.04)
ssh user@NODE_IP "sudo apt install unattended-upgrades && sudo dpkg-reconfigure -plow unattended-upgrades"
3. K3s Cluster
Secret Encryption at Rest
K3s is configured with secrets-encryption: true in the server config. This encrypts all Secret resources in etcd using AES-CBC.
# Verify encryption is active
k3s secrets-encrypt status
# Expected: Encryption Status: Enabled
# Rotate encryption keys (do periodically)
k3s secrets-encrypt rotate-keys
k3s secrets-encrypt reencrypt
RBAC
Each workload has a dedicated ServiceAccount with automountServiceAccountToken: false:
| ServiceAccount | Used By | K8s API Access |
|---|---|---|
api |
API deployment | None |
worker |
Worker deployment | None |
admin |
Admin deployment | None |
redis |
Redis deployment | None |
No Roles or RoleBindings are created — pods have zero K8s API access.
# Verify service accounts exist
kubectl get sa -n honeydue
# Verify no roles are bound
kubectl get rolebindings -n honeydue
kubectl get clusterrolebindings | grep honeydue
# Expected: no results
Pod Disruption Budgets
Prevent node maintenance from taking down all replicas:
| Workload | Replicas | minAvailable |
|---|---|---|
| API | 3 | 2 |
| Worker | 2 | 1 |
# Verify PDBs
kubectl get pdb -n honeydue
Audit Logging (Optional Enhancement)
K3s supports audit logging for API server requests:
# Add to K3s server config for detailed audit logging
# /etc/rancher/k3s/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
resources:
- group: ""
resources: ["secrets", "configmaps"]
- level: RequestResponse
users: ["system:anonymous"]
- level: None
resources:
- group: ""
resources: ["events"]
WireGuard (Optional Enhancement)
K3s supports WireGuard for encrypting inter-node traffic:
# Enable WireGuard on K3s (add to server args)
# --flannel-backend=wireguard-native
4. Pod Security
Security Contexts
Every pod runs with these security restrictions:
Pod-level:
securityContext:
runAsNonRoot: true
runAsUser: <uid> # 1000 (api/worker), 1001 (admin), 999 (redis)
runAsGroup: <gid>
fsGroup: <gid>
seccompProfile:
type: RuntimeDefault # Linux kernel syscall filtering
Container-level:
securityContext:
allowPrivilegeEscalation: false # Cannot gain more privileges than parent
readOnlyRootFilesystem: true # Filesystem is immutable
capabilities:
drop: ["ALL"] # No Linux capabilities
Writable Directories
With readOnlyRootFilesystem: true, writable paths use emptyDir volumes:
| Pod | Path | Purpose | Backing |
|---|---|---|---|
| API | /tmp |
Temp files | emptyDir (64Mi) |
| Worker | /tmp |
Temp files | emptyDir (64Mi) |
| Admin | /app/.next/cache |
Next.js ISR cache | emptyDir (256Mi) |
| Admin | /tmp |
Temp files | emptyDir (64Mi) |
| Redis | /data |
Persistence | PVC (5Gi) |
| Redis | /tmp |
AOF rewrite temp | emptyDir tmpfs (64Mi) |
User IDs
| Container | UID:GID | Source |
|---|---|---|
| API | 1000:1000 | Dockerfile app user |
| Worker | 1000:1000 | Dockerfile app user |
| Admin | 1001:1001 | Dockerfile nextjs user |
| Redis | 999:999 | Alpine redis user |
# Verify all pods run as non-root
kubectl get pods -n honeydue -o jsonpath='{range .items[*]}{.metadata.name}{" runAsNonRoot="}{.spec.securityContext.runAsNonRoot}{"\n"}{end}'
5. Network Segmentation
Default-Deny Policy
All ingress and egress traffic in the honeydue namespace is denied by default. Explicit NetworkPolicy rules allow only necessary traffic.
Allowed Traffic
┌─────────────┐
│ Traefik │
│ (kube-system)│
└──────┬──────┘
│
┌──────────┼──────────┐
│ │ │
▼ ▼ │
┌────────┐ ┌────────┐ │
│ API │ │ Admin │ │
│ :8000 │ │ :3000 │ │
└───┬────┘ └────┬───┘ │
│ │ │
┌───────┤ │ │
│ │ │ │
▼ ▼ ▼ │
┌───────┐ ┌────────┐ ┌────────┐ │
│ Redis │ │External│ │ API │ │
│ :6379 │ │Services│ │(in-clr)│ │
└───────┘ └────────┘ └────────┘ │
▲ │
│ ┌────────┐ │
└───────│ Worker │────────────┘
└────────┘
| Policy | From | To | Ports |
|---|---|---|---|
default-deny-all |
all | all | none |
allow-dns |
all pods | kube-dns | 53 UDP/TCP |
allow-ingress-to-api |
Traefik (kube-system) | API pods | 8000 |
allow-ingress-to-admin |
Traefik (kube-system) | Admin pods | 3000 |
allow-ingress-to-redis |
API + Worker pods | Redis | 6379 |
allow-egress-from-api |
API pods | Redis, external (443, 5432, 587) | various |
allow-egress-from-worker |
Worker pods | Redis, external (443, 5432, 587) | various |
allow-egress-from-admin |
Admin pods | API pods (in-cluster) | 8000 |
Key restrictions:
- Redis is reachable ONLY from API and Worker pods
- Admin can ONLY reach the API service (no direct DB/Redis access)
- No pod can reach private IP ranges except in-cluster services
- External egress limited to specific ports (443, 5432, 587)
# Verify network policies
kubectl get networkpolicy -n honeydue
# Test: admin pod should NOT be able to reach Redis
kubectl exec -n honeydue deploy/admin -- nc -zv redis.honeydue.svc.cluster.local 6379
# Expected: timeout/refused
6. Redis
Authentication
Redis requires a password when redis.password is set in config.yaml:
- Password passed via
REDIS_PASSWORDenvironment variable fromhoneydue-secrets - Redis starts with
--requirepass $REDIS_PASSWORD - Health probes authenticate with
-a $REDIS_PASSWORD - Go API connects via
redis://:PASSWORD@redis.honeydue.svc.cluster.local:6379/0
Network Isolation
- Redis has no Ingress — not exposed outside the cluster
- NetworkPolicy restricts access to API and Worker pods only
- Admin pods cannot reach Redis
Memory Limits
--maxmemory 256mb— hard cap on Redis memory--maxmemory-policy noeviction— returns errors rather than silently evicting data- K8s resource limit: 512Mi (headroom for AOF rewrite)
Dangerous Command Renaming (Optional Enhancement)
For additional protection, rename dangerous commands in a custom redis.conf:
rename-command FLUSHDB ""
rename-command FLUSHALL ""
rename-command DEBUG ""
rename-command CONFIG "HONEYDUE_CONFIG_a7f3b"
# Verify Redis auth is required
kubectl exec -n honeydue deploy/redis -- redis-cli ping
# Expected: (error) NOAUTH Authentication required.
kubectl exec -n honeydue deploy/redis -- redis-cli -a "$REDIS_PASSWORD" ping
# Expected: PONG
7. PostgreSQL (Neon)
Connection Security
- SSL required:
sslmode=requirein connection string - Connection limits:
max_open_conns=25,max_idle_conns=10 - Scoped credentials: Database user has access only to
honeyduedatabase - Password rotation: Change in Neon dashboard, update
secrets/postgres_password.txt, re-run02-setup-secrets.sh
Access Control
- Only API and Worker pods have egress to port 5432 (NetworkPolicy enforced)
- Admin pods cannot reach the database directly
- Redis pods have no external egress
# Verify only API/Worker can reach Neon
kubectl exec -n honeydue deploy/admin -- nc -zv ep-xxx.us-east-2.aws.neon.tech 5432
# Expected: timeout (blocked by network policy)
Query Safety
- GORM uses parameterized queries (SQL injection prevention)
- No raw SQL in handlers — all queries go through repositories
- Decimal fields use
shopspring/decimal(no floating-point errors)
8. Cloudflare
TLS Configuration
- Mode: Full (Strict) — Cloudflare validates the origin certificate
- Origin cert: Stored as K8s Secret
cloudflare-origin-cert - Minimum TLS: 1.2 (set in Cloudflare dashboard)
- HSTS: Enabled via security headers middleware
Origin Lockdown
The cloudflare-only Traefik middleware restricts all ingress to Cloudflare IP ranges only. Direct requests to the origin IP are rejected with 403.
# Test: direct request to origin should fail
curl -k https://ORIGIN_IP/api/health/
# Expected: 403 Forbidden
# Test: request through Cloudflare should work
curl https://api.myhoneydue.com/api/health/
# Expected: 200 OK
Cloudflare IP Range Updates
Cloudflare IP ranges change infrequently but should be checked periodically:
# Compare current ranges with deployed middleware
diff <(curl -s https://www.cloudflare.com/ips-v4; curl -s https://www.cloudflare.com/ips-v6) \
<(kubectl get middleware cloudflare-only -n honeydue -o jsonpath='{.spec.ipAllowList.sourceRange[*]}' | tr ' ' '\n')
WAF & Rate Limiting
- Cloudflare WAF: Enable managed rulesets in dashboard (OWASP Core, Cloudflare Specials)
- Rate limiting: Traefik middleware (100 req/min, burst 200) + Go API auth rate limiting
- Bot management: Enable in Cloudflare dashboard for API routes
Security Headers
Applied via Traefik middleware to all responses:
| Header | Value |
|---|---|
Strict-Transport-Security |
max-age=31536000; includeSubDomains |
X-Frame-Options |
DENY |
X-Content-Type-Options |
nosniff |
X-XSS-Protection |
1; mode=block |
Referrer-Policy |
strict-origin-when-cross-origin |
Content-Security-Policy |
default-src 'self'; frame-ancestors 'none' |
Permissions-Policy |
camera=(), microphone=(), geolocation=() |
X-Permitted-Cross-Domain-Policies |
none |
9. Container Images
Build Security
- Multi-stage builds: Build stage discarded, only runtime artifacts copied
- Alpine base: Minimal attack surface (~5MB base)
- Non-root users:
app:1000(Go),nextjs:1001(admin) - Stripped binaries: Go binaries built with
-ldflags "-s -w"(no debug symbols) - No shell in final image (Go containers): Only the binary + CA certs
Image Scanning (Recommended)
Add image scanning to CI/CD before pushing to GHCR:
# Trivy scan (run in CI)
trivy image --severity HIGH,CRITICAL --exit-code 1 ghcr.io/NAMESPACE/honeydue-api:latest
# Grype alternative
grype ghcr.io/NAMESPACE/honeydue-api:latest --fail-on high
Version Pinning
- Redis image:
redis:7-alpine(pin to specific tag in production, e.g.,redis:7.4.2-alpine) - Go base: pinned in Dockerfile
- Node base: pinned in admin Dockerfile
10. Secrets Management
At-Rest Encryption
K3s encrypts all Secret resources in etcd with AES-CBC (--secrets-encryption flag).
Secret Inventory
| Secret | Contains | Rotation Procedure |
|---|---|---|
honeydue-secrets |
DB password, SECRET_KEY, SMTP password, FCM key, Redis password | Update source files + re-run 02-setup-secrets.sh |
honeydue-apns-key |
APNs .p8 private key | Replace file + re-run 02-setup-secrets.sh |
cloudflare-origin-cert |
TLS cert + key | Regenerate in Cloudflare + re-run 02-setup-secrets.sh |
ghcr-credentials |
Registry PAT | Regenerate GitHub PAT + re-run 02-setup-secrets.sh |
admin-basic-auth |
htpasswd hash | Update config.yaml + re-run 02-setup-secrets.sh |
Rotation Procedure
# 1. Update the secret source (file or config.yaml value)
# 2. Re-run the secrets script
./scripts/02-setup-secrets.sh
# 3. Restart affected pods to pick up new secret values
kubectl rollout restart deployment/api deployment/worker -n honeydue
# 4. Verify pods are healthy
kubectl get pods -n honeydue -w
Secret Hygiene
secrets/directory is gitignored — never committedconfig.yamlis gitignored — never committed- Scripts validate secret files exist and aren't empty before creating K8s secrets
SECRET_KEYrequires minimum 32 characters- ConfigMap redacts sensitive values in
04-verify.shoutput
11. B2 Object Storage
Access Control
- Scoped application key: Create a B2 key with access to only the
honeyduebucket - Permissions: Read + Write only (no
deleteFiles, nolistAllBucketNames) - Bucket-only: Key cannot access other buckets in the account
# Create scoped B2 key (Backblaze CLI)
b2 create-key --bucket BUCKET_NAME honeydue-api readFiles,writeFiles,listFiles
Upload Validation (Go API)
- File size limit:
STORAGE_MAX_FILE_SIZE(10MB default) - Allowed MIME types:
STORAGE_ALLOWED_TYPES(images + PDF only) - Path traversal protection in upload handler
- Files served via authenticated proxy (
media_handler) — no direct B2 URLs exposed to clients
Versioning
Enable B2 bucket versioning to protect against accidental deletion:
# Enable versioning on the B2 bucket
b2 update-bucket --versioning enabled BUCKET_NAME
12. Monitoring & Alerting
Log Aggregation
K3s logs are available via kubectl logs. For persistent log aggregation:
# View API logs
kubectl logs -n honeydue -l app.kubernetes.io/name=api --tail=100 -f
# View worker logs
kubectl logs -n honeydue -l app.kubernetes.io/name=worker --tail=100 -f
# View all warning events
kubectl get events -n honeydue --field-selector type=Warning --sort-by='.lastTimestamp'
Recommended: Deploy Loki + Grafana for persistent log search and alerting.
Health Monitoring
# Continuous health monitoring
watch -n 10 "kubectl get pods -n honeydue -o wide && echo && kubectl top pods -n honeydue 2>/dev/null"
# Check pod restart counts (indicator of crashes)
kubectl get pods -n honeydue -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{range .status.containerStatuses[*]}{.restartCount}{end}{"\n"}{end}'
Alerting Thresholds
| Metric | Warning | Critical | Check Command |
|---|---|---|---|
| Pod restarts | > 3 in 1h | > 10 in 1h | kubectl get pods |
| API response time | > 500ms p95 | > 2s p95 | Cloudflare Analytics |
| Memory usage | > 80% limit | > 95% limit | kubectl top pods |
| Redis memory | > 200MB | > 250MB | redis-cli info memory |
| Disk (PVC) | > 80% | > 95% | kubectl exec ... df -h |
| Certificate expiry | < 30 days | < 7 days | Cloudflare dashboard |
Audit Trail
- K8s events:
kubectl get events -n honeydue(auto-pruned after 1h) - Go API: zerolog structured logging with credential masking
- Cloudflare: Access logs, WAF logs, rate limiting logs in dashboard
- Hetzner: SSH auth logs in
/var/log/auth.log
13. Incident Response
Playbook: Compromised API Token
# 1. Rotate SECRET_KEY to invalidate ALL tokens
echo "$(openssl rand -hex 32)" > secrets/secret_key.txt
./scripts/02-setup-secrets.sh
kubectl rollout restart deployment/api deployment/worker -n honeydue
# 2. All users will need to re-authenticate
Playbook: Compromised Database Credentials
# 1. Rotate password in Neon dashboard
# 2. Update local secret file
echo "NEW_PASSWORD" > secrets/postgres_password.txt
./scripts/02-setup-secrets.sh
kubectl rollout restart deployment/api deployment/worker -n honeydue
# 3. Monitor for connection errors
kubectl logs -n honeydue -l app.kubernetes.io/name=api --tail=50 -f
Playbook: Compromised Push Notification Keys
# APNs: Revoke key in Apple Developer Console, generate new .p8
cp new_key.p8 secrets/apns_auth_key.p8
./scripts/02-setup-secrets.sh
kubectl rollout restart deployment/api deployment/worker -n honeydue
# FCM: Rotate server key in Firebase Console
echo "NEW_FCM_KEY" > secrets/fcm_server_key.txt
./scripts/02-setup-secrets.sh
kubectl rollout restart deployment/api deployment/worker -n honeydue
Playbook: Suspicious Pod Behavior
# 1. Isolate the pod (remove from service)
kubectl label pod SUSPICIOUS_POD -n honeydue app.kubernetes.io/name-
# 2. Capture state for investigation
kubectl logs SUSPICIOUS_POD -n honeydue > /tmp/suspicious-logs.txt
kubectl describe pod SUSPICIOUS_POD -n honeydue > /tmp/suspicious-describe.txt
# 3. Delete and let deployment recreate
kubectl delete pod SUSPICIOUS_POD -n honeydue
Communication Plan
- Internal: Document incident timeline in a private channel
- Users: If data breach — notify affected users within 72 hours
- Vendors: Revoke/rotate all potentially compromised credentials
- Post-mortem: Document root cause, timeline, remediation, prevention
14. Compliance Checklist
Run through this checklist before production launch and periodically thereafter.
Infrastructure
- Hetzner firewall allows only ports 22, 443, 6443
- SSH password auth disabled on all nodes
- fail2ban active on all nodes
- OS security updates enabled (unattended-upgrades)
# Verify
hcloud firewall describe honeydue-fw
ssh user@NODE "grep PasswordAuthentication /etc/ssh/sshd_config"
ssh user@NODE "sudo fail2ban-client status sshd"
K3s Cluster
- Secret encryption enabled
- Service accounts created with no API access
- Pod disruption budgets deployed
- No default service account used by workloads
# Verify
k3s secrets-encrypt status
kubectl get sa -n honeydue
kubectl get pdb -n honeydue
kubectl get pods -n honeydue -o jsonpath='{range .items[*]}{.metadata.name}{" sa="}{.spec.serviceAccountName}{"\n"}{end}'
Pod Security
- All pods:
runAsNonRoot: true - All containers:
allowPrivilegeEscalation: false - All containers:
readOnlyRootFilesystem: true - All containers:
capabilities.drop: ["ALL"] - All pods:
seccompProfile.type: RuntimeDefault
# Verify (automated check in 04-verify.sh)
./scripts/04-verify.sh
Network
- Default-deny NetworkPolicy applied
- 8+ explicit allow policies deployed
- Redis only reachable from API + Worker
- Admin only reaches API service
- Cloudflare-only middleware applied to all ingress
# Verify
kubectl get networkpolicy -n honeydue
kubectl get ingress -n honeydue -o yaml | grep cloudflare-only
Authentication & Authorization
- Redis requires password
- Admin panel has basic auth layer
- API uses bcrypt for passwords
- Auth tokens have expiration
- Rate limiting on auth endpoints
# Verify Redis auth
kubectl exec -n honeydue deploy/redis -- redis-cli ping
# Expected: NOAUTH error
# Verify admin auth
kubectl get secret admin-basic-auth -n honeydue
Secrets
- All secrets stored as K8s Secrets (not ConfigMap)
- Secrets encrypted at rest (K3s)
- No secrets in git history
- SECRET_KEY >= 32 characters
- Secret rotation documented
# Verify no secrets in ConfigMap
kubectl get configmap honeydue-config -n honeydue -o yaml | grep -iE 'password|secret|token|key'
# Should show only non-sensitive config keys (EMAIL_HOST, APNS_KEY_ID, etc.)
TLS & Headers
- Cloudflare Full (Strict) mode enabled
- Origin cert valid and not expired
- HSTS header present with includeSubDomains
- CSP header:
default-src 'self'; frame-ancestors 'none' - Permissions-Policy blocks camera/mic/geo
- X-Frame-Options: DENY
# Verify headers (via Cloudflare)
curl -sI https://api.myhoneydue.com/api/health/ | grep -iE 'strict-transport|content-security|permissions-policy|x-frame'
Container Images
- Multi-stage Dockerfile (no build tools in runtime)
- Non-root user in all images
- Alpine base (minimal surface)
- No secrets baked into images
# Verify non-root
kubectl get pods -n honeydue -o jsonpath='{range .items[*]}{.metadata.name}{" uid="}{.spec.securityContext.runAsUser}{"\n"}{end}'
External Services
- PostgreSQL:
sslmode=require - B2: Scoped application key (single bucket)
- APNs: .p8 key (not .p12 certificate)
- SMTP: TLS enabled (
use_tls: true)
Quick Reference Commands
# Full security verification
./scripts/04-verify.sh
# Rotate all secrets
./scripts/02-setup-secrets.sh && \
kubectl rollout restart deployment/api deployment/worker deployment/admin -n honeydue
# Check for security events
kubectl get events -n honeydue --field-selector type=Warning
# Emergency: scale down everything
kubectl scale deployment --all -n honeydue --replicas=0
# Emergency: restore
kubectl scale deployment api -n honeydue --replicas=3
kubectl scale deployment worker -n honeydue --replicas=2
kubectl scale deployment admin -n honeydue --replicas=1
kubectl scale deployment redis -n honeydue --replicas=1