Add K3s dev deployment setup for single-node VPS
Mirrors the prod deploy-k3s/ setup but runs all services in-cluster on a single node: PostgreSQL (replaces Neon), MinIO S3-compatible storage (replaces B2), Redis, API, worker, and admin. Includes fully automated setup scripts (00-init through 04-verify), server hardening (SSH, fail2ban, ufw), Let's Encrypt TLS via Traefik, network policies, RBAC, and security contexts matching prod. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
391
deploy-k3s/README.md
Normal file
391
deploy-k3s/README.md
Normal file
@@ -0,0 +1,391 @@
|
||||
# honeyDue — K3s Production Deployment
|
||||
|
||||
Production Kubernetes deployment for honeyDue on Hetzner Cloud using K3s.
|
||||
|
||||
**Architecture**: 3-node HA K3s cluster (CX33), Neon Postgres, Redis (in-cluster), Backblaze B2 (uploads), Cloudflare CDN/TLS.
|
||||
|
||||
**Domains**: `api.myhoneydue.com`, `admin.myhoneydue.com`
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
cd honeyDueAPI-go/deploy-k3s
|
||||
|
||||
# 1. Fill in the single config file
|
||||
cp config.yaml.example config.yaml
|
||||
# Edit config.yaml — fill in ALL empty values
|
||||
|
||||
# 2. Create secret files
|
||||
# See secrets/README.md for the full list
|
||||
echo "your-neon-password" > secrets/postgres_password.txt
|
||||
openssl rand -base64 48 > secrets/secret_key.txt
|
||||
echo "your-smtp-password" > secrets/email_host_password.txt
|
||||
echo "your-fcm-key" > secrets/fcm_server_key.txt
|
||||
cp /path/to/AuthKey.p8 secrets/apns_auth_key.p8
|
||||
cp /path/to/origin.pem secrets/cloudflare-origin.crt
|
||||
cp /path/to/origin-key.pem secrets/cloudflare-origin.key
|
||||
|
||||
# 3. Provision → Secrets → Deploy
|
||||
./scripts/01-provision-cluster.sh
|
||||
./scripts/02-setup-secrets.sh
|
||||
./scripts/03-deploy.sh
|
||||
|
||||
# 4. Set up Hetzner LB + Cloudflare DNS (see sections below)
|
||||
|
||||
# 5. Verify
|
||||
./scripts/04-verify.sh
|
||||
curl https://api.myhoneydue.com/api/health/
|
||||
```
|
||||
|
||||
That's it. Everything reads from `config.yaml` + `secrets/`.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Prerequisites](#1-prerequisites)
|
||||
2. [Configuration](#2-configuration)
|
||||
3. [Provision Cluster](#3-provision-cluster)
|
||||
4. [Create Secrets](#4-create-secrets)
|
||||
5. [Deploy](#5-deploy)
|
||||
6. [Configure Load Balancer & DNS](#6-configure-load-balancer--dns)
|
||||
7. [Verify](#7-verify)
|
||||
8. [Monitoring & Logs](#8-monitoring--logs)
|
||||
9. [Scaling](#9-scaling)
|
||||
10. [Rollback](#10-rollback)
|
||||
11. [Backup & DR](#11-backup--dr)
|
||||
12. [Security Checklist](#12-security-checklist)
|
||||
13. [Troubleshooting](#13-troubleshooting)
|
||||
|
||||
---
|
||||
|
||||
## 1. Prerequisites
|
||||
|
||||
| Tool | Install | Purpose |
|
||||
|------|---------|---------|
|
||||
| `hetzner-k3s` | `gem install hetzner-k3s` | Cluster provisioning |
|
||||
| `kubectl` | https://kubernetes.io/docs/tasks/tools/ | Cluster management |
|
||||
| `helm` | https://helm.sh/docs/intro/install/ | Optional: Prometheus/Grafana |
|
||||
| `stern` | `brew install stern` | Multi-pod log tailing |
|
||||
| `docker` | https://docs.docker.com/get-docker/ | Image building |
|
||||
| `python3` | Pre-installed on macOS | Config parsing |
|
||||
| `htpasswd` | `brew install httpd` or `apt install apache2-utils` | Admin basic auth secret |
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
hetzner-k3s version && kubectl version --client && docker version && python3 --version
|
||||
```
|
||||
|
||||
## 2. Configuration
|
||||
|
||||
There are two things to fill in:
|
||||
|
||||
### config.yaml — all string configuration
|
||||
|
||||
```bash
|
||||
cp config.yaml.example config.yaml
|
||||
```
|
||||
|
||||
Open `config.yaml` and fill in every empty `""` value:
|
||||
|
||||
| Section | What to fill in |
|
||||
|---------|----------------|
|
||||
| `cluster.hcloud_token` | Hetzner API token (Read/Write) — generate at console.hetzner.cloud |
|
||||
| `registry.*` | GHCR credentials (same as Docker Swarm setup) |
|
||||
| `database.host`, `database.user` | Neon PostgreSQL connection info |
|
||||
| `email.user` | Fastmail email address |
|
||||
| `push.apns_key_id`, `push.apns_team_id` | Apple Push Notification identifiers |
|
||||
| `storage.b2_*` | Backblaze B2 bucket and credentials |
|
||||
| `redis.password` | Strong password for Redis authentication (required for production) |
|
||||
| `admin.basic_auth_user` | HTTP basic auth username for admin panel |
|
||||
| `admin.basic_auth_password` | HTTP basic auth password for admin panel |
|
||||
|
||||
Everything else has sensible defaults. `config.yaml` is gitignored.
|
||||
|
||||
### secrets/ — file-based secrets
|
||||
|
||||
These are binary or multi-line files that can't go in YAML:
|
||||
|
||||
| File | Source |
|
||||
|------|--------|
|
||||
| `secrets/postgres_password.txt` | Your Neon database password |
|
||||
| `secrets/secret_key.txt` | `openssl rand -base64 48` (min 32 chars) |
|
||||
| `secrets/email_host_password.txt` | Fastmail app password |
|
||||
| `secrets/fcm_server_key.txt` | Firebase console → Project Settings → Cloud Messaging |
|
||||
| `secrets/apns_auth_key.p8` | Apple Developer → Keys → APNs key |
|
||||
| `secrets/cloudflare-origin.crt` | Cloudflare → SSL/TLS → Origin Server → Create Certificate |
|
||||
| `secrets/cloudflare-origin.key` | (saved with the certificate above) |
|
||||
|
||||
## 3. Provision Cluster
|
||||
|
||||
```bash
|
||||
export KUBECONFIG=$(pwd)/kubeconfig
|
||||
./scripts/01-provision-cluster.sh
|
||||
```
|
||||
|
||||
This script:
|
||||
1. Reads cluster config from `config.yaml`
|
||||
2. Generates `cluster-config.yaml` for hetzner-k3s
|
||||
3. Provisions 3x CX33 nodes with HA etcd (5-10 minutes)
|
||||
4. Writes node IPs back into `config.yaml`
|
||||
5. Labels the Redis node
|
||||
|
||||
After provisioning:
|
||||
|
||||
```bash
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
## 4. Create Secrets
|
||||
|
||||
```bash
|
||||
./scripts/02-setup-secrets.sh
|
||||
```
|
||||
|
||||
This reads `config.yaml` for registry credentials and creates all Kubernetes Secrets from the `secrets/` files:
|
||||
- `honeydue-secrets` — DB password, app secret, email password, FCM key, Redis password (if configured)
|
||||
- `honeydue-apns-key` — APNS .p8 key (mounted as volume in pods)
|
||||
- `ghcr-credentials` — GHCR image pull credentials
|
||||
- `cloudflare-origin-cert` — TLS certificate for Ingress
|
||||
- `admin-basic-auth` — htpasswd secret for admin panel basic auth (if configured)
|
||||
|
||||
## 5. Deploy
|
||||
|
||||
**Full deploy** (build + push + apply):
|
||||
|
||||
```bash
|
||||
./scripts/03-deploy.sh
|
||||
```
|
||||
|
||||
**Deploy pre-built images** (skip build):
|
||||
|
||||
```bash
|
||||
./scripts/03-deploy.sh --skip-build --tag abc1234
|
||||
```
|
||||
|
||||
The script:
|
||||
1. Reads registry config from `config.yaml`
|
||||
2. Builds and pushes 3 Docker images to GHCR
|
||||
3. Generates a Kubernetes ConfigMap from `config.yaml` (converts to flat env vars)
|
||||
4. Applies all manifests with image tag substitution
|
||||
5. Waits for all rollouts to complete
|
||||
|
||||
## 6. Configure Load Balancer & DNS
|
||||
|
||||
### Hetzner Load Balancer
|
||||
|
||||
1. [Hetzner Console](https://console.hetzner.cloud/) → **Load Balancers → Create**
|
||||
2. Location: **fsn1**, add all 3 nodes as targets
|
||||
3. Service: TCP 443 → 443, health check on TCP 443
|
||||
4. Note the LB IP and update `load_balancer_ip` in `config.yaml`
|
||||
|
||||
### Cloudflare DNS
|
||||
|
||||
1. [Cloudflare Dashboard](https://dash.cloudflare.com/) → `myhoneydue.com` → **DNS**
|
||||
|
||||
| Type | Name | Content | Proxy |
|
||||
|------|------|---------|-------|
|
||||
| A | `api` | `<LB_IP>` | Proxied (orange cloud) |
|
||||
| A | `admin` | `<LB_IP>` | Proxied (orange cloud) |
|
||||
|
||||
2. **SSL/TLS → Overview** → Set mode to **Full (Strict)**
|
||||
|
||||
3. If you haven't generated the origin cert yet:
|
||||
**SSL/TLS → Origin Server → Create Certificate**
|
||||
- Hostnames: `*.myhoneydue.com`, `myhoneydue.com`
|
||||
- Validity: 15 years
|
||||
- Save to `secrets/cloudflare-origin.crt` and `secrets/cloudflare-origin.key`
|
||||
- Re-run `./scripts/02-setup-secrets.sh`
|
||||
|
||||
## 7. Verify
|
||||
|
||||
```bash
|
||||
# Automated cluster health check
|
||||
./scripts/04-verify.sh
|
||||
|
||||
# External health check (after DNS propagation)
|
||||
curl -v https://api.myhoneydue.com/api/health/
|
||||
```
|
||||
|
||||
Expected: `{"status": "ok"}` with HTTP 200.
|
||||
|
||||
## 8. Monitoring & Logs
|
||||
|
||||
### Logs with stern
|
||||
|
||||
```bash
|
||||
stern -n honeydue api # All API pod logs
|
||||
stern -n honeydue worker # All worker logs
|
||||
stern -n honeydue . # Everything
|
||||
stern -n honeydue api | grep ERROR # Filter
|
||||
```
|
||||
|
||||
### kubectl logs
|
||||
|
||||
```bash
|
||||
kubectl logs -n honeydue deployment/api -f
|
||||
kubectl logs -n honeydue <pod-name> --previous # Crashed container
|
||||
```
|
||||
|
||||
### Resource usage
|
||||
|
||||
```bash
|
||||
kubectl top pods -n honeydue
|
||||
kubectl top nodes
|
||||
```
|
||||
|
||||
### Optional: Prometheus + Grafana
|
||||
|
||||
```bash
|
||||
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
|
||||
helm repo update
|
||||
helm install monitoring prometheus-community/kube-prometheus-stack \
|
||||
--namespace monitoring --create-namespace \
|
||||
--set grafana.adminPassword=your-password
|
||||
|
||||
# Access Grafana
|
||||
kubectl port-forward -n monitoring svc/monitoring-grafana 3001:80
|
||||
# Open http://localhost:3001
|
||||
```
|
||||
|
||||
## 9. Scaling
|
||||
|
||||
### Manual
|
||||
|
||||
```bash
|
||||
kubectl scale deployment/api -n honeydue --replicas=5
|
||||
kubectl scale deployment/worker -n honeydue --replicas=3
|
||||
```
|
||||
|
||||
### HPA (auto-scaling)
|
||||
|
||||
API auto-scales 3→6 replicas on CPU > 70% or memory > 80%:
|
||||
|
||||
```bash
|
||||
kubectl get hpa -n honeydue
|
||||
kubectl describe hpa api -n honeydue
|
||||
```
|
||||
|
||||
### Adding nodes
|
||||
|
||||
Edit `config.yaml` to add nodes, then re-run provisioning:
|
||||
|
||||
```bash
|
||||
./scripts/01-provision-cluster.sh
|
||||
```
|
||||
|
||||
## 10. Rollback
|
||||
|
||||
```bash
|
||||
./scripts/rollback.sh
|
||||
```
|
||||
|
||||
Shows rollout history, asks for confirmation, rolls back all deployments to previous revision.
|
||||
|
||||
Single deployment rollback:
|
||||
|
||||
```bash
|
||||
kubectl rollout undo deployment/api -n honeydue
|
||||
```
|
||||
|
||||
## 11. Backup & DR
|
||||
|
||||
| Component | Strategy | Action Required |
|
||||
|-----------|----------|-----------------|
|
||||
| PostgreSQL | Neon PITR (automatic) | None |
|
||||
| Redis | Reconstructible cache + Asynq queue | None |
|
||||
| etcd | K3s auto-snapshots (12h, keeps 5) | None |
|
||||
| B2 Storage | B2 versioning + lifecycle rules | Enable in B2 settings |
|
||||
| Secrets | Local `secrets/` + `config.yaml` | Keep secure offline backup |
|
||||
|
||||
**Disaster recovery**: Re-provision → re-create secrets → re-deploy. Database recovers via Neon PITR.
|
||||
|
||||
## 12. Security
|
||||
|
||||
See **[SECURITY.md](SECURITY.md)** for the comprehensive hardening guide, incident response playbooks, and full compliance checklist.
|
||||
|
||||
### Summary of deployed security controls
|
||||
|
||||
| Control | Status | Manifests |
|
||||
|---------|--------|-----------|
|
||||
| Pod security contexts (non-root, read-only FS, no caps) | Applied | All `deployment.yaml` |
|
||||
| Network policies (default-deny + explicit allows) | Applied | `manifests/network-policies.yaml` |
|
||||
| RBAC (dedicated SAs, no K8s API access) | Applied | `manifests/rbac.yaml` |
|
||||
| Pod disruption budgets | Applied | `manifests/pod-disruption-budgets.yaml` |
|
||||
| Redis authentication | Applied (if `redis.password` set) | `redis/deployment.yaml` |
|
||||
| Cloudflare-only origin lockdown | Applied | `ingress/ingress.yaml` |
|
||||
| Admin basic auth | Applied (if `admin.*` set) | `ingress/middleware.yaml` |
|
||||
| Security headers (HSTS, CSP, Permissions-Policy) | Applied | `ingress/middleware.yaml` |
|
||||
| Secret encryption at rest | K3s config | `--secrets-encryption` |
|
||||
|
||||
### Quick checklist
|
||||
|
||||
- [ ] Hetzner Firewall: allow only 22, 443, 6443 from your IP
|
||||
- [ ] SSH: key-only auth (`PasswordAuthentication no`)
|
||||
- [ ] `redis.password` set in `config.yaml`
|
||||
- [ ] `admin.basic_auth_user` and `admin.basic_auth_password` set in `config.yaml`
|
||||
- [ ] `kubeconfig`: `chmod 600 kubeconfig`, never commit
|
||||
- [ ] `config.yaml`: contains tokens — never commit, keep secure backup
|
||||
- [ ] Image scanning: `trivy image` or `docker scout cves` before deploy
|
||||
- [ ] Run `./scripts/04-verify.sh` — includes automated security checks
|
||||
|
||||
## 13. Troubleshooting
|
||||
|
||||
### ImagePullBackOff
|
||||
|
||||
```bash
|
||||
kubectl describe pod <pod-name> -n honeydue
|
||||
# Check: image name, GHCR credentials, image exists
|
||||
```
|
||||
|
||||
Fix: verify `registry.*` in config.yaml, re-run `02-setup-secrets.sh`.
|
||||
|
||||
### CrashLoopBackOff
|
||||
|
||||
```bash
|
||||
kubectl logs <pod-name> -n honeydue --previous
|
||||
# Common: missing env vars, DB connection failure, invalid APNS key
|
||||
```
|
||||
|
||||
### Redis connection refused / NOAUTH
|
||||
|
||||
```bash
|
||||
kubectl get pods -n honeydue -l app.kubernetes.io/name=redis
|
||||
|
||||
# If redis.password is set, you must authenticate:
|
||||
kubectl exec -it deploy/redis -n honeydue -- redis-cli -a "$REDIS_PASSWORD" ping
|
||||
# Without -a: (error) NOAUTH Authentication required.
|
||||
```
|
||||
|
||||
### Health check failures
|
||||
|
||||
```bash
|
||||
kubectl exec -it deploy/api -n honeydue -- curl -v http://localhost:8000/api/health/
|
||||
kubectl exec -it deploy/api -n honeydue -- env | sort
|
||||
```
|
||||
|
||||
### Pods stuck in Pending
|
||||
|
||||
```bash
|
||||
kubectl describe pod <pod-name> -n honeydue
|
||||
# For Redis: ensure a node has label honeydue/redis=true
|
||||
kubectl get nodes --show-labels | grep redis
|
||||
```
|
||||
|
||||
### DNS not resolving
|
||||
|
||||
```bash
|
||||
dig api.myhoneydue.com +short
|
||||
# Verify LB IP matches what's in config.yaml
|
||||
```
|
||||
|
||||
### Certificate / TLS errors
|
||||
|
||||
```bash
|
||||
kubectl get secret cloudflare-origin-cert -n honeydue
|
||||
kubectl describe ingress honeydue -n honeydue
|
||||
curl -vk --resolve api.myhoneydue.com:443:<NODE_IP> https://api.myhoneydue.com/api/health/
|
||||
```
|
||||
Reference in New Issue
Block a user