# 14 — Deployment Process ## Summary A production deploy is: build a new image, push to Gitea, update the Deployment's image field with the new SHA, Kubernetes rolls new pods in. No downtime if the change is backward-compatible. Rollback is `kubectl rollout undo`. This chapter walks through the full process, plus alternate paths (config-only changes, manifest changes, hotfixes). ## TL;DR using the unified deploy script The recommended path. `deploy-k3s/scripts/03-deploy.sh` builds all four images (api, worker, admin, web), pushes to Gitea, regenerates the ConfigMap from `config.yaml`, applies every manifest under `deploy-k3s/manifests/` (including the observability vmagent), and waits for all rollouts. ```bash cd /Users/treyt/Desktop/code/honeyDue/honeyDueAPI-go git add . && git commit -m "..." && git push gitea master export KUBECONFIG=~/.kube/honeydue.yaml bash deploy-k3s/scripts/03-deploy.sh # full build + push + rollout # or, to redeploy without rebuilding: bash deploy-k3s/scripts/03-deploy.sh --skip-build # or, to pin a specific tag: bash deploy-k3s/scripts/03-deploy.sh --tag d3708e6 ``` What the script does, in order: 1. Read registry creds from `deploy-k3s/config.yaml`. 2. `docker login gitea.treytartt.com`. 3. Build all four images with `--platform linux/amd64` (so arm64 Macs don't push images that crash on Hetzner amd64 nodes with "exec format error"). 4. Push to the gitea registry, plus tag and push `:latest`. 5. Generate the env file from `config.yaml` and apply as ConfigMap `honeydue-config` (uses dry-run + apply for diff-free idempotence). 6. Apply `manifests/namespace.yaml`, `redis/`, `ingress/`, `api/{deployment,service,hpa}`, `worker/`, `admin/`, `web/`. 7. Apply `manifests/observability/vmagent.yaml`, substituting `TOKEN_PLACEHOLDER` with `OBS_INGEST_TOKEN` from `deploy/prod.env` (gitignored). Skipped with a warning if the token isn't present. 8. `kubectl rollout status` for every Deployment, including vmagent. ~7–10 minutes for a full rebuild. ~1–2 minutes with `--skip-build`. ## TL;DR for a single-service code change (manual) ```bash # 1. Commit + get SHA cd /Users/treyt/Desktop/code/honeyDue/honeyDueAPI-go git add . && git commit -m "..." && SHA=$(git rev-parse --short HEAD) # 2. Login to Gitea registry (creds in config.yaml) docker login gitea.treytartt.com -u admin # 3. Build + push amd64 image docker build --platform linux/amd64 --target api \ -t "gitea.treytartt.com/admin/honeydue-api:${SHA}" . docker push "gitea.treytartt.com/admin/honeydue-api:${SHA}" # 4. Roll it in export KUBECONFIG=~/.kube/honeydue.yaml kubectl set image deployment/api -n honeydue \ api="gitea.treytartt.com/admin/honeydue-api:${SHA}" # 5. Watch kubectl rollout status -n honeydue deployment/api # 6. Log out docker logout gitea.treytartt.com ``` ~3–5 minutes end to end for api. > **Gotcha:** Deployments default to `imagePullPolicy: IfNotPresent`, > which means kubelet won't re-fetch an image with a tag it already > has cached locally — even if the registry now has different bytes > at that tag. Always change tags (use the SHA), or temporarily flip > `imagePullPolicy: Always` and `kubectl rollout restart` if you need > to overwrite a tag. ## The build ### Step 1 — Prepare ```bash cd /Users/treyt/Desktop/code/honeyDue/honeyDueAPI-go git status # clean working tree? git log -1 --oneline # this is the SHA that'll ship ``` ### Step 2 — Login to Gitea ```bash set -a; source deploy/registry.env; set +a printf '%s' "$REGISTRY_TOKEN" | \ docker login "$REGISTRY" -u "$REGISTRY_USERNAME" --password-stdin ``` **Note**: `docker login` without `--password-stdin` writes the token to shell history. Don't skip the `printf` trick. ### Step 3 — Build + push ```bash SHA=$(git rev-parse --short HEAD) # For API docker buildx build \ --platform linux/amd64 \ --target api \ -t "gitea.treytartt.com/admin/honeydue-api:${SHA}" \ --push . # For Worker docker buildx build \ --platform linux/amd64 \ --target worker \ -t "gitea.treytartt.com/admin/honeydue-worker:${SHA}" \ --push . # For Admin (Next.js) docker buildx build \ --platform linux/amd64 \ --target admin \ -t "gitea.treytartt.com/admin/honeydue-admin:${SHA}" \ --push . ``` - `--platform linux/amd64` — cross-compile from operator's arm64 to Hetzner nodes' amd64 - `--target X` — select a stage from the multi-stage Dockerfile - `--push` — push to registry in one step; don't leave image in local Docker First build is slow (~3–5 min cold). Subsequent builds hit BuildKit layer cache and complete in ~30–60s if only app code changed. ### Build platform note If `docker buildx` isn't configured: ```bash docker buildx create --name honeydue-builder --use docker buildx inspect --bootstrap ``` This creates a BuildKit container that supports cross-platform builds. The `--bootstrap` line spins it up immediately so errors surface now instead of on first build. ## The deploy ### For a single service ```bash export KUBECONFIG=~/.kube/honeydue-k3s.yaml kubectl set image deployment/api -n honeydue \ api="gitea.treytartt.com/admin/honeydue-api:${SHA}" ``` This updates the Deployment's image field. Kubernetes: 1. Creates a new ReplicaSet with the new image (annotation records rev) 2. Starts a new pod (per `maxSurge: 1`) 3. Waits for readinessProbe to pass on the new pod (up to 240s for cold api boot) 4. Once ready, removes a pod from the old ReplicaSet 5. Repeats until all pods are on the new ReplicaSet 6. Marks rollout complete ### Watching the rollout ```bash kubectl rollout status -n honeydue deployment/api ``` Outputs progress; returns when complete or timed out. Default timeout is 10 minutes. More detailed: ```bash # Watch pods transition kubectl get pods -n honeydue -l app.kubernetes.io/name=api -w # Watch events kubectl get events -n honeydue --sort-by=.lastTimestamp -w ``` ### For all three services ```bash for svc in api worker admin; do kubectl set image deployment/$svc -n honeydue \ $svc="gitea.treytartt.com/admin/honeydue-${svc}:${SHA}" done # Watch all rollouts for svc in api worker admin; do kubectl rollout status -n honeydue deployment/$svc done ``` ## Config-only changes (no new image) When you change `prod.env` but code is unchanged: ```bash # 1. Update prod.env locally # 2. Regenerate ConfigMap kubectl create configmap honeydue-config -n honeydue \ --from-env-file=deploy/prod.env \ --dry-run=client -o yaml | kubectl apply -f - # 3. Pods do NOT auto-reload env vars. Restart them. kubectl rollout restart -n honeydue deployment/api deployment/admin deployment/worker ``` `rollout restart` triggers a rolling update with the *same* image but forces pod recreation. New pods pick up the updated ConfigMap. ### Why not auto-reload? Kubernetes has no built-in mechanism to restart pods on ConfigMap change. There's no `envFromWatch` equivalent. Third-party operators like Reloader can do it, but we don't run one. For sensitive config (like the `SECRET_KEY`), this is actually good — pods don't cycle unexpectedly when someone tweaks the ConfigMap. ## Secret changes Same flow as config: ```bash # Rotate a value kubectl patch secret honeydue-secrets -n honeydue \ --type=merge -p "{\"data\":{\"SECRET_KEY\":\"$(echo -n 'newvalue' | base64)\"}}" # Restart pods kubectl rollout restart -n honeydue deployment/api deployment/worker ``` ## Manifest changes When you add/modify a deployment YAML: ```bash kubectl apply -f deploy-k3s/manifests/api/deployment.yaml ``` If the change is a spec field that Kubernetes considers a new pod template (e.g., changing resource limits, env, volumes), pods roll. If the change is a scalar like replicas, no pod churn — just new pods added/removed. ## Rollback ### Last-known-good rollback ```bash kubectl rollout undo deployment/api -n honeydue ``` Reverts to the previous ReplicaSet (the one with the previous image). Takes ~30s to stabilize. ### Rollback to a specific revision ```bash # See revision history kubectl rollout history deployment/api -n honeydue # Revert to specific revision number kubectl rollout undo deployment/api -n honeydue --to-revision=3 ``` Kubernetes keeps up to 10 ReplicaSet revisions by default (`spec.revisionHistoryLimit`). ### Hard rollback (deploy an older image) ```bash kubectl set image deployment/api -n honeydue \ api="gitea.treytartt.com/admin/honeydue-api:" ``` Useful when you want to go back further than the revision history, or to a specific known-good SHA. ## Rolling update semantics ```yaml strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 0 maxSurge: 1 ``` For api (3 replicas): - `maxUnavailable: 0` — no pod is removed until replacement is ready - `maxSurge: 1` — up to 4 pods exist simultaneously during rollout Timeline (approximate, warm state): - t=0: kubectl set image - t=0: k8s creates new RS with 1 pod - t=30s (or so): new pod readiness probe passes - t=30s: k8s terminates 1 old pod - t=60s: next new pod ready - t=60s: another old pod terminates - ...continues until all on new RS For cold-boot (e.g., first deploy on a rebuilt cluster), the MigrateWithLock advisory lock extends this to several minutes. But the rollout is serialized — only one pod starts per iteration, so the lock queue is small. ## Hotfix workflow When we need to ship a fix fast and skip the usual steps: 1. Fix in code 2. Build + push 3. `kubectl set image` on the affected service only 4. Monitor with `kubectl logs -f` Don't skip CI/tests in a real org; for solo operator this is the tradeoff. ## Integration with Gitea Currently no CI/CD. The operator builds from the workstation and pushes manually. Future: - Gitea Actions (Drone-like CI) could trigger on push to `main` - Build + push step could run in a GitHub Actions-compatible workflow - Auto-deploy on tag push, manual promote to prod **TODO** (Chapter 20). ## What the old Swarm deploy script did Contrast: `deploy/scripts/deploy_prod.sh` (Swarm-era) did: 1. Validate every config file (placeholder detection, APNS key format, B2 all-or-none) 2. Buildx to amd64 3. Push to Gitea (we retrofitted this from GHCR) 4. SCP bundle to manager node 5. `docker secret create` + `docker config create` with versioned names 6. `docker stack deploy --with-registry-auth` 7. Poll stack services until convergence (420s timeout) 8. Prune old secret/config versions 9. Healthcheck the final URL; auto-rollback on failure 10. Log out of registries The current k3s replacement, `deploy-k3s/scripts/03-deploy.sh`, covers the same ground in fewer steps because Kubernetes does the versioning/rollout/health bookkeeping natively. See the TL;DR section at the top of this chapter. ## Common deploy failures | Symptom | Likely cause | |---|---| | `ImagePullBackOff` | Image not in registry, or pull secret expired | | Stuck at "Progressing" | Readiness probe not passing; check pod logs | | `CrashLoopBackOff` immediately | App won't start; check pod logs for panic/exit reason | | `CrashLoopBackOff` after migration | Cache service, Redis connection, or post-init code issue | | Old pods never terminate | New pods not ready; rollout doesn't progress | | Rollout succeeds but app is broken | Readiness probe is too lenient; passes on broken app | ### Debugging commands ```bash # Describe the deployment (shows events, conditions) kubectl describe deployment api -n honeydue # Describe the latest pod kubectl describe pod -n honeydue -l app.kubernetes.io/name=api # Logs from currently-running pods kubectl logs -n honeydue -l app.kubernetes.io/name=api --tail=100 --prefix # Logs from the last-terminated pod kubectl logs -n honeydue --previous # Events in the namespace (newest first) kubectl get events -n honeydue --sort-by=.lastTimestamp # Pause a rollout (stops new pods from being created) kubectl rollout pause deployment/api -n honeydue # Resume kubectl rollout resume deployment/api -n honeydue ``` ## Zero-downtime considerations For zero-downtime deploys, the new image must be: 1. **Backward-compatible** with the current database schema (schema migrations run before new code) 2. **Backward-compatible** with in-flight API requests (don't remove endpoints mid-deploy; deprecate first) 3. **Backward-compatible** with Redis data structures (don't change cache key formats abruptly) For breaking changes: 1. Deploy intermediate version that handles both old and new 2. Once rolled out everywhere, deploy breaking-change version 3. Two deploys, same day or different days We don't have this discipline yet; our API has too few clients to worry about. As mobile clients proliferate, this becomes more important. ## Blue-green / canary (not yet) Kubernetes supports advanced rollout strategies: - **Canary**: route 5% of traffic to new version, scale up gradually - **Blue-green**: run new version alongside old, flip traffic all at once These require Traefik's TraefikService CRD with weighted routing, or a service mesh. **TODO** if traffic scale justifies. ## Cleanup: the old Swarm config `deploy/` directory contains the Swarm-era config. It's still there but unused. After we're confident in k3s (a few weeks? month?), remove it: ```bash rm -rf deploy/ ``` Keep the useful files in `deploy-k3s/` only. ## Operator cheat sheet ```bash # Full build + deploy cd /Users/treyt/Desktop/code/honeyDue/honeyDueAPI-go SHA=$(git rev-parse --short HEAD) set -a; source deploy/registry.env; set +a printf '%s' "$REGISTRY_TOKEN" | docker login "$REGISTRY" -u admin --password-stdin docker buildx build --platform linux/amd64 --target api -t "gitea.treytartt.com/admin/honeydue-api:${SHA}" --push . docker buildx build --platform linux/amd64 --target worker -t "gitea.treytartt.com/admin/honeydue-worker:${SHA}" --push . docker buildx build --platform linux/amd64 --target admin -t "gitea.treytartt.com/admin/honeydue-admin:${SHA}" --push . docker logout gitea.treytartt.com export KUBECONFIG=~/.kube/honeydue-k3s.yaml for svc in api worker admin; do kubectl set image deployment/$svc -n honeydue "$svc=gitea.treytartt.com/admin/honeydue-${svc}:${SHA}" done for svc in api worker admin; do kubectl rollout status -n honeydue deployment/$svc done ``` ## References - [Kubernetes Deployment rolling update][rolling] - [kubectl rollout][rollout] - [Docker buildx][buildx] [rolling]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#rolling-update-deployment [rollout]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#rollout [buildx]: https://docs.docker.com/build/buildx/