Infrastructure:
- Stack now runs on K3s v1.34.6 HA (3 Hetzner CX33 nodes as managers)
- Traefik DaemonSet + hostNetwork replaces Caddy + ingress mesh
- All manifests in deploy-k3s/manifests/; Swarm config (deploy/) kept
temporarily for reference
Bug fixes surfaced during migration:
- Dockerfile: golang:1.24-alpine -> 1.25-alpine (go.mod requires 1.25)
- cache_service.go: remove sync.Once reassignment from inside Do()
callback (was causing 'unlock of unlocked mutex' fatal after
Redis Ping failure)
- router.go: relax CSP from 'default-src none' to 'default-src self'
+ allowlist fonts.googleapis.com so the marketing landing page CSS
actually loads in browsers
- deploy/scripts/deploy_prod.sh: use docker buildx with
--platform linux/amd64 so arm64 (Apple Silicon) dev machines produce
images runnable on x86_64 Hetzner nodes; fix array expansion under
set -u
- deploy/swarm-stack.prod.yml: fix secret source references to use
top-level aliases (the '\${X_SECRET}' form never actually resolved);
dozzle ports: long-form host_ip is rejected by Swarm, switched to
short-form (bound to 0.0.0.0 with UFW-based loopback restriction);
worker replicas 2 -> 1 (Asynq scheduler singleton)
- deploy-k3s/manifests/admin/deployment.yaml: probe path '/admin/' -> '/'
(Next.js serves at root; /admin/ returned 404 and killed pods);
startupProbe failureThreshold 12 -> 24
- deploy-k3s/manifests/pod-disruption-budgets.yaml: worker minAvailable
1 -> 0 (singleton)
- deploy-k3s/manifests/api/deployment.yaml: startupProbe failureThreshold
12 -> 48 (MigrateWithLock serializes across 3 replicas on first-boot;
real startup takes up to 240s)
- .gitignore: tighten 'api' -> '/api' (was matching deploy-k3s/manifests/api/
and admin/src/app/api/*, hiding legitimate files)
New files:
- deploy-k3s/manifests/traefik-helmchartconfig.yaml: DaemonSet +
hostNetwork override for k3s-bundled Traefik
- deploy-k3s/manifests/ingress/ingress-simple.yaml: plain Ingress
without TLS (CF Flexible SSL) and without middleware
- deploy-k3s/MIGRATION_NOTES.md: operator-facing migration log
Documentation:
- docs/deployment/ — full deployment book, 26 files, ~42k words:
- Part I Overview, infrastructure, orchestrator choice (Ch 0-2)
- Part II Networking, firewall, Cloudflare (Ch 3-4, 13)
- Part III Security, Traefik ingress (Ch 5-6)
- Part IV Services, DB, storage, secrets, registry (Ch 7-11)
- Part V Data flow, deploy process, observability, failures, runbook
(Ch 12, 14-17)
- Part VI Cost, Swarm postmortem, roadmap (Ch 18-20)
- Appendices: glossary, kubectl cheat sheet, file locations,
consolidated citations
- README.md: Production Deployment section replaced with pointer to
the book; Go version bumped to 1.25
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 KiB
11 — Container Registry (Gitea)
Summary
We host our own container registry on Gitea at gitea.treytartt.com.
Every image push and pull goes there, not Docker Hub or GHCR. The Gitea
instance runs outside this k3s cluster (on its own VPS) and is available
at https://gitea.treytartt.com with public HTTPS. Image pulls are
authenticated via a Personal Access Token stored as a Kubernetes
dockerconfigjson Secret.
Why Gitea
Decision matrix
| Option | Cost | Auth model | Pros | Cons |
|---|---|---|---|---|
| Gitea built-in registry | $0 (already running Gitea) | Gitea PAT | Self-hosted, integrated with code | Another service to maintain |
| GHCR (GitHub Container Registry) | Free for public, $0 for private with paid plan | GitHub PAT | Popular, reliable | Uses GitHub; vendor dependency |
| Docker Hub | Free tier limited; paid $5-7/mo | Docker Hub account | Ubiquitous | Rate limits on anonymous pulls |
| AWS ECR | ~$1/mo for small use | IAM | Integrates with AWS workloads | AWS account required |
| Harbor (self-hosted) | $0 | Many options | Best enterprise features | Heavy to operate |
Gitea won primarily because the operator was already running Gitea for code hosting. Container registry is built into Gitea 1.17+ as a free feature. One fewer service to set up.
Side benefits:
- Code and images live together (one backup policy, one access model)
- PATs are scoped and rotatable via the same UI
- No external vendor to worry about for this critical piece of the deploy pipeline
Rejected alternatives:
- Docker Hub — rate limits on unauthenticated pulls would bite us if nodes pull the same image repeatedly during rolling updates
- GHCR — fine but adds GitHub dependency we don't otherwise have
- Harbor — massive overkill; we're not a 100-team enterprise
Layout
Images live under the authenticated user's namespace:
gitea.treytartt.com/admin/honeydue-api:237c6b8
gitea.treytartt.com/admin/honeydue-worker:237c6b8
gitea.treytartt.com/admin/honeydue-admin:237c6b8
admin is the Gitea user that owns the images. Images are private
by default.
Image tagging strategy
Tags are git short SHAs (e.g., 237c6b8). Not :latest. Not semantic
version.
Rationale:
:latestis ambiguous — which build? Rolling updates should roll a specific tag so rollbacks are deterministic.:v1.2.3works for released libraries but our app rolls forward continuously; versioning per deploy is unnecessary overhead.- Git SHAs are unique, immutable, and tie each image to the exact commit that built it.
PUSH_LATEST_TAG=false is set in deploy/cluster.env. When we rebuild
and push, only the SHA tag gets pushed. The latest tag is never
created by our deploy pipeline.
Authentication
Creating the PAT
At https://gitea.treytartt.com/-/user/settings/applications, we created a token with scopes:
read:packagewrite:package
No other scopes. This token can only interact with package registry; it can't read repo contents, create issues, or touch account settings.
PAT on the operator workstation
Stored in deploy/registry.env:
REGISTRY=gitea.treytartt.com
REGISTRY_NAMESPACE=admin
REGISTRY_USERNAME=admin
REGISTRY_TOKEN=<pat>
This file is .gitignored in deploy/.gitignore. If it ever gets
committed accidentally, rotate the PAT immediately.
PAT in the cluster
Stored as the gitea-credentials Secret (type dockerconfigjson) in
the honeydue namespace. See Chapter 10.
Kubelet reads this Secret when a pod needs to pull from the Gitea registry.
The build pipeline
Dockerfile multi-stage
honeyDueAPI-go/Dockerfile has three target stages:
api— compiled Go binary + static assets for the HTTP APIworker— compiled Go binary for the background workeradmin— Next.js standalone build of the admin panel
A single Dockerfile keeps build-cache sharing efficient (the Go builder stage produces binaries for both api and worker; admin reuses its own Node builder stage).
Multi-arch cross-compilation
The operator workstation is arm64 (Apple Silicon). The Hetzner nodes
are x86_64. A naive docker build on arm64 produces arm64 images
that won't run on the nodes (exec format error).
The deploy pipeline uses docker buildx:
docker buildx build \
--platform linux/amd64 \
--target api \
-t gitea.treytartt.com/admin/honeydue-api:$SHA \
--push \
/Users/treyt/Desktop/code/honeyDue/honeyDueAPI-go
--platform linux/amd64— cross-compile to x86_64--target api— which Dockerfile stage to build--push— push directly to the registry (skip local image cache)
The Go stages use the TARGETARCH build arg to produce the right
architecture binary. Node stages use QEMU emulation (which is slower but
acceptable for our ~1 min admin build).
Buildx builder
We use a named buildx builder to keep state out of Docker's default environment:
docker buildx create --name honeydue-builder --use
docker buildx inspect --bootstrap
The honeydue-builder is a docker-container driver — spawns a
BuildKit container when building, tears it down when idle. Supports
multi-platform and caches layers across builds.
From local file to cluster — the full path
flowchart LR
subgraph dev[Operator workstation]
Code[Source code]
Dockerfile
Buildx[docker buildx]
end
subgraph Gitea[gitea.treytartt.com]
Reg[Package registry]
end
subgraph K8s[k3s cluster]
Kubelet
Containerd
Pod
end
Code --> Dockerfile
Dockerfile --> Buildx
Buildx -- push --> Reg
Reg -- pull --> Kubelet
Kubelet --> Containerd
Containerd --> Pod
End-to-end
- Operator pushes code: commits to
mainlocally - Operator builds + pushes image:
docker buildx build --push ...from the repo root. Build takes 1–3 minutes first time, seconds on warm cache. - Image lands in Gitea: visible at
https://gitea.treytartt.com/admin/-/packages/container/honeydue-api - Operator updates Deployment:
kubectl set image deployment/api api=gitea.treytartt.com/admin/honeydue-api:$NEW_SHA -n honeydue - K8s begins rolling update: creates new ReplicaSet with new image
- Kubelet on target node sees a pod with an image it doesn't have
- Kubelet calls containerd: "pull this image using these creds"
- Containerd authenticates to Gitea registry using the PAT from
gitea-credentialsSecret, downloads the image - Containerd starts the container with the new image
- Readiness probe passes: new pod joins the Service endpoints
- Kubelet tears down an old pod
Pushing manually
If you need to push a one-off image (e.g., testing a fix):
# Login (once per session)
set -a; source deploy/registry.env; set +a
printf '%s' "$REGISTRY_TOKEN" | docker login "$REGISTRY" -u "$REGISTRY_USERNAME" --password-stdin
# Build + push
cd honeyDueAPI-go
SHA=$(git rev-parse --short HEAD)
docker buildx build \
--platform linux/amd64 \
--target api \
-t "gitea.treytartt.com/admin/honeydue-api:${SHA}" \
--push .
# Logout (don't leave creds in ~/.docker/config.json)
docker logout gitea.treytartt.com
Image sizes
Current images:
| Image | Size | Layers |
|---|---|---|
honeydue-api |
~53 MB | Alpine base + Go binary |
honeydue-worker |
~50 MB | Alpine base + Go binary |
honeydue-admin |
~150 MB | Node 20 alpine + Next.js standalone |
The Go binaries are statically compiled, CGO_ENABLED=0. Alpine is the base for smallest footprint.
Image retention
Gitea does not auto-prune images. Every :<sha> tag accumulates
forever. The package page at
https://gitea.treytartt.com/admin/-/packages/container/honeydue-api
lists them all.
At current pace (deploys ~few/week, images ~50-150 MB each), this grows ~10 GB/year. Not critical; 80 GB node disk can take years.
TODO: Add a monthly cleanup: delete all but last 30 tags per image. Can be a cron job or a manual quarterly cleanup.
Image verification — not yet
We do not sign images or verify signatures. An attacker who compromised Gitea could push a malicious image under an existing tag (though Gitea should prevent tag reuse if immutable tags are configured).
TODO (Chapter 20): Add cosign
for signing at build time + Kyverno or Connaisseur policy to verify
at pull time.
Gitea registry itself
The Gitea instance runs outside this k3s cluster on its own VPS (operator's existing infrastructure). It's not part of the honeyDue deployment — it's adjacent infrastructure.
If the Gitea host goes down:
- Currently-running pods keep working (they already pulled their images)
- New deployments/scale-ups fail at the image-pull step
- No impact on existing user traffic
This is an acceptable external dependency. Gitea host has its own uptime story.
Cost
$0/mo. Gitea registry is included in the Gitea install we already pay the VPS for (not accounted to honeyDue's cost).
If we ever switched to GHCR, cost would still be $0 for public images or bundled with our (nonexistent) GitHub Team subscription.
What we don't have
- Image scanning (Trivy, Snyk) — scan images for known CVEs on push
- Image signing (cosign)
- Multi-region replication — only hosted in one place
- High availability — Gitea is single-instance
For our scale, none of these are needed. TODO (Chapter 20) if the operator appetite increases.
Operator cheat sheet
# List packages via API
curl -sS "https://gitea.treytartt.com/api/v1/packages/admin?type=container" \
-H "Accept: application/json" | jq .
# Browse in UI
# https://gitea.treytartt.com/admin/-/packages
# Delete a specific tag via API
curl -X DELETE \
-H "Authorization: token $GITEA_PAT" \
"https://gitea.treytartt.com/api/v1/packages/admin/container/honeydue-api/237c6b8"
# Login from kubectl side (refresh the Secret)
kubectl create secret docker-registry gitea-credentials -n honeydue \
--docker-server=gitea.treytartt.com \
--docker-username=admin \
--docker-password=<new PAT> \
--dry-run=client -o yaml | kubectl apply -f -
# After rotating PAT, restart pods that use it for pulls
kubectl rollout restart -n honeydue deploy/api deploy/admin deploy/worker