Files
honeyDueAPI/docs/deployment/11-registry.md
T
Trey t 6f303dbbaa
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Migrate prod deploy from Swarm to K3s; add full deployment book
Infrastructure:
- Stack now runs on K3s v1.34.6 HA (3 Hetzner CX33 nodes as managers)
- Traefik DaemonSet + hostNetwork replaces Caddy + ingress mesh
- All manifests in deploy-k3s/manifests/; Swarm config (deploy/) kept
  temporarily for reference

Bug fixes surfaced during migration:
- Dockerfile: golang:1.24-alpine -> 1.25-alpine (go.mod requires 1.25)
- cache_service.go: remove sync.Once reassignment from inside Do()
  callback (was causing 'unlock of unlocked mutex' fatal after
  Redis Ping failure)
- router.go: relax CSP from 'default-src none' to 'default-src self'
  + allowlist fonts.googleapis.com so the marketing landing page CSS
  actually loads in browsers
- deploy/scripts/deploy_prod.sh: use docker buildx with
  --platform linux/amd64 so arm64 (Apple Silicon) dev machines produce
  images runnable on x86_64 Hetzner nodes; fix array expansion under
  set -u
- deploy/swarm-stack.prod.yml: fix secret source references to use
  top-level aliases (the '\${X_SECRET}' form never actually resolved);
  dozzle ports: long-form host_ip is rejected by Swarm, switched to
  short-form (bound to 0.0.0.0 with UFW-based loopback restriction);
  worker replicas 2 -> 1 (Asynq scheduler singleton)
- deploy-k3s/manifests/admin/deployment.yaml: probe path '/admin/' -> '/'
  (Next.js serves at root; /admin/ returned 404 and killed pods);
  startupProbe failureThreshold 12 -> 24
- deploy-k3s/manifests/pod-disruption-budgets.yaml: worker minAvailable
  1 -> 0 (singleton)
- deploy-k3s/manifests/api/deployment.yaml: startupProbe failureThreshold
  12 -> 48 (MigrateWithLock serializes across 3 replicas on first-boot;
  real startup takes up to 240s)
- .gitignore: tighten 'api' -> '/api' (was matching deploy-k3s/manifests/api/
  and admin/src/app/api/*, hiding legitimate files)

New files:
- deploy-k3s/manifests/traefik-helmchartconfig.yaml: DaemonSet +
  hostNetwork override for k3s-bundled Traefik
- deploy-k3s/manifests/ingress/ingress-simple.yaml: plain Ingress
  without TLS (CF Flexible SSL) and without middleware
- deploy-k3s/MIGRATION_NOTES.md: operator-facing migration log

Documentation:
- docs/deployment/ — full deployment book, 26 files, ~42k words:
  - Part I Overview, infrastructure, orchestrator choice (Ch 0-2)
  - Part II Networking, firewall, Cloudflare (Ch 3-4, 13)
  - Part III Security, Traefik ingress (Ch 5-6)
  - Part IV Services, DB, storage, secrets, registry (Ch 7-11)
  - Part V Data flow, deploy process, observability, failures, runbook
    (Ch 12, 14-17)
  - Part VI Cost, Swarm postmortem, roadmap (Ch 18-20)
  - Appendices: glossary, kubectl cheat sheet, file locations,
    consolidated citations
- README.md: Production Deployment section replaced with pointer to
  the book; Go version bumped to 1.25

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 07:20:54 -05:00

11 KiB
Raw Blame History

11 — Container Registry (Gitea)

Summary

We host our own container registry on Gitea at gitea.treytartt.com. Every image push and pull goes there, not Docker Hub or GHCR. The Gitea instance runs outside this k3s cluster (on its own VPS) and is available at https://gitea.treytartt.com with public HTTPS. Image pulls are authenticated via a Personal Access Token stored as a Kubernetes dockerconfigjson Secret.

Why Gitea

Decision matrix

Option Cost Auth model Pros Cons
Gitea built-in registry $0 (already running Gitea) Gitea PAT Self-hosted, integrated with code Another service to maintain
GHCR (GitHub Container Registry) Free for public, $0 for private with paid plan GitHub PAT Popular, reliable Uses GitHub; vendor dependency
Docker Hub Free tier limited; paid $5-7/mo Docker Hub account Ubiquitous Rate limits on anonymous pulls
AWS ECR ~$1/mo for small use IAM Integrates with AWS workloads AWS account required
Harbor (self-hosted) $0 Many options Best enterprise features Heavy to operate

Gitea won primarily because the operator was already running Gitea for code hosting. Container registry is built into Gitea 1.17+ as a free feature. One fewer service to set up.

Side benefits:

  • Code and images live together (one backup policy, one access model)
  • PATs are scoped and rotatable via the same UI
  • No external vendor to worry about for this critical piece of the deploy pipeline

Rejected alternatives:

  • Docker Hub — rate limits on unauthenticated pulls would bite us if nodes pull the same image repeatedly during rolling updates
  • GHCR — fine but adds GitHub dependency we don't otherwise have
  • Harbor — massive overkill; we're not a 100-team enterprise

Layout

Images live under the authenticated user's namespace:

gitea.treytartt.com/admin/honeydue-api:237c6b8
gitea.treytartt.com/admin/honeydue-worker:237c6b8
gitea.treytartt.com/admin/honeydue-admin:237c6b8

admin is the Gitea user that owns the images. Images are private by default.

Image tagging strategy

Tags are git short SHAs (e.g., 237c6b8). Not :latest. Not semantic version.

Rationale:

  • :latest is ambiguous — which build? Rolling updates should roll a specific tag so rollbacks are deterministic.
  • :v1.2.3 works for released libraries but our app rolls forward continuously; versioning per deploy is unnecessary overhead.
  • Git SHAs are unique, immutable, and tie each image to the exact commit that built it.

PUSH_LATEST_TAG=false is set in deploy/cluster.env. When we rebuild and push, only the SHA tag gets pushed. The latest tag is never created by our deploy pipeline.

Authentication

Creating the PAT

At https://gitea.treytartt.com/-/user/settings/applications, we created a token with scopes:

  • read:package
  • write:package

No other scopes. This token can only interact with package registry; it can't read repo contents, create issues, or touch account settings.

PAT on the operator workstation

Stored in deploy/registry.env:

REGISTRY=gitea.treytartt.com
REGISTRY_NAMESPACE=admin
REGISTRY_USERNAME=admin
REGISTRY_TOKEN=<pat>

This file is .gitignored in deploy/.gitignore. If it ever gets committed accidentally, rotate the PAT immediately.

PAT in the cluster

Stored as the gitea-credentials Secret (type dockerconfigjson) in the honeydue namespace. See Chapter 10.

Kubelet reads this Secret when a pod needs to pull from the Gitea registry.

The build pipeline

Dockerfile multi-stage

honeyDueAPI-go/Dockerfile has three target stages:

  • api — compiled Go binary + static assets for the HTTP API
  • worker — compiled Go binary for the background worker
  • admin — Next.js standalone build of the admin panel

A single Dockerfile keeps build-cache sharing efficient (the Go builder stage produces binaries for both api and worker; admin reuses its own Node builder stage).

Multi-arch cross-compilation

The operator workstation is arm64 (Apple Silicon). The Hetzner nodes are x86_64. A naive docker build on arm64 produces arm64 images that won't run on the nodes (exec format error).

The deploy pipeline uses docker buildx:

docker buildx build \
  --platform linux/amd64 \
  --target api \
  -t gitea.treytartt.com/admin/honeydue-api:$SHA \
  --push \
  /Users/treyt/Desktop/code/honeyDue/honeyDueAPI-go
  • --platform linux/amd64 — cross-compile to x86_64
  • --target api — which Dockerfile stage to build
  • --push — push directly to the registry (skip local image cache)

The Go stages use the TARGETARCH build arg to produce the right architecture binary. Node stages use QEMU emulation (which is slower but acceptable for our ~1 min admin build).

Buildx builder

We use a named buildx builder to keep state out of Docker's default environment:

docker buildx create --name honeydue-builder --use
docker buildx inspect --bootstrap

The honeydue-builder is a docker-container driver — spawns a BuildKit container when building, tears it down when idle. Supports multi-platform and caches layers across builds.

From local file to cluster — the full path

flowchart LR
    subgraph dev[Operator workstation]
        Code[Source code]
        Dockerfile
        Buildx[docker buildx]
    end
    subgraph Gitea[gitea.treytartt.com]
        Reg[Package registry]
    end
    subgraph K8s[k3s cluster]
        Kubelet
        Containerd
        Pod
    end

    Code --> Dockerfile
    Dockerfile --> Buildx
    Buildx -- push --> Reg
    Reg -- pull --> Kubelet
    Kubelet --> Containerd
    Containerd --> Pod

End-to-end

  1. Operator pushes code: commits to main locally
  2. Operator builds + pushes image: docker buildx build --push ... from the repo root. Build takes 13 minutes first time, seconds on warm cache.
  3. Image lands in Gitea: visible at https://gitea.treytartt.com/admin/-/packages/container/honeydue-api
  4. Operator updates Deployment: kubectl set image deployment/api api=gitea.treytartt.com/admin/honeydue-api:$NEW_SHA -n honeydue
  5. K8s begins rolling update: creates new ReplicaSet with new image
  6. Kubelet on target node sees a pod with an image it doesn't have
  7. Kubelet calls containerd: "pull this image using these creds"
  8. Containerd authenticates to Gitea registry using the PAT from gitea-credentials Secret, downloads the image
  9. Containerd starts the container with the new image
  10. Readiness probe passes: new pod joins the Service endpoints
  11. Kubelet tears down an old pod

Pushing manually

If you need to push a one-off image (e.g., testing a fix):

# Login (once per session)
set -a; source deploy/registry.env; set +a
printf '%s' "$REGISTRY_TOKEN" | docker login "$REGISTRY" -u "$REGISTRY_USERNAME" --password-stdin

# Build + push
cd honeyDueAPI-go
SHA=$(git rev-parse --short HEAD)
docker buildx build \
  --platform linux/amd64 \
  --target api \
  -t "gitea.treytartt.com/admin/honeydue-api:${SHA}" \
  --push .

# Logout (don't leave creds in ~/.docker/config.json)
docker logout gitea.treytartt.com

Image sizes

Current images:

Image Size Layers
honeydue-api ~53 MB Alpine base + Go binary
honeydue-worker ~50 MB Alpine base + Go binary
honeydue-admin ~150 MB Node 20 alpine + Next.js standalone

The Go binaries are statically compiled, CGO_ENABLED=0. Alpine is the base for smallest footprint.

Image retention

Gitea does not auto-prune images. Every :<sha> tag accumulates forever. The package page at https://gitea.treytartt.com/admin/-/packages/container/honeydue-api lists them all.

At current pace (deploys ~few/week, images ~50-150 MB each), this grows ~10 GB/year. Not critical; 80 GB node disk can take years.

TODO: Add a monthly cleanup: delete all but last 30 tags per image. Can be a cron job or a manual quarterly cleanup.

Image verification — not yet

We do not sign images or verify signatures. An attacker who compromised Gitea could push a malicious image under an existing tag (though Gitea should prevent tag reuse if immutable tags are configured).

TODO (Chapter 20): Add cosign for signing at build time + Kyverno or Connaisseur policy to verify at pull time.

Gitea registry itself

The Gitea instance runs outside this k3s cluster on its own VPS (operator's existing infrastructure). It's not part of the honeyDue deployment — it's adjacent infrastructure.

If the Gitea host goes down:

  • Currently-running pods keep working (they already pulled their images)
  • New deployments/scale-ups fail at the image-pull step
  • No impact on existing user traffic

This is an acceptable external dependency. Gitea host has its own uptime story.

Cost

$0/mo. Gitea registry is included in the Gitea install we already pay the VPS for (not accounted to honeyDue's cost).

If we ever switched to GHCR, cost would still be $0 for public images or bundled with our (nonexistent) GitHub Team subscription.

What we don't have

  • Image scanning (Trivy, Snyk) — scan images for known CVEs on push
  • Image signing (cosign)
  • Multi-region replication — only hosted in one place
  • High availability — Gitea is single-instance

For our scale, none of these are needed. TODO (Chapter 20) if the operator appetite increases.

Operator cheat sheet

# List packages via API
curl -sS "https://gitea.treytartt.com/api/v1/packages/admin?type=container" \
  -H "Accept: application/json" | jq .

# Browse in UI
#   https://gitea.treytartt.com/admin/-/packages

# Delete a specific tag via API
curl -X DELETE \
  -H "Authorization: token $GITEA_PAT" \
  "https://gitea.treytartt.com/api/v1/packages/admin/container/honeydue-api/237c6b8"

# Login from kubectl side (refresh the Secret)
kubectl create secret docker-registry gitea-credentials -n honeydue \
  --docker-server=gitea.treytartt.com \
  --docker-username=admin \
  --docker-password=<new PAT> \
  --dry-run=client -o yaml | kubectl apply -f -

# After rotating PAT, restart pods that use it for pulls
kubectl rollout restart -n honeydue deploy/api deploy/admin deploy/worker

References