Files
honeyDueAPI/docs/deployment
Trey t 77cfcc0b27
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
docs: rewrite ch15 observability + cross-refs for the live obs stack
ch15 is now an account of what's actually running, not a roadmap for
what we'd add: VictoriaMetrics + Jaeger + Grafana on 88oakappsUpdate
fronted by Cloudflare and bearer-gated nginx, vmagent in-cluster, the
internal/prom histogram set, the rollout's NetworkPolicy footprint,
the obs.88oakapps.com endpoint shape, the ~$0/700MB resource budget,
and a token-rotation runbook. The "what we still don't have" section
keeps log aggregation, alerting, and full distributed tracing as the
honest gap list.

Other touched docs:
- 00-overview: \"deliberately absent\" no longer claims we have no
  metrics — calls out the cross-cluster shape instead.
- 14-deployment-process: TL;DR now points at deploy-k3s/scripts/03-deploy.sh
  (full build + push + apply + obs vmagent), with the manual
  kubectl-set-image flow kept as the single-service path. Notes the
  IfNotPresent gotcha that bit us during the rollout.
- 16-failure-modes: adds vmagent-can't-reach-obs and Grafana-no-data.
- 18-cost: $0 line item for the obs stack on 88oakappsUpdate, with the
  CX32 migration trigger.
- 17/18 README + appendix b: link the new ch15, add the obs cheat
  sheet block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 15:05:06 -05:00
..

honeyDue Production Deployment — The Book

This is the complete reference for the honeyDue production deployment as it exists on 2026-04-24. It serves two audiences:

  1. A new engineer learning the system for the first time. Start at Chapter 0 (Overview) and read in order. Concepts are built up; nothing is assumed beyond "you've deployed web apps before."
  2. The operator (future-you) needing a specific fact fast. Every chapter opens with a one-paragraph summary and has an operator runbook at its end. The appendices are a cheat sheet.

The deployment is non-trivial. It's a 3-node HA Kubernetes cluster running a Go API, a Next.js admin panel, a background worker, Redis, and Traefik — all fronted by Cloudflare, integrated with Neon Postgres, Backblaze B2, and a self-hosted Gitea registry. This book explains why each of those pieces was chosen (often over two or three alternatives we tried first), what they do, and how to operate them.

Table of Contents

Part I — The System

Part II — Networking

Part III — Security

Part IV — Workloads

Part V — Operation

Part VI — Context

Appendices

Quick Facts

Field Value
Orchestrator K3s v1.34.6+k3s1 (3 nodes, HA control plane)
Ingress Traefik v3 (DaemonSet, hostNetwork)
Nodes 3× Hetzner Cloud CX33 (4 vCPU, 8 GB RAM, 80 GB SSD) in nbg1 (Nuremberg)
DNS & Edge Cloudflare (Free plan), SSL=Flexible, round-robin 3 node A records
Database Neon Postgres, ep-floral-truth-amttbc5a.c-5.us-east-1.aws.neon.tech
Cache + Queue Redis 7-alpine, in-cluster, 1 replica, PVC-backed, pinned to nbg1-2
Object Storage Backblaze B2, honeyDueProd bucket, us-east-005 region
Image Registry Self-hosted Gitea v1.25.5 at gitea.treytartt.com
Transactional Email Fastmail SMTP (smtp.fastmail.com:587)
Domains api.myhoneydue.com, admin.myhoneydue.com, myhoneydue.com
Monthly Cost (current) ~$3040 (3× Hetzner + Neon Launch + B2 + Cloudflare Free + Gitea free)
kubeconfig ~/.kube/honeydue-k3s.yaml on operator workstation
Repo honeyDueAPI-go/deploy-k3s/ for manifests, deploy/ is the legacy Swarm config

How to Read This Book

  • "Why did we…?" answers are in the chapter covering that component. Every major design choice has an explicit rejection of 13 alternatives.
  • Historical bugs are in Chapter 19. The rest of the book describes the current (fixed) state; 19 is the forensic record of what was broken and how we figured it out.
  • Operator commands you'll run regularly are in Appendix B. Chapter 17 has longer procedures (cert rotation, DB migration, etc.).
  • Citations throughout use footnote-style links to the canonical source (k3s docs, moby issues, Cloudflare docs, etc.). Appendix D collects them.

Conventions

  • Kubernetes namespace for the app is honeydue.
  • SSH aliases are hetzner1, hetzner2, hetzner3 in your ~/.ssh/config.
  • Node hostnames in the cluster are ubuntu-8gb-nbg1-{1,2,3} (Hetzner-assigned).
  • The mapping is non-obvious because the Hetzner hostname suffix order does not match SSH alias order:
SSH alias Public IP Hostname in k3s
hetzner1 178.104.247.152 ubuntu-8gb-nbg1-2
hetzner2 178.105.32.198 ubuntu-8gb-nbg1-1
hetzner3 178.104.249.189 ubuntu-8gb-nbg1-3

When a chapter refers to "hetzner1" it means the box at 178.104.247.152 / nbg1-2.