fix(security): remediate 2026-05-12 audit findings (Stages 2–5)
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Backend CI / Build (push) Has been cancelled

Remediation of the 2026-05-12/13 audits (78 findings + cluster gaps),
tracked in deploy-k3s/SECURITY.md, plus fixes from two independent
post-remediation reviews.

Auth & sessions:
- SHA-256 hashed auth-token storage (C1); prior-token cache eviction on
  re-login (MEDIUM-1)
- local Google JWKS verification, iss/aud/exp checks (C2/C3)
- constant-time login + generic errors (L1/LIVE-L11/LIVE-L13)
- per-account login lockout keyed on distinct source IPs (M5/MEDIUM-3)
- verified-email gating, login rate limiting (LIVE-L19, H1-H3)

IAP & webhooks:
- Apple/Google cross-account replay protection (C5/C6/C10/C13, H5/H6)
- migrations 000003-000006 (token hashing, IAP replay, audit_log +
  webhook_event_log table creation, append-only audit log)

Authorization & races:
- file-ownership owner-OR-member fix (C7), atomic share-code join
  (C9/H9), device-token reassignment (C8/LOW-3)

Secrets & deploy:
- secrets file-mounted at /etc/honeydue/secrets, not env (F8); Redis
  password out of the ConfigMap (HIGH-1); B2 keys reconciled
- digest-pinned images, admin ingress hardening, CSP/HSTS, /metrics
  lockdown; kubeconfig 0600, etcd secrets-encryption, fail2ban +
  unattended-upgrades at provision; secret-rotation runbook

Build, vet, and the full test suite (incl. -race) pass; the goose
migration chain is verified against PostgreSQL 16.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trey t
2026-05-16 22:28:33 -05:00
parent 2004f9c5b2
commit c77ff07ce9
59 changed files with 2819 additions and 1245 deletions
+27 -5
View File
@@ -68,6 +68,25 @@ SECRET_ARGS=(
if [[ -n "${REDIS_PASSWORD}" ]]; then
log " Including REDIS_PASSWORD in secrets"
SECRET_ARGS+=(--from-literal="REDIS_PASSWORD=${REDIS_PASSWORD}")
else
# Audit K3S-F1 (CRITICAL) / MEDIUM-4: refuse to deploy with an unauthenticated
# Redis. A previous version only warned here, which let a deploy from an
# unedited config.yaml silently bring Redis up with no password.
die "redis.password is empty in config.yaml — refusing to deploy: Redis would run with NO authentication (audit K3S-F1). Set a strong value, e.g.: openssl rand -base64 32"
fi
# B2 (Backblaze) object-storage credentials. The api/worker manifests
# reference B2_KEY_ID / B2_APP_KEY as required secret keys, so honeydue-secrets
# MUST carry them or those pods fail to start. Sourced from config.yaml so the
# script and the manifests no longer drift (was a latent gap before 2026-05-16).
B2_KEY_ID_VAL="$(cfg storage.b2_key_id 2>/dev/null || true)"
B2_APP_KEY_VAL="$(cfg storage.b2_app_key 2>/dev/null || true)"
if [[ -n "${B2_KEY_ID_VAL}" && -n "${B2_APP_KEY_VAL}" ]]; then
log " Including B2_KEY_ID / B2_APP_KEY in secrets"
SECRET_ARGS+=(--from-literal="B2_KEY_ID=${B2_KEY_ID_VAL}")
SECRET_ARGS+=(--from-literal="B2_APP_KEY=${B2_APP_KEY_VAL}")
else
warn "storage.b2_key_id / b2_app_key not set in config.yaml — B2 uploads will be disabled."
fi
# Observability ingest credentials live in deploy/prod.env (gitignored) so
@@ -100,22 +119,24 @@ kubectl create secret generic honeydue-apns-key \
--from-file="apns_auth_key.p8=${SECRETS_DIR}/apns_auth_key.p8" \
--dry-run=client -o yaml | kubectl apply -f -
# --- Create GHCR registry credentials ---
# --- Create container registry credentials ---
# Secret name is gitea-credentials (audit F6): the registry is self-hosted
# Gitea, not GHCR. Every deployment manifest references this same name.
REGISTRY_SERVER="$(cfg registry.server)"
REGISTRY_USER="$(cfg registry.username)"
REGISTRY_TOKEN="$(cfg registry.token)"
if [[ -n "${REGISTRY_SERVER}" && -n "${REGISTRY_USER}" && -n "${REGISTRY_TOKEN}" ]]; then
log "Creating ghcr-credentials..."
kubectl create secret docker-registry ghcr-credentials \
log "Creating gitea-credentials..."
kubectl create secret docker-registry gitea-credentials \
--namespace="${NAMESPACE}" \
--docker-server="${REGISTRY_SERVER}" \
--docker-username="${REGISTRY_USER}" \
--docker-password="${REGISTRY_TOKEN}" \
--dry-run=client -o yaml | kubectl apply -f -
else
warn "Registry credentials incomplete in config.yaml — skipping ghcr-credentials."
warn "Registry credentials incomplete in config.yaml — skipping gitea-credentials."
fi
# --- Create Cloudflare origin cert ---
@@ -132,7 +153,8 @@ kubectl create secret tls cloudflare-origin-cert \
if [[ -n "${ADMIN_AUTH_USER}" && -n "${ADMIN_AUTH_PASSWORD}" ]]; then
command -v htpasswd >/dev/null 2>&1 || die "Missing: htpasswd (install apache2-utils)"
log "Creating admin-basic-auth secret..."
HTPASSWD="$(htpasswd -nb "${ADMIN_AUTH_USER}" "${ADMIN_AUTH_PASSWORD}")"
# -B forces bcrypt (Traefik BasicAuth supports it; avoids weak apr1-MD5).
HTPASSWD="$(htpasswd -nbB "${ADMIN_AUTH_USER}" "${ADMIN_AUTH_PASSWORD}")"
kubectl create secret generic admin-basic-auth \
--namespace="${NAMESPACE}" \
--from-literal=users="${HTPASSWD}" \
+55 -5
View File
@@ -128,6 +128,56 @@ else
warn "Skipping build. Using images for tag: ${DEPLOY_TAG}"
fi
# --- Resolve immutable image digests (audit F5) ---
# A short-SHA tag is mutable — anyone who can push to the registry can
# overwrite it, and imagePullPolicy then pulls the new bits silently. We
# deploy by @sha256: digest instead, pinning the exact image that was just
# built and pushed. `docker push` populates RepoDigests; with --skip-build
# (no local image) resolve_ref falls back to the tag.
resolve_ref() {
local img="$1" digest
digest="$(docker inspect --format='{{range .RepoDigests}}{{println .}}{{end}}' "${img}" 2>/dev/null | grep -m1 '@sha256:' || true)"
if [[ -n "${digest}" ]]; then
printf '%s' "${digest}"
else
warn "could not resolve a digest for ${img} — deploying by mutable tag"
printf '%s' "${img}"
fi
}
API_REF="$(resolve_ref "${API_IMAGE}")"
WORKER_REF="$(resolve_ref "${WORKER_IMAGE}")"
ADMIN_REF="$(resolve_ref "${ADMIN_IMAGE}")"
WEB_REF="$(resolve_ref "${WEB_IMAGE}")"
log "Deploying by digest:"
log " API: ${API_REF}"
log " Worker: ${WORKER_REF}"
log " Admin: ${ADMIN_REF}"
# --- Image scan + signing (audit CODE-L5) ---
# Both steps are best-effort: the deploy does NOT fail if the tools are
# absent, so an operator who has not set up cosign/trivy yet is not blocked.
# Install trivy + cosign and export COSIGN_KEY to enforce. Cluster-side
# admission verification (Kyverno/Connaisseur) is a separate operator step.
if [[ "${SKIP_BUILD}" == "false" ]]; then
if command -v trivy >/dev/null 2>&1; then
log "Scanning images with Trivy (HIGH,CRITICAL)..."
for img in "${API_IMAGE}" "${WORKER_IMAGE}" "${ADMIN_IMAGE}"; do
trivy image --severity HIGH,CRITICAL --exit-code 0 --quiet "${img}" \
|| warn "Trivy reported findings for ${img}"
done
else
warn "trivy not installed — skipping image vulnerability scan (audit L5)"
fi
if command -v cosign >/dev/null 2>&1 && [[ -n "${COSIGN_KEY:-}" ]]; then
log "Signing images with cosign..."
for ref in "${API_REF}" "${WORKER_REF}" "${ADMIN_REF}"; do
cosign sign --yes --key "${COSIGN_KEY}" "${ref}" || warn "cosign sign failed for ${ref}"
done
else
warn "cosign not configured (need cosign + COSIGN_KEY) — skipping image signing (audit L5)"
fi
fi
# --- Generate and apply ConfigMap from config.yaml ---
log "Generating env from config.yaml..."
@@ -166,7 +216,7 @@ kubectl apply -f "${MANIFESTS}/ingress/"
# pod sees a stale schema.
log "Running database migrations (goose Job)..."
kubectl delete job honeydue-migrate -n "${NAMESPACE}" --ignore-not-found --wait=true >/dev/null
sed "s|image: IMAGE_PLACEHOLDER|image: ${API_IMAGE}|" "${MANIFESTS}/migrate/job.yaml" | kubectl apply -f -
sed "s|image: IMAGE_PLACEHOLDER|image: ${API_REF}|" "${MANIFESTS}/migrate/job.yaml" | kubectl apply -f -
if ! kubectl wait --namespace="${NAMESPACE}" --for=condition=complete --timeout=10m job/honeydue-migrate; then
warn "migration Job failed — see logs:"
kubectl logs -n "${NAMESPACE}" job/honeydue-migrate --tail=200 || true
@@ -175,17 +225,17 @@ fi
log "Migrations applied; proceeding with api/worker rollout"
# Apply deployments with image substitution
sed "s|image: IMAGE_PLACEHOLDER|image: ${API_IMAGE}|" "${MANIFESTS}/api/deployment.yaml" | kubectl apply -f -
sed "s|image: IMAGE_PLACEHOLDER|image: ${API_REF}|" "${MANIFESTS}/api/deployment.yaml" | kubectl apply -f -
kubectl apply -f "${MANIFESTS}/api/service.yaml"
kubectl apply -f "${MANIFESTS}/api/hpa.yaml"
sed "s|image: IMAGE_PLACEHOLDER|image: ${WORKER_IMAGE}|" "${MANIFESTS}/worker/deployment.yaml" | kubectl apply -f -
sed "s|image: IMAGE_PLACEHOLDER|image: ${WORKER_REF}|" "${MANIFESTS}/worker/deployment.yaml" | kubectl apply -f -
sed "s|image: IMAGE_PLACEHOLDER|image: ${ADMIN_IMAGE}|" "${MANIFESTS}/admin/deployment.yaml" | kubectl apply -f -
sed "s|image: IMAGE_PLACEHOLDER|image: ${ADMIN_REF}|" "${MANIFESTS}/admin/deployment.yaml" | kubectl apply -f -
kubectl apply -f "${MANIFESTS}/admin/service.yaml"
if [[ -d "${MANIFESTS}/web" ]]; then
sed "s|image: IMAGE_PLACEHOLDER|image: ${WEB_IMAGE}|" "${MANIFESTS}/web/deployment.yaml" | kubectl apply -f -
sed "s|image: IMAGE_PLACEHOLDER|image: ${WEB_REF}|" "${MANIFESTS}/web/deployment.yaml" | kubectl apply -f -
kubectl apply -f "${MANIFESTS}/web/service.yaml"
fi
+21 -6
View File
@@ -100,7 +100,7 @@ lines = [
# API
'DEBUG=false',
f\"ALLOWED_HOSTS={d['api']},{d['base']}\",
f\"CORS_ALLOWED_ORIGINS=https://{d['base']},https://{d['admin']}\",
f\"CORS_ALLOWED_ORIGINS=https://{d['base']},https://{d['admin']},https://{d.get('app', 'app.' + d['base'])}\",
'TIMEZONE=UTC',
f\"BASE_URL=https://{d['base']}\",
'PORT=8000',
@@ -119,9 +119,14 @@ lines = [
f\"DB_MAX_IDLE_CONNS={db['max_idle_conns']}\",
f\"DB_MAX_LIFETIME={db['max_lifetime']}\",
f\"DB_MAX_IDLE_TIME={db.get('max_idle_time', '0s')}\",
# Redis (in-namespace DNS short form — password injected if configured;
# short form works because /etc/resolv.conf in pods searches honeydue.svc.cluster.local)
f\"REDIS_URL=redis://{':%s@' % val(rd.get('password')) if rd.get('password') else ''}redis:6379/0\",
# Redis in-namespace DNS short form (works because pod /etc/resolv.conf
# searches honeydue.svc.cluster.local). Audit HIGH-1: the password is
# intentionally NOT embedded here. This URL is emitted into the
# honeydue-config ConfigMap, which is NOT encrypted at rest and is
# readable by anyone with `get configmap`. The Redis password travels
# only in honeydue-secrets as REDIS_PASSWORD (file-mounted, F8); the API
# applies it in cache_service.go and the worker onto its Asynq opt.
'REDIS_URL=redis://redis:6379/0',
'REDIS_DB=0',
# Email
f\"EMAIL_HOST={em['host']}\",
@@ -218,8 +223,18 @@ config = {
'image': 'ubuntu-24.04',
},
'additional_packages': ['open-iscsi'],
'post_create_commands': ['sudo systemctl enable --now iscsid'],
'k3s_config_file': 'secrets-encryption: true\n',
# Audit K3S-CG2: harden the node OS at provision time — fail2ban for SSH
# brute-force, unattended-upgrades for automatic security patches.
'post_create_commands': [
'sudo systemctl enable --now iscsid',
'sudo apt-get update -qq',
'sudo DEBIAN_FRONTEND=noninteractive apt-get install -y -qq fail2ban unattended-upgrades',
'sudo systemctl enable --now fail2ban',
'sudo dpkg-reconfigure -f noninteractive -plow unattended-upgrades',
],
# Audit K3S-CG1 / K3S-F4: encrypt Secrets at rest in etcd, and write the
# node kubeconfig as mode 0600 (not world-readable).
'k3s_config_file': 'secrets-encryption: true\nwrite-kubeconfig-mode: \"0600\"\n',
}
print(yaml.dump(config, default_flow_style=False, sort_keys=False))