6de90acef7
Auth was structurally broken — the api's Kratos middleware was pointing at http://kratos:4433 but Kratos wasn't deployed. The only thing keeping users logged in was a 5-min Redis cache; once it expired the middleware called Whoami → no DNS → 401 → forced relogin with no path back. This commit deploys Kratos for real: Manifests: - kratos.yaml + migrate-job.yaml: pin oryd/kratos:v26.2.0@sha256:92eedc... (CalVer current stable as of 2026-06-03) - configmap.yaml: drop Google OIDC provider (not in scope); fill the Apple provider with real Services ID / Team ID / Key ID — Apple now sits at providers[0] - kratos.yaml: drop the Google-secret env binding; rebind APPLE_PRIVATE_KEY to PROVIDERS_0_APPLE_PRIVATE_KEY (shifted from index 1) - network-policies.yaml: add a kratos egress rule to allow-egress-from-api. Without this, even with kratos running, the api gets "connection refused" on http://kratos:4433 (post-DNAT NetworkPolicy enforcement — runbook §9.2). Operator prerequisites that were completed alongside this commit: - Neon kratos database created (separate from honeyDue, owner neondb_owner) - Cloudflare DNS for auth.myhoneydue.com (3 A records, proxied) - kratos: block added to config.yaml (gitignored): DSN to the Neon DIRECT endpoint, cookie + cipher secrets generated, Fastmail SMTPS URI, .p8 contents inline Out of scope intentionally: - Google sign-in (additive; can append providers[] later) - Migrating existing auth_user rows onto Kratos identities — pre-prod; existing users will need to sign in fresh, which creates a new Kratos identity and a new local user row (per migration plan in manifests/kratos/README.md). Verified end-to-end: - 338 schema migrations applied successfully - 2/2 kratos pods Ready - api → kratos:4433/sessions/whoami returns 401 for invalid token (was "connection refused" before this commit's NetworkPolicy patch) - auth.myhoneydue.com resolves through CF; cloudflare-only middleware keeps the origin protected exactly like the other hostnames Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
443 lines
11 KiB
YAML
443 lines
11 KiB
YAML
# Network Policies — default-deny with explicit allows
|
|
# Apply AFTER namespace and deployments are created.
|
|
# Verify: kubectl get networkpolicy -n honeydue
|
|
|
|
# --- Default deny all ingress and egress ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: default-deny-all
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector: {}
|
|
policyTypes:
|
|
- Ingress
|
|
- Egress
|
|
|
|
---
|
|
# --- Allow DNS for all pods (required for service discovery) ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-dns
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector: {}
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
- to: []
|
|
ports:
|
|
- protocol: UDP
|
|
port: 53
|
|
- protocol: TCP
|
|
port: 53
|
|
|
|
---
|
|
# --- API: allow ingress from Traefik (kube-system namespace) ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-ingress-to-api
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: api
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
# Traefik runs as DaemonSet with hostNetwork=true, so traffic from it
|
|
# arrives with the NODE IP as source (not a pod IP). The node pod CIDR
|
|
# 10.42.0.0/16 covers any intra-cluster caller; the three node IPs
|
|
# cover Traefik on hostNetwork.
|
|
- from:
|
|
- ipBlock:
|
|
cidr: 178.105.32.198/32 # ubuntu-8gb-nbg1-1
|
|
- ipBlock:
|
|
cidr: 178.104.247.152/32 # ubuntu-8gb-nbg1-2
|
|
- ipBlock:
|
|
cidr: 178.104.249.189/32 # ubuntu-8gb-nbg1-3
|
|
- ipBlock:
|
|
cidr: 10.42.0.0/16 # cluster pod CIDR
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8000
|
|
|
|
---
|
|
# --- Admin: allow ingress from Traefik (kube-system namespace) ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-ingress-to-admin
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: admin
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
# Traefik runs as DaemonSet with hostNetwork=true — see allow-ingress-to-api
|
|
# for the rationale. Same ipBlock list.
|
|
- from:
|
|
- ipBlock:
|
|
cidr: 178.105.32.198/32
|
|
- ipBlock:
|
|
cidr: 178.104.247.152/32
|
|
- ipBlock:
|
|
cidr: 178.104.249.189/32
|
|
- ipBlock:
|
|
cidr: 10.42.0.0/16
|
|
ports:
|
|
- protocol: TCP
|
|
port: 3000
|
|
|
|
---
|
|
# --- Redis: allow ingress ONLY from api + worker pods ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-ingress-to-redis
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: redis
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
- from:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: api
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: worker
|
|
ports:
|
|
- protocol: TCP
|
|
port: 6379
|
|
|
|
---
|
|
# --- API: allow egress to Redis, external services (Neon DB, APNs, FCM, B2, SMTP) ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-api
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: api
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# Redis (in-cluster)
|
|
- to:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: redis
|
|
ports:
|
|
- protocol: TCP
|
|
port: 6379
|
|
# Kratos (in-cluster). The auth middleware validates every session via
|
|
# http://kratos:4433/sessions/whoami; the AuthService also uses :4434
|
|
# for account deletion (DELETE /admin/identities/{id}). k3s evaluates
|
|
# egress rules AFTER kube-proxy DNAT (runbook §9.2), so this podSelector
|
|
# rule covers Service ClusterIP traffic correctly.
|
|
- to:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: kratos
|
|
ports:
|
|
- protocol: TCP
|
|
port: 4433
|
|
- protocol: TCP
|
|
port: 4434
|
|
# External services: Neon DB (5432), SMTP (587), HTTPS (443 — APNs, FCM, B2, PostHog)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.0.0.0/8
|
|
- 172.16.0.0/12
|
|
- 192.168.0.0/16
|
|
ports:
|
|
- protocol: TCP
|
|
port: 5432
|
|
- protocol: TCP
|
|
port: 587
|
|
- protocol: TCP
|
|
port: 443
|
|
|
|
---
|
|
# --- Worker: allow egress to Redis, external services ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-worker
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: worker
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# Redis (in-cluster)
|
|
- to:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: redis
|
|
ports:
|
|
- protocol: TCP
|
|
port: 6379
|
|
# External services: Neon DB (5432), SMTP (587), HTTPS (443 — APNs, FCM, B2)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.0.0.0/8
|
|
- 172.16.0.0/12
|
|
- 192.168.0.0/16
|
|
ports:
|
|
- protocol: TCP
|
|
port: 5432
|
|
- protocol: TCP
|
|
port: 587
|
|
- protocol: TCP
|
|
port: 443
|
|
|
|
---
|
|
# --- Admin: allow egress to API (internal) for SSR ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-admin
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: admin
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# API service (in-cluster, for server-side API calls)
|
|
- to:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: api
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8000
|
|
|
|
---
|
|
# --- Web: allow ingress from Traefik (kube-system namespace) ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-ingress-to-web
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: web
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
# Traefik runs as DaemonSet with hostNetwork=true — see allow-ingress-to-api
|
|
# for the rationale. Same ipBlock list.
|
|
- from:
|
|
- ipBlock:
|
|
cidr: 178.105.32.198/32
|
|
- ipBlock:
|
|
cidr: 178.104.247.152/32
|
|
- ipBlock:
|
|
cidr: 178.104.249.189/32
|
|
- ipBlock:
|
|
cidr: 10.42.0.0/16
|
|
ports:
|
|
- protocol: TCP
|
|
port: 3000
|
|
|
|
---
|
|
# --- Web: allow egress for the Next.js server-side proxy routes ---
|
|
# Browser → app.myhoneydue.com → web pod (Node.js) → api.myhoneydue.com
|
|
# The web pod resolves api.myhoneydue.com via public DNS and hits
|
|
# Cloudflare (143.). We don't know which CF IP yet at policy time, so
|
|
# allow HTTPS to public ipBlock (except private CIDRs).
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-web
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: web
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# HTTPS to public (api.myhoneydue.com via CF, PostHog, any other remote)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.0.0.0/8
|
|
- 172.16.0.0/12
|
|
- 192.168.0.0/16
|
|
ports:
|
|
- protocol: TCP
|
|
port: 443
|
|
|
|
---
|
|
# vmagent egress.
|
|
#
|
|
# IMPORTANT (gotcha): k3s's built-in NetworkPolicy controller appears to
|
|
# evaluate egress rules AFTER kube-proxy's DNAT, not before (contrary to
|
|
# the k8s spec). So traffic from a pod to the kubernetes Service
|
|
# (ClusterIP 10.43.0.1:443) is policy-checked as dst=<node_public_ip>:6443.
|
|
# That's why we need an explicit rule for :6443 to public IPs, even though
|
|
# we already allow :443 to the cluster service CIDR.
|
|
#
|
|
# Without the :6443 rule, vmagent's k8s service discovery silently fails
|
|
# and zero pods get scraped. See deploy-k3s/RUNBOOK.md ("vmagent SD broken").
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-vmagent
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: vmagent
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# DNS (cluster-internal)
|
|
- to:
|
|
- namespaceSelector: {}
|
|
ports:
|
|
- port: 53
|
|
protocol: UDP
|
|
- port: 53
|
|
protocol: TCP
|
|
# k8s API server via ClusterIP (pre-DNAT view)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 10.43.0.0/16
|
|
ports:
|
|
- port: 443
|
|
protocol: TCP
|
|
# k8s API server post-DNAT (real path k3s NetPol enforcer sees) — REQUIRED
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.42.0.0/16
|
|
ports:
|
|
- port: 6443
|
|
protocol: TCP
|
|
# Scrape api Pods on :8000
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 10.42.0.0/16
|
|
ports:
|
|
- port: 8000
|
|
protocol: TCP
|
|
# Scrape kube-state-metrics Pod on :8080 (pod CIDR)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 10.42.0.0/16
|
|
ports:
|
|
- port: 8080
|
|
protocol: TCP
|
|
# HTTPS to public (remote-write to obs.88oakapps.com via Cloudflare)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.42.0.0/16
|
|
- 10.43.0.0/16
|
|
ports:
|
|
- port: 443
|
|
protocol: TCP
|
|
|
|
---
|
|
# Allow vmagent → api ingress on :8000 so api pods accept scrapes.
|
|
# api Pods are otherwise locked down by default-deny-all + allow-ingress-to-api
|
|
# (which only allows Traefik). This adds vmagent specifically.
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-vmagent-to-api
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: api
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
- from:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: vmagent
|
|
ports:
|
|
- port: 8000
|
|
protocol: TCP
|
|
|
|
---
|
|
# alloy-logs egress — Grafana Alloy discovers honeydue pods via the k8s API
|
|
# and pushes their logs to Loki at obs.88oakapps.com. Same k3s NetworkPolicy
|
|
# DNAT gotcha as vmagent: API-server traffic is policy-checked as
|
|
# dst=<node_public_ip>:6443, so an explicit :6443 rule is required.
|
|
# Alloy reads log FILES from a hostPath, so it needs no ingress and no
|
|
# egress to pod :8000/:8080 — only DNS, the API server, and obs HTTPS.
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-alloy-logs
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: alloy-logs
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# DNS (cluster-internal)
|
|
- to:
|
|
- namespaceSelector: {}
|
|
ports:
|
|
- port: 53
|
|
protocol: UDP
|
|
- port: 53
|
|
protocol: TCP
|
|
# k8s API server via ClusterIP (pre-DNAT view)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 10.43.0.0/16
|
|
ports:
|
|
- port: 443
|
|
protocol: TCP
|
|
# k8s API server post-DNAT (real path k3s NetPol enforcer sees) — REQUIRED
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.42.0.0/16
|
|
ports:
|
|
- port: 6443
|
|
protocol: TCP
|
|
# HTTPS to public (log push to obs.88oakapps.com via Cloudflare)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.42.0.0/16
|
|
- 10.43.0.0/16
|
|
ports:
|
|
- port: 443
|
|
protocol: TCP
|