139a990ebc
vmagent's k8s service discovery has been silently broken for 17+ days
because k3s's NetworkPolicy controller evaluates egress AFTER kube-proxy's
DNAT (contrary to the k8s spec). Pod → ClusterIP 10.43.0.1:443 was
DNAT'd to <node_public_ip>:6443, and the resulting :6443 destination
matched none of vmagent's egress rules → TCP RST → "connection refused"
on every SD watch attempt. Grafana panels using kube_* or up{} metrics
returned empty as a result.
Changes:
- network-policies.yaml: commit the previously-cluster-only NetPols
(allow-egress-from-vmagent, allow-vmagent-to-api) so a fresh deploy
produces a working cluster. The vmagent egress rule now includes :6443
to public IPs (the post-DNAT path) and :8080 to the pod CIDR (for
scraping kube-state-metrics).
- observability/kube-state-metrics.yaml: new manifest. Provides the
kube_pod_*, kube_deployment_*, kube_service_* metrics that Grafana
panels need to count pods, replicas, etc. Runs in kube-system with
cluster-scoped RBAC.
- observability/vmagent.yaml:
* add kube-state-metrics scrape job to the ConfigMap
* add vmagent-kube-system Role+RoleBinding so cross-namespace SD works
* replace the misleading liveness probe (was /-/healthy, which lies
while SD is broken) with an exec probe that checks /api/v1/targets
for at least one healthy target — automatic recovery from future
stale-SD incidents
- scripts/03-deploy.sh: actually apply network-policies.yaml (was
committed but never applied) and apply kube-state-metrics.yaml.
- RUNBOOK.md (new): documents the post-DNAT gotcha, the liveness probe
trap, bearer-token recovery procedure, drift-detection diff, and a
post-redeploy verification checklist.
- .gitignore: cover kubeconfig.tunnel (created during SSH-tunnelled
kubectl sessions) so admin client cert can't be committed by accident.
Verified via kubectl --dry-run on all three modified manifests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
375 lines
9.2 KiB
YAML
375 lines
9.2 KiB
YAML
# Network Policies — default-deny with explicit allows
|
|
# Apply AFTER namespace and deployments are created.
|
|
# Verify: kubectl get networkpolicy -n honeydue
|
|
|
|
# --- Default deny all ingress and egress ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: default-deny-all
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector: {}
|
|
policyTypes:
|
|
- Ingress
|
|
- Egress
|
|
|
|
---
|
|
# --- Allow DNS for all pods (required for service discovery) ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-dns
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector: {}
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
- to: []
|
|
ports:
|
|
- protocol: UDP
|
|
port: 53
|
|
- protocol: TCP
|
|
port: 53
|
|
|
|
---
|
|
# --- API: allow ingress from Traefik (kube-system namespace) ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-ingress-to-api
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: api
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
# Traefik runs as DaemonSet with hostNetwork=true, so traffic from it
|
|
# arrives with the NODE IP as source (not a pod IP). The node pod CIDR
|
|
# 10.42.0.0/16 covers any intra-cluster caller; the three node IPs
|
|
# cover Traefik on hostNetwork.
|
|
- from:
|
|
- ipBlock:
|
|
cidr: 178.105.32.198/32 # ubuntu-8gb-nbg1-1
|
|
- ipBlock:
|
|
cidr: 178.104.247.152/32 # ubuntu-8gb-nbg1-2
|
|
- ipBlock:
|
|
cidr: 178.104.249.189/32 # ubuntu-8gb-nbg1-3
|
|
- ipBlock:
|
|
cidr: 10.42.0.0/16 # cluster pod CIDR
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8000
|
|
|
|
---
|
|
# --- Admin: allow ingress from Traefik (kube-system namespace) ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-ingress-to-admin
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: admin
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
# Traefik runs as DaemonSet with hostNetwork=true — see allow-ingress-to-api
|
|
# for the rationale. Same ipBlock list.
|
|
- from:
|
|
- ipBlock:
|
|
cidr: 178.105.32.198/32
|
|
- ipBlock:
|
|
cidr: 178.104.247.152/32
|
|
- ipBlock:
|
|
cidr: 178.104.249.189/32
|
|
- ipBlock:
|
|
cidr: 10.42.0.0/16
|
|
ports:
|
|
- protocol: TCP
|
|
port: 3000
|
|
|
|
---
|
|
# --- Redis: allow ingress ONLY from api + worker pods ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-ingress-to-redis
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: redis
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
- from:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: api
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: worker
|
|
ports:
|
|
- protocol: TCP
|
|
port: 6379
|
|
|
|
---
|
|
# --- API: allow egress to Redis, external services (Neon DB, APNs, FCM, B2, SMTP) ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-api
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: api
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# Redis (in-cluster)
|
|
- to:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: redis
|
|
ports:
|
|
- protocol: TCP
|
|
port: 6379
|
|
# External services: Neon DB (5432), SMTP (587), HTTPS (443 — APNs, FCM, B2, PostHog)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.0.0.0/8
|
|
- 172.16.0.0/12
|
|
- 192.168.0.0/16
|
|
ports:
|
|
- protocol: TCP
|
|
port: 5432
|
|
- protocol: TCP
|
|
port: 587
|
|
- protocol: TCP
|
|
port: 443
|
|
|
|
---
|
|
# --- Worker: allow egress to Redis, external services ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-worker
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: worker
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# Redis (in-cluster)
|
|
- to:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: redis
|
|
ports:
|
|
- protocol: TCP
|
|
port: 6379
|
|
# External services: Neon DB (5432), SMTP (587), HTTPS (443 — APNs, FCM, B2)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.0.0.0/8
|
|
- 172.16.0.0/12
|
|
- 192.168.0.0/16
|
|
ports:
|
|
- protocol: TCP
|
|
port: 5432
|
|
- protocol: TCP
|
|
port: 587
|
|
- protocol: TCP
|
|
port: 443
|
|
|
|
---
|
|
# --- Admin: allow egress to API (internal) for SSR ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-admin
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: admin
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# API service (in-cluster, for server-side API calls)
|
|
- to:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: api
|
|
ports:
|
|
- protocol: TCP
|
|
port: 8000
|
|
|
|
---
|
|
# --- Web: allow ingress from Traefik (kube-system namespace) ---
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-ingress-to-web
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: web
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
# Traefik runs as DaemonSet with hostNetwork=true — see allow-ingress-to-api
|
|
# for the rationale. Same ipBlock list.
|
|
- from:
|
|
- ipBlock:
|
|
cidr: 178.105.32.198/32
|
|
- ipBlock:
|
|
cidr: 178.104.247.152/32
|
|
- ipBlock:
|
|
cidr: 178.104.249.189/32
|
|
- ipBlock:
|
|
cidr: 10.42.0.0/16
|
|
ports:
|
|
- protocol: TCP
|
|
port: 3000
|
|
|
|
---
|
|
# --- Web: allow egress for the Next.js server-side proxy routes ---
|
|
# Browser → app.myhoneydue.com → web pod (Node.js) → api.myhoneydue.com
|
|
# The web pod resolves api.myhoneydue.com via public DNS and hits
|
|
# Cloudflare (143.). We don't know which CF IP yet at policy time, so
|
|
# allow HTTPS to public ipBlock (except private CIDRs).
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-web
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: web
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# HTTPS to public (api.myhoneydue.com via CF, PostHog, any other remote)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.0.0.0/8
|
|
- 172.16.0.0/12
|
|
- 192.168.0.0/16
|
|
ports:
|
|
- protocol: TCP
|
|
port: 443
|
|
|
|
---
|
|
# vmagent egress.
|
|
#
|
|
# IMPORTANT (gotcha): k3s's built-in NetworkPolicy controller appears to
|
|
# evaluate egress rules AFTER kube-proxy's DNAT, not before (contrary to
|
|
# the k8s spec). So traffic from a pod to the kubernetes Service
|
|
# (ClusterIP 10.43.0.1:443) is policy-checked as dst=<node_public_ip>:6443.
|
|
# That's why we need an explicit rule for :6443 to public IPs, even though
|
|
# we already allow :443 to the cluster service CIDR.
|
|
#
|
|
# Without the :6443 rule, vmagent's k8s service discovery silently fails
|
|
# and zero pods get scraped. See deploy-k3s/RUNBOOK.md ("vmagent SD broken").
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-egress-from-vmagent
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: vmagent
|
|
policyTypes:
|
|
- Egress
|
|
egress:
|
|
# DNS (cluster-internal)
|
|
- to:
|
|
- namespaceSelector: {}
|
|
ports:
|
|
- port: 53
|
|
protocol: UDP
|
|
- port: 53
|
|
protocol: TCP
|
|
# k8s API server via ClusterIP (pre-DNAT view)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 10.43.0.0/16
|
|
ports:
|
|
- port: 443
|
|
protocol: TCP
|
|
# k8s API server post-DNAT (real path k3s NetPol enforcer sees) — REQUIRED
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.42.0.0/16
|
|
ports:
|
|
- port: 6443
|
|
protocol: TCP
|
|
# Scrape api Pods on :8000
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 10.42.0.0/16
|
|
ports:
|
|
- port: 8000
|
|
protocol: TCP
|
|
# Scrape kube-state-metrics Pod on :8080 (pod CIDR)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 10.42.0.0/16
|
|
ports:
|
|
- port: 8080
|
|
protocol: TCP
|
|
# HTTPS to public (remote-write to obs.88oakapps.com via Cloudflare)
|
|
- to:
|
|
- ipBlock:
|
|
cidr: 0.0.0.0/0
|
|
except:
|
|
- 10.42.0.0/16
|
|
- 10.43.0.0/16
|
|
ports:
|
|
- port: 443
|
|
protocol: TCP
|
|
|
|
---
|
|
# Allow vmagent → api ingress on :8000 so api pods accept scrapes.
|
|
# api Pods are otherwise locked down by default-deny-all + allow-ingress-to-api
|
|
# (which only allows Traefik). This adds vmagent specifically.
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: NetworkPolicy
|
|
metadata:
|
|
name: allow-vmagent-to-api
|
|
namespace: honeydue
|
|
spec:
|
|
podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: api
|
|
policyTypes:
|
|
- Ingress
|
|
ingress:
|
|
- from:
|
|
- podSelector:
|
|
matchLabels:
|
|
app.kubernetes.io/name: vmagent
|
|
ports:
|
|
- port: 8000
|
|
protocol: TCP
|