fix(security): remediate 2026-05-12 audit findings (Stages 2–5)
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Backend CI / Build (push) Has been cancelled

Remediation of the 2026-05-12/13 audits (78 findings + cluster gaps),
tracked in deploy-k3s/SECURITY.md, plus fixes from two independent
post-remediation reviews.

Auth & sessions:
- SHA-256 hashed auth-token storage (C1); prior-token cache eviction on
  re-login (MEDIUM-1)
- local Google JWKS verification, iss/aud/exp checks (C2/C3)
- constant-time login + generic errors (L1/LIVE-L11/LIVE-L13)
- per-account login lockout keyed on distinct source IPs (M5/MEDIUM-3)
- verified-email gating, login rate limiting (LIVE-L19, H1-H3)

IAP & webhooks:
- Apple/Google cross-account replay protection (C5/C6/C10/C13, H5/H6)
- migrations 000003-000006 (token hashing, IAP replay, audit_log +
  webhook_event_log table creation, append-only audit log)

Authorization & races:
- file-ownership owner-OR-member fix (C7), atomic share-code join
  (C9/H9), device-token reassignment (C8/LOW-3)

Secrets & deploy:
- secrets file-mounted at /etc/honeydue/secrets, not env (F8); Redis
  password out of the ConfigMap (HIGH-1); B2 keys reconciled
- digest-pinned images, admin ingress hardening, CSP/HSTS, /metrics
  lockdown; kubeconfig 0600, etcd secrets-encryption, fail2ban +
  unattended-upgrades at provision; secret-rotation runbook

Build, vet, and the full test suite (incl. -race) pass; the goose
migration chain is verified against PostgreSQL 16.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trey t
2026-05-16 22:28:33 -05:00
parent 2004f9c5b2
commit c77ff07ce9
59 changed files with 2819 additions and 1245 deletions
+13
View File
@@ -0,0 +1,13 @@
-- +goose Up
-- Audit C1: auth tokens are stored as SHA-256 hashes (hex, 64 chars), never
-- as plaintext, so a database compromise no longer yields usable session
-- tokens. Widen the key column from 40 to 64 chars. Existing plaintext rows
-- cannot be rehashed in place, so they are dropped — every user logs in
-- once after this deploy. This is expected and one-time.
ALTER TABLE user_authtoken ALTER COLUMN key TYPE varchar(64);
DELETE FROM user_authtoken;
-- +goose Down
-- Tokens cannot be un-hashed; clearing the table is the only safe rollback.
DELETE FROM user_authtoken;
ALTER TABLE user_authtoken ALTER COLUMN key TYPE varchar(40);
@@ -0,0 +1,47 @@
-- +goose Up
-- Audit C5/C6/C10/C13: bind each in-app-purchase transaction to exactly one
-- account so a valid receipt cannot be replayed against a second account to
-- grant Pro for free.
--
-- apple_original_transaction_id is a dedicated, indexed column — it replaces
-- the LIKE '%...%' scan over apple_receipt_data that the Apple webhook used
-- to find users (C13). google_purchase_token already exists; we just add the
-- uniqueness guarantee.
ALTER TABLE subscription_usersubscription
ADD COLUMN IF NOT EXISTS apple_original_transaction_id text;
-- Partial unique indexes: one account per transaction. NULL/empty rows are
-- excluded so accounts without an IAP are unaffected.
CREATE UNIQUE INDEX IF NOT EXISTS uq_subscription_apple_original_txn
ON subscription_usersubscription (apple_original_transaction_id)
WHERE apple_original_transaction_id IS NOT NULL
AND apple_original_transaction_id <> '';
-- Pre-flight dedup for the Google index below. apple_original_transaction_id
-- is brand-new (added above), so it is all-NULL and cannot collide. But
-- google_purchase_token is a pre-existing column, and the C6 replay bug being
-- fixed here is exactly "the same token bound to multiple accounts" — so
-- duplicate rows may exist and would make the UNIQUE index below fail to
-- build, aborting the migrate Job. Keep the earliest subscription row for
-- each token and clear the token on the rest; those rows lose a binding that
-- was disputed anyway, while the original (earliest) owner keeps it.
UPDATE subscription_usersubscription s
SET google_purchase_token = NULL
WHERE google_purchase_token IS NOT NULL
AND google_purchase_token <> ''
AND id <> (
SELECT MIN(s2.id)
FROM subscription_usersubscription s2
WHERE s2.google_purchase_token = s.google_purchase_token
);
CREATE UNIQUE INDEX IF NOT EXISTS uq_subscription_google_purchase_token
ON subscription_usersubscription (google_purchase_token)
WHERE google_purchase_token IS NOT NULL
AND google_purchase_token <> '';
-- +goose Down
DROP INDEX IF EXISTS uq_subscription_google_purchase_token;
DROP INDEX IF EXISTS uq_subscription_apple_original_txn;
ALTER TABLE subscription_usersubscription
DROP COLUMN IF EXISTS apple_original_transaction_id;
@@ -0,0 +1,52 @@
-- +goose Up
-- Audit M7: make audit_log append-only. A BEFORE trigger rejects UPDATE and
-- DELETE so the security-event history cannot be altered or erased after the
-- fact — even by a database account with broad table privileges. The
-- application only ever INSERTs into this table.
-- The audit_log table itself was never created by a goose migration — it is
-- only built by GORM AutoMigrate in the test harness, and production never
-- runs AutoMigrate. CREATE TABLE IF NOT EXISTS brings it under migration
-- control without disturbing an existing table: a no-op on a DB that already
-- has it, and a correct build (matching models.AuditLog) on a from-scratch
-- redeploy — so the trigger below has a table to attach to and a clean
-- redeploy comes up with a working, append-only audit log.
CREATE TABLE IF NOT EXISTS audit_log (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT,
event_type VARCHAR(50) NOT NULL,
ip_address VARCHAR(45),
user_agent TEXT,
details JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX IF NOT EXISTS idx_audit_log_user_id ON audit_log (user_id);
CREATE INDEX IF NOT EXISTS idx_audit_log_created_at ON audit_log (created_at);
-- +goose StatementBegin
CREATE OR REPLACE FUNCTION audit_log_append_only() RETURNS trigger AS $$
BEGIN
RAISE EXCEPTION 'audit_log is append-only: % is not permitted', TG_OP;
END;
$$ LANGUAGE plpgsql;
-- +goose StatementEnd
-- DROP ... IF EXISTS before CREATE keeps this idempotent (CREATE TRIGGER has
-- no OR REPLACE on older PostgreSQL).
DROP TRIGGER IF EXISTS audit_log_no_update ON audit_log;
CREATE TRIGGER audit_log_no_update
BEFORE UPDATE ON audit_log
FOR EACH ROW EXECUTE FUNCTION audit_log_append_only();
DROP TRIGGER IF EXISTS audit_log_no_delete ON audit_log;
CREATE TRIGGER audit_log_no_delete
BEFORE DELETE ON audit_log
FOR EACH ROW EXECUTE FUNCTION audit_log_append_only();
-- +goose Down
-- Reverses only the append-only guard, which is this migration's purpose.
-- The audit_log table is intentionally NOT dropped — it may hold security
-- history that predates this migration.
DROP TRIGGER IF EXISTS audit_log_no_delete ON audit_log;
DROP TRIGGER IF EXISTS audit_log_no_update ON audit_log;
DROP FUNCTION IF EXISTS audit_log_append_only();
+30
View File
@@ -0,0 +1,30 @@
-- +goose Up
-- Audit H6 follow-up. The Apple/Google webhook handler now fails CLOSED on a
-- deduplication-store error: if it cannot consult webhook_event_log it returns
-- 500 rather than risk processing a replayed event. That makes the presence of
-- the webhook_event_log table mandatory.
--
-- Like audit_log, this table was never created by a goose migration — only by
-- GORM AutoMigrate in tests — so a from-scratch redeploy would come up without
-- it and 500 every subscription webhook. CREATE TABLE IF NOT EXISTS brings it
-- under migration control: a no-op where the table already exists, and a
-- correct build (matching repositories.WebhookEvent) on a fresh database.
CREATE TABLE IF NOT EXISTS webhook_event_log (
id BIGSERIAL PRIMARY KEY,
event_id VARCHAR(255) NOT NULL,
provider VARCHAR(20) NOT NULL,
event_type VARCHAR(100) NOT NULL,
processed_at TIMESTAMPTZ NOT NULL DEFAULT now(),
payload_hash VARCHAR(64)
);
-- (provider, event_id) is the dedup key — matches the
-- uniqueIndex:idx_provider_event_id GORM tags on repositories.WebhookEvent.
CREATE UNIQUE INDEX IF NOT EXISTS idx_provider_event_id
ON webhook_event_log (provider, event_id);
-- +goose Down
-- The table is intentionally NOT dropped — it may hold deduplication history
-- that predates this migration, and dropping it would let already-processed
-- webhook events be replayed. Down is a documented no-op.
SELECT 1;