Commit Graph

21 Commits

Author SHA1 Message Date
Trey t 12b2f9d43b Adopt pressly/goose for schema migrations
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Replaces the previous hand-rolled MigrateWithLock + GORM AutoMigrate path,
which had two compounding problems:
- AutoMigrate ran on every pod startup (~5 min over the transatlantic
  link) even when no schema changes had landed
- pg_advisory_lock is session-scoped, which silently fails through
  Neon's pgbouncer transaction-mode pooler — turns out this is a
  known and documented limitation that bites golang-migrate too

Goose was chosen over golang-migrate (the other heavyweight) because:
- Goose wraps each migration file in a transaction by default, so a
  failure rolls back cleanly instead of leaving a "dirty" version
  state requiring manual force-reset (golang-migrate's known
  weakness, per its own issue tracker — see #1001 + Atlas's writeup)
- Goose's locking is opt-in. We don't opt in: migrations run as a
  single Kubernetes Job, which IS the singleton process. No advisory
  lock needed at all.

Layout:
- migrations/000001_init.sql — schema-only pg_dump of the live Neon
  DB at adoption, stripped of psql-only directives that block goose's
  bookkeeping insert. Pre-goose hand-numbered migrations 002-022 had
  their effects folded into this baseline; deleted from the live tree
  but preserved in git history at 58e6997.
- Dockerfile installs `goose v3.22.1` at build time and copies the
  binary into the api image. The migrate Job reuses the api image with
  command=goose, so no separate image to build/push/version.
- deploy-k3s/manifests/migrate/job.yaml: a one-shot Job that strips
  the -pooler segment from DB_HOST (advisory lock won't survive
  pgbouncer transaction-mode), runs `goose up`, exits.
- deploy-k3s/scripts/03-deploy.sh: deletes any prior Job, applies the
  fresh one, `kubectl wait --for=condition=complete --timeout=10m`,
  then proceeds with api/worker rollout. Job failure aborts the deploy
  before any new app pod sees a stale schema.
- internal/database/database.go::RequireSchemaApplied checks
  goose_db_version on startup. api/worker refuse to boot if the
  table is missing or its latest row has is_applied=false — the
  fail-fast for "operator forgot to run migrate."
- Makefile: migrate-up / migrate-down / migrate-status / migrate-new
  for local workflow.

Production DB was bootstrapped manually:
  $ goose -dir migrations postgres "$DSN" version  # creates table
  $ psql ... -c "INSERT INTO goose_db_version (version_id, is_applied, tstamp) VALUES (1, true, NOW());"

Smoke test against fresh Postgres locally: 50 user tables created in
284ms via `goose up`, version_id=1 + is_applied=t recorded.

Verified the local goose CLI talks to prod successfully:
  $ goose ... status
  Applied At                  Migration
  =======================================
  Mon Apr 27 03:43:55 2026 -- 000001_init.sql

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 22:46:36 -05:00
Trey t bc3da007db Wire OpenTelemetry tracing — HTTP, B2, APNs, FCM, asynq, GORM (partial)
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Step 1 — OTel SDK: cmd/api and cmd/worker initialize a tracer provider
that exports OTLP/HTTP to obs.88oakapps.com (Jaeger all-in-one). Sampling
is AlwaysSample in dev (DEBUG=true) and TraceIDRatioBased(0.1) in prod,
overridable via OTEL_TRACES_SAMPLER_ARG. Service names are honeydue-api
and honeydue-worker. otelecho.Middleware opens a span per HTTP request.

Step 2 — Manual spans: storage_service.Upload now takes ctx and emits
storage.upload + b2.PutObject spans (size_bytes, key, mime_type, bucket,
result attrs). APNs Send/SendWithCategory and FCM sendOne emit per-token
spans with topic, status_code, reason. Asynq middleware emits
asynq.handle:<task_type> per job with retry/payload attrs and records
asynq_job_duration_seconds.

Step 3 — Database: otelgorm plugin registered in database.Connect, so
any SQL emitted via db.WithContext(ctx) attaches to the request span.
Every repository now exposes WithContext(ctx) *XRepository as the
migration helper. TaskService.ListTasks and GetTasksByResidence are
migrated end-to-end (ctx threaded through handler → service → repo);
remaining services adopt the same pattern incrementally — pre-migration
methods still emit untraced SQL via the unchanged db field.

OBS_TRACES_URL and OBS_INGEST_TOKEN flow from deploy/prod.env →
honeydue-secrets → api+worker Deployments via secretKeyRef (optional).
02-setup-secrets.sh sources them from prod.env on next run; manifests
mark both env vars optional so the deployment rolls without traces if
the secret is absent.

ch15 observability doc now lists what produces spans today vs the
remaining migration work, with the explicit per-method pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 15:28:05 -05:00
Trey T 4ec4bbbfe8 Auto-seed lookups + admin + templates on first API boot
Backend CI / Test (push) Has been cancelled
Backend CI / Contract Tests (push) Has been cancelled
Backend CI / Lint (push) Has been cancelled
Backend CI / Secret Scanning (push) Has been cancelled
Backend CI / Build (push) Has been cancelled
Add a data_migration that runs seeds/001_lookups.sql,
seeds/003_admin_user.sql, and seeds/003_task_templates.sql exactly
once on startup and invalidates the Redis seeded_data cache afterwards
so /api/static_data/ returns fresh results. Removes the need to
remember `./dev.sh seed-all`; the data_migrations tracking row prevents
re-runs, and each INSERT uses ON CONFLICT DO UPDATE so re-execution is
safe.
2026-04-15 08:37:55 -05:00
Trey t 33eee812b6 Harden prod deploy: versioned secrets, healthchecks, migration lock, dry-run
Swarm stack
- Resource limits on all services, stop_grace_period 60s on api/worker/admin
- Dozzle bound to manager loopback only (ssh -L required for access)
- Worker health server on :6060, admin /api/health endpoint
- Redis 200M LRU cap, B2/S3 env vars wired through to api service

Deploy script
- DRY_RUN=1 prints plan + exits
- Auto-rollback on failed healthcheck, docker logout at end
- Versioned-secret pruning keeps last SECRET_KEEP_VERSIONS (default 3)
- PUSH_LATEST_TAG default flipped to false
- B2 all-or-none validation before deploy

Code
- cmd/api takes pg_advisory_lock on a dedicated connection before
  AutoMigrate, serialising boot-time migrations across replicas
- cmd/worker exposes an HTTP /health endpoint with graceful shutdown

Docs
- deploy/DEPLOYING.md: step-by-step walkthrough for a real deploy
- deploy/shit_deploy_cant_do.md: manual prerequisites + recurring ops
- deploy/README.md updated with storage toggle, worker-replica caveat,
  multi-arch recipe, connection-pool tuning, renumbered sections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:22:43 -05:00
Trey t 2e10822e5a Add S3-compatible storage backend (B2, MinIO, AWS S3)
Introduces a StorageBackend interface with local filesystem and S3
implementations. The StorageService delegates raw I/O to the backend
while keeping validation, encryption, and URL generation unchanged.

Backend selection is config-driven: set B2_ENDPOINT + B2_KEY_ID +
B2_APP_KEY + B2_BUCKET_NAME for S3 mode, or STORAGE_UPLOAD_DIR for
local mode. STORAGE_USE_SSL=false for in-cluster MinIO (HTTP).

All existing tests pass unchanged — the local backend preserves
identical behavior to the previous direct-filesystem implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 21:31:24 -05:00
Trey T 4abc57535e Add delete account endpoint and file encryption at rest
Delete Account (Plan #2):
- DELETE /api/auth/account/ with password or "DELETE" confirmation
- Cascade delete across 15+ tables in correct FK order
- Auth provider detection (email/apple/google) for /auth/me/
- File cleanup after account deletion
- Handler + repository tests (12 tests)

Encryption at Rest (Plan #3):
- AES-256-GCM envelope encryption (per-file DEK wrapped by KEK)
- Encrypt on upload, auto-decrypt on serve via StorageService.ReadFile()
- MediaHandler serves decrypted files via c.Blob()
- TaskService email image loading uses ReadFile()
- cmd/migrate-encrypt CLI tool with --dry-run for existing files
- Encryption service + storage service tests (18 tests)
2026-03-26 10:41:01 -05:00
Trey t 42a5533a56 Fix 113 hardening issues across entire Go backend
Security:
- Replace all binding: tags with validate: + c.Validate() in admin handlers
- Add rate limiting to auth endpoints (login, register, password reset)
- Add security headers (HSTS, XSS protection, nosniff, frame options)
- Wire Google Pub/Sub token verification into webhook handler
- Replace ParseUnverified with proper OIDC/JWKS key verification
- Verify inner Apple JWS signatures in webhook handler
- Add io.LimitReader (1MB) to all webhook body reads
- Add ownership verification to file deletion
- Move hardcoded admin credentials to env vars
- Add uniqueIndex to User.Email
- Hide ConfirmationCode from JSON serialization
- Mask confirmation codes in admin responses
- Use http.DetectContentType for upload validation
- Fix path traversal in storage service
- Replace os.Getenv with Viper in stripe service
- Sanitize Redis URLs before logging
- Separate DEBUG_FIXED_CODES from DEBUG flag
- Reject weak SECRET_KEY in production
- Add host check on /_next/* proxy routes
- Use explicit localhost CORS origins in debug mode
- Replace err.Error() with generic messages in all admin error responses

Critical fixes:
- Rewrite FCM to HTTP v1 API with OAuth 2.0 service account auth
- Fix user_customuser -> auth_user table names in raw SQL
- Fix dashboard verified query to use UserProfile model
- Add escapeLikeWildcards() to prevent SQL wildcard injection

Bug fixes:
- Add bounds checks for days/expiring_soon query params (1-3650)
- Add receipt_data/transaction_id empty-check to RestoreSubscription
- Change Active bool -> *bool in device handler
- Check all unchecked GORM/FindByIDWithProfile errors
- Add validation for notification hour fields (0-23)
- Add max=10000 validation on task description updates

Transactions & data integrity:
- Wrap registration flow in transaction
- Wrap QuickComplete in transaction
- Move image creation inside completion transaction
- Wrap SetSpecialties in transaction
- Wrap GetOrCreateToken in transaction
- Wrap completion+image deletion in transaction

Performance:
- Batch completion summaries (2 queries vs 2N)
- Reuse single http.Client in IAP validation
- Cache dashboard counts (30s TTL)
- Batch COUNT queries in admin user list
- Add Limit(500) to document queries
- Add reminder_stage+due_date filters to reminder queries
- Parse AllowedTypes once at init
- In-memory user cache in auth middleware (30s TTL)
- Timezone change detection cache
- Optimize P95 with per-endpoint sorted buffers
- Replace crypto/md5 with hash/fnv for ETags

Code quality:
- Add sync.Once to all monitoring Stop()/Close() methods
- Replace 8 fmt.Printf with zerolog in auth service
- Log previously discarded errors
- Standardize delete response shapes
- Route hardcoded English through i18n
- Remove FileURL from DocumentResponse (keep MediaURL only)
- Thread user timezone through kanban board responses
- Initialize empty slices to prevent null JSON
- Extract shared field map for task Update/UpdateTx
- Delete unused SoftDeleteModel, min(), formatCron, legacy handlers

Worker & jobs:
- Wire Asynq email infrastructure into worker
- Register HandleReminderLogCleanup with daily 3AM cron
- Use per-user timezone in HandleSmartReminder
- Replace direct DB queries with repository calls
- Delete legacy reminder handlers (~200 lines)
- Delete unused task type constants

Dependencies:
- Replace archived jung-kurt/gofpdf with go-pdf/fpdf
- Replace unmaintained gomail.v2 with wneessen/go-mail
- Add TODO for Echo jwt v3 transitive dep removal

Test infrastructure:
- Fix MakeRequest/SeedLookupData error handling
- Replace os.Exit(0) with t.Skip() in scope/consistency tests
- Add 11 new FCM v1 tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 23:14:13 -05:00
Trey t 4976eafc6c Rebrand from Casera/MyCrib to honeyDue
Total rebrand across all Go API source files:
- Go module path: casera-api -> honeydue-api
- All imports updated (130+ files)
- Docker: containers, images, networks renamed
- Email templates: support email, noreply, icon URL
- Domains: casera.app/mycrib.treytartt.com -> honeyDue.treytartt.com
- Bundle IDs: com.tt.casera -> com.tt.honeyDue
- IAP product IDs updated
- Landing page, admin panel, config defaults
- Seeds, CI workflows, Makefile, docs
- Database table names preserved (no migration needed)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 06:33:38 -06:00
treyt e26116e2cf Add webhook logging, pagination, middleware, migrations, and prod hardening
- Webhook event logging repo and subscription webhook idempotency
- Pagination helper (echohelpers) with cursor/offset support
- Request ID and structured logging middleware
- Push client improvements (FCM HTTP v1, better error handling)
- Task model version column, business constraint migrations, targeted indexes
- Expanded categorization chain tests
- Email service and config hardening
- CI workflow updates, .gitignore additions, .env.example updates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 21:32:09 -06:00
Trey t 6dac34e373 Migrate from Gin to Echo framework and add comprehensive integration tests
Major changes:
- Migrate all handlers from Gin to Echo framework
- Add new apperrors, echohelpers, and validator packages
- Update middleware for Echo compatibility
- Add ArchivedHandler to task categorization chain (archived tasks go to cancelled_tasks column)
- Add 6 new integration tests:
  - RecurringTaskLifecycle: NextDueDate advancement for weekly/monthly tasks
  - MultiUserSharing: Complex sharing with user removal
  - TaskStateTransitions: All state transitions and kanban column changes
  - DateBoundaryEdgeCases: Threshold boundary testing
  - CascadeOperations: Residence deletion cascade effects
  - MultiUserOperations: Shared residence collaboration
- Add single-purpose repository functions for kanban columns (GetOverdueTasks, GetDueSoonTasks, etc.)
- Fix RemoveUser route param mismatch (userId -> user_id)
- Fix determineExpectedColumn helper to correctly prioritize in_progress over overdue

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-16 13:52:08 -06:00
Trey t eb127fda20 Add real-time log monitoring and system stats dashboard
Implements a comprehensive monitoring system for the admin interface:

Backend:
- New monitoring package with Redis ring buffer for log storage
- Zerolog MultiWriter to capture logs to Redis
- System stats collection (CPU, memory, disk, goroutines, GC)
- HTTP metrics middleware (request counts, latency, error rates)
- Asynq queue stats for worker process
- WebSocket endpoint for real-time log streaming
- Admin auth middleware now accepts token in query params (for WebSocket)

Frontend:
- New monitoring page with tabs (Overview, Logs, API Stats, Worker Stats)
- Real-time log viewer with level filtering and search
- System stats cards showing CPU, memory, goroutines, uptime
- HTTP endpoint statistics table
- Asynq queue depth visualization
- Enable/disable monitoring toggle in settings

Memory safeguards:
- Max 200 unique endpoints tracked
- Hourly stats reset to prevent unbounded growth
- Max 1000 log entries in ring buffer
- Max 1000 latency samples for P95 calculation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-09 10:26:40 -06:00
Trey t c17e85c14e Add comprehensive i18n localization support
- Add go-i18n package for internationalization
- Create i18n middleware to extract Accept-Language header
- Add translation files for en, es, fr, de, pt languages
- Localize all handler error messages and responses
- Add language context to all API handlers

Supported languages: English, Spanish, French, German, Portuguese

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-02 02:01:47 -06:00
Trey t a15e847098 Replace Gorush with direct APNs/FCM integration
- Add direct APNs client using sideshow/apns2 for iOS push
- Add direct FCM client using legacy HTTP API for Android push
- Remove Gorush dependency (no external push server needed)
- Update all services/handlers to use new push.Client
- Update config for APNS_PRODUCTION flag

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-29 12:04:31 -06:00
Trey t c7dc56e2d2 Rebrand from MyCrib to Casera
- Update Go module from mycrib-api to casera-api
- Update all import statements across 69 Go files
- Update admin panel branding (title, sidebar, login form)
- Update email templates (subjects, bodies, signatures)
- Update PDF report generation branding
- Update Docker container names and network
- Update config defaults (database name, email sender, APNS topic)
- Update README and documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 21:10:48 -06:00
Trey t 469f21a833 Add PDF reports, file uploads, admin auth, and comprehensive tests
Features:
- PDF service for generating task reports with ReportLab-style formatting
- Storage service for file uploads (local and S3-compatible)
- Admin authentication middleware with JWT support
- Admin user model and repository

Infrastructure:
- Updated Docker configuration for admin panel builds
- Email service enhancements for task notifications
- Updated router with admin and file upload routes
- Environment configuration updates

Tests:
- Unit tests for handlers (auth, residence, task)
- Unit tests for models (user, residence, task)
- Unit tests for repositories (user, residence, task)
- Unit tests for services (residence, task)
- Integration test setup
- Test utilities for mocking database and services

Database:
- Admin user seed data
- Updated test data seeds

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 23:36:20 -06:00
Trey t 2b03f03604 Fix email config not reading from env vars
Viper requires SetDefault to be called for env vars to be
automatically bound. Added defaults for EMAIL_HOST_USER and
EMAIL_HOST_PASSWORD.

Also added info logging to show email config on startup for debugging.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 23:23:26 -06:00
Trey t fff5d8c206 Integrate GoAdmin into main API with auto-migrations
- Admin panel accessible at /admin on the same domain
- GoAdmin tables created automatically during startup migrations
- Default credentials: admin / admin

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 21:21:35 -06:00
Trey t eea53c09b2 Allow API to start even if database connection fails
- Don't crash on database connection failure
- Log error and continue - health checks will pass
- Database operations will fail but app stays up

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 20:24:03 -06:00
Trey t c319719535 Improve startup resilience and add debug logging
- Reduce database retry count and backoff time
- Make SECRET_KEY optional (use default with warning)
- Add config debug logging on startup
- Remove strict validation that prevented startup

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 20:19:06 -06:00
Trey t 8feb308f94 Add retry logic for database connection and graceful Redis handling
- Retry database connection 5 times with exponential backoff
- Allow app to start even if Redis is unavailable
- Improves startup reliability in container environments

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 20:16:47 -06:00
Trey t 1f12f3f62a Initial commit: MyCrib API in Go
Complete rewrite of Django REST API to Go with:
- Gin web framework for HTTP routing
- GORM for database operations
- GoAdmin for admin panel
- Gorush integration for push notifications
- Redis for caching and job queues

Features implemented:
- User authentication (login, register, logout, password reset)
- Residence management (CRUD, sharing, share codes)
- Task management (CRUD, kanban board, completions)
- Contractor management (CRUD, specialties)
- Document management (CRUD, warranties)
- Notifications (preferences, push notifications)
- Subscription management (tiers, limits)

Infrastructure:
- Docker Compose for local development
- Database migrations and seed data
- Admin panel for data management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 20:07:16 -06:00