Shipping commit 88fb175 changed the trace shape and added a new caching
layer with required invalidation rules. Updating the operator-facing
docs so they match the running system.
ch08 (database):
- DB_HOST is the -pooler Neon endpoint, not direct compute
- Connection pool: MaxIdleConns 20 (was 10), MaxLifetime 30m (was 10m),
MaxIdleTime 0 (never close idle)
- New \"Pool warm-up at boot\" section documenting the 20-parallel-ping
warm-up in database.Connect
- Replaced the \"Neon regions\" section: explicit RTT numbers, the
optimization stack that minimizes round-trips, when this still matters
ch15 (observability):
- Replaced the 2,473ms/5-span sample trace with the new 229ms/2-span
post-optimization trace; kept the old one underneath for diff context
ch16 (failure modes):
- Added: stale residence-IDs cache (data freshness bug + recovery)
- Added: Redis at maxmemory limit (verify allkeys-lru policy)
- Added: Neon pooler unreachable but direct endpoint up — emergency
switchover procedure
ch17 (runbook):
- §23 Invalidate residence-IDs cache for a user (DEL key + grep for
missing invalidation in new code)
- §24 Verify DB pool warm-up is working (log pattern + impact test)
- §25 Switch DB host between pooler and direct endpoints
observability-plan.md status flipped from \"plan only\" to shipped
with the latency-cut summary.
README links to the new ch08 latency section.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every authed API endpoint now produces a nested flame graph
(HTTP → auth → service → SQL). Replaces the in-flight section with the
final span-source matrix and a sample 5-span /api/tasks/ trace. Notes
the visible Hetzner→Neon transatlantic RTT as the perf bottleneck the
flame graph surfaced.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 1 — OTel SDK: cmd/api and cmd/worker initialize a tracer provider
that exports OTLP/HTTP to obs.88oakapps.com (Jaeger all-in-one). Sampling
is AlwaysSample in dev (DEBUG=true) and TraceIDRatioBased(0.1) in prod,
overridable via OTEL_TRACES_SAMPLER_ARG. Service names are honeydue-api
and honeydue-worker. otelecho.Middleware opens a span per HTTP request.
Step 2 — Manual spans: storage_service.Upload now takes ctx and emits
storage.upload + b2.PutObject spans (size_bytes, key, mime_type, bucket,
result attrs). APNs Send/SendWithCategory and FCM sendOne emit per-token
spans with topic, status_code, reason. Asynq middleware emits
asynq.handle:<task_type> per job with retry/payload attrs and records
asynq_job_duration_seconds.
Step 3 — Database: otelgorm plugin registered in database.Connect, so
any SQL emitted via db.WithContext(ctx) attaches to the request span.
Every repository now exposes WithContext(ctx) *XRepository as the
migration helper. TaskService.ListTasks and GetTasksByResidence are
migrated end-to-end (ctx threaded through handler → service → repo);
remaining services adopt the same pattern incrementally — pre-migration
methods still emit untraced SQL via the unchanged db field.
OBS_TRACES_URL and OBS_INGEST_TOKEN flow from deploy/prod.env →
honeydue-secrets → api+worker Deployments via secretKeyRef (optional).
02-setup-secrets.sh sources them from prod.env on next run; manifests
mark both env vars optional so the deployment rolls without traces if
the secret is absent.
ch15 observability doc now lists what produces spans today vs the
remaining migration work, with the explicit per-method pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ch15 is now an account of what's actually running, not a roadmap for
what we'd add: VictoriaMetrics + Jaeger + Grafana on 88oakappsUpdate
fronted by Cloudflare and bearer-gated nginx, vmagent in-cluster, the
internal/prom histogram set, the rollout's NetworkPolicy footprint,
the obs.88oakapps.com endpoint shape, the ~$0/700MB resource budget,
and a token-rotation runbook. The "what we still don't have" section
keeps log aggregation, alerting, and full distributed tracing as the
honest gap list.
Other touched docs:
- 00-overview: \"deliberately absent\" no longer claims we have no
metrics — calls out the cross-cluster shape instead.
- 14-deployment-process: TL;DR now points at deploy-k3s/scripts/03-deploy.sh
(full build + push + apply + obs vmagent), with the manual
kubectl-set-image flow kept as the single-service path. Notes the
IfNotPresent gotcha that bit us during the rollout.
- 16-failure-modes: adds vmagent-can't-reach-obs and Grafana-no-data.
- 18-cost: $0 line item for the obs stack on 88oakappsUpdate, with the
CX32 migration trigger.
- 17/18 README + appendix b: link the new ch15, add the obs cheat
sheet block.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds internal/prom package with histograms for HTTP, GORM, B2, APNs, and
FCM, wired into the Echo router (HTTPMiddleware + /metrics) and GORM via
statement-level callbacks (no ctx plumbing needed). Storage and push
clients call ObserveB2Upload / ObserveAPNsSend / ObserveFCMSend at the
network round-trip points.
Existing internal/monitoring metrics move to /metrics/legacy so the
canonical /metrics emits proper histogram buckets for p50/p95/p99 rollups.
deploy-k3s/manifests/observability/vmagent.yaml deploys a single-replica
vmagent in the honeydue namespace that scrapes api Pods on :8000/metrics
every 15s and remote-writes to https://obs.88oakapps.com/api/v1/write
with a bearer token (substituted at deploy time from OBS_INGEST_TOKEN
in deploy/prod.env). NetworkPolicies allow vmagent egress to api Pods
and to the public obs endpoint over :443; the obs side runs
VictoriaMetrics + Jaeger + Grafana on 88oakappsUpdate.
docs/observability-plan.md captures the full plan including resource
budget, instrumentation table, 4-step rollout, and migration triggers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The iOS app was renamed (MyCrib → Casera → honeyDue) and the bundle ID
was updated to com.myhoneydue.honeyDue (release) / .dev (debug), but
APPLE_CLIENT_ID and APNS_TOPIC across env templates and k3s configs
still pointed at the old com.tt.honeyDue.honeyDueDev value. This made
verifyAudience reject every Apple identity token (aud claim mismatch).
Updated:
- deploy/prod.env.example: bundle ID + comment that empty client_id
rejects all tokens with DEBUG=false
- .env.example: add Sign in with Apple block (was missing entirely)
- deploy-k3s{,-dev}/config.yaml.example: apple_auth.client_id default
- deploy-k3s-dev/scripts/00-init.sh: same
- docker-compose.dev.yml: APNS_TOPIC fallback
- docs/deployment/10-secrets-config.md: doc reference
The live deploy/prod.env and local .env are .gitignored — they were
edited in place and need to ship via deploy_prod.sh to take effect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Clients that send users through a multi-task onboarding step no longer
loop N POST /api/tasks/ calls and no longer create "orphan" tasks with
no reference to the TaskTemplate they came from.
Task model
- New task_template_id column + GORM FK (migration 000016)
- CreateTaskRequest.template_id, TaskResponse.template_id
- task_service.CreateTask persists the backlink
Bulk endpoint
- POST /api/tasks/bulk/ — 1-50 tasks in a single transaction,
returns every created row + TotalSummary. Single residence access
check, per-entry residence_id is overridden with batch value
- task_handler.BulkCreateTasks + task_service.BulkCreateTasks using
db.Transaction; task_repo.CreateTx + FindByIDTx helpers
Climate-region scoring
- templateConditions gains ClimateRegionID; suggestion_service scores
residence.PostalCode -> ZipToState -> GetClimateRegionIDByState against
the template's conditions JSON (no penalty on mismatch / unknown ZIP)
- regionMatchBonus 0.35, totalProfileFields 14 -> 15
- Standalone GET /api/tasks/templates/by-region/ removed; legacy
task_tasktemplate_regions many-to-many dropped (migration 000017).
Region affinity now lives entirely in the template's conditions JSON
Tests
- +11 cases across task_service_test, task_handler_test, suggestion_
service_test: template_id persistence, bulk rollback + cap + auth,
region match / mismatch / no-ZIP / unknown-ZIP / stacks-with-others
Docs
- docs/openapi.yaml: /tasks/bulk/ + BulkCreateTasks schemas, template_id
on TaskResponse + CreateTaskRequest, /templates/by-region/ removed
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a new endpoint GET /api/tasks/templates/by-region/?zip= that resolves
ZIP codes to IECC climate regions and returns relevant home maintenance
task templates. Includes climate region model, region lookup service with
tests, seed data for all 8 climate zones with 50+ templates, and OpenAPI spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Stripe integration: add StripeService with checkout sessions, customer
portal, and webhook handling for subscription lifecycle events.
- Free trials: auto-start configurable trial on first subscription check,
with admin-controllable duration and enable/disable toggle.
- Cross-platform guard: prevent duplicate subscriptions across iOS, Android,
and Stripe by checking existing platform before allowing purchase.
- Subscription model: add Stripe fields (customer_id, subscription_id,
price_id), trial fields (trial_start, trial_end, trial_used), and
SubscriptionSource/IsTrialActive helpers.
- API: add trial and source fields to status response, update OpenAPI spec.
- Clean up stale migration and audit docs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The TimeoutMiddleware wraps the response writer in *http.timeoutWriter which
doesn't implement http.Flusher. When the admin reverse proxy or WebSocket
upgrader tries to flush, it panics and crashes the container (502 Bad Gateway).
Skip timeout for /admin, /_next, and /ws routes.
Also fix the Dockerfile HEALTHCHECK to detect the worker process — the worker
has no HTTP server so the curl-based check always failed, marking it unhealthy.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Apple App Store Server API integration for receipt/transaction validation
- Add Google Play Developer API integration for purchase token validation
- Add webhook endpoints for server-to-server subscription notifications
- POST /api/subscription/webhook/apple/ (App Store Server Notifications v2)
- POST /api/subscription/webhook/google/ (Real-time Developer Notifications)
- Support both StoreKit 1 (receipt_data) and StoreKit 2 (transaction_id)
- Add repository methods to find users by transaction ID or purchase token
- Add configuration for IAP credentials (APPLE_IAP_*, GOOGLE_IAP_*)
- Add setup documentation for configuring webhooks
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add competitor analysis with feature comparison and pricing research
- Add social media kit with Twitter, Instagram, LinkedIn content
- Add press release for app launch announcements
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This refactor eliminates duplicate task logic across the codebase by
creating a centralized task package with three layers:
- predicates/: Pure Go functions defining task state logic (IsCompleted,
IsOverdue, IsDueSoon, IsUpcoming, IsActive, IsInProgress, EffectiveDate)
- scopes/: GORM scope functions mirroring predicates for database queries
- categorization/: Chain of Responsibility pattern for kanban column assignment
Key fixes:
- Fixed PostgreSQL DATE vs TIMESTAMP comparison bug in scopes (added
explicit ::timestamp casts) that caused summary/kanban count mismatches
- Fixed models/task.go IsOverdue() and IsDueSoon() to use EffectiveDate
(NextDueDate ?? DueDate) instead of only DueDate
- Removed duplicate isTaskCompleted() helpers from task_repo.go and
task_button_types.go
Files refactored to use consolidated logic:
- task_repo.go: Uses scopes for statistics, predicates for filtering
- task_button_types.go: Uses predicates instead of inline logic
- responses/task.go: Delegates to categorization package
- dashboard_handler.go: Uses scopes for task statistics
- residence_service.go: Uses predicates for report generation
- worker/jobs/handler.go: Documented SQL with predicate references
Added comprehensive tests:
- predicates_test.go: Unit tests for all predicate functions
- scopes_test.go: Integration tests verifying scopes match predicates
- consistency_test.go: Three-layer consistency tests ensuring predicates,
scopes, and categorization all return identical results
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove Gorush push server dependency (now using direct APNs/FCM)
- Update docker-compose.yml to remove gorush service
- Update config.go to remove GORUSH_URL
- Fix worker queries:
- Use auth_user instead of user_user table
- Use completed_at instead of completion_date column
- Add NotificationService to worker handler for actionable notifications
- Add docs/PUSH_NOTIFICATIONS.md with architecture documentation
- Update README.md, DOKKU_SETUP.md, and dev.sh
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Features:
- Add task action buttons to push notifications (complete, view, cancel, etc.)
- Add button types logic for different task states (overdue, in_progress, etc.)
- Implement Chain of Responsibility pattern for task categorization
- Add comprehensive kanban categorization documentation
Fixes:
- Reset recurring task status to Pending after completion so tasks appear
in correct kanban column (was staying in "In Progress")
- Fix PostgreSQL EXTRACT function error in overdue notifications query
- Update seed data to properly set next_due_date for recurring tasks
Admin:
- Add tasks list to residence detail page
- Fix task edit page to properly handle all fields
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add TaskTemplate model with category and frequency support
- Add task template repository with CRUD and search operations
- Add task template service layer
- Add public API endpoints for templates (no auth required):
- GET /api/tasks/templates/ - list all templates
- GET /api/tasks/templates/grouped/ - templates grouped by category
- GET /api/tasks/templates/search/?q= - search templates
- GET /api/tasks/templates/by-category/:id/ - templates by category
- GET /api/tasks/templates/:id/ - single template
- Add admin panel for task template management (CRUD)
- Add admin API endpoints for templates
- Add seed file with predefined task templates
- Add i18n translations for template errors
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Integrate landing page into Go app (served at root /)
- Add STATIC_DIR config for static file serving
- Redesign all email templates with modern dark theme styling
- Add app icon to email headers
- Return updated task with kanban_column in completion response
- Update task DTO to include kanban column for client-side state updates
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add go-i18n package for internationalization
- Create i18n middleware to extract Accept-Language header
- Add translation files for en, es, fr, de, pt languages
- Localize all handler error messages and responses
- Add language context to all API handlers
Supported languages: English, Spanish, French, German, Portuguese
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add IsFree boolean field to UserSubscription model
- When IsFree is true, user sees limitations_enabled=false regardless of global setting
- CheckLimit() bypasses all limit checks for IsFree users
- Add admin endpoint GET /api/admin/subscriptions/user/:user_id
- Add IsFree toggle to admin user detail page under Subscription card
- Add database migration 004_subscription_is_free
- Add integration tests for IsFree functionality
- Add task kanban categorization documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add TASK_REMINDER_HOUR, TASK_REMINDER_MINUTE, OVERDUE_REMINDER_HOUR, DAILY_DIGEST_HOUR config
- Document schedule times in UTC with explanation table
- Add worker verification steps to check logs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The KMM mobile app calls /api/notifications/devices/register/ but
the Go API had /api/notifications/devices/. Added alias to support both.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Next.js admin now runs on internal port 3001
- Go API uses Dokku's PORT environment variable (5000)
- Updated admin proxy default to localhost:3001
- Added /_next/* static asset proxy route
This fixes the "address already in use" error when deploying to Dokku
where both services were trying to bind to port 3000.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Moved Gorush configuration from optional section to main setup flow (step 7)
- Removed duplicate/broken Push Notifications (Optional) section
- Fixed section numbering in environment variables
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Comprehensive step-by-step instructions for deploying to a remote server:
- Server setup and prerequisites
- Dokku installation and plugins
- PostgreSQL and Redis configuration
- Environment variables reference
- SSL setup with Let's Encrypt
- Worker process scaling
- Push notifications setup
- Maintenance commands and troubleshooting
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>