Files
Sportstime/.planning/phases/05-cloudkit-crud/05-01-PLAN.md
Trey t e5c6d0fec7 docs(05): create CloudKit CRUD phase plans
Phase 5: CloudKit CRUD
- 2 plans created
- 4 total tasks defined
- Ready for execution

Plan 05-01: Smart sync with change detection
- Change detection with diff reporting
- Differential sync (upload only changed records)

Plan 05-02: Verification and record management
- Sync verification (CloudKit vs local comparison)
- Individual record CRUD operations

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:02:06 -06:00

5.7 KiB

phase, plan, type, domain
phase plan type domain
05-cloudkit-crud 01 execute data-pipeline
Add smart sync with change detection to cloudkit_import.py.

Purpose: Enable differential uploads that only sync new/changed records, reducing CloudKit API calls and sync time. Output: Enhanced cloudkit_import.py with --diff, --smart-sync, and --changes-only flags.

<execution_context> ~/.claude/get-shit-done/workflows/execute-phase.md ~/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/04-canonical-linking/04-01-SUMMARY.md

Relevant source files: @Scripts/cloudkit_import.py

Tech stack available: Python 3, requests, cryptography, CloudKit server-to-server API Established patterns: forceReplace for create/update, query() for read, delete_all() for deletion, batch operations with BATCH_SIZE=200

Constraining decisions:

  • Phase 04-01: 5760 games canonicalized with 100% team/stadium resolution
  • Existing CloudKit import uses forceReplace (creates or replaces) for all operations
  • recordChangeTag must be used for conflict detection in updates
Task 1: Add change detection with diff reporting Scripts/cloudkit_import.py Add change detection capability to compare local canonical data against CloudKit records.
  1. Add query_all(record_type, verbose) method to CloudKit class:

    • Query with pagination (use continuationMarker for >200 records)
    • Return dict mapping recordName to record data (including recordChangeTag)
    • Handle query errors gracefully
  2. Add compute_diff(local_records, cloud_records) function:

    • Returns dict with keys: 'new', 'updated', 'unchanged', 'deleted'
    • 'new': records in local but not in cloud (by recordName)
    • 'updated': records in both where fields differ (compare field values)
    • 'unchanged': records in both with same field values
    • 'deleted': records in cloud but not in local
    • Include count for each category
  3. Add --diff flag to argparse:

    • When set, query CloudKit and show diff report for each record type
    • Format: "Stadiums: 32 unchanged, 2 new, 1 updated, 0 deleted"
    • Do NOT perform any imports, just report
  4. Field comparison for 'updated' detection:

    • Compare string/int fields directly
    • For location fields, compare lat/lng with tolerance (0.0001)
    • For reference fields, compare recordName only
    • Ignore recordChangeTag and timestamps in comparison

Avoid: Using forceReplace for everything. The goal is to identify WHAT changed before deciding HOW to sync.

cd Scripts && python cloudkit_import.py --diff --verbose

Should output diff report showing counts for each record type (stadiums, teams, games, etc.) --diff flag works and reports new/updated/unchanged/deleted counts for each record type

Task 2: Add differential sync with smart-sync flag Scripts/cloudkit_import.py Add differential sync capability that only uploads new and changed records.
  1. Add sync_diff(ck, diff, record_type, dry_run, verbose) function:

    • For 'new' records: use forceReplace (creates new)
    • For 'updated' records: use 'update' operationType with recordChangeTag
    • For 'deleted' records: use 'delete' operationType (optional, controlled by flag)
    • Skip 'unchanged' records entirely
    • Return counts: created, updated, deleted, skipped
  2. Add CloudKit update operation handling in modify():

    • update operationType requires recordChangeTag field
    • Handle CONFLICT error (409) - means record was modified since query
    • On conflict: re-query that record, recompute if still needs update
  3. Add --smart-sync flag:

    • Query CloudKit first to get current state
    • Compute diff against local data
    • Sync only new and updated records
    • Print summary: "Created N, Updated M, Skipped K unchanged"
  4. Add --delete-orphans flag (used with --smart-sync):

    • When set, also delete records in CloudKit but not in local
    • Default: do NOT delete orphans (safe mode)
    • Print warning: "Would delete N orphan records (use --delete-orphans to confirm)"
  5. Menu integration:

    • Add option 12: "Smart sync (diff-based)"
    • Add option 13: "Smart sync + delete orphans"

Avoid: Deleting records without explicit flag. forceReplace on unchanged records.

cd Scripts && python cloudkit_import.py --smart-sync --dry-run --verbose

Should show what would be created/updated/skipped without making changes.

cd Scripts && python cloudkit_import.py --smart-sync --verbose

Should perform differential sync, reporting created/updated/skipped counts. --smart-sync flag performs differential upload, skipping unchanged records. Created/updated counts are accurate.

Before declaring plan complete: - [ ] `python cloudkit_import.py --diff` reports accurate counts for all record types - [ ] `python cloudkit_import.py --smart-sync --dry-run` shows correct preview - [ ] `python cloudkit_import.py --smart-sync` uploads only changed records - [ ] Update with recordChangeTag handles conflicts gracefully - [ ] Interactive menu has new options 12 and 13

<success_criteria>

  • Change detection accurately identifies new/updated/unchanged/deleted records
  • Smart sync reduces CloudKit API calls by skipping unchanged records
  • Conflict handling prevents data loss on concurrent updates
  • No regressions to existing import functionality </success_criteria>
After completion, create `.planning/phases/05-cloudkit-crud/05-01-SUMMARY.md`