Phase 5: CloudKit CRUD - 2 plans created - 4 total tasks defined - Ready for execution Plan 05-01: Smart sync with change detection - Change detection with diff reporting - Differential sync (upload only changed records) Plan 05-02: Verification and record management - Sync verification (CloudKit vs local comparison) - Individual record CRUD operations Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
152 lines
5.7 KiB
Markdown
152 lines
5.7 KiB
Markdown
---
|
|
phase: 05-cloudkit-crud
|
|
plan: 01
|
|
type: execute
|
|
domain: data-pipeline
|
|
---
|
|
|
|
<objective>
|
|
Add smart sync with change detection to cloudkit_import.py.
|
|
|
|
Purpose: Enable differential uploads that only sync new/changed records, reducing CloudKit API calls and sync time.
|
|
Output: Enhanced cloudkit_import.py with --diff, --smart-sync, and --changes-only flags.
|
|
</objective>
|
|
|
|
<execution_context>
|
|
~/.claude/get-shit-done/workflows/execute-phase.md
|
|
~/.claude/get-shit-done/templates/summary.md
|
|
</execution_context>
|
|
|
|
<context>
|
|
@.planning/PROJECT.md
|
|
@.planning/ROADMAP.md
|
|
@.planning/STATE.md
|
|
@.planning/phases/04-canonical-linking/04-01-SUMMARY.md
|
|
|
|
**Relevant source files:**
|
|
@Scripts/cloudkit_import.py
|
|
|
|
**Tech stack available:** Python 3, requests, cryptography, CloudKit server-to-server API
|
|
**Established patterns:** forceReplace for create/update, query() for read, delete_all() for deletion, batch operations with BATCH_SIZE=200
|
|
|
|
**Constraining decisions:**
|
|
- Phase 04-01: 5760 games canonicalized with 100% team/stadium resolution
|
|
- Existing CloudKit import uses forceReplace (creates or replaces) for all operations
|
|
- recordChangeTag must be used for conflict detection in updates
|
|
</context>
|
|
|
|
<tasks>
|
|
|
|
<task type="auto">
|
|
<name>Task 1: Add change detection with diff reporting</name>
|
|
<files>Scripts/cloudkit_import.py</files>
|
|
<action>
|
|
Add change detection capability to compare local canonical data against CloudKit records.
|
|
|
|
1. Add `query_all(record_type, verbose)` method to CloudKit class:
|
|
- Query with pagination (use continuationMarker for >200 records)
|
|
- Return dict mapping recordName to record data (including recordChangeTag)
|
|
- Handle query errors gracefully
|
|
|
|
2. Add `compute_diff(local_records, cloud_records)` function:
|
|
- Returns dict with keys: 'new', 'updated', 'unchanged', 'deleted'
|
|
- 'new': records in local but not in cloud (by recordName)
|
|
- 'updated': records in both where fields differ (compare field values)
|
|
- 'unchanged': records in both with same field values
|
|
- 'deleted': records in cloud but not in local
|
|
- Include count for each category
|
|
|
|
3. Add `--diff` flag to argparse:
|
|
- When set, query CloudKit and show diff report for each record type
|
|
- Format: "Stadiums: 32 unchanged, 2 new, 1 updated, 0 deleted"
|
|
- Do NOT perform any imports, just report
|
|
|
|
4. Field comparison for 'updated' detection:
|
|
- Compare string/int fields directly
|
|
- For location fields, compare lat/lng with tolerance (0.0001)
|
|
- For reference fields, compare recordName only
|
|
- Ignore recordChangeTag and timestamps in comparison
|
|
|
|
Avoid: Using forceReplace for everything. The goal is to identify WHAT changed before deciding HOW to sync.
|
|
</action>
|
|
<verify>
|
|
```bash
|
|
cd Scripts && python cloudkit_import.py --diff --verbose
|
|
```
|
|
Should output diff report showing counts for each record type (stadiums, teams, games, etc.)
|
|
</verify>
|
|
<done>--diff flag works and reports new/updated/unchanged/deleted counts for each record type</done>
|
|
</task>
|
|
|
|
<task type="auto">
|
|
<name>Task 2: Add differential sync with smart-sync flag</name>
|
|
<files>Scripts/cloudkit_import.py</files>
|
|
<action>
|
|
Add differential sync capability that only uploads new and changed records.
|
|
|
|
1. Add `sync_diff(ck, diff, record_type, dry_run, verbose)` function:
|
|
- For 'new' records: use forceReplace (creates new)
|
|
- For 'updated' records: use 'update' operationType with recordChangeTag
|
|
- For 'deleted' records: use 'delete' operationType (optional, controlled by flag)
|
|
- Skip 'unchanged' records entirely
|
|
- Return counts: created, updated, deleted, skipped
|
|
|
|
2. Add CloudKit `update` operation handling in modify():
|
|
- update operationType requires recordChangeTag field
|
|
- Handle CONFLICT error (409) - means record was modified since query
|
|
- On conflict: re-query that record, recompute if still needs update
|
|
|
|
3. Add `--smart-sync` flag:
|
|
- Query CloudKit first to get current state
|
|
- Compute diff against local data
|
|
- Sync only new and updated records
|
|
- Print summary: "Created N, Updated M, Skipped K unchanged"
|
|
|
|
4. Add `--delete-orphans` flag (used with --smart-sync):
|
|
- When set, also delete records in CloudKit but not in local
|
|
- Default: do NOT delete orphans (safe mode)
|
|
- Print warning: "Would delete N orphan records (use --delete-orphans to confirm)"
|
|
|
|
5. Menu integration:
|
|
- Add option 12: "Smart sync (diff-based)"
|
|
- Add option 13: "Smart sync + delete orphans"
|
|
|
|
Avoid: Deleting records without explicit flag. forceReplace on unchanged records.
|
|
</action>
|
|
<verify>
|
|
```bash
|
|
cd Scripts && python cloudkit_import.py --smart-sync --dry-run --verbose
|
|
```
|
|
Should show what would be created/updated/skipped without making changes.
|
|
|
|
```bash
|
|
cd Scripts && python cloudkit_import.py --smart-sync --verbose
|
|
```
|
|
Should perform differential sync, reporting created/updated/skipped counts.
|
|
</verify>
|
|
<done>--smart-sync flag performs differential upload, skipping unchanged records. Created/updated counts are accurate.</done>
|
|
</task>
|
|
|
|
</tasks>
|
|
|
|
<verification>
|
|
Before declaring plan complete:
|
|
- [ ] `python cloudkit_import.py --diff` reports accurate counts for all record types
|
|
- [ ] `python cloudkit_import.py --smart-sync --dry-run` shows correct preview
|
|
- [ ] `python cloudkit_import.py --smart-sync` uploads only changed records
|
|
- [ ] Update with recordChangeTag handles conflicts gracefully
|
|
- [ ] Interactive menu has new options 12 and 13
|
|
</verification>
|
|
|
|
<success_criteria>
|
|
|
|
- Change detection accurately identifies new/updated/unchanged/deleted records
|
|
- Smart sync reduces CloudKit API calls by skipping unchanged records
|
|
- Conflict handling prevents data loss on concurrent updates
|
|
- No regressions to existing import functionality
|
|
</success_criteria>
|
|
|
|
<output>
|
|
After completion, create `.planning/phases/05-cloudkit-crud/05-01-SUMMARY.md`
|
|
</output>
|