Design covers three improvements: - Paginated CloudKit fetches to handle >400 records - Graceful cancellation with per-entity progress saving - NWPathMonitor for auto-sync on network restoration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
305 lines
11 KiB
Markdown
305 lines
11 KiB
Markdown
# Sync Reliability Design
|
|
|
|
**Date:** 2026-01-13
|
|
**Status:** Approved
|
|
**Author:** Claude Code + User collaboration
|
|
|
|
## Summary
|
|
|
|
Improve CloudKit sync reliability with three enhancements:
|
|
1. **Pagination** - Fetch all records, not just first ~400
|
|
2. **Cancellation** - Graceful background task expiration with progress saving
|
|
3. **Network monitoring** - Auto-sync when connection restored
|
|
|
|
## Problem Statement
|
|
|
|
The current sync system has a critical gap: CloudKit queries return max 400 records per request, but the code discards the pagination cursor. With ~5,000 games in the dataset, most records may never sync from CloudKit.
|
|
|
|
Additionally, background tasks can be terminated by iOS with no graceful handling, and there's no automatic sync when network connectivity is restored.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Sync Reliability System │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌──────────────────┐ ┌──────────────────────────────────┐ │
|
|
│ │ NetworkMonitor │────►│ BackgroundSyncManager │ │
|
|
│ │ (NWPathMonitor) │ │ - Cancellation token support │ │
|
|
│ │ - Debounced │ │ - Progress persistence │ │
|
|
│ └──────────────────┘ └──────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌──────────────────────────────────┐ │
|
|
│ │ CloudKitService │ │
|
|
│ │ - Paginated fetches │ │
|
|
│ │ - CKQueryOperation + cursors │ │
|
|
│ │ - Cancellation-aware │ │
|
|
│ └──────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌──────────────────────────────────┐ │
|
|
│ │ SyncState (enhanced) │ │
|
|
│ │ - Per-entity lastSync dates │ │
|
|
│ │ - Partial sync progress │ │
|
|
│ └──────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Design Details
|
|
|
|
### 1. Paginated CloudKit Fetches
|
|
|
|
**New Protocol:**
|
|
|
|
```swift
|
|
/// Token checked between pages to support graceful cancellation
|
|
protocol SyncCancellationToken: Sendable {
|
|
var isCancelled: Bool { get }
|
|
}
|
|
```
|
|
|
|
**Paginated Fetch Pattern:**
|
|
|
|
Each fetch method (`fetchGamesForSync`, `fetchTeamsForSync`, etc.) refactored to:
|
|
|
|
```swift
|
|
func fetchGamesForSync(
|
|
since lastSync: Date?,
|
|
cancellationToken: SyncCancellationToken? = nil
|
|
) async throws -> [SyncGame] {
|
|
var allResults: [SyncGame] = []
|
|
var cursor: CKQueryOperation.Cursor? = nil
|
|
|
|
repeat {
|
|
// Check cancellation before each page
|
|
if cancellationToken?.isCancelled == true {
|
|
break // Return what we have so far
|
|
}
|
|
|
|
let (pageResults, nextCursor) = try await fetchGamesPage(
|
|
since: lastSync,
|
|
cursor: cursor
|
|
)
|
|
allResults.append(contentsOf: pageResults)
|
|
cursor = nextCursor
|
|
|
|
} while cursor != nil
|
|
|
|
return allResults
|
|
}
|
|
|
|
private func fetchGamesPage(
|
|
since lastSync: Date?,
|
|
cursor: CKQueryOperation.Cursor?
|
|
) async throws -> ([SyncGame], CKQueryOperation.Cursor?) {
|
|
// Uses CKQueryOperation for explicit cursor control
|
|
// Returns up to 400 records per page
|
|
}
|
|
```
|
|
|
|
**Rationale:**
|
|
- Cancellation checked between pages (not mid-fetch)
|
|
- Returns partial results on cancellation (not thrown away)
|
|
- Page size of 400 balances throughput vs. memory
|
|
|
|
### 2. Cancellation & Progress Saving
|
|
|
|
**Cancellation Token Implementation:**
|
|
|
|
```swift
|
|
final class BackgroundTaskCancellationToken: SyncCancellationToken, @unchecked Sendable {
|
|
private let _isCancelled = OSAllocatedUnfairLock(initialState: false)
|
|
|
|
var isCancelled: Bool { _isCancelled.withLock { $0 } }
|
|
|
|
func cancel() { _isCancelled.withLock { $0 = true } }
|
|
}
|
|
```
|
|
|
|
**Enhanced SyncState:**
|
|
|
|
Add per-entity sync timestamps for partial progress:
|
|
|
|
```swift
|
|
// Add to existing SyncState model
|
|
var lastStadiumSync: Date?
|
|
var lastTeamSync: Date?
|
|
var lastGameSync: Date?
|
|
var lastLeagueStructureSync: Date?
|
|
var lastTeamAliasSync: Date?
|
|
var lastStadiumAliasSync: Date?
|
|
var lastSportSync: Date?
|
|
```
|
|
|
|
**Modified Sync Flow:**
|
|
|
|
```swift
|
|
func syncAll(context: ModelContext, cancellationToken: SyncCancellationToken? = nil) async throws -> SyncResult {
|
|
// Sync each entity type, saving progress after each completes
|
|
|
|
let stadiums = try await syncStadiums(context: context, cancellationToken: cancellationToken)
|
|
syncState.lastStadiumSync = Date() // Save progress immediately
|
|
try context.save()
|
|
|
|
if cancellationToken?.isCancelled == true {
|
|
return partialResult(stadiums: stadiums)
|
|
}
|
|
|
|
let teams = try await syncTeams(context: context, cancellationToken: cancellationToken)
|
|
syncState.lastTeamSync = Date()
|
|
try context.save()
|
|
|
|
// ... continue for each entity type
|
|
}
|
|
```
|
|
|
|
**Key behaviors:**
|
|
- Progress saved after each entity type completes
|
|
- Cancellation returns partial results instead of throwing
|
|
- Next sync uses per-entity timestamps to resume
|
|
|
|
### 3. Network Monitor
|
|
|
|
**NetworkMonitor Singleton:**
|
|
|
|
```swift
|
|
import Network
|
|
|
|
@MainActor
|
|
final class NetworkMonitor: ObservableObject {
|
|
static let shared = NetworkMonitor()
|
|
|
|
@Published private(set) var isConnected: Bool = true
|
|
@Published private(set) var connectionType: ConnectionType = .unknown
|
|
|
|
enum ConnectionType { case wifi, cellular, wired, unknown }
|
|
|
|
private let monitor = NWPathMonitor()
|
|
private let queue = DispatchQueue(label: "NetworkMonitor")
|
|
private var debounceTask: Task<Void, Never>?
|
|
|
|
private init() {
|
|
monitor.pathUpdateHandler = { [weak self] path in
|
|
Task { @MainActor in
|
|
self?.handlePathUpdate(path)
|
|
}
|
|
}
|
|
monitor.start(queue: queue)
|
|
}
|
|
|
|
private func handlePathUpdate(_ path: NWPath) {
|
|
let wasConnected = isConnected
|
|
isConnected = (path.status == .satisfied)
|
|
connectionType = detectConnectionType(path)
|
|
|
|
// Debounce: only trigger sync if we GAINED connection
|
|
if !wasConnected && isConnected {
|
|
debounceTask?.cancel()
|
|
debounceTask = Task {
|
|
try? await Task.sleep(for: .seconds(2.5))
|
|
guard !Task.isCancelled else { return }
|
|
NotificationCenter.default.post(name: .networkRestored, object: nil)
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
extension Notification.Name {
|
|
static let networkRestored = Notification.Name("NetworkRestored")
|
|
}
|
|
```
|
|
|
|
**BackgroundSyncManager Integration:**
|
|
|
|
```swift
|
|
// In BackgroundSyncManager.configure(with:)
|
|
NotificationCenter.default.addObserver(
|
|
forName: .networkRestored,
|
|
object: nil,
|
|
queue: .main
|
|
) { [weak self] _ in
|
|
Task { @MainActor in
|
|
await self?.performSyncIfNeeded()
|
|
}
|
|
}
|
|
```
|
|
|
|
**Rationale:**
|
|
- 2.5 second debounce handles WiFi<->cellular transitions
|
|
- Only triggers on connection *gained* (not every path update)
|
|
- Uses NotificationCenter for loose coupling
|
|
|
|
## Error Handling
|
|
|
|
| Scenario | Behavior |
|
|
|----------|----------|
|
|
| Cancellation mid-page | Current page completes, then stops. Partial data saved. |
|
|
| Network drops during sync | CloudKit throws error, caught by existing handling. Per-entity progress preserved. |
|
|
| Multiple sync triggers overlap | Existing `syncInProgress` flag prevents duplicate syncs. |
|
|
| App killed during background task | Per-entity saves ensure most progress kept. |
|
|
| Network flapping | 2.5s debounce collapses rapid transitions into single sync. |
|
|
| First launch offline | Bootstrap from bundled JSON. NetworkMonitor triggers sync when connected. |
|
|
|
|
**Cancellation vs Error distinction:**
|
|
|
|
```swift
|
|
struct SyncResult {
|
|
// ... existing fields
|
|
let wasCancelled: Bool // NEW: distinguish clean cancel from error
|
|
}
|
|
```
|
|
|
|
When `wasCancelled == true`:
|
|
- Don't increment `consecutiveFailures`
|
|
- Don't set `lastSyncError`
|
|
- Do update per-entity timestamps
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests (CloudKitService)
|
|
|
|
| Test | Validates |
|
|
|------|-----------|
|
|
| `test_pagination_fetchesAllPages` | Mock returns 3 pages of 400, verifies all 1200 returned |
|
|
| `test_pagination_stopsOnCancellation` | Cancel after page 1, verify only ~400 returned |
|
|
| `test_pagination_handlesEmptyResult` | No records returns empty array, not error |
|
|
| `test_pagination_handlesSinglePage` | <400 records returns without extra fetch |
|
|
|
|
### Unit Tests (Cancellation)
|
|
|
|
| Test | Validates |
|
|
|------|-----------|
|
|
| `test_cancellationToken_threadSafe` | Concurrent read/write doesn't crash |
|
|
| `test_syncAll_savesProgressPerEntity` | Mock cancel after teams, verify stadiums saved but games not |
|
|
| `test_syncAll_doesNotIncrementFailuresOnCancel` | `consecutiveFailures` unchanged after clean cancel |
|
|
|
|
### Unit Tests (NetworkMonitor)
|
|
|
|
| Test | Validates |
|
|
|------|-----------|
|
|
| `test_debounce_collapsesRapidChanges` | 5 changes in 1s = 1 notification |
|
|
| `test_onlyNotifiesOnGainedConnection` | connected->disconnected doesn't trigger sync |
|
|
| `test_cancelsDebounceIfDisconnects` | connect, disconnect before 2.5s = no notification |
|
|
|
|
### Integration Test
|
|
|
|
| Test | Validates |
|
|
|------|-----------|
|
|
| `test_fullSyncFlow_withMockCloudKit` | Bootstrap -> sync -> cancel -> resume picks up correctly |
|
|
|
|
## Files to Modify
|
|
|
|
| File | Changes |
|
|
|------|---------|
|
|
| `CloudKitService.swift` | Add pagination to all fetch methods, accept cancellation token |
|
|
| `CanonicalSyncService.swift` | Per-entity progress saving, pass cancellation token, handle partial results |
|
|
| `BackgroundSyncManager.swift` | Create/pass cancellation token, subscribe to network notifications |
|
|
| `CanonicalModels.swift` | Add per-entity sync timestamps to SyncState |
|
|
| `NetworkMonitor.swift` | **NEW FILE** - NWPathMonitor wrapper with debouncing |
|
|
|
|
## Migration
|
|
|
|
No data migration required. New `SyncState` fields will be `nil` initially, causing a full sync on first run after update (existing behavior).
|