plan: Phase 8 DAG System TDD with 2 plans

- 08-01: GameDAGRouter edge cases and anchor validation TDD (17+ tests) - 08-02: Performance with large datasets (10K+ games) and diversity coverage TDD TDD discipline: tests define correctness, code must match. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 11:36:29 -06:00
parent 8c98e95801
commit a786d7e2aa
4 changed files with 313 additions and 8 deletions
@@ -37,10 +37,11 @@
 **Goal**: Performance and edge case tests with tens of thousands of objects; fix code if tests fail
 **Depends on**: v1.0 milestone complete
 **Research**: Unlikely (internal patterns, existing code)
-**Plans**: TBD
+**Plans**: 2
 Plans:
- [ ] 08-01: TBD (run /gsd:plan-phase 8 to break down)
+- [ ] 08-01: GameDAGRouter edge cases and anchor validation TDD
 - [ ] 08-02: GameDAGRouter performance with large datasets and diversity coverage TDD
 #### Phase 9: Trip Planner Modes TDD
@@ -108,7 +109,7 @@ Plans:
 | 5. CloudKit CRUD | v1.0 | 2/2 | Complete | 2026-01-10 |
 | 6. Validation Reports | v1.0 | 1/1 | Complete | 2026-01-10 |
 | 7. Testing & Documentation | v1.0 | 1/1 | Complete | 2026-01-10 |
-| 8. DAG System TDD | v1.1 | 0/? | Not started | - |
+| 8. DAG System TDD | v1.1 | 0/2 | Planned | - |
 | 9. Trip Planner Modes TDD | v1.1 | 0/? | Not started | - |
 | 10. Trip Builder Options TDD | v1.1 | 0/? | Not started | - |
 | 11. Itinerary & Constraints TDD | v1.1 | 0/? | Not started | - |
@@ -10,9 +10,9 @@ See: .planning/PROJECT.md (updated 2026-01-10)
 ## Current Position
 Phase: 8 of 12 (DAG System TDD)
-Plan: Not started
+Plan: 08-01 ready (2 plans total)
-Status: Ready to plan
+Status: Ready to execute
-Last activity: 2026-01-10 — Milestone v1.1 TDD & Correctness created
+Last activity: 2026-01-10 — Phase 8 planned (2 TDD plans)
 Progress: ░░░░░░░░░░ 0%
@@ -43,6 +43,6 @@ None.
 ## Session Continuity
 Last session: 2026-01-10
-Stopped at: Milestone v1.1 initialization
+Stopped at: Phase 8 planning complete
 Resume file: None
-Next action: /gsd:plan-phase 8 to plan first phase
+Next action: /gsd:execute-plan 08-01 to start TDD execution
@@ -0,0 +1,131 @@
 ---
 phase: 08-dag-system-tdd
 type: execute
 ---
 <objective>
 TDD for GameDAGRouter edge cases and anchor game validation.
 Purpose: Ensure the DAG routing algorithm handles boundary conditions correctly before testing at scale.
 Output: Comprehensive edge case test suite for GameDAGRouter, with code fixes if tests fail.
 </objective>
 <execution_context>
 ~/.claude/get-shit-done/workflows/execute-phase.md
 ./summary.md
 ~/.claude/get-shit-done/references/tdd.md
 </execution_context>
 <context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@SportsTime/Planning/Engine/GameDAGRouter.swift
@SportsTimeTests/ScenarioBPlannerTests.swift
 </context>
 <tasks>
 <task type="auto">
  <name>Task 1: Create GameDAGRouterTests with edge case tests</name>
  <files>SportsTimeTests/GameDAGRouterTests.swift</files>
  <action>
 Create a new test file using Swift Testing framework (@Test attributes, #expect assertions).
 Include test helpers:
 - makeStadium(id:city:lat:lon:) with default coordinates spread across US
 - makeGame(id:stadiumId:startTime:) with default sport/teams
 - date(_:) helper for "yyyy-MM-dd HH:mm" parsing
 Write RED tests for these edge cases:
 1. Empty games array → returns empty routes
 2. Single game → returns [[game]] (unless anchor mismatch)
 3. Single game with non-matching anchor → returns []
 4. Two games, chronological and feasible → returns route containing both
 5. Two games, chronological but infeasible (too far) → returns two separate single-game routes
 6. Two games, reverse chronological (second before first) → returns two separate single-game routes
 7. Three games where only pairs are feasible → returns all valid pairs/singles
 8. Anchor game filtering: routes missing anchors are excluded
 9. Repeat cities OFF: routes with same city twice are excluded
 10. Repeat cities ON: routes with same city twice are included
 Run tests expecting failures for any code gaps:
 ```bash
 xcodebuild -project SportsTime.xcodeproj -scheme SportsTime -destination 'platform=iOS Simulator,name=iPhone 17,OS=26.2' -only-testing:SportsTimeTests/GameDAGRouterTests test
 ```
  </action>
  <verify>All edge case tests exist and execute (RED or GREEN)</verify>
  <done>GameDAGRouterTests.swift contains 10+ edge case tests using Swift Testing framework</done>
 </task>
 <task type="auto">
  <name>Task 2: Fix any failing edge case tests</name>
  <files>SportsTime/Planning/Engine/GameDAGRouter.swift</files>
  <action>
 Run the edge case tests. For any failures:
 1. Identify the exact assertion that fails
 2. Trace the code path in GameDAGRouter.swift
 3. Fix the logic bug (do NOT modify the test - tests define correctness)
 4. Re-run until GREEN
 Common fix areas:
 - Edge case handling in findRoutes() lines 101-120
 - canTransition() feasibility logic lines 463-507
 - Anchor filtering logic lines 181-184
 Do NOT change test expectations. If a test fails, the code is wrong.
  </action>
  <verify>
 ```bash
 xcodebuild -project SportsTime.xcodeproj -scheme SportsTime -destination 'platform=iOS Simulator,name=iPhone 17,OS=26.2' -only-testing:SportsTimeTests/GameDAGRouterTests test
 ```
 All tests pass
  </verify>
  <done>All 10+ edge case tests pass (GREEN)</done>
 </task>
 <task type="auto">
  <name>Task 3: Add canTransition boundary tests</name>
  <files>SportsTimeTests/GameDAGRouterTests.swift</files>
  <action>
 Add tests for canTransition edge cases via findRoutes() behavior:
 1. Same stadium, same day, 4 hours apart → transition feasible
 2. Different stadium, 1000 miles apart, same day → infeasible (not enough driving time)
 3. Different stadium, 1000 miles apart, 2 days apart → feasible (enough driving days)
 4. Different stadium, 100 miles apart, 4 hours available → feasible
 5. Different stadium, 100 miles apart, 1 hour available → infeasible (need 3hr buffer after game)
 6. Game end buffer: 3hr buffer after game end before departure
 7. Arrival buffer: 1hr buffer before next game start
 These test the canTransition() logic indirectly through findRoutes() results.
  </action>
  <verify>
 ```bash
 xcodebuild -project SportsTime.xcodeproj -scheme SportsTime -destination 'platform=iOS Simulator,name=iPhone 17,OS=26.2' -only-testing:SportsTimeTests/GameDAGRouterTests test
 ```
 All tests pass
  </verify>
  <done>7 additional boundary tests pass (17+ total tests)</done>
 </task>
 </tasks>
 <verification>
 Before declaring phase complete:
 - [ ] `xcodebuild test` for GameDAGRouterTests passes with 17+ tests
 - [ ] All edge cases documented in test names
 - [ ] No tests were modified to pass (code was fixed instead)
 - [ ] Existing tests in other test files still pass
 </verification>
 <success_criteria>
 - All tasks completed
 - All verification checks pass
 - 17+ edge case tests for GameDAGRouter
 - TDD discipline maintained (tests define correctness)
 </success_criteria>
 <output>
 After completion, create `.planning/phases/08-dag-system-tdd/08-01-SUMMARY.md`
 </output>
@@ -0,0 +1,173 @@
 ---
 phase: 08-dag-system-tdd
 type: execute
 ---
 <objective>
 TDD for GameDAGRouter performance with large datasets and diversity coverage.
 Purpose: Ensure the DAG algorithm performs well with production-scale data (10K+ games) and produces diverse route options.
 Output: Performance test suite validating scalability and diversity guarantees, with code fixes if tests fail.
 </objective>
 <execution_context>
 ~/.claude/get-shit-done/workflows/execute-phase.md
 ./summary.md
 ~/.claude/get-shit-done/references/tdd.md
 </execution_context>
 <context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@SportsTime/Planning/Engine/GameDAGRouter.swift
@SportsTimeTests/GameDAGRouterTests.swift
 </context>
 <tasks>
 <task type="auto">
  <name>Task 1: Add performance tests with large datasets</name>
  <files>SportsTimeTests/GameDAGRouterTests.swift</files>
  <action>
 Add performance tests to existing GameDAGRouterTests file.
 Create helper to generate large test data:
 ```swift
 private func generateLargeDataset(
    gameCount: Int,
    stadiumCount: Int,
    daysSpan: Int
 ) -> (games: [Game], stadiums: [UUID: Stadium])
 ```
 Write RED tests:
 1. 1000 games, 50 stadiums, 30 days → completes in <2 seconds
 2. 5000 games, 100 stadiums, 60 days → completes in <10 seconds
 3. 10000 games, 150 stadiums, 90 days → completes in <30 seconds
 4. Memory: 10K games doesn't cause memory spike (verify routes returned, not OOM)
 Use Swift Testing's `Clock` or `ContinuousClock` for timing:
 ```swift
@Test("10K games completes in reasonable time")
 func performance_10KGames_CompletesInTime() async {
    let (games, stadiums) = generateLargeDataset(gameCount: 10000, stadiumCount: 150, daysSpan: 90)
    let start = ContinuousClock.now
    let routes = GameDAGRouter.findRoutes(games: games, stadiums: stadiums, constraints: .default)
    let elapsed = start.duration(to: .now)
    #expect(routes.count > 0, "Should return routes")
    #expect(elapsed < .seconds(30), "Should complete within 30 seconds")
 }
 ```
 Run tests and note any performance failures.
  </action>
  <verify>Performance tests exist and execute</verify>
  <done>4 performance tests written with timing assertions</done>
 </task>
 <task type="auto">
  <name>Task 2: Fix any performance issues</name>
  <files>SportsTime/Planning/Engine/GameDAGRouter.swift</files>
  <action>
 If performance tests fail (timeouts), optimize GameDAGRouter:
 Potential optimizations (apply only if tests fail):
 1. Reduce beam width for very large inputs (dynamic scaling based on game count)
 2. Early termination when enough diverse routes found
 3. More aggressive diversity pruning during expansion
 4. Pre-compute stadium distances instead of recalculating
 Do NOT weaken test expectations. If the test says 30 seconds, the code must meet that.
 Re-run tests until performance requirements met.
  </action>
  <verify>
 ```bash
 xcodebuild -project SportsTime.xcodeproj -scheme SportsTime -destination 'platform=iOS Simulator,name=iPhone 17,OS=26.2' -only-testing:SportsTimeTests/GameDAGRouterTests test
 ```
 All performance tests pass within time limits
  </verify>
  <done>All 4 performance tests pass within specified time limits</done>
 </task>
 <task type="auto">
  <name>Task 3: Add diversity coverage tests</name>
  <files>SportsTimeTests/GameDAGRouterTests.swift</files>
  <action>
 Add tests verifying the diversity selection produces varied results:
 1. Game count diversity: With 50 games over 10 days, routes include 2-game, 3-game, 4-game, and 5+ game options
 2. City count diversity: Routes span different numbers of cities (2, 3, 4, 5+)
 3. Mileage diversity: Routes include short (<500mi), medium (500-1000mi), and long (1000+mi) options
 4. Duration diversity: Routes include 2-day, 3-day, 5-day, and 7+ day options
 5. Bucket coverage: At least 3 of 5 game count buckets represented in output
 6. No duplicates: All returned routes have unique game combinations
 Test diversity by analyzing the returned routes:
 ```swift
@Test("diversity includes varied game counts")
 func diversity_VariedGameCounts() {
    let (games, stadiums) = generateDiverseDataset() // 50 games, 20 stadiums, 14 days
    let routes = GameDAGRouter.findRoutes(games: games, stadiums: stadiums, constraints: .default)
    let gameCounts = Set(routes.map { $0.count })
    #expect(gameCounts.count >= 3, "Should have at least 3 different route lengths")
    #expect(gameCounts.contains { $0 <= 3 }, "Should include short routes")
    #expect(gameCounts.contains { $0 >= 5 }, "Should include long routes")
 }
 ```
  </action>
  <verify>
 ```bash
 xcodebuild -project SportsTime.xcodeproj -scheme SportsTime -destination 'platform=iOS Simulator,name=iPhone 17,OS=26.2' -only-testing:SportsTimeTests/GameDAGRouterTests test
 ```
 All diversity tests pass
  </verify>
  <done>6 diversity tests pass, verifying multi-dimensional variety</done>
 </task>
 <task type="auto">
  <name>Task 4: Fix any diversity issues</name>
  <files>SportsTime/Planning/Engine/GameDAGRouter.swift</files>
  <action>
 If diversity tests fail, fix selectDiverseRoutes() logic:
 Common issues:
 1. Not all buckets being sampled (check pass 1-4 in selectDiverseRoutes)
 2. Short routes getting pruned too early (check diversityPrune)
 3. Bucket calculations wrong (check RouteProfile bucket properties)
 Fix the diversity algorithm to ensure varied output. Do NOT modify test expectations.
  </action>
  <verify>
 ```bash
 xcodebuild -project SportsTime.xcodeproj -scheme SportsTime -destination 'platform=iOS Simulator,name=iPhone 17,OS=26.2' -only-testing:SportsTimeTests/GameDAGRouterTests test
 ```
 All tests pass
  </verify>
  <done>All diversity tests pass, proving multi-dimensional route variety</done>
 </task>
 </tasks>
 <verification>
 Before declaring phase complete:
 - [ ] Performance tests pass for 1K, 5K, 10K game datasets
 - [ ] Diversity tests verify varied route lengths, cities, miles, durations
 - [ ] No test assertions weakened to pass
 - [ ] All existing GameDAGRouter edge case tests still pass
 - [ ] Full test suite runs successfully
 </verification>
 <success_criteria>
 - All tasks completed
 - All verification checks pass
 - 10+ new tests (4 performance + 6 diversity)
 - GameDAGRouter handles 10K+ games efficiently
 - Diversity selection produces varied results
 </success_criteria>
 <output>
 After completion, create `.planning/phases/08-dag-system-tdd/08-02-SUMMARY.md`
 </output>