--- phase: 9.1-fix-flaky-test-when-ran-in-parallel plan: 01 type: execute --- Fix test suite flakiness where 5 tests fail in parallel execution but pass individually. Purpose: Ensure reliable CI/CD testing by resolving Swift Testing parallel execution state pollution discovered in Phase 9. Output: All 5 flaky tests pass consistently in full parallel test suite execution. ~/.claude/get-shit-done/workflows/execute-phase.md ~/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/codebase/TESTING.md # Prior phase context @.planning/phases/09-trip-planner-modes-tdd/09-01-SUMMARY.md @.planning/phases/09-trip-planner-modes-tdd/09-02-SUMMARY.md @.planning/phases/09-trip-planner-modes-tdd/09-03-SUMMARY.md # Test files with flaky tests @SportsTimeTests/ScenarioAPlannerSwiftTests.swift @SportsTimeTests/ScenarioBPlannerTests.swift @SportsTimeTests/ScenarioCPlannerTests.swift **Flaky tests identified (5 total):** From ScenarioAPlannerSwiftTests.swift (3 tests): - `plan_StopDepartureDate_IsLastGameDate()` (line ~300) - `plan_ManyGames_HandledEfficiently()` (line ~493) - `plan_ThreeSameDayGames_PicksFeasibleCombinations()` (line ~944) From ScenarioBPlannerTests.swift (2 tests): - `plan_FillerSameDayAsAnchor_Excluded()` - `plan_MustSeeGamesTooFarApart_Fails()` From ScenarioCPlannerTests.swift (1 test): - `corridor_MultipleGamesMixed_FiltersCorrectly()` **Root cause:** Swift Testing runs tests in parallel by default. Tests likely share mutable state (actor instances, simulator state) causing interference. **Tech stack available:** - Swift Testing framework (iOS 26+) - `#expect()` assertions - `@Suite` and `@Test` attributes - `.serialized` trait for controlling parallelization - `confirmation()` for actor synchronization **Key observation:** ScenarioAPlannerSwiftTests is missing `@Suite` attribute (other test files have it). Task 1: Add Swift Testing isolation to flaky tests SportsTimeTests/ScenarioAPlannerSwiftTests.swift, SportsTimeTests/ScenarioBPlannerTests.swift, SportsTimeTests/ScenarioCPlannerTests.swift Apply Swift Testing isolation patterns to prevent state pollution: 1. **Add @Suite attribute to ScenarioAPlannerSwiftTests** (currently missing): - Add `@Suite("ScenarioA Tests")` before `struct ScenarioAPlannerSwiftTests` - Matches pattern used in ScenarioBPlannerTests and ScenarioCPlannerTests 2. **Apply .serialized trait to flaky tests:** - Modify the 5 flaky test functions to use `.serialized` trait - Swift Testing syntax: `@Test(.serialized, "test description")` - This prevents parallel execution of these specific tests - Example: Change `@Test("handles many games")` to `@Test(.serialized, "handles many games")` 3. **Verify actor synchronization patterns:** - Review if tests use `confirmation()` for actor operations (Swift Testing pattern for async/actor testing) - Add `confirmation()` if tests modify shared actor state and don't already use it - Only needed if actors are shared across tests (check planner initialization in `plan()` helper) **Why .serialized:** These tests likely interfere due to parallel execution timing. Running them serially eliminates race conditions while maintaining all other tests' parallel execution benefits. **What to avoid and WHY:** - Don't add `.serialized` to ALL tests - only the 5 flaky ones. Other tests benefit from parallel execution speed. - Don't use XCTest patterns (like `setUp()`/`tearDown()`) - this is Swift Testing, use Swift Testing patterns only. - Don't modify test assertions - tests validate correct behavior individually, issue is isolation not correctness. Run full test suite in parallel: ```bash xcodebuild -project SportsTime.xcodeproj -scheme SportsTime -destination 'platform=iOS Simulator,name=iPhone 17,OS=26.2' test ``` All 5 previously flaky tests must pass: - `plan_StopDepartureDate_IsLastGameDate()` - `plan_ManyGames_HandledEfficiently()` - `plan_ThreeSameDayGames_PicksFeasibleCombinations()` - `plan_FillerSameDayAsAnchor_Excluded()` - `plan_MustSeeGamesTooFarApart_Fails()` - `corridor_MultipleGamesMixed_FiltersCorrectly()` Run suite 3 times to confirm consistency (flaky tests would fail at least once in 3 runs). - @Suite attribute added to ScenarioAPlannerSwiftTests - .serialized trait applied to all 5 flaky tests - Full test suite passes 3 consecutive times - Test output shows serialized tests running (no parallel interference) Task 2: Verify test isolation doesn't reduce coverage SportsTimeTests/ScenarioAPlannerSwiftTests.swift, SportsTimeTests/ScenarioBPlannerTests.swift, SportsTimeTests/ScenarioCPlannerTests.swift Confirm serialization fixes flakiness without compromising test quality: 1. **Run tests individually to confirm behavior unchanged:** ```bash # Run each flaky test individually xcodebuild -only-testing:SportsTimeTests/ScenarioAPlannerSwiftTests/plan_StopDepartureDate_IsLastGameDate test xcodebuild -only-testing:SportsTimeTests/ScenarioAPlannerSwiftTests/plan_ManyGames_HandledEfficiently test # ... repeat for all 5 ``` All must still pass individually (sanity check). 2. **Measure performance impact:** - Note full suite execution time before changes (from Task 1 baseline) - Note full suite execution time after .serialized changes - Document in SUMMARY.md (expected: minimal impact since only 5/180+ tests serialized) 3. **Verify no new flaky tests introduced:** - Check test output for any other tests now showing intermittent failures - Run suite 5 times total (2 more runs after Task 1's 3 runs) - All 180+ tests must pass consistently **What to avoid and WHY:** - Don't skip individual test verification - ensures serialization didn't break test logic - Don't accept new flaky tests - if any appear, investigate whether .serialized needs to be applied more broadly or if there's a different issue ```bash # Verify all tests pass consistently for i in {1..5}; do echo "Run $i/5" xcodebuild -project SportsTime.xcodeproj -scheme SportsTime -destination 'platform=iOS Simulator,name=iPhone 17,OS=26.2' test | grep "Test Suite 'All tests'" done ``` Expected output (all 5 runs): - "Test Suite 'All tests' passed" - No failures in any of the 5 runs - All 5 previously flaky tests pass individually - Full test suite passes 5 consecutive times (100% success rate) - No new flaky tests detected - Performance impact documented (serializing 5/180+ tests should add ~1-2 seconds max) Before declaring phase complete: - [ ] Full test suite passes consistently (5/5 runs successful) - [ ] All 5 previously flaky tests now pass in parallel execution - [ ] No new flaky tests introduced - [ ] Individual test execution still passes for all 5 tests - [ ] Test isolation change documented in SUMMARY.md - All tasks completed - All verification checks pass - 5 flaky tests now pass reliably in parallel test suite - No degradation in test coverage or execution - CI/CD testing is now reliable After completion, create `.planning/phases/9.1-fix-flaky-test-when-ran-in-parallel/9.1-01-SUMMARY.md`: # Phase 9.1 Plan 01: Fix Flaky Test Parallel Execution Summary **[Substantive one-liner describing solution - e.g., "Applied .serialized trait to 5 flaky tests for reliable CI/CD execution"]** ## Performance - **Duration:** [time] - **Started:** [timestamp] - **Completed:** [timestamp] - **Tests:** 5 flaky tests fixed ## Accomplishments - Fixed test suite flakiness affecting 5 tests - Added @Suite attribute to ScenarioAPlannerSwiftTests - Applied .serialized trait to prevent parallel execution interference - Verified consistent test suite execution (5/5 runs pass) ## Files Created/Modified - `SportsTimeTests/ScenarioAPlannerSwiftTests.swift` - Added @Suite, .serialized to 3 tests - `SportsTimeTests/ScenarioBPlannerTests.swift` - Added .serialized to 2 tests - `SportsTimeTests/ScenarioCPlannerTests.swift` - Added .serialized to 1 test ## Decisions Made | Decision | Rationale | |----------|-----------| | Use .serialized trait instead of global serialization | Only 5/180+ tests affected, maintains parallel execution benefits for other tests | | Add @Suite to ScenarioAPlannerSwiftTests | Matches pattern in other test files, provides better test organization | ## Deviations from Plan [Auto-fixed issues or deferred enhancements] ## Issues Encountered [Problems and resolutions, or "None"] ## Next Phase Readiness - Test suite reliability: 100% (5/5 parallel runs pass) - Ready for Phase 10: Trip Builder Options TDD - No blockers