Commit Graph

91 Commits

Author SHA1 Message Date
Trey t
e195944297 feat(08-02): optimize GameDAGRouter performance for large datasets
Implemented dynamic beam width scaling and early termination to handle
5K-10K game datasets efficiently. All performance tests now pass.

Optimizations:
- Dynamic beam width: 800+ games use width 50, 2K+ use 30, 5K+ use 25
- Early termination: stop expanding when beam reaches 3x target size
- Prevents exponential blowup during day-by-day expansion

Performance results:
- 1K games: 2s (was 2s baseline) ✓
- 5K games: 1s (was 13s) - 13x faster ✓
- 10K games: 1-2s (was 34s) - 17x faster ✓

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-10 12:16:58 -06:00
Trey t
13be6ffca5 test(08-02): add performance tests with large datasets
Added 4 performance tests with 1K, 5K, 10K games to validate DAG
algorithm scalability. Tests currently failing (RED phase).

Tests:
- 1K games: <2s expected
- 5K games: <10s expected
- 10K games: <30s expected
- 10K games: memory stability

Helper generateLargeDataset() creates realistic test data with
distributed stadiums and games across time spans.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-10 11:58:51 -06:00
Trey t
6e00663fec docs(08-01): complete GameDAGRouter edge cases plan
Plan 08-01 complete:
- 17 TDD tests for GameDAGRouter edge cases
- canTransition boundary validation tests
- Anchor filtering and repeat city handling tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 11:52:51 -06:00
Trey t
02cb09f4e5 test(08-01): canTransition boundary tests and cleanup
Add 7 canTransition boundary tests:
- Same stadium same day 4 hours apart is feasible
- Different stadium 1000 miles apart same day is infeasible
- Different stadium 380 miles apart 2 days apart is feasible
- Different stadium 100 miles apart 4 hours available is feasible
- Different stadium 100 miles apart 1 hour available is infeasible
- Game end buffer (3 hour) validation
- Arrival buffer (1 hour) validation

Also removes broken DayCardTests that referenced types removed in
previous refactor (DayCard, DayConflictInfo).

Total: 17 GameDAGRouter edge case tests all passing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 11:51:38 -06:00
Trey t
a4db9a92eb test(08-01): GameDAGRouter edge case tests
Add 10 TDD tests for GameDAGRouter covering:
- Empty games array returns empty routes
- Single game returns single-game route
- Single game with non-matching anchor returns empty
- Two chronological feasible games returns combined route
- Two games too far apart same day returns separate routes
- Two games reverse chronological returns separate routes
- Three games with only feasible pairs returns valid combinations
- Anchor filtering excludes routes missing anchors
- Repeat cities OFF excludes same city twice
- Repeat cities ON allows same city twice

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 11:42:31 -06:00
Trey t
a786d7e2aa plan: Phase 8 DAG System TDD with 2 plans
- 08-01: GameDAGRouter edge cases and anchor validation TDD (17+ tests)
- 08-02: Performance with large datasets (10K+ games) and diversity coverage TDD

TDD discipline: tests define correctness, code must match.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 11:36:29 -06:00
Trey t
8c98e95801 docs: create milestone v1.1 TDD & Correctness (5 phases)
Phases:
- 8. DAG System TDD: Performance/edge case tests with 10k+ objects
- 9. Trip Planner Modes TDD: By dates, must-see games, start/end cities
- 10. Trip Builder Options TDD: Repeat cities, must-stops
- 11. Itinerary & Constraints TDD: Travel directions, driving limits
- 12. Integration Validation: End-to-end scenarios, regression suite

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 11:32:38 -06:00
Trey t
9ef4b1a770 remove cbb 2026-01-10 11:16:15 -06:00
Trey t
ca9fa535f1 chore: complete v1.0 Data Pipeline milestone
- Added MILESTONES.md entry with key accomplishments
- Evolved PROJECT.md with validated requirements
- Reorganized ROADMAP.md with milestone grouping
- Created milestone archive: milestones/v1.0-ROADMAP.md
- Updated STATE.md for next milestone planning
- Tagged v1.0

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 11:15:19 -06:00
Trey t
1b796a604c chore: remove CBB from pipeline scripts
CBB (College Basketball) was deferred in Phase 2.1 due to 350+ D1 teams
requiring a separate scoped approach. Remove it from pipeline scripts.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:56:24 -06:00
Trey t
63fb06c41a fix: update pipeline imports to use sport modules
After Phase 1 refactoring moved scraper functions to sport-specific
modules (nba.py, mlb.py, etc.), these pipeline scripts still imported
from scrape_schedules.py.

- run_pipeline.py: import from core.py and sport modules
- validate_data.py: import from core.py and sport modules
- run_canonicalization_pipeline.py: import from core.py and sport modules

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:52:13 -06:00
Trey t
78f08449fc docs(07-01): complete documentation & finalization plan
Tasks completed: 2/2
- Create Scripts/README.md with pipeline documentation
- Update PROJECT.md with completion status

SUMMARY: .planning/phases/07-testing-documentation/07-01-SUMMARY.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:44:48 -06:00
Trey t
f1adaf342e docs(07-01): update PROJECT.md with completion status
- Mark all Active requirements as complete (7 items)
- Update Key Decisions outcomes (split by sport, validation reports, full CRUD)
- Update Current State to reflect resolved data quality and complete pipeline
- Update last updated date to 2026-01-10

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:43:32 -06:00
Trey t
d9f446bccb docs(07-01): create Scripts/README.md with pipeline documentation
- Overview and quick start commands
- ASCII architecture diagram showing data flow
- Module reference table for all Python scripts
- Sport modules table with stadium counts
- Data files and alias file documentation
- Pipeline commands for scraping, canonicalization, CloudKit

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:42:47 -06:00
Trey t
aeeb160a90 docs(07-01): create phase plan
Phase 7: Testing & Documentation
- 1 plan created
- 2 tasks defined (README.md, PROJECT.md updates)
- Ready for execution

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:41:04 -06:00
Trey t
baa3dfef0b docs(06-01): complete validation reports plan
Add SUMMARY.md documenting validation capabilities:
- --validate flag with local/CloudKit/sync validation
- --list-orphans flag with completeness metrics and health score
- Menu options 16-17 for interactive mode

Update STATE.md: Phase 6 complete (14/14 plans, 100%)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:35:49 -06:00
Trey t
9d2dbf61dd feat(06-01): add orphan listing and completeness metrics
Add --list-orphans flag with orphan detection by record type,
data completeness metrics (coordinates, capacity, team/stadium refs),
health score calculation (0-100), and actionable recommendations.
Includes JSON export and menu option 17.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:34:39 -06:00
Trey t
9f0edc4228 feat(06-01): add comprehensive validation command
Add --validate flag with local validation, CloudKit relationship
checking, and sync status comparison. Includes JSON export via
--output flag and menu option 16 for interactive mode.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:31:52 -06:00
Trey t
4266940c8f docs(06-01): create validation reports phase plan
Phase 6: Validation Reports
- 1 plan created
- 2 tasks defined
- Ready for execution
2026-01-10 10:24:59 -06:00
Trey t
ad7a396704 docs(05-02): complete Phase 5 CloudKit CRUD
- Add 05-02-SUMMARY.md
- Update STATE.md: Phase 5 complete, ready for Phase 6
- Update ROADMAP.md: Mark Phase 5 and plan 05-02 complete

Phase 5 delivers full CRUD operations:
- Create: forceReplace import
- Read: --get, --list, --verify, query_all()
- Update: --update-record, --smart-sync
- Delete: --delete-record, --delete-orphans, --delete-all

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:18:48 -06:00
Trey t
5a08659837 feat(05-02): add individual record management commands
Add commands for managing individual CloudKit records:
- --get TYPE ID: Retrieve and display single record
- --list TYPE [--count]: List all recordNames for a type
- --update-record TYPE ID FIELD=VALUE: Update fields with conflict handling
- --delete-record TYPE ID [--force]: Delete with confirmation

Features:
- Type validation against VALID_RECORD_TYPES
- Triple lookup fallback: direct -> deterministic UUID -> canonicalId query
- Automatic type parsing for numeric field values
- Conflict detection with automatic forceReplace retry
- Deletion confirmation (skip with --force)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:17:40 -06:00
Trey t
5763db4a61 feat(05-02): add sync verification with --verify flag
- Add --verify flag for quick verification (counts + 5-record spot-check)
- Add --verify-deep flag for full field-by-field comparison
- Add verify_sync() function to compare CloudKit vs local data
- Add lookup() method to CloudKit class for record lookups
- Add menu options 14-15 for verify sync quick/deep
2026-01-10 10:13:08 -06:00
Trey t
b42a57fba2 docs(05-01): complete smart sync with change detection plan
Tasks completed: 2/2
- Add change detection with diff reporting
- Add differential sync with smart-sync flag

SUMMARY: .planning/phases/05-cloudkit-crud/05-01-SUMMARY.md
2026-01-10 10:09:43 -06:00
Trey t
d9a6aa4fe4 feat(05-01): add differential sync with smart-sync flag
- sync_diff() for differential uploads
- update operation with recordChangeTag conflict handling
- --smart-sync and --delete-orphans flags
- Menu options 12-13 for smart sync

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:08:04 -06:00
Trey t
0c74495ee5 feat(05-01): add change detection with diff reporting
- query_all() method with pagination
- compute_diff() returns new/updated/unchanged/deleted
- --diff flag shows report without importing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:05:29 -06:00
Trey t
e5c6d0fec7 docs(05): create CloudKit CRUD phase plans
Phase 5: CloudKit CRUD
- 2 plans created
- 4 total tasks defined
- Ready for execution

Plan 05-01: Smart sync with change detection
- Change detection with diff reporting
- Differential sync (upload only changed records)

Plan 05-02: Verification and record management
- Sync verification (CloudKit vs local comparison)
- Individual record CRUD operations

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:02:06 -06:00
Trey t
1675e22b26 docs(04-01): complete canonical linking phase
Create 04-01-SUMMARY.md documenting:
- 5760 games canonicalized with 100% resolution rate
- 3 team aliases added (WSH, NY, ATX)
- All validation checks passed

Update STATE.md:
- Phase 4 complete (11/19 plans done, 58%)
- Add 04-01 decision on iterative alias discovery

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:59:09 -06:00
Trey t
b6286119d7 feat(04-01): run game canonicalization pipeline
Generate canonical games with team/stadium links for 5760 games across
NBA, MLB, NHL, NFL, and MLS.

Added missing team aliases:
- NFL WSH -> team_nfl_was (Washington Commanders)
- MLS NY -> team_mls_nyrb (NY Red Bulls)
- MLS ATX -> team_mls_aus (Austin FC)

Remaining 8 warnings are expected NFL playoff placeholders (TBD/AFC/NFC).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:55:53 -06:00
Trey t
dbfaca206d docs(04-01): create canonical linking plan
Phase 4: Canonical Linking
- 1 plan created
- 3 tasks defined (game canonicalization, validation, fix issues)
- Ready for execution
2026-01-10 09:52:58 -06:00
Trey t
80bfb5919b docs(03-02): complete secondary sports canonicalization plan
Tasks completed: 3/3
- Add MLS to canonicalization pipeline (30 teams + 10 aliases + 8 stadium aliases)
- Add WNBA to canonicalization pipeline (13 teams + 6 aliases + 4 stadium aliases)
- Add NWSL to canonicalization pipeline (13 teams + 7 aliases + 3 stadium aliases)

Phase 3 complete - all 7 sports now have alias support (180 teams total)

SUMMARY: .planning/phases/03-alias-systems/03-02-SUMMARY.md
2026-01-10 09:45:09 -06:00
Trey t
81f620defe feat(03-02): add NWSL to canonicalization pipeline
- Import NWSL_TEAMS from nwsl module
- Add NWSL_DIVISIONS dict (single league structure, no divisions)
- Add NWSL to sport_mappings for team canonicalization
- Add NWSL team abbreviation aliases (ANG, GOTHAM, NCC, BAY, etc.)
- Add NWSL stadium aliases (CPKC Stadium, SeatGeek Stadium, WakeMed, etc.)

Total teams: 180 (13 NWSL teams added)
Final breakdown: NBA(30) + MLB(30) + NHL(32) + NFL(32) + MLS(30) + WNBA(13) + NWSL(13)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:43:18 -06:00
Trey t
285bc075d7 feat(03-02): add WNBA to canonicalization pipeline
- Import WNBA_TEAMS from wnba module
- Add WNBA_DIVISIONS dict (single league structure, no divisions)
- Add WNBA to sport_mappings for team canonicalization
- Update arena_key to use 'arena' for WNBA (like NBA/NHL)
- Add WNBA team abbreviation aliases (LV, LAS, NYL, PHX, etc.)
- Add WNBA stadium aliases (Michelob Ultra Arena, Gateway Center, etc.)

Total teams: 167 (13 WNBA teams added)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:41:59 -06:00
Trey t
b6a913df1d feat(03-02): add MLS to canonicalization pipeline
- Import MLS_TEAMS from mls module
- Add MLS_DIVISIONS dict (Eastern/Western conferences)
- Add MLS to sport_mappings for team canonicalization
- Add MLS team abbreviation aliases (LA, NYC, RBNY, etc.)
- Add MLS stadium historical aliases (BMO, PayPal Park, Shell Energy, etc.)

Total teams: 154 (30 MLS teams added)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:40:39 -06:00
Trey t
3e9bd24214 docs(03-01): complete NFL canonicalization plan
Tasks completed: 3/3
- Add NFL to canonicalize_teams.py (32 teams with division structure)
- Add NFL team abbreviation aliases to canonicalize_games.py (11 aliases)
- Add NFL stadium historical aliases to canonicalize_stadiums.py (14 stadiums)

SUMMARY: .planning/phases/03-alias-systems/03-01-SUMMARY.md
2026-01-10 09:38:03 -06:00
Trey t
90c9cef0bd feat(03-01): add NFL stadium historical aliases
Add NFL entries to HISTORICAL_STADIUM_ALIASES dict:
- Caesars Superdome (Mercedes-Benz, Louisiana Superdome)
- Paycor Stadium (Paul Brown Stadium)
- Empower Field at Mile High (Broncos Stadium, Sports Authority, Invesco, Mile High)
- Acrisure Stadium (Heinz Field)
- EverBank Stadium (TIAA Bank, Alltel, Jacksonville Municipal)
- Northwest Stadium (FedExField, Jack Kent Cooke)
- Hard Rock Stadium (Sun Life, Land Shark, Dolphin, Pro Player, Joe Robbie)
- Highmark Stadium (Bills Stadium, New Era, Ralph Wilson, Rich Stadium)
- GEHA Field at Arrowhead Stadium (Arrowhead Stadium)
- AT&T Stadium (Cowboys Stadium)
- Lumen Field (CenturyLink, Qwest, Seahawks Stadium)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:36:11 -06:00
Trey t
41496b6bea feat(03-01): add NFL team abbreviation aliases
Add NFL entries to TEAM_ABBREV_ALIASES dict:
- Historical relocations: OAK→LV, SD→LAC, STL→LAR
- Common 3-letter variations: JAC, GNB, KAN, NWE, NOR, TAM, SFO
- Direct match for WAS included for completeness

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:35:21 -06:00
Trey t
d4d0d95c54 feat(03-01): add NFL to team canonicalization
Add NFL support to canonicalize_teams.py:
- Import NFL_TEAMS from scrape_schedules
- Add NFL_DIVISIONS dict with all 32 teams mapped to conference/division
- Include NFL in sport_mappings for canonicalization
- Add NFL_DIVISIONS to division_map lookup

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:34:49 -06:00
Trey t
6ad7de6484 docs(03): update project state for Phase 3
- Current focus: Phase 3 - Alias Systems
- Phase planned, ready for execution
- Next action: Execute 03-01-PLAN.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:32:14 -06:00
Trey t
163d57bc3b docs(03): create phase plans for Alias Systems
Phase 03: Alias Systems
- 2 plans created
- 6 total tasks defined
- Ready for execution

Plan 1: Add NFL to canonicalization pipeline with aliases
Plan 2: Add MLS, WNBA, NWSL to canonicalization pipeline

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:31:51 -06:00
Trey t
33dc15f729 docs(02.1-03): complete Phase 2.1 Additional Sports Stadiums
- STATE.md: Phase 2.1 complete (3/3 plans)
- ROADMAP.md: Phase 2.1 marked complete

Phase 2.1 delivered:
- MLS: 30 stadiums
- WNBA: 13 arenas
- NWSL: 13 stadiums
- Total: 56 new venues added
2026-01-10 01:08:25 -06:00
Trey t
90b2210e44 docs(02.1-03): complete NWSL sport module plan
Tasks completed: 2/2
- Create NWSL sport module with hardcoded stadiums
- Integrate NWSL module with scrape_schedules.py

Phase 2.1 complete: MLS, WNBA, NWSL modules created

SUMMARY: .planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-03-SUMMARY.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 01:07:21 -06:00
Trey t
5307fdf6a4 feat(02.1-03): integrate NWSL module with scrape_schedules.py
Update scrape_schedules.py to import NWSL stadium functionality from nwsl.py:
- Add import for NWSL_TEAMS, get_nwsl_team_abbrev, scrape_nwsl_stadiums
- Remove inline NWSL_TEAMS dict (now imported from nwsl.py)
- Remove stub scrape_nwsl_stadiums function (now using module implementation)
- Update docstrings and comments to reflect module structure

Stadium scraping now uses modules for all secondary sports:
- MLS: 30 stadiums from mls.py
- WNBA: 13 arenas from wnba.py
- NWSL: 13 stadiums from nwsl.py

Only CBB remains inline (350+ D1 teams requires separate scoped phase).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 01:06:14 -06:00
Trey t
75e2498382 feat(02.1-03): create NWSL sport module with hardcoded stadiums
Create nwsl.py following the established sport module pattern:
- 13 NWSL teams matching current 2025 season roster
- All 13 stadiums with complete data (capacity, year_opened, coordinates)
- Cross-referenced MLS coordinates for shared stadiums (10 shared with MLS)
- 3 NWSL-specific stadiums: SeatGeek Stadium, CPKC Stadium, WakeMed Soccer Park

Module exports:
- NWSL_TEAMS dict
- get_nwsl_team_abbrev() function
- scrape_nwsl_stadiums_hardcoded() function
- scrape_nwsl_stadiums() function with fallback system
- NWSL_STADIUM_SOURCES configuration

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 01:04:15 -06:00
Trey t
67792279f1 docs(02.1-02): update project state and roadmap
- STATE.md: Position 2/3 in phase 2.1, metrics updated
- ROADMAP.md: 02.1-02 marked complete
2026-01-10 01:01:58 -06:00
Trey t
b529fd592a docs(02.1-02): complete WNBA sport module plan
Tasks completed: 2/2
- Create WNBA sport module with 13 hardcoded arenas
- Integrate WNBA module with scrape_schedules.py

SUMMARY: .planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-02-SUMMARY.md
2026-01-10 01:00:50 -06:00
Trey t
f141136bb4 feat(02.1-02): integrate WNBA module with scrape_schedules.py
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:59:53 -06:00
Trey t
5a51dab59f feat(02.1-02): create WNBA sport module with 13 hardcoded arenas
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:59:02 -06:00
Trey t
4cce2e0d48 docs(02.1-01): update project state and roadmap
- STATE.md: Position 1/3 in phase 2.1, metrics updated
- ROADMAP.md: 02.1-01 marked complete
2026-01-10 00:56:22 -06:00
Trey t
e2d629b76f docs(02.1-01): complete MLS sport module plan
Tasks completed: 2/2
- Create MLS sport module with 30 hardcoded stadiums
- Integrate MLS module with scrape_schedules.py

SUMMARY: .planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-01-SUMMARY.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:55:01 -06:00
Trey t
8f1803b10d feat(02.1-01): integrate MLS module with scrape_schedules.py
- Import MLS_TEAMS, get_mls_team_abbrev, scrape_mls_stadiums from mls.py
- Remove inline MLS_TEAMS dict (now imported from module)
- Remove inline MLS stadium scraper functions (now in mls.py)
- Update TODO comments to reflect MLS extraction complete

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:52:17 -06:00