Commit Graph

74 Commits

Author SHA1 Message Date
Trey t
9f0edc4228 feat(06-01): add comprehensive validation command
Add --validate flag with local validation, CloudKit relationship
checking, and sync status comparison. Includes JSON export via
--output flag and menu option 16 for interactive mode.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:31:52 -06:00
Trey t
4266940c8f docs(06-01): create validation reports phase plan
Phase 6: Validation Reports
- 1 plan created
- 2 tasks defined
- Ready for execution
2026-01-10 10:24:59 -06:00
Trey t
ad7a396704 docs(05-02): complete Phase 5 CloudKit CRUD
- Add 05-02-SUMMARY.md
- Update STATE.md: Phase 5 complete, ready for Phase 6
- Update ROADMAP.md: Mark Phase 5 and plan 05-02 complete

Phase 5 delivers full CRUD operations:
- Create: forceReplace import
- Read: --get, --list, --verify, query_all()
- Update: --update-record, --smart-sync
- Delete: --delete-record, --delete-orphans, --delete-all

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:18:48 -06:00
Trey t
5a08659837 feat(05-02): add individual record management commands
Add commands for managing individual CloudKit records:
- --get TYPE ID: Retrieve and display single record
- --list TYPE [--count]: List all recordNames for a type
- --update-record TYPE ID FIELD=VALUE: Update fields with conflict handling
- --delete-record TYPE ID [--force]: Delete with confirmation

Features:
- Type validation against VALID_RECORD_TYPES
- Triple lookup fallback: direct -> deterministic UUID -> canonicalId query
- Automatic type parsing for numeric field values
- Conflict detection with automatic forceReplace retry
- Deletion confirmation (skip with --force)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:17:40 -06:00
Trey t
5763db4a61 feat(05-02): add sync verification with --verify flag
- Add --verify flag for quick verification (counts + 5-record spot-check)
- Add --verify-deep flag for full field-by-field comparison
- Add verify_sync() function to compare CloudKit vs local data
- Add lookup() method to CloudKit class for record lookups
- Add menu options 14-15 for verify sync quick/deep
2026-01-10 10:13:08 -06:00
Trey t
b42a57fba2 docs(05-01): complete smart sync with change detection plan
Tasks completed: 2/2
- Add change detection with diff reporting
- Add differential sync with smart-sync flag

SUMMARY: .planning/phases/05-cloudkit-crud/05-01-SUMMARY.md
2026-01-10 10:09:43 -06:00
Trey t
d9a6aa4fe4 feat(05-01): add differential sync with smart-sync flag
- sync_diff() for differential uploads
- update operation with recordChangeTag conflict handling
- --smart-sync and --delete-orphans flags
- Menu options 12-13 for smart sync

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:08:04 -06:00
Trey t
0c74495ee5 feat(05-01): add change detection with diff reporting
- query_all() method with pagination
- compute_diff() returns new/updated/unchanged/deleted
- --diff flag shows report without importing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:05:29 -06:00
Trey t
e5c6d0fec7 docs(05): create CloudKit CRUD phase plans
Phase 5: CloudKit CRUD
- 2 plans created
- 4 total tasks defined
- Ready for execution

Plan 05-01: Smart sync with change detection
- Change detection with diff reporting
- Differential sync (upload only changed records)

Plan 05-02: Verification and record management
- Sync verification (CloudKit vs local comparison)
- Individual record CRUD operations

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:02:06 -06:00
Trey t
1675e22b26 docs(04-01): complete canonical linking phase
Create 04-01-SUMMARY.md documenting:
- 5760 games canonicalized with 100% resolution rate
- 3 team aliases added (WSH, NY, ATX)
- All validation checks passed

Update STATE.md:
- Phase 4 complete (11/19 plans done, 58%)
- Add 04-01 decision on iterative alias discovery

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:59:09 -06:00
Trey t
b6286119d7 feat(04-01): run game canonicalization pipeline
Generate canonical games with team/stadium links for 5760 games across
NBA, MLB, NHL, NFL, and MLS.

Added missing team aliases:
- NFL WSH -> team_nfl_was (Washington Commanders)
- MLS NY -> team_mls_nyrb (NY Red Bulls)
- MLS ATX -> team_mls_aus (Austin FC)

Remaining 8 warnings are expected NFL playoff placeholders (TBD/AFC/NFC).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:55:53 -06:00
Trey t
dbfaca206d docs(04-01): create canonical linking plan
Phase 4: Canonical Linking
- 1 plan created
- 3 tasks defined (game canonicalization, validation, fix issues)
- Ready for execution
2026-01-10 09:52:58 -06:00
Trey t
80bfb5919b docs(03-02): complete secondary sports canonicalization plan
Tasks completed: 3/3
- Add MLS to canonicalization pipeline (30 teams + 10 aliases + 8 stadium aliases)
- Add WNBA to canonicalization pipeline (13 teams + 6 aliases + 4 stadium aliases)
- Add NWSL to canonicalization pipeline (13 teams + 7 aliases + 3 stadium aliases)

Phase 3 complete - all 7 sports now have alias support (180 teams total)

SUMMARY: .planning/phases/03-alias-systems/03-02-SUMMARY.md
2026-01-10 09:45:09 -06:00
Trey t
81f620defe feat(03-02): add NWSL to canonicalization pipeline
- Import NWSL_TEAMS from nwsl module
- Add NWSL_DIVISIONS dict (single league structure, no divisions)
- Add NWSL to sport_mappings for team canonicalization
- Add NWSL team abbreviation aliases (ANG, GOTHAM, NCC, BAY, etc.)
- Add NWSL stadium aliases (CPKC Stadium, SeatGeek Stadium, WakeMed, etc.)

Total teams: 180 (13 NWSL teams added)
Final breakdown: NBA(30) + MLB(30) + NHL(32) + NFL(32) + MLS(30) + WNBA(13) + NWSL(13)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:43:18 -06:00
Trey t
285bc075d7 feat(03-02): add WNBA to canonicalization pipeline
- Import WNBA_TEAMS from wnba module
- Add WNBA_DIVISIONS dict (single league structure, no divisions)
- Add WNBA to sport_mappings for team canonicalization
- Update arena_key to use 'arena' for WNBA (like NBA/NHL)
- Add WNBA team abbreviation aliases (LV, LAS, NYL, PHX, etc.)
- Add WNBA stadium aliases (Michelob Ultra Arena, Gateway Center, etc.)

Total teams: 167 (13 WNBA teams added)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:41:59 -06:00
Trey t
b6a913df1d feat(03-02): add MLS to canonicalization pipeline
- Import MLS_TEAMS from mls module
- Add MLS_DIVISIONS dict (Eastern/Western conferences)
- Add MLS to sport_mappings for team canonicalization
- Add MLS team abbreviation aliases (LA, NYC, RBNY, etc.)
- Add MLS stadium historical aliases (BMO, PayPal Park, Shell Energy, etc.)

Total teams: 154 (30 MLS teams added)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:40:39 -06:00
Trey t
3e9bd24214 docs(03-01): complete NFL canonicalization plan
Tasks completed: 3/3
- Add NFL to canonicalize_teams.py (32 teams with division structure)
- Add NFL team abbreviation aliases to canonicalize_games.py (11 aliases)
- Add NFL stadium historical aliases to canonicalize_stadiums.py (14 stadiums)

SUMMARY: .planning/phases/03-alias-systems/03-01-SUMMARY.md
2026-01-10 09:38:03 -06:00
Trey t
90c9cef0bd feat(03-01): add NFL stadium historical aliases
Add NFL entries to HISTORICAL_STADIUM_ALIASES dict:
- Caesars Superdome (Mercedes-Benz, Louisiana Superdome)
- Paycor Stadium (Paul Brown Stadium)
- Empower Field at Mile High (Broncos Stadium, Sports Authority, Invesco, Mile High)
- Acrisure Stadium (Heinz Field)
- EverBank Stadium (TIAA Bank, Alltel, Jacksonville Municipal)
- Northwest Stadium (FedExField, Jack Kent Cooke)
- Hard Rock Stadium (Sun Life, Land Shark, Dolphin, Pro Player, Joe Robbie)
- Highmark Stadium (Bills Stadium, New Era, Ralph Wilson, Rich Stadium)
- GEHA Field at Arrowhead Stadium (Arrowhead Stadium)
- AT&T Stadium (Cowboys Stadium)
- Lumen Field (CenturyLink, Qwest, Seahawks Stadium)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:36:11 -06:00
Trey t
41496b6bea feat(03-01): add NFL team abbreviation aliases
Add NFL entries to TEAM_ABBREV_ALIASES dict:
- Historical relocations: OAK→LV, SD→LAC, STL→LAR
- Common 3-letter variations: JAC, GNB, KAN, NWE, NOR, TAM, SFO
- Direct match for WAS included for completeness

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:35:21 -06:00
Trey t
d4d0d95c54 feat(03-01): add NFL to team canonicalization
Add NFL support to canonicalize_teams.py:
- Import NFL_TEAMS from scrape_schedules
- Add NFL_DIVISIONS dict with all 32 teams mapped to conference/division
- Include NFL in sport_mappings for canonicalization
- Add NFL_DIVISIONS to division_map lookup

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:34:49 -06:00
Trey t
6ad7de6484 docs(03): update project state for Phase 3
- Current focus: Phase 3 - Alias Systems
- Phase planned, ready for execution
- Next action: Execute 03-01-PLAN.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:32:14 -06:00
Trey t
163d57bc3b docs(03): create phase plans for Alias Systems
Phase 03: Alias Systems
- 2 plans created
- 6 total tasks defined
- Ready for execution

Plan 1: Add NFL to canonicalization pipeline with aliases
Plan 2: Add MLS, WNBA, NWSL to canonicalization pipeline

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:31:51 -06:00
Trey t
33dc15f729 docs(02.1-03): complete Phase 2.1 Additional Sports Stadiums
- STATE.md: Phase 2.1 complete (3/3 plans)
- ROADMAP.md: Phase 2.1 marked complete

Phase 2.1 delivered:
- MLS: 30 stadiums
- WNBA: 13 arenas
- NWSL: 13 stadiums
- Total: 56 new venues added
2026-01-10 01:08:25 -06:00
Trey t
90b2210e44 docs(02.1-03): complete NWSL sport module plan
Tasks completed: 2/2
- Create NWSL sport module with hardcoded stadiums
- Integrate NWSL module with scrape_schedules.py

Phase 2.1 complete: MLS, WNBA, NWSL modules created

SUMMARY: .planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-03-SUMMARY.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 01:07:21 -06:00
Trey t
5307fdf6a4 feat(02.1-03): integrate NWSL module with scrape_schedules.py
Update scrape_schedules.py to import NWSL stadium functionality from nwsl.py:
- Add import for NWSL_TEAMS, get_nwsl_team_abbrev, scrape_nwsl_stadiums
- Remove inline NWSL_TEAMS dict (now imported from nwsl.py)
- Remove stub scrape_nwsl_stadiums function (now using module implementation)
- Update docstrings and comments to reflect module structure

Stadium scraping now uses modules for all secondary sports:
- MLS: 30 stadiums from mls.py
- WNBA: 13 arenas from wnba.py
- NWSL: 13 stadiums from nwsl.py

Only CBB remains inline (350+ D1 teams requires separate scoped phase).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 01:06:14 -06:00
Trey t
75e2498382 feat(02.1-03): create NWSL sport module with hardcoded stadiums
Create nwsl.py following the established sport module pattern:
- 13 NWSL teams matching current 2025 season roster
- All 13 stadiums with complete data (capacity, year_opened, coordinates)
- Cross-referenced MLS coordinates for shared stadiums (10 shared with MLS)
- 3 NWSL-specific stadiums: SeatGeek Stadium, CPKC Stadium, WakeMed Soccer Park

Module exports:
- NWSL_TEAMS dict
- get_nwsl_team_abbrev() function
- scrape_nwsl_stadiums_hardcoded() function
- scrape_nwsl_stadiums() function with fallback system
- NWSL_STADIUM_SOURCES configuration

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 01:04:15 -06:00
Trey t
67792279f1 docs(02.1-02): update project state and roadmap
- STATE.md: Position 2/3 in phase 2.1, metrics updated
- ROADMAP.md: 02.1-02 marked complete
2026-01-10 01:01:58 -06:00
Trey t
b529fd592a docs(02.1-02): complete WNBA sport module plan
Tasks completed: 2/2
- Create WNBA sport module with 13 hardcoded arenas
- Integrate WNBA module with scrape_schedules.py

SUMMARY: .planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-02-SUMMARY.md
2026-01-10 01:00:50 -06:00
Trey t
f141136bb4 feat(02.1-02): integrate WNBA module with scrape_schedules.py
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:59:53 -06:00
Trey t
5a51dab59f feat(02.1-02): create WNBA sport module with 13 hardcoded arenas
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:59:02 -06:00
Trey t
4cce2e0d48 docs(02.1-01): update project state and roadmap
- STATE.md: Position 1/3 in phase 2.1, metrics updated
- ROADMAP.md: 02.1-01 marked complete
2026-01-10 00:56:22 -06:00
Trey t
e2d629b76f docs(02.1-01): complete MLS sport module plan
Tasks completed: 2/2
- Create MLS sport module with 30 hardcoded stadiums
- Integrate MLS module with scrape_schedules.py

SUMMARY: .planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-01-SUMMARY.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:55:01 -06:00
Trey t
8f1803b10d feat(02.1-01): integrate MLS module with scrape_schedules.py
- Import MLS_TEAMS, get_mls_team_abbrev, scrape_mls_stadiums from mls.py
- Remove inline MLS_TEAMS dict (now imported from module)
- Remove inline MLS stadium scraper functions (now in mls.py)
- Update TODO comments to reflect MLS extraction complete

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:52:17 -06:00
Trey t
addc9b37f7 feat(02.1-01): create MLS sport module with 30 hardcoded stadiums
Add complete MLS stadium data following established sport module pattern:
- 30 MLS stadiums with capacity (soccer configuration) and year_opened
- MLS_TEAMS dict with all 30 teams
- get_mls_team_abbrev() function for team abbreviation lookup
- scrape_mls_stadiums_hardcoded() as primary source
- scrape_mls_stadiums_gavinr() as fallback source
- MLS_STADIUM_SOURCES configuration for fallback system

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:50:56 -06:00
Trey t
02d154cf46 docs(02.1): create phase plan for additional sports stadiums
Phase 2.1: Additional Sports Stadiums
- 3 plans created (MLS, WNBA, NWSL modules)
- CBB deferred to future phase (350+ D1 teams)
- 6 total tasks defined
- Ready for execution

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:47:02 -06:00
Trey t
64137b57bf docs(02-02): complete Phase 2 Stadium Foundation
- Add 02-02-SUMMARY.md documenting pipeline regeneration
- Update STATE.md: Phase 2 complete, next is Phase 2.1
- Update ROADMAP.md: Mark Phase 2 as complete (2/2 plans)
- Performance: 5 plans, 37 min total, 7.4 min average

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:41:07 -06:00
Trey t
1808d2c3d0 feat(02-02): bundle 122 core stadiums (MLB/NBA/NHL/NFL)
- Filter bundled JSON to core 4 sports only (152 → 122 stadiums)
- Exclude MLS stadiums (incomplete data, deferred to Phase 2.1)
- Filter aliases to match (200 → 165 aliases)
- All fields populated: no empty state, zero capacity, or null year

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:39:09 -06:00
Trey t
c2da6a7770 feat(02-02): regenerate stadium data with canonicalization pipeline
- Ran scrape_schedules.py --stadiums-update
- Ran canonicalize_stadiums.py for canonical IDs
- Core sports: MLB:30, NBA:30, NHL:32, NFL:30 (122 total)
- MLS stadiums also included from comprehensive scrape (30)
- Stadium aliases generated for historical name mappings

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:37:09 -06:00
Trey t
90bdf1608c feat(02-01): add year_opened to all 122 hardcoded stadiums
Added year_opened field to stadium data across all 4 sport modules:
- MLB: 30 ballparks (1912-2023)
- NBA: 30 arenas (1968-2024)
- NHL: 32 arenas (1968-2021)
- NFL: 30 stadiums (1924-2020)

Updated Stadium object creation in all modules to pass year_opened.
Stadium dataclass already supported the field.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:31:45 -06:00
Trey t
95861cae40 docs(02): create stadium foundation phase plans
Phase 2: Stadium Foundation
- 2 plans created
- 5 total tasks defined
- Ready for execution

Plan 02-01: Audit & complete hardcoded stadium data
Plan 02-02: Regenerate canonical data and verify pipeline

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:24:41 -06:00
Trey t
3f84890fec docs(01-03): complete nfl.py + orchestrator refactor plan
- Create 01-03-SUMMARY.md documenting NFL module and orchestrator refactor
- Update STATE.md: Phase 1 complete, ready for Phase 2
- Update ROADMAP.md: Mark Phase 1 as complete (3/3 plans)
- Phase 1 total duration: 23 min across 3 plans

Phase 1: Script Architecture complete. All 4 core sports (MLB, NBA, NHL, NFL)
now have dedicated modules with consistent patterns.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:20:13 -06:00
Trey t
b93205e7fb feat(01-03): refactor scrape_schedules.py to orchestrator
Transform monolithic 3359-line script into thin 733-line orchestrator:

- Import core utilities from core.py (Game, Stadium, fallback system)
- Import MLB/NBA/NHL/NFL scrapers from dedicated sport modules
- Core sports now use module convenience functions (scrape_{sport}_games)
- Non-core sports (WNBA, MLS, NWSL, CBB) remain inline with TODO markers
- CLI unchanged: --sport, --season, --stadiums-only, --stadiums-update
- 78% reduction in orchestrator size (3359 -> 733 lines)

Phase 1: Script Architecture complete - all 4 core sports modularized.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:18:09 -06:00
Trey t
a6c9230335 feat(01-03): create nfl.py sport module
Extract NFL scrapers from monolithic scrape_schedules.py into dedicated
sport module following established pattern from nba.py/nhl.py:

- NFL_TEAMS: 32 teams with stadiums
- Game scrapers: ESPN API, Pro-Football-Reference, CBS Sports
- Stadium scrapers: ScoreBot, GeoJSON gist, hardcoded fallback
- NFL_GAME_SOURCES and NFL_STADIUM_SOURCES configurations
- get_nfl_season_string() for cross-calendar-year format (2025-26)
- scrape_nfl_games() convenience function with fallback

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:14:42 -06:00
Trey t
bf65a59fb4 docs(01-02): complete NBA + NHL modules plan
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:08:27 -06:00
Trey t
c229fa73fd feat(01-02): create nhl.py sport module
NHL team mappings, Hockey-Reference/NHL API/ESPN scrapers, stadium data with coordinates.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:07:38 -06:00
Trey t
70acfb7bc6 feat(01-02): create nba.py sport module
NBA team mappings, Basketball-Reference/ESPN/CBS scrapers, stadium data with coordinates.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:07:37 -06:00
Trey t
60b450d869 docs: add Phase 1 plans and codebase documentation
- 01-01-PLAN.md: core.py + mlb.py (executed)
- 01-02-PLAN.md: nba.py + nhl.py
- 01-03-PLAN.md: nfl.py + orchestrator refactor
- Codebase documentation for planning context

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:00:45 -06:00
Trey t
504187059f docs(01-01): complete core.py + mlb.py plan
Tasks completed: 2/2
- Create core.py shared module
- Create mlb.py sport module

SUMMARY: .planning/phases/01-script-architecture/01-01-SUMMARY.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:00:33 -06:00
Trey t
cdf4c775ff feat(01-01): create mlb.py sport module
- MLB_TEAMS dictionary with all 30 teams
- Game scrapers: Baseball-Reference, MLB Stats API, ESPN
- Stadium scrapers: MLBScoreBot, GeoJSON, hardcoded fallback
- MLB_GAME_SOURCES and MLB_STADIUM_SOURCES configurations
- scrape_mlb_games() convenience function

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 23:59:04 -06:00
Trey t
edbb5dbbda feat(01-01): create core.py shared module
- Rate limiting utilities (REQUEST_DELAY, rate_limit, fetch_page)
- Data classes (Game, Stadium)
- Multi-source fallback system (ScraperSource, scrape_with_fallback)
- Stadium fallback system (StadiumScraperSource, scrape_stadiums_with_fallback)
- ID generation (assign_stable_ids)
- Export utilities (export_to_json, validate_games)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 23:58:55 -06:00