diff --git a/.planning/PROJECT.md b/.planning/PROJECT.md index b9f5297..58dc9e1 100644 --- a/.planning/PROJECT.md +++ b/.planning/PROJECT.md @@ -19,13 +19,13 @@ Every game must correctly link to its teams and stadium — a game at the wrong ### Active -- [ ] Split scripts by sport (MLB, NBA, NHL, NFL as separate modules) -- [ ] Complete stadium database with correct coordinates and names -- [ ] Stadium alias system for name variations across sources -- [ ] Correct game→team→stadium canonical linking for all sports -- [ ] Full CRUD CloudKit management (create, read, update, delete) -- [ ] Validation reports showing counts, gaps, and orphan records -- [ ] Team alias system for name variations across sources +- ✓ Split scripts by sport (MLB, NBA, NHL, NFL as separate modules) — 7 sport modules +- ✓ Complete stadium database with correct coordinates and names — 148 stadiums +- ✓ Stadium alias system for name variations across sources — alias JSON files +- ✓ Correct game→team→stadium canonical linking for all sports — canonicalize_games.py +- ✓ Full CRUD CloudKit management (create, read, update, delete) — cloudkit_import.py +- ✓ Validation reports showing counts, gaps, and orphan records — --validate flag +- ✓ Team alias system for name variations across sources — TEAM_ABBREV_ALIASES ### Out of Scope @@ -36,10 +36,10 @@ Every game must correctly link to its teams and stadium — a game at the wrong ## Context **Current State:** -- Data quality issues exist across all sports (wrong stadiums, missing games, broken team links) -- Stadium problems include: missing venues, wrong coordinates, name mismatches between sources -- Single large script files that are hard to debug and maintain -- Existing CloudKit import works but lacks verification and CRUD operations +- Data quality: Resolved — all games correctly link to teams and stadiums via canonical IDs +- Stadium database: Complete — 148 stadiums across 7 sports with verified coordinates +- Script organization: Resolved — sport-specific modules (mlb.py, nba.py, nhl.py, nfl.py, mls.py, wnba.py, nwsl.py) +- CloudKit: Full CRUD — create, update, delete with diff reporting, verification, and orphan detection **Existing Infrastructure:** - Python 3 with requests, beautifulsoup4, pandas, lxml @@ -63,9 +63,9 @@ Every game must correctly link to its teams and stadium — a game at the wrong | Decision | Rationale | Outcome | |----------|-----------|---------| -| Split by sport, not function | User preference for organization | — Pending | -| Validation reports over automated tests | Faster feedback, easier debugging | — Pending | -| Full CRUD over upload-only | Enable data corrections without full rebuild | — Pending | +| Split by sport, not function | User preference for organization | ✓ Completed — 7 sport modules (mlb.py, nba.py, nhl.py, nfl.py, mls.py, wnba.py, nwsl.py) | +| Validation reports over automated tests | Faster feedback, easier debugging | ✓ Completed — --validate flag with health scores and completeness metrics | +| Full CRUD over upload-only | Enable data corrections without full rebuild | ✓ Completed — create/update/delete with diff reporting and orphan detection | --- -*Last updated: 2026-01-09 after initialization* +*Last updated: 2026-01-10 — Project complete (all 7 phases finished)*