Phase 2: Stadium Foundation - 2 plans created - 5 total tasks defined - Ready for execution Plan 02-01: Audit & complete hardcoded stadium data Plan 02-02: Regenerate canonical data and verify pipeline Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
4.9 KiB
phase, plan, type
| phase | plan | type |
|---|---|---|
| 02-stadium-foundation | 01 | execute |
Purpose: Ensure all sport modules have complete, accurate stadium data that will flow through the canonicalization pipeline. Output: All 4 sport modules with complete stadium data (city, state, lat/lng, capacity, year_opened, teams).
<execution_context> ~/.claude/get-shit-done/workflows/execute-phase.md ~/.claude/get-shit-done/templates/summary.md </execution_context>
@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/01-script-architecture/01-03-SUMMARY.mdKey files: @Scripts/mlb.py @Scripts/nba.py @Scripts/nhl.py @Scripts/nfl.py
Current state:
- MLB, NBA, NHL, NFL modules have hardcoded stadium data with city, state, lat/lng, capacity, teams
- Missing field: year_opened (null in all canonical data)
- NFL module created in Phase 1 Plan 03 with 30 hardcoded stadiums
- Bundled stadiums_canonical.json has incomplete data (state="", capacity=0, missing NFL)
Expected stadium counts:
- MLB: 30 stadiums (30 teams)
- NBA: 30 stadiums (30 teams)
- NHL: 32 stadiums (32 teams)
- NFL: 30 stadiums (32 teams, 2 shared: SoFi Stadium, MetLife Stadium)
Stadium data structure:
Each module has scrape_{sport}_stadiums_hardcoded() returning Stadium objects with:
- name, city, state, lat/lng, capacity, teams
- Missing: year_opened for filtering historical/renamed venues
Do NOT modify any files in this task - audit only. The goal is to understand current state before making changes. Print audit summary showing stadium counts per sport and any data quality issues found Audit report shows MLB:30, NBA:30, NHL:32, NFL:30 stadiums with all required fields documented
Task 2: Add year_opened to all hardcoded stadiums Scripts/mlb.py, Scripts/nba.py, Scripts/nhl.py, Scripts/nfl.py Add year_opened to each stadium's hardcoded data. Use the actual opening year for each venue:MLB stadiums (sample):
- Fenway Park: 1912
- Wrigley Field: 1914
- Dodger Stadium: 1962
- Globe Life Field: 2020
NBA arenas (sample):
- TD Garden: 1995
- Madison Square Garden: 1968
- Chase Center: 2019
- Intuit Dome: 2024
NHL arenas: Many share with NBA - verify and match
NFL stadiums (sample):
- Lambeau Field: 1957
- SoFi Stadium: 2020
- Allegiant Stadium: 2020
For each module:
- Update the hardcoded dict to include 'year_opened' key
- Update Stadium object creation to include year_opened parameter
- Ensure Stadium dataclass in core.py has year_opened field (verify first)
Research actual opening years from Wikipedia if unsure. Use the original opening year, not renovation years.
Run python -c "from mlb import scrape_mlb_stadiums; s=scrape_mlb_stadiums(); print(f'MLB: {len(s)} stadiums, year_opened example: {s[0].year_opened if hasattr(s[0], \"year_opened\") else \"MISSING\"}')" for each sport
All 4 sport modules have year_opened in hardcoded data, Stadium objects include year_opened field
<success_criteria>
- Task 1: Audit complete with documented counts and any gaps identified
- Task 2: year_opened added to all hardcoded stadiums in all 4 modules
- No import errors when loading modules
- Ready for Plan 02 (pipeline regeneration) </success_criteria>
Phase 2 Plan 01: Stadium Data Audit & Completion Summary
[Substantive one-liner]
Accomplishments
- [Stadium counts verified]
- [year_opened added to all modules]
Files Created/Modified
Scripts/mlb.py- Added year_openedScripts/nba.py- Added year_openedScripts/nhl.py- Added year_openedScripts/nfl.py- Added year_opened
Decisions Made
[Any gaps found and how resolved]
Issues Encountered
[Any data issues discovered]
Next Step
Ready for 02-02-PLAN.md (pipeline regeneration)