Phase 2.1: Additional Sports Stadiums - 3 plans created (MLS, WNBA, NWSL modules) - CBB deferred to future phase (350+ D1 teams) - 6 total tasks defined - Ready for execution Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5.6 KiB
phase, plan, type
| phase | plan | type |
|---|---|---|
| 2.1-additional-sports-stadiums | 01 | execute |
Purpose: Enable MLS stadium data to flow through the canonicalization pipeline like the core 4 sports. Output: mls.py module with 30 stadiums including capacity, year_opened, and coordinates.
<execution_context> ~/.claude/get-shit-done/workflows/execute-phase.md ~/.claude/get-shit-done/templates/summary.md </execution_context>
@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.mdPrior phase context:
@.planning/phases/02-stadium-foundation/02-02-SUMMARY.md
Pattern reference (follow this module structure):
@Scripts/mlb.py @Scripts/nba.py
Current MLS data location:
@Scripts/scrape_schedules.py (MLS_TEAMS dict at line 93) @Scripts/data/stadiums.json (MLS entries have lat/lng but missing capacity/year_opened)
Core module for imports:
@Scripts/core.py
Tech stack available: Python 3, dataclasses, requests Established patterns: Sport module structure (team dict, get_abbrev function, hardcoded stadiums, scraper sources) Constraining decisions:
- Phase 02-02: MLS excluded from bundled JSON due to incomplete data (zero capacity, null year_opened)
- Module docstring and imports (try/except for core imports)
- all exports list
- MLS_TEAMS dict (copy from scrape_schedules.py, 30 teams)
- get_mls_team_abbrev() function
- Hardcoded MLS stadiums dict with COMPLETE data:
- All 30 MLS stadiums
- Each entry needs: city, state, lat, lng, capacity, teams (list of abbrevs), year_opened
- Use existing lat/lng from Scripts/data/stadiums.json where available
- Research capacity and year_opened for each stadium
Key stadiums to research (capacity/year_opened):
- Mercedes-Benz Stadium (ATL) - shared with NFL
- Q2 Stadium (Austin) - MLS-specific, opened 2021
- Bank of America Stadium (CLT) - shared with NFL
- Soldier Field (CHI) - shared with NFL
- TQL Stadium (CIN) - MLS-specific, opened 2021
- Dick's Sporting Goods Park (COL)
- Lower.com Field (CLB) - opened 2021
- Toyota Stadium (DAL)
- Audi Field (DC) - MLS-specific, opened 2018
- Shell Energy Stadium (HOU) - MLS-specific
- Dignity Health Sports Park (LAG)
- BMO Stadium (LAFC) - opened 2018
- Chase Stadium (MIA) - MLS-specific
- Allianz Field (MIN) - opened 2019
- Stade Saputo (MTL)
- Geodis Park (NSH) - opened 2022
- Gillette Stadium (NE) - shared with NFL
- Yankee Stadium (NYCFC) - shared with MLB
- Red Bull Arena (NYRB)
- Inter&Co Stadium (ORL)
- Subaru Park (PHI)
- Providence Park (POR)
- America First Field (RSL)
- PayPal Park (SJ)
- Lumen Field (SEA) - shared with NFL
- Children's Mercy Park (SKC)
- CityPark (STL) - opened 2023
- BMO Field (TOR)
- BC Place (VAN) - shared stadium
- Snapdragon Stadium (SD) - shared, opened 2022
- scrape_mls_stadiums_hardcoded() function returning list[Stadium]
- scrape_mls_stadiums() function with fallback sources
- MLS_STADIUM_SOURCES configuration
Note: Some stadiums are shared with NFL/MLB - use correct MLS-specific capacity where different (soccer configuration). python3 -c "from Scripts.mls import MLS_TEAMS, scrape_mls_stadiums_hardcoded; s = scrape_mls_stadiums_hardcoded(); print(f'{len(s)} stadiums'); assert len(s) == 30; assert all(st.capacity > 0 for st in s); assert all(st.year_opened for st in s)" mls.py exists with 30 teams, 30 stadiums, all with non-zero capacity and year_opened values
Task 2: Integrate MLS module with scrape_schedules.py Scripts/scrape_schedules.py Update scrape_schedules.py to use the new mls.py module:-
Add import at top (with try/except pattern):
- from mls import MLS_TEAMS, get_mls_team_abbrev, scrape_mls_stadiums, MLS_STADIUM_SOURCES
-
Remove inline MLS_TEAMS dict (lines ~93-124) - now imported from mls.py
-
Update get_team_abbrev() function to use get_mls_team_abbrev() for MLS
-
Update scrape_mls_stadiums_gavinr() to be a secondary source (keep it, but mls.py hardcoded is primary)
-
Update the stadium scraping section to use scrape_mls_stadiums() from mls.py
-
Verify MLS games scraping still works (uses MLS_TEAMS for abbreviation lookup)
Do NOT remove the game scraping functions (scrape_mls_fbref, etc.) - those stay inline for now. cd Scripts && python3 -c "from scrape_schedules import MLS_TEAMS, get_team_abbrev; print(f'MLS teams: {len(MLS_TEAMS)}'); abbrev = get_team_abbrev('Atlanta United FC', 'MLS'); print(f'ATL United abbrev: {abbrev}'); assert abbrev == 'ATL'" scrape_schedules.py imports MLS_TEAMS from mls.py, get_team_abbrev works for MLS, inline MLS_TEAMS removed
Before declaring plan complete: - [ ] mls.py exists with complete module structure - [ ] All 30 MLS stadiums have capacity > 0 and year_opened values - [ ] scrape_schedules.py imports from mls.py successfully - [ ] `python3 Scripts/scrape_schedules.py --stadiums-update` includes MLS stadiums with complete data<success_criteria>
- mls.py module created following established pattern
- 30 MLS stadiums with complete data (capacity, year_opened, coordinates)
- scrape_schedules.py integration works
- No import errors when running pipeline </success_criteria>