Sportstime/.planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-01-PLAN.md

---
phase: 2.1-additional-sports-stadiums
plan: 01
type: execute
---

<objective>
Create MLS sport module with complete hardcoded stadium data.

Purpose: Enable MLS stadium data to flow through the canonicalization pipeline like the core 4 sports.
Output: mls.py module with 30 stadiums including capacity, year_opened, and coordinates.
</objective>

<execution_context>
~/.claude/get-shit-done/workflows/execute-phase.md
~/.claude/get-shit-done/templates/summary.md
</execution_context>

<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md

# Prior phase context:
@.planning/phases/02-stadium-foundation/02-02-SUMMARY.md

# Pattern reference (follow this module structure):
@Scripts/mlb.py
@Scripts/nba.py

# Current MLS data location:
@Scripts/scrape_schedules.py (MLS_TEAMS dict at line 93)
@Scripts/data/stadiums.json (MLS entries have lat/lng but missing capacity/year_opened)

# Core module for imports:
@Scripts/core.py

**Tech stack available:** Python 3, dataclasses, requests
**Established patterns:** Sport module structure (team dict, get_abbrev function, hardcoded stadiums, scraper sources)
**Constraining decisions:**
- Phase 02-02: MLS excluded from bundled JSON due to incomplete data (zero capacity, null year_opened)
</context>

<tasks>

<task type="auto">
  <name>Task 1: Create mls.py module with complete stadium data</name>
  <files>Scripts/mls.py</files>
  <action>
Create mls.py following the mlb.py/nba.py pattern:

1. Module docstring and imports (try/except for core imports)
2. __all__ exports list
3. MLS_TEAMS dict (copy from scrape_schedules.py, 30 teams)
4. get_mls_team_abbrev() function
5. Hardcoded MLS stadiums dict with COMPLETE data:
   - All 30 MLS stadiums
   - Each entry needs: city, state, lat, lng, capacity, teams (list of abbrevs), year_opened
   - Use existing lat/lng from Scripts/data/stadiums.json where available
   - Research capacity and year_opened for each stadium

Key stadiums to research (capacity/year_opened):
- Mercedes-Benz Stadium (ATL) - shared with NFL
- Q2 Stadium (Austin) - MLS-specific, opened 2021
- Bank of America Stadium (CLT) - shared with NFL
- Soldier Field (CHI) - shared with NFL
- TQL Stadium (CIN) - MLS-specific, opened 2021
- Dick's Sporting Goods Park (COL)
- Lower.com Field (CLB) - opened 2021
- Toyota Stadium (DAL)
- Audi Field (DC) - MLS-specific, opened 2018
- Shell Energy Stadium (HOU) - MLS-specific
- Dignity Health Sports Park (LAG)
- BMO Stadium (LAFC) - opened 2018
- Chase Stadium (MIA) - MLS-specific
- Allianz Field (MIN) - opened 2019
- Stade Saputo (MTL)
- Geodis Park (NSH) - opened 2022
- Gillette Stadium (NE) - shared with NFL
- Yankee Stadium (NYCFC) - shared with MLB
- Red Bull Arena (NYRB)
- Inter&Co Stadium (ORL)
- Subaru Park (PHI)
- Providence Park (POR)
- America First Field (RSL)
- PayPal Park (SJ)
- Lumen Field (SEA) - shared with NFL
- Children's Mercy Park (SKC)
- CityPark (STL) - opened 2023
- BMO Field (TOR)
- BC Place (VAN) - shared stadium
- Snapdragon Stadium (SD) - shared, opened 2022

6. scrape_mls_stadiums_hardcoded() function returning list[Stadium]
7. scrape_mls_stadiums() function with fallback sources
8. MLS_STADIUM_SOURCES configuration

Note: Some stadiums are shared with NFL/MLB - use correct MLS-specific capacity where different (soccer configuration).
  </action>
  <verify>python3 -c "from Scripts.mls import MLS_TEAMS, scrape_mls_stadiums_hardcoded; s = scrape_mls_stadiums_hardcoded(); print(f'{len(s)} stadiums'); assert len(s) == 30; assert all(st.capacity > 0 for st in s); assert all(st.year_opened for st in s)"</verify>
  <done>mls.py exists with 30 teams, 30 stadiums, all with non-zero capacity and year_opened values</done>
</task>

<task type="auto">
  <name>Task 2: Integrate MLS module with scrape_schedules.py</name>
  <files>Scripts/scrape_schedules.py</files>
  <action>
Update scrape_schedules.py to use the new mls.py module:

1. Add import at top (with try/except pattern):
   - from mls import MLS_TEAMS, get_mls_team_abbrev, scrape_mls_stadiums, MLS_STADIUM_SOURCES

2. Remove inline MLS_TEAMS dict (lines ~93-124) - now imported from mls.py

3. Update get_team_abbrev() function to use get_mls_team_abbrev() for MLS

4. Update scrape_mls_stadiums_gavinr() to be a secondary source (keep it, but mls.py hardcoded is primary)

5. Update the stadium scraping section to use scrape_mls_stadiums() from mls.py

6. Verify MLS games scraping still works (uses MLS_TEAMS for abbreviation lookup)

Do NOT remove the game scraping functions (scrape_mls_fbref, etc.) - those stay inline for now.
  </action>
  <verify>cd Scripts && python3 -c "from scrape_schedules import MLS_TEAMS, get_team_abbrev; print(f'MLS teams: {len(MLS_TEAMS)}'); abbrev = get_team_abbrev('Atlanta United FC', 'MLS'); print(f'ATL United abbrev: {abbrev}'); assert abbrev == 'ATL'"</verify>
  <done>scrape_schedules.py imports MLS_TEAMS from mls.py, get_team_abbrev works for MLS, inline MLS_TEAMS removed</done>
</task>

</tasks>

<verification>
Before declaring plan complete:
- [ ] mls.py exists with complete module structure
- [ ] All 30 MLS stadiums have capacity > 0 and year_opened values
- [ ] scrape_schedules.py imports from mls.py successfully
- [ ] `python3 Scripts/scrape_schedules.py --stadiums-update` includes MLS stadiums with complete data
</verification>

<success_criteria>

- mls.py module created following established pattern
- 30 MLS stadiums with complete data (capacity, year_opened, coordinates)
- scrape_schedules.py integration works
- No import errors when running pipeline
</success_criteria>

<output>
After completion, create `.planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-01-SUMMARY.md`
</output>