docs(02.1): create phase plan for additional sports stadiums

Phase 2.1: Additional Sports Stadiums
- 3 plans created (MLS, WNBA, NWSL modules)
- CBB deferred to future phase (350+ D1 teams)
- 6 total tasks defined
- Ready for execution

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Trey t
2026-01-10 00:47:02 -06:00
parent 64137b57bf
commit 02d154cf46
4 changed files with 424 additions and 5 deletions

View File

@@ -46,13 +46,15 @@ Plans:
- [x] 02-02: Regenerate canonical data and verify pipeline
### Phase 2.1: Additional Sports Stadiums (INSERTED)
**Goal**: Add hardcoded stadium data for secondary sports: MLS, WNBA, NWSL, and CBB (College Basketball)
**Goal**: Add hardcoded stadium data for secondary sports: MLS, WNBA, NWSL (CBB deferred - 350+ D1 teams requires separate scoped phase)
**Depends on**: Phase 2
**Research**: Unlikely (stadium data compilation)
**Plans**: TBD
**Research**: No (stadium data compilation following established patterns)
**Plans**: 3 plans
Plans:
- [ ] 02.1-01: TBD (run /gsd:plan-phase 2.1 to break down)
- [ ] 02.1-01: Create MLS module with 30 hardcoded stadiums
- [ ] 02.1-02: Create WNBA module with 13 hardcoded arenas
- [ ] 02.1-03: Create NWSL module with 13+ hardcoded stadiums
### Phase 3: Alias Systems
**Goal**: Implement alias systems for both stadiums and teams to handle name variations across data sources
@@ -100,7 +102,7 @@ Phases execute in numeric order: 1 → 2 → 2.1 → 3 → 4 → 5 → 6
|-------|----------------|--------|-----------|
| 1. Script Architecture | 3/3 | Complete | 2026-01-10 |
| 2. Stadium Foundation | 2/2 | Complete | 2026-01-10 |
| 2.1. Additional Sports Stadiums | 0/TBD | Not started | - |
| 2.1. Additional Sports Stadiums | 0/3 | Not started | - |
| 3. Alias Systems | 0/TBD | Not started | - |
| 4. Canonical Linking | 0/TBD | Not started | - |
| 5. CloudKit CRUD | 0/TBD | Not started | - |

View File

@@ -0,0 +1,149 @@
---
phase: 2.1-additional-sports-stadiums
plan: 01
type: execute
---
<objective>
Create MLS sport module with complete hardcoded stadium data.
Purpose: Enable MLS stadium data to flow through the canonicalization pipeline like the core 4 sports.
Output: mls.py module with 30 stadiums including capacity, year_opened, and coordinates.
</objective>
<execution_context>
~/.claude/get-shit-done/workflows/execute-phase.md
~/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
# Prior phase context:
@.planning/phases/02-stadium-foundation/02-02-SUMMARY.md
# Pattern reference (follow this module structure):
@Scripts/mlb.py
@Scripts/nba.py
# Current MLS data location:
@Scripts/scrape_schedules.py (MLS_TEAMS dict at line 93)
@Scripts/data/stadiums.json (MLS entries have lat/lng but missing capacity/year_opened)
# Core module for imports:
@Scripts/core.py
**Tech stack available:** Python 3, dataclasses, requests
**Established patterns:** Sport module structure (team dict, get_abbrev function, hardcoded stadiums, scraper sources)
**Constraining decisions:**
- Phase 02-02: MLS excluded from bundled JSON due to incomplete data (zero capacity, null year_opened)
</context>
<tasks>
<task type="auto">
<name>Task 1: Create mls.py module with complete stadium data</name>
<files>Scripts/mls.py</files>
<action>
Create mls.py following the mlb.py/nba.py pattern:
1. Module docstring and imports (try/except for core imports)
2. __all__ exports list
3. MLS_TEAMS dict (copy from scrape_schedules.py, 30 teams)
4. get_mls_team_abbrev() function
5. Hardcoded MLS stadiums dict with COMPLETE data:
- All 30 MLS stadiums
- Each entry needs: city, state, lat, lng, capacity, teams (list of abbrevs), year_opened
- Use existing lat/lng from Scripts/data/stadiums.json where available
- Research capacity and year_opened for each stadium
Key stadiums to research (capacity/year_opened):
- Mercedes-Benz Stadium (ATL) - shared with NFL
- Q2 Stadium (Austin) - MLS-specific, opened 2021
- Bank of America Stadium (CLT) - shared with NFL
- Soldier Field (CHI) - shared with NFL
- TQL Stadium (CIN) - MLS-specific, opened 2021
- Dick's Sporting Goods Park (COL)
- Lower.com Field (CLB) - opened 2021
- Toyota Stadium (DAL)
- Audi Field (DC) - MLS-specific, opened 2018
- Shell Energy Stadium (HOU) - MLS-specific
- Dignity Health Sports Park (LAG)
- BMO Stadium (LAFC) - opened 2018
- Chase Stadium (MIA) - MLS-specific
- Allianz Field (MIN) - opened 2019
- Stade Saputo (MTL)
- Geodis Park (NSH) - opened 2022
- Gillette Stadium (NE) - shared with NFL
- Yankee Stadium (NYCFC) - shared with MLB
- Red Bull Arena (NYRB)
- Inter&Co Stadium (ORL)
- Subaru Park (PHI)
- Providence Park (POR)
- America First Field (RSL)
- PayPal Park (SJ)
- Lumen Field (SEA) - shared with NFL
- Children's Mercy Park (SKC)
- CityPark (STL) - opened 2023
- BMO Field (TOR)
- BC Place (VAN) - shared stadium
- Snapdragon Stadium (SD) - shared, opened 2022
6. scrape_mls_stadiums_hardcoded() function returning list[Stadium]
7. scrape_mls_stadiums() function with fallback sources
8. MLS_STADIUM_SOURCES configuration
Note: Some stadiums are shared with NFL/MLB - use correct MLS-specific capacity where different (soccer configuration).
</action>
<verify>python3 -c "from Scripts.mls import MLS_TEAMS, scrape_mls_stadiums_hardcoded; s = scrape_mls_stadiums_hardcoded(); print(f'{len(s)} stadiums'); assert len(s) == 30; assert all(st.capacity > 0 for st in s); assert all(st.year_opened for st in s)"</verify>
<done>mls.py exists with 30 teams, 30 stadiums, all with non-zero capacity and year_opened values</done>
</task>
<task type="auto">
<name>Task 2: Integrate MLS module with scrape_schedules.py</name>
<files>Scripts/scrape_schedules.py</files>
<action>
Update scrape_schedules.py to use the new mls.py module:
1. Add import at top (with try/except pattern):
- from mls import MLS_TEAMS, get_mls_team_abbrev, scrape_mls_stadiums, MLS_STADIUM_SOURCES
2. Remove inline MLS_TEAMS dict (lines ~93-124) - now imported from mls.py
3. Update get_team_abbrev() function to use get_mls_team_abbrev() for MLS
4. Update scrape_mls_stadiums_gavinr() to be a secondary source (keep it, but mls.py hardcoded is primary)
5. Update the stadium scraping section to use scrape_mls_stadiums() from mls.py
6. Verify MLS games scraping still works (uses MLS_TEAMS for abbreviation lookup)
Do NOT remove the game scraping functions (scrape_mls_fbref, etc.) - those stay inline for now.
</action>
<verify>cd Scripts && python3 -c "from scrape_schedules import MLS_TEAMS, get_team_abbrev; print(f'MLS teams: {len(MLS_TEAMS)}'); abbrev = get_team_abbrev('Atlanta United FC', 'MLS'); print(f'ATL United abbrev: {abbrev}'); assert abbrev == 'ATL'"</verify>
<done>scrape_schedules.py imports MLS_TEAMS from mls.py, get_team_abbrev works for MLS, inline MLS_TEAMS removed</done>
</task>
</tasks>
<verification>
Before declaring plan complete:
- [ ] mls.py exists with complete module structure
- [ ] All 30 MLS stadiums have capacity > 0 and year_opened values
- [ ] scrape_schedules.py imports from mls.py successfully
- [ ] `python3 Scripts/scrape_schedules.py --stadiums-update` includes MLS stadiums with complete data
</verification>
<success_criteria>
- mls.py module created following established pattern
- 30 MLS stadiums with complete data (capacity, year_opened, coordinates)
- scrape_schedules.py integration works
- No import errors when running pipeline
</success_criteria>
<output>
After completion, create `.planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-01-SUMMARY.md`
</output>

View File

@@ -0,0 +1,128 @@
---
phase: 2.1-additional-sports-stadiums
plan: 02
type: execute
---
<objective>
Create WNBA sport module with complete hardcoded stadium data.
Purpose: Enable WNBA stadium data to flow through the canonicalization pipeline.
Output: wnba.py module with 13 arenas including capacity, year_opened, and coordinates.
</objective>
<execution_context>
~/.claude/get-shit-done/workflows/execute-phase.md
~/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
# Prior plan in this phase:
@.planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-01-SUMMARY.md
# Pattern reference:
@Scripts/mlb.py
@Scripts/mls.py (created in 02.1-01)
# Current WNBA data:
@Scripts/scrape_schedules.py (WNBA_TEAMS dict at line 77)
# NBA arenas (many shared with WNBA):
@Scripts/nba.py
# Core module:
@Scripts/core.py
**Tech stack available:** Python 3, dataclasses, requests
**Established patterns:** Sport module structure from mlb.py, mls.py
**Key insight:** Many WNBA teams share arenas with NBA teams - can reference nba.py hardcoded data for coordinates/capacity
</context>
<tasks>
<task type="auto">
<name>Task 1: Create wnba.py module with complete stadium data</name>
<files>Scripts/wnba.py</files>
<action>
Create wnba.py following the established pattern:
1. Module docstring and imports (try/except for core imports)
2. __all__ exports list
3. WNBA_TEAMS dict (copy from scrape_schedules.py, 13 teams)
4. get_wnba_team_abbrev() function
5. Hardcoded WNBA arenas dict with COMPLETE data:
WNBA Teams and Arenas (2025 season - 13 teams):
- ATL: Atlanta Dream → Gateway Center Arena (College Park, GA) - WNBA-specific, ~3,500 capacity, opened 2018
- CHI: Chicago Sky → Wintrust Arena (Chicago, IL) - WNBA-specific, ~10,387 capacity, opened 2017
- CON: Connecticut Sun → Mohegan Sun Arena (Uncasville, CT) - ~10,000 capacity, opened 2001
- DAL: Dallas Wings → College Park Center (Arlington, TX) - ~7,000 capacity, opened 2012
- GSV: Golden State Valkyries → Chase Center (San Francisco, CA) - shared with NBA Warriors, ~18,064, opened 2019
- IND: Indiana Fever → Gainbridge Fieldhouse (Indianapolis, IN) - shared with NBA Pacers, ~17,923, opened 1999
- LVA: Las Vegas Aces → Michelob Ultra Arena (Las Vegas, NV) - ~12,000 capacity, opened 2016
- LA: Los Angeles Sparks → Crypto.com Arena (Los Angeles, CA) - shared with NBA Lakers/Clippers, ~19,079, opened 1999
- MIN: Minnesota Lynx → Target Center (Minneapolis, MN) - shared with NBA Timberwolves, ~18,978, opened 1990
- NY: New York Liberty → Barclays Center (Brooklyn, NY) - shared with NBA Nets, ~17,732, opened 2012
- PHO: Phoenix Mercury → Footprint Center (Phoenix, AZ) - shared with NBA Suns, ~17,071, opened 1992
- SEA: Seattle Storm → Climate Pledge Arena (Seattle, WA) - shared with NHL Kraken, ~17,100, opened 1962 (renovated 2021)
- WAS: Washington Mystics → Entertainment & Sports Arena (Washington, DC) - WNBA-specific, ~4,200, opened 2018
6. scrape_wnba_stadiums_hardcoded() function returning list[Stadium]
7. scrape_wnba_stadiums() function with fallback sources
8. WNBA_STADIUM_SOURCES configuration
Note: Use WNBA-specific capacity where different from NBA configuration.
Cross-reference nba.py for shared arena coordinates.
</action>
<verify>python3 -c "from Scripts.wnba import WNBA_TEAMS, scrape_wnba_stadiums_hardcoded; s = scrape_wnba_stadiums_hardcoded(); print(f'{len(s)} arenas'); assert len(s) == 13; assert all(st.capacity > 0 for st in s); assert all(st.year_opened for st in s)"</verify>
<done>wnba.py exists with 13 teams, 13 arenas, all with non-zero capacity and year_opened values</done>
</task>
<task type="auto">
<name>Task 2: Integrate WNBA module with scrape_schedules.py</name>
<files>Scripts/scrape_schedules.py</files>
<action>
Update scrape_schedules.py to use the new wnba.py module:
1. Add import at top (with try/except pattern):
- from wnba import WNBA_TEAMS, get_wnba_team_abbrev, scrape_wnba_stadiums, WNBA_STADIUM_SOURCES
2. Remove inline WNBA_TEAMS dict (lines ~77-91) - now imported from wnba.py
3. Update get_team_abbrev() function to use get_wnba_team_abbrev() for WNBA
4. Update scrape_wnba_stadiums() stub function to use the new module's implementation
5. Verify WNBA games scraping still works
Do NOT remove the game scraping functions - those stay inline for now.
</action>
<verify>cd Scripts && python3 -c "from scrape_schedules import WNBA_TEAMS, get_team_abbrev; print(f'WNBA teams: {len(WNBA_TEAMS)}'); abbrev = get_team_abbrev('Las Vegas Aces', 'WNBA'); print(f'Aces abbrev: {abbrev}'); assert abbrev == 'LVA'"</verify>
<done>scrape_schedules.py imports WNBA_TEAMS from wnba.py, get_team_abbrev works for WNBA, inline WNBA_TEAMS removed</done>
</task>
</tasks>
<verification>
Before declaring plan complete:
- [ ] wnba.py exists with complete module structure
- [ ] All 13 WNBA arenas have capacity > 0 and year_opened values
- [ ] scrape_schedules.py imports from wnba.py successfully
- [ ] No import errors when running pipeline
</verification>
<success_criteria>
- wnba.py module created following established pattern
- 13 WNBA arenas with complete data (capacity, year_opened, coordinates)
- scrape_schedules.py integration works
- Shared NBA arenas have correct coordinates
</success_criteria>
<output>
After completion, create `.planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-02-SUMMARY.md`
</output>

View File

@@ -0,0 +1,140 @@
---
phase: 2.1-additional-sports-stadiums
plan: 03
type: execute
---
<objective>
Create NWSL sport module with complete hardcoded stadium data.
Purpose: Enable NWSL stadium data to flow through the canonicalization pipeline.
Output: nwsl.py module with 13+ stadiums including capacity, year_opened, and coordinates.
</objective>
<execution_context>
~/.claude/get-shit-done/workflows/execute-phase.md
~/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
# Prior plans in this phase:
@.planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-01-SUMMARY.md
@.planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-02-SUMMARY.md
# Pattern reference:
@Scripts/mlb.py
@Scripts/mls.py (created in 02.1-01)
@Scripts/wnba.py (created in 02.1-02)
# Current NWSL data:
@Scripts/scrape_schedules.py (NWSL_TEAMS dict at line 126)
# MLS stadiums (some shared with NWSL):
@Scripts/mls.py
# Core module:
@Scripts/core.py
**Tech stack available:** Python 3, dataclasses, requests
**Established patterns:** Sport module structure from mlb.py, mls.py, wnba.py
**Key insight:** Several NWSL teams share stadiums with MLS teams - can reference mls.py hardcoded data
</context>
<tasks>
<task type="auto">
<name>Task 1: Create nwsl.py module with complete stadium data</name>
<files>Scripts/nwsl.py</files>
<action>
Create nwsl.py following the established pattern:
1. Module docstring and imports (try/except for core imports)
2. __all__ exports list
3. NWSL_TEAMS dict (copy from scrape_schedules.py, 13 teams - verify current roster)
4. get_nwsl_team_abbrev() function
5. Hardcoded NWSL stadiums dict with COMPLETE data:
NWSL Teams and Stadiums (2025 season - 14 teams as of expansion):
- LA: Angel City FC → BMO Stadium (Los Angeles, CA) - shared with LAFC, ~22,000, opened 2018
- SJ: Bay FC → PayPal Park (San Jose, CA) - shared with SJ Earthquakes, ~18,000, opened 2015
- CHI: Chicago Red Stars → SeatGeek Stadium (Bridgeview, IL) - ~20,000 capacity, opened 2006
- HOU: Houston Dash → Shell Energy Stadium (Houston, TX) - shared with Houston Dynamo, ~22,039, opened 2012
- KC: Kansas City Current → CPKC Stadium (Kansas City, MO) - NWSL-specific, ~11,500, opened 2024
- NJ: NJ/NY Gotham FC → Red Bull Arena (Harrison, NJ) - shared with NY Red Bulls, ~25,000, opened 2010
- NC: North Carolina Courage → WakeMed Soccer Park (Cary, NC) - ~10,000, opened 2002
- ORL: Orlando Pride → Inter&Co Stadium (Orlando, FL) - shared with Orlando City SC, ~25,500, opened 2017
- POR: Portland Thorns FC → Providence Park (Portland, OR) - shared with Portland Timbers, ~25,218, opened 1926 (renovated 2019)
- SEA: Seattle Reign FC → Lumen Field (Seattle, WA) - shared with Sounders/Seahawks, ~69,000, opened 2002
- SD: San Diego Wave FC → Snapdragon Stadium (San Diego, CA) - shared, ~35,000, opened 2022
- UTA: Utah Royals FC → America First Field (Sandy, UT) - shared with Real Salt Lake, ~20,213, opened 2008
- WAS: Washington Spirit → Audi Field (Washington, DC) - shared with DC United, ~20,000, opened 2018
- BOS: Boston Breakers FC (if active - verify current NWSL roster)
Cross-reference mls.py for shared stadium coordinates and verify current league membership.
6. scrape_nwsl_stadiums_hardcoded() function returning list[Stadium]
7. scrape_nwsl_stadiums() function with fallback sources
8. NWSL_STADIUM_SOURCES configuration
Note: NWSL has had expansion and contraction - verify current team roster matches actual 2025 season.
</action>
<verify>python3 -c "from Scripts.nwsl import NWSL_TEAMS, scrape_nwsl_stadiums_hardcoded; s = scrape_nwsl_stadiums_hardcoded(); print(f'{len(s)} stadiums'); print(f'{len(NWSL_TEAMS)} teams'); assert all(st.capacity > 0 for st in s); assert all(st.year_opened for st in s)"</verify>
<done>nwsl.py exists with current NWSL teams, all stadiums with non-zero capacity and year_opened values</done>
</task>
<task type="auto">
<name>Task 2: Integrate NWSL module and finalize phase</name>
<files>Scripts/scrape_schedules.py</files>
<action>
Update scrape_schedules.py to use the new nwsl.py module:
1. Add import at top (with try/except pattern):
- from nwsl import NWSL_TEAMS, get_nwsl_team_abbrev, scrape_nwsl_stadiums, NWSL_STADIUM_SOURCES
2. Remove inline NWSL_TEAMS dict (lines ~126-140) - now imported from nwsl.py
3. Update get_team_abbrev() function to use get_nwsl_team_abbrev() for NWSL
4. Update scrape_nwsl_stadiums() stub function to use the new module's implementation
5. Verify NWSL games scraping still works
6. Run full stadium update to verify all 3 new sports integrate:
python3 scrape_schedules.py --stadiums-update
Do NOT remove the game scraping functions - those stay inline for now.
</action>
<verify>cd Scripts && python3 -c "from scrape_schedules import NWSL_TEAMS, get_team_abbrev; print(f'NWSL teams: {len(NWSL_TEAMS)}'); abbrev = get_team_abbrev('Portland Thorns FC', 'NWSL'); print(f'Thorns abbrev: {abbrev}'); assert abbrev == 'POR'"</verify>
<done>scrape_schedules.py imports NWSL_TEAMS from nwsl.py, get_team_abbrev works for NWSL, all 3 secondary sport modules integrated</done>
</task>
</tasks>
<verification>
Before declaring plan complete:
- [ ] nwsl.py exists with complete module structure
- [ ] All NWSL stadiums have capacity > 0 and year_opened values
- [ ] scrape_schedules.py imports from nwsl.py successfully
- [ ] `python3 Scripts/scrape_schedules.py --stadiums-update` includes MLS, WNBA, and NWSL stadiums
- [ ] No import errors when running pipeline
</verification>
<success_criteria>
- nwsl.py module created following established pattern
- All NWSL stadiums with complete data (capacity, year_opened, coordinates)
- scrape_schedules.py integration works for all 3 new sports
- Phase 2.1 complete (MLS, WNBA, NWSL modules created)
- CBB deferred to future phase (documented in summary)
</success_criteria>
<output>
After completion, create `.planning/phases/2.1-add-stadium-data-mls-wnba-nwsl-cbb/02.1-03-SUMMARY.md` with:
- Summary of all 3 plans (MLS, WNBA, NWSL modules)
- Note that CBB was deferred (350+ D1 teams requires separate scoped phase)
- Phase 2.1 complete status
</output>