# WNBA, MLS, and NWSL Implementation Guide
Complete end-to-end implementation for adding WNBA, MLS, and NWSL to SportsTime.
---
## 1. League Overview
### WNBA (Women's National Basketball Association)
- **Teams**: 13 (expanding to 15 by 2026)
- **Season**: May - September (regular season), September - October (playoffs)
- **Game Cadence**: ~40 games per team, 3-4 games per week
- **Special Considerations**:
- Many teams share arenas with NBA teams (key for stadium handling)
- Olympic break in summer every 4 years
- Commissioner's Cup midseason tournament
**Shared Venues (WNBA/NBA)**:
| WNBA Team | NBA Team | Arena |
|-----------|----------|-------|
| Atlanta Dream | Hawks | State Farm Arena |
| Chicago Sky | Bulls | Wintrust Arena (different) |
| Dallas Wings | Mavericks | College Park Center (different) |
| Indiana Fever | Pacers | Gainbridge Fieldhouse |
| Los Angeles Sparks | Lakers/Clippers | Crypto.com Arena |
| Minnesota Lynx | Timberwolves | Target Center |
| New York Liberty | Knicks | Barclays Center |
| Phoenix Mercury | Suns | Footprint Center |
| Washington Mystics | Wizards | Entertainment & Sports Arena (different) |
### MLS (Major League Soccer)
- **Teams**: 29 teams (2024), expanding
- **Season**: February/March - October (regular season), October - December (playoffs)
- **Game Cadence**: 34 games per team, 1-2 games per week
- **Special Considerations**:
- Some teams share NFL stadiums (Atlanta, Seattle, New England)
- Midweek matches (Wednesday/Thursday) common
- US Open Cup adds additional games
- Canadian teams (Toronto, Vancouver, Montreal) - timezone handling
**Shared Venues (MLS/NFL)**:
| MLS Team | NFL Team | Stadium |
|----------|----------|---------|
| Atlanta United | Falcons | Mercedes-Benz Stadium |
| Seattle Sounders | Seahawks | Lumen Field |
| New England Revolution | Patriots | Gillette Stadium |
### NWSL (National Women's Soccer League)
- **Teams**: 14 teams (2024)
- **Season**: March - November (regular season + playoffs)
- **Game Cadence**: 26 games per team, 1-2 games per week
- **Special Considerations**:
- Some share MLS stadiums (Portland, Orlando, Kansas City)
- Many use smaller soccer-specific venues
- Expansion teams frequently added
---
## 2. Schedule & Data Sources
### WNBA Data Sources
**Primary: Basketball-Reference (Women)**
```
URL Pattern: https://www.basketball-reference.com/wnba/years/{YEAR}_games.html
Example: https://www.basketball-reference.com/wnba/years/2025_games.html
```
**HTML Structure**:
```html
| Fri, May 17, 2024 |
7:30p |
Dallas Wings |
Atlanta Dream |
Gateway Center Arena |
```
**Fields Available**: date, time, home_team, away_team, arena, attendance, box_score_link
**Secondary: ESPN WNBA**
```
URL Pattern: https://www.espn.com/wnba/schedule/_/date/{YYYYMMDD}
```
### MLS Data Sources
**Primary: FBref (Football Reference)**
```
URL Pattern: https://fbref.com/en/comps/22/{YEAR}/schedule/{YEAR}-Major-League-Soccer-Scores-and-Fixtures
Example: https://fbref.com/en/comps/22/2024/schedule/2024-Major-League-Soccer-Scores-and-Fixtures
```
**HTML Structure**:
```html
| 2024-02-24 |
19:30 |
LA Galaxy |
Inter Miami |
Dignity Health Sports Park |
```
**Fields Available**: date, time (24hr), home_team, away_team, venue, score, attendance
**Secondary: MLS Official**
```
URL Pattern: https://www.mlssoccer.com/schedule/scores
API Endpoint: https://sportapi.mlssoccer.com/api/matches?culture=en-us&dateFrom={date}&dateTo={date}
```
### NWSL Data Sources
**Primary: FBref (NWSL)**
```
URL Pattern: https://fbref.com/en/comps/182/{YEAR}/schedule/{YEAR}-NWSL-Scores-and-Fixtures
Example: https://fbref.com/en/comps/182/2024/schedule/2024-NWSL-Scores-and-Fixtures
```
**HTML Structure**: Same as MLS (FBref standard format)
**Secondary: NWSL Official**
```
URL Pattern: https://www.nwslsoccer.com/schedule
```
---
## 3. Schedule Parser Changes
### File: `Scripts/scrape_schedules.py`
#### 3.1 Add Team Mappings (after NHL_TEAMS ~line 180)
```python
# =============================================================================
# WNBA TEAMS
# =============================================================================
WNBA_TEAMS = {
'ATL': {'name': 'Atlanta Dream', 'city': 'Atlanta', 'arena': 'Gateway Center Arena'},
'CHI': {'name': 'Chicago Sky', 'city': 'Chicago', 'arena': 'Wintrust Arena'},
'CON': {'name': 'Connecticut Sun', 'city': 'Uncasville', 'arena': 'Mohegan Sun Arena'},
'DAL': {'name': 'Dallas Wings', 'city': 'Arlington', 'arena': 'College Park Center'},
'IND': {'name': 'Indiana Fever', 'city': 'Indianapolis', 'arena': 'Gainbridge Fieldhouse'},
'LVA': {'name': 'Las Vegas Aces', 'city': 'Las Vegas', 'arena': 'Michelob Ultra Arena'},
'LAS': {'name': 'Los Angeles Sparks', 'city': 'Los Angeles', 'arena': 'Crypto.com Arena'},
'MIN': {'name': 'Minnesota Lynx', 'city': 'Minneapolis', 'arena': 'Target Center'},
'NYL': {'name': 'New York Liberty', 'city': 'Brooklyn', 'arena': 'Barclays Center'},
'PHO': {'name': 'Phoenix Mercury', 'city': 'Phoenix', 'arena': 'Footprint Center'},
'SEA': {'name': 'Seattle Storm', 'city': 'Seattle', 'arena': 'Climate Pledge Arena'},
'WAS': {'name': 'Washington Mystics', 'city': 'Washington', 'arena': 'Entertainment & Sports Arena'},
# Expansion teams (add as announced)
'GSV': {'name': 'Golden State Valkyries', 'city': 'San Francisco', 'arena': 'Chase Center'},
'POR': {'name': 'Portland Expansion', 'city': 'Portland', 'arena': 'TBD'},
'TOR': {'name': 'Toronto Expansion', 'city': 'Toronto', 'arena': 'TBD'},
}
# =============================================================================
# MLS TEAMS
# =============================================================================
MLS_TEAMS = {
'ATL': {'name': 'Atlanta United FC', 'city': 'Atlanta', 'stadium': 'Mercedes-Benz Stadium'},
'AUS': {'name': 'Austin FC', 'city': 'Austin', 'stadium': 'Q2 Stadium'},
'CHI': {'name': 'Chicago Fire FC', 'city': 'Chicago', 'stadium': 'Soldier Field'},
'CIN': {'name': 'FC Cincinnati', 'city': 'Cincinnati', 'stadium': 'TQL Stadium'},
'CLB': {'name': 'Columbus Crew', 'city': 'Columbus', 'stadium': 'Lower.com Field'},
'COL': {'name': 'Colorado Rapids', 'city': 'Commerce City', 'stadium': 'Dick\'s Sporting Goods Park'},
'DAL': {'name': 'FC Dallas', 'city': 'Frisco', 'stadium': 'Toyota Stadium'},
'DCU': {'name': 'D.C. United', 'city': 'Washington', 'stadium': 'Audi Field'},
'HOU': {'name': 'Houston Dynamo FC', 'city': 'Houston', 'stadium': 'Shell Energy Stadium'},
'LAG': {'name': 'LA Galaxy', 'city': 'Carson', 'stadium': 'Dignity Health Sports Park'},
'LAF': {'name': 'Los Angeles FC', 'city': 'Los Angeles', 'stadium': 'BMO Stadium'},
'MIA': {'name': 'Inter Miami CF', 'city': 'Fort Lauderdale', 'stadium': 'Chase Stadium'},
'MIN': {'name': 'Minnesota United FC', 'city': 'Saint Paul', 'stadium': 'Allianz Field'},
'MTL': {'name': 'CF Montréal', 'city': 'Montreal', 'stadium': 'Stade Saputo'},
'NSH': {'name': 'Nashville SC', 'city': 'Nashville', 'stadium': 'Geodis Park'},
'NER': {'name': 'New England Revolution', 'city': 'Foxborough', 'stadium': 'Gillette Stadium'},
'NYC': {'name': 'New York City FC', 'city': 'New York', 'stadium': 'Yankee Stadium'},
'NYR': {'name': 'New York Red Bulls', 'city': 'Harrison', 'stadium': 'Red Bull Arena'},
'ORL': {'name': 'Orlando City SC', 'city': 'Orlando', 'stadium': 'Exploria Stadium'},
'PHI': {'name': 'Philadelphia Union', 'city': 'Chester', 'stadium': 'Subaru Park'},
'POR': {'name': 'Portland Timbers', 'city': 'Portland', 'stadium': 'Providence Park'},
'RSL': {'name': 'Real Salt Lake', 'city': 'Sandy', 'stadium': 'America First Field'},
'SJE': {'name': 'San Jose Earthquakes', 'city': 'San Jose', 'stadium': 'PayPal Park'},
'SEA': {'name': 'Seattle Sounders FC', 'city': 'Seattle', 'stadium': 'Lumen Field'},
'SKC': {'name': 'Sporting Kansas City', 'city': 'Kansas City', 'stadium': 'Children\'s Mercy Park'},
'STL': {'name': 'St. Louis City SC', 'city': 'St. Louis', 'stadium': 'CityPark'},
'TOR': {'name': 'Toronto FC', 'city': 'Toronto', 'stadium': 'BMO Field'},
'VAN': {'name': 'Vancouver Whitecaps FC', 'city': 'Vancouver', 'stadium': 'BC Place'},
'SDG': {'name': 'San Diego FC', 'city': 'San Diego', 'stadium': 'Snapdragon Stadium'}, # 2025 expansion
}
# =============================================================================
# NWSL TEAMS
# =============================================================================
NWSL_TEAMS = {
'ANG': {'name': 'Angel City FC', 'city': 'Los Angeles', 'stadium': 'BMO Stadium'},
'CHI': {'name': 'Chicago Red Stars', 'city': 'Chicago', 'stadium': 'SeatGeek Stadium'},
'HOU': {'name': 'Houston Dash', 'city': 'Houston', 'stadium': 'Shell Energy Stadium'},
'KCC': {'name': 'Kansas City Current', 'city': 'Kansas City', 'stadium': 'CPKC Stadium'},
'LOU': {'name': 'Racing Louisville FC', 'city': 'Louisville', 'stadium': 'Lynn Family Stadium'},
'NCC': {'name': 'North Carolina Courage', 'city': 'Cary', 'stadium': 'WakeMed Soccer Park'},
'NJG': {'name': 'NJ/NY Gotham FC', 'city': 'Harrison', 'stadium': 'Red Bull Arena'},
'ORL': {'name': 'Orlando Pride', 'city': 'Orlando', 'stadium': 'Exploria Stadium'},
'POR': {'name': 'Portland Thorns FC', 'city': 'Portland', 'stadium': 'Providence Park'},
'SDW': {'name': 'San Diego Wave FC', 'city': 'San Diego', 'stadium': 'Snapdragon Stadium'},
'SEA': {'name': 'Seattle Reign FC', 'city': 'Seattle', 'stadium': 'Lumen Field'},
'UTA': {'name': 'Utah Royals FC', 'city': 'Sandy', 'stadium': 'America First Field'},
'WAS': {'name': 'Washington Spirit', 'city': 'Washington', 'stadium': 'Audi Field'},
'BAY': {'name': 'Bay FC', 'city': 'San Francisco', 'stadium': 'PayPal Park'}, # 2024 expansion
}
```
#### 3.2 Update `get_team_abbrev()` Function
```python
def get_team_abbrev(team_name: str, sport: str) -> str:
"""Get team abbreviation from full name."""
team_maps = {
'NBA': NBA_TEAMS,
'MLB': MLB_TEAMS,
'NHL': NHL_TEAMS,
'WNBA': WNBA_TEAMS,
'MLS': MLS_TEAMS,
'NWSL': NWSL_TEAMS,
}
teams = team_maps.get(sport, {})
# Direct match on abbreviation
for abbrev, data in teams.items():
if team_name.lower() == data['name'].lower():
return abbrev
# Partial match (e.g., "Hawks" matches "Atlanta Hawks")
if team_name.lower() in data['name'].lower():
return abbrev
# Fallback: first 3 characters
return team_name[:3].upper()
```
#### 3.3 Add WNBA Scraper
```python
def scrape_wnba_basketball_reference(season: int) -> list[Game]:
"""
Scrape WNBA schedule from Basketball-Reference.
URL: https://www.basketball-reference.com/wnba/years/{YEAR}_games.html
Season year is the calendar year (e.g., 2025 for 2025 season)
"""
games = []
url = f"https://www.basketball-reference.com/wnba/years/{season}_games.html"
print(f"Scraping WNBA {season} from Basketball-Reference...")
soup = fetch_page(url, 'basketball-reference.com')
if not soup:
return games
table = soup.find('table', {'id': 'schedule'})
if not table:
print(" No schedule table found")
return games
tbody = table.find('tbody')
if not tbody:
return games
for row in tbody.find_all('tr'):
if row.get('class') and 'thead' in row.get('class'):
continue
try:
# Parse date
date_cell = row.find('th', {'data-stat': 'date_game'})
if not date_cell:
continue
date_link = date_cell.find('a')
date_str = date_link.text if date_link else date_cell.text
# Parse time
time_cell = row.find('td', {'data-stat': 'game_start_time'})
time_str = time_cell.text.strip() if time_cell else None
# Parse teams
visitor_cell = row.find('td', {'data-stat': 'visitor_team_name'})
home_cell = row.find('td', {'data-stat': 'home_team_name'})
if not visitor_cell or not home_cell:
continue
away_team = visitor_cell.find('a').text if visitor_cell.find('a') else visitor_cell.text
home_team = home_cell.find('a').text if home_cell.find('a') else home_cell.text
# Parse arena
arena_cell = row.find('td', {'data-stat': 'arena_name'})
arena = arena_cell.text.strip() if arena_cell else ''
# Convert date (format: "Sat, May 18, 2024")
try:
parsed_date = datetime.strptime(date_str.strip(), '%a, %b %d, %Y')
date_formatted = parsed_date.strftime('%Y-%m-%d')
except:
continue
# Generate game ID
home_abbrev = get_team_abbrev(home_team, 'WNBA')
away_abbrev = get_team_abbrev(away_team, 'WNBA')
game_id = f"wnba_{date_formatted}_{away_abbrev}_{home_abbrev}".lower()
game = Game(
id=game_id,
sport='WNBA',
season=str(season),
date=date_formatted,
time=time_str,
home_team=home_team,
away_team=away_team,
home_team_abbrev=home_abbrev,
away_team_abbrev=away_abbrev,
venue=arena,
source='basketball-reference.com'
)
games.append(game)
except Exception as e:
print(f" Error parsing row: {e}")
continue
print(f" Found {len(games)} games from Basketball-Reference")
return games
```
#### 3.4 Add MLS Scraper
```python
def scrape_mls_fbref(season: int) -> list[Game]:
"""
Scrape MLS schedule from FBref.
URL: https://fbref.com/en/comps/22/{YEAR}/schedule/{YEAR}-Major-League-Soccer-Scores-and-Fixtures
"""
games = []
url = f"https://fbref.com/en/comps/22/{season}/schedule/{season}-Major-League-Soccer-Scores-and-Fixtures"
print(f"Scraping MLS {season} from FBref...")
soup = fetch_page(url, 'fbref.com')
if not soup:
return games
# FBref uses table with id like sched_{year}_22_1
table = soup.find('table', {'id': lambda x: x and 'sched_' in x})
if not table:
print(" No schedule table found")
return games
tbody = table.find('tbody')
if not tbody:
return games
for row in tbody.find_all('tr'):
try:
# Parse date (format: 2024-02-24)
date_cell = row.find('td', {'data-stat': 'date'})
if not date_cell:
continue
date_str = date_cell.text.strip()
# Parse time (24hr format: 19:30)
time_cell = row.find('td', {'data-stat': 'time'})
time_str = time_cell.text.strip() if time_cell else None
# Convert 24hr to 12hr format for consistency
if time_str:
try:
t = datetime.strptime(time_str, '%H:%M')
time_str = t.strftime('%I:%M%p').lstrip('0').lower()
except:
pass
# Parse teams
home_cell = row.find('td', {'data-stat': 'home_team'})
away_cell = row.find('td', {'data-stat': 'away_team'})
if not home_cell or not away_cell:
continue
home_team = home_cell.find('a').text if home_cell.find('a') else home_cell.text
away_team = away_cell.find('a').text if away_cell.find('a') else away_cell.text
home_team = home_team.strip()
away_team = away_team.strip()
if not home_team or not away_team:
continue
# Parse venue
venue_cell = row.find('td', {'data-stat': 'venue'})
venue = venue_cell.text.strip() if venue_cell else ''
# Generate game ID
home_abbrev = get_team_abbrev(home_team, 'MLS')
away_abbrev = get_team_abbrev(away_team, 'MLS')
game_id = f"mls_{date_str}_{away_abbrev}_{home_abbrev}".lower()
game = Game(
id=game_id,
sport='MLS',
season=str(season),
date=date_str,
time=time_str,
home_team=home_team,
away_team=away_team,
home_team_abbrev=home_abbrev,
away_team_abbrev=away_abbrev,
venue=venue,
source='fbref.com'
)
games.append(game)
except Exception as e:
print(f" Error parsing row: {e}")
continue
print(f" Found {len(games)} games from FBref")
return games
```
#### 3.5 Add NWSL Scraper
```python
def scrape_nwsl_fbref(season: int) -> list[Game]:
"""
Scrape NWSL schedule from FBref.
URL: https://fbref.com/en/comps/182/{YEAR}/schedule/{YEAR}-NWSL-Scores-and-Fixtures
"""
games = []
url = f"https://fbref.com/en/comps/182/{season}/schedule/{season}-NWSL-Scores-and-Fixtures"
print(f"Scraping NWSL {season} from FBref...")
soup = fetch_page(url, 'fbref.com')
if not soup:
return games
table = soup.find('table', {'id': lambda x: x and 'sched_' in x})
if not table:
print(" No schedule table found")
return games
tbody = table.find('tbody')
if not tbody:
return games
for row in tbody.find_all('tr'):
try:
date_cell = row.find('td', {'data-stat': 'date'})
if not date_cell:
continue
date_str = date_cell.text.strip()
time_cell = row.find('td', {'data-stat': 'time'})
time_str = time_cell.text.strip() if time_cell else None
if time_str:
try:
t = datetime.strptime(time_str, '%H:%M')
time_str = t.strftime('%I:%M%p').lstrip('0').lower()
except:
pass
home_cell = row.find('td', {'data-stat': 'home_team'})
away_cell = row.find('td', {'data-stat': 'away_team'})
if not home_cell or not away_cell:
continue
home_team = home_cell.find('a').text if home_cell.find('a') else home_cell.text
away_team = away_cell.find('a').text if away_cell.find('a') else away_cell.text
home_team = home_team.strip()
away_team = away_team.strip()
if not home_team or not away_team:
continue
venue_cell = row.find('td', {'data-stat': 'venue'})
venue = venue_cell.text.strip() if venue_cell else ''
home_abbrev = get_team_abbrev(home_team, 'NWSL')
away_abbrev = get_team_abbrev(away_team, 'NWSL')
game_id = f"nwsl_{date_str}_{away_abbrev}_{home_abbrev}".lower()
game = Game(
id=game_id,
sport='NWSL',
season=str(season),
date=date_str,
time=time_str,
home_team=home_team,
away_team=away_team,
home_team_abbrev=home_abbrev,
away_team_abbrev=away_abbrev,
venue=venue,
source='fbref.com'
)
games.append(game)
except Exception as e:
print(f" Error parsing row: {e}")
continue
print(f" Found {len(games)} games from FBref")
return games
```
---
## 4. Stadium & Team Canonicalization
### 4.1 Canonical ID Patterns
**Stadiums** (per-sport, even for shared venues):
```
stadium_{sport}_{normalized_name}
```
Examples:
- `stadium_wnba_barclays_center` (WNBA Liberty)
- `stadium_nba_barclays_center` (NBA Nets)
- `stadium_mls_mercedes_benz_stadium`
- `stadium_nwsl_providence_park`
**Teams**:
```
team_{sport}_{abbrev}
```
Examples:
- `team_wnba_nyl` (New York Liberty)
- `team_mls_atl` (Atlanta United)
- `team_nwsl_por` (Portland Thorns)
**Games**:
```
game_{sport}_{season}_{date}_{away}_{home}
```
Examples:
- `game_wnba_2025_20250518_dal_atl`
- `game_mls_2025_20250301_mia_lag`
- `game_nwsl_2025_20250315_por_ang`
### 4.2 Shared Venue Handling
**Critical Rule**: Stadiums are per-sport entities. A physical venue shared between sports creates MULTIPLE canonical stadium records.
**Example: Barclays Center**
```json
// Stadium for NBA Nets
{
"canonical_id": "stadium_nba_barclays_center",
"name": "Barclays Center",
"city": "Brooklyn",
"sport": "NBA",
"primary_team_abbrevs": ["BRK"]
}
// Stadium for WNBA Liberty
{
"canonical_id": "stadium_wnba_barclays_center",
"name": "Barclays Center",
"city": "Brooklyn",
"sport": "WNBA",
"primary_team_abbrevs": ["NYL"]
}
```
**Rationale**: Trip planning needs sport-specific filtering. A user planning an NBA trip shouldn't see WNBA games unless explicitly requested.
### 4.3 Update `canonicalize_stadiums.py`
Add to `generate_stadiums_from_teams()`:
```python
def generate_stadiums_from_teams() -> list[Stadium]:
"""Generate stadium entries from team mappings."""
stadiums = []
# Existing: NBA, MLB, NHL
for abbrev, data in NBA_TEAMS.items():
stadiums.append(create_stadium(data, 'NBA', [abbrev]))
# ... existing MLB, NHL
# NEW: WNBA
for abbrev, data in WNBA_TEAMS.items():
stadiums.append(Stadium(
id=f"wnba_{normalize_name(data['arena'])}",
name=data['arena'],
city=data['city'],
state=get_state_for_city(data['city']),
latitude=0.0, # Geocoded later
longitude=0.0,
capacity=0,
sport='WNBA',
team_abbrevs=[abbrev],
source='team_mapping'
))
# NEW: MLS
for abbrev, data in MLS_TEAMS.items():
stadiums.append(Stadium(
id=f"mls_{normalize_name(data['stadium'])}",
name=data['stadium'],
city=data['city'],
state=get_state_for_city(data['city']),
latitude=0.0,
longitude=0.0,
capacity=0,
sport='MLS',
team_abbrevs=[abbrev],
source='team_mapping'
))
# NEW: NWSL
for abbrev, data in NWSL_TEAMS.items():
stadiums.append(Stadium(
id=f"nwsl_{normalize_name(data['stadium'])}",
name=data['stadium'],
city=data['city'],
state=get_state_for_city(data['city']),
latitude=0.0,
longitude=0.0,
capacity=0,
sport='NWSL',
team_abbrevs=[abbrev],
source='team_mapping'
))
return stadiums
```
### 4.4 Update `canonicalize_teams.py`
Add league structure mappings:
```python
# WNBA has no conferences/divisions in traditional sense
WNBA_DIVISIONS = {abbrev: (None, None) for abbrev in WNBA_TEAMS}
# MLS Conferences
MLS_DIVISIONS = {
# Eastern Conference
'ATL': ('mls_eastern', None),
'CHI': ('mls_eastern', None),
'CIN': ('mls_eastern', None),
'CLB': ('mls_eastern', None),
'DCU': ('mls_eastern', None),
'MIA': ('mls_eastern', None),
'MTL': ('mls_eastern', None),
'NSH': ('mls_eastern', None),
'NER': ('mls_eastern', None),
'NYC': ('mls_eastern', None),
'NYR': ('mls_eastern', None),
'ORL': ('mls_eastern', None),
'PHI': ('mls_eastern', None),
'TOR': ('mls_eastern', None),
# Western Conference
'AUS': ('mls_western', None),
'COL': ('mls_western', None),
'DAL': ('mls_western', None),
'HOU': ('mls_western', None),
'LAG': ('mls_western', None),
'LAF': ('mls_western', None),
'MIN': ('mls_western', None),
'POR': ('mls_western', None),
'RSL': ('mls_western', None),
'SJE': ('mls_western', None),
'SEA': ('mls_western', None),
'SKC': ('mls_western', None),
'STL': ('mls_western', None),
'VAN': ('mls_western', None),
'SDG': ('mls_western', None),
}
# NWSL has no conferences
NWSL_DIVISIONS = {abbrev: (None, None) for abbrev in NWSL_TEAMS}
```
---
## 5. Local Canonical JSON Updates
### 5.1 stadiums_canonical.json
New entries follow existing format:
```json
{
"canonical_id": "stadium_wnba_barclays_center",
"name": "Barclays Center",
"city": "Brooklyn",
"state": "NY",
"latitude": 40.6826,
"longitude": -73.9754,
"capacity": 17732,
"sport": "WNBA",
"primary_team_abbrevs": ["NYL"],
"year_opened": 2012
}
```
### 5.2 teams_canonical.json
```json
{
"canonical_id": "team_wnba_nyl",
"name": "New York Liberty",
"abbreviation": "NYL",
"sport": "WNBA",
"city": "Brooklyn",
"stadium_canonical_id": "stadium_wnba_barclays_center",
"conference_id": null,
"division_id": null,
"primary_color": "#6ECEB2",
"secondary_color": "#000000"
}
```
### 5.3 games_canonical.json
```json
{
"canonical_id": "game_wnba_2025_20250518_dal_atl",
"sport": "WNBA",
"season": "2025",
"date": "2025-05-18",
"time": "7:30p",
"home_team_canonical_id": "team_wnba_atl",
"away_team_canonical_id": "team_wnba_dal",
"stadium_canonical_id": "stadium_wnba_gateway_center_arena",
"is_playoff": false,
"broadcast": null
}
```
### 5.4 Validation Rules
Update `validate_canonical.py`:
```python
VALID_SPORTS = {'NBA', 'MLB', 'NHL', 'WNBA', 'MLS', 'NWSL'}
def validate_sport_field(sport: str) -> list[str]:
"""Validate sport is one of the supported values."""
errors = []
if sport not in VALID_SPORTS:
errors.append(f"Invalid sport: {sport}. Must be one of {VALID_SPORTS}")
return errors
```
---
## 6. CloudKit Integration
### 6.1 Record Types (Already Exist)
No new record types needed. Existing types support new sports:
- `Stadium` - add records with sport="WNBA"/"MLS"/"NWSL"
- `Team` - add records with sport="WNBA"/"MLS"/"NWSL"
- `Game` - add records with sport="WNBA"/"MLS"/"NWSL"
- `StadiumAlias` - unchanged
- `TeamAlias` - unchanged
- `LeagueStructure` - add new entries for MLS conferences
### 6.2 Field Mapping (Unchanged)
**Stadium Record**:
```
recordName: canonical_id (e.g., "stadium_wnba_barclays_center")
fields:
- uuid: STRING (deterministic from canonical_id)
- name: STRING
- city: STRING
- state: STRING
- latitude: DOUBLE
- longitude: DOUBLE
- capacity: INT64
- sport: STRING ("WNBA", "MLS", "NWSL")
- yearOpened: INT64
- imageURL: STRING (optional)
- lastModified: TIMESTAMP
- schemaVersion: INT64
```
**Team Record**:
```
recordName: canonical_id (e.g., "team_wnba_nyl")
fields:
- uuid: STRING
- name: STRING
- abbreviation: STRING
- sport: STRING
- city: STRING
- stadiumCanonicalId: STRING (reference by canonical_id)
- conferenceId: STRING (optional)
- divisionId: STRING (optional)
- primaryColor: STRING
- secondaryColor: STRING
- lastModified: TIMESTAMP
- schemaVersion: INT64
```
**Game Record**:
```
recordName: canonical_id (e.g., "game_wnba_2025_20250518_dal_atl")
fields:
- uuid: STRING
- sport: STRING
- season: STRING
- dateTime: TIMESTAMP
- homeTeamCanonicalId: STRING
- awayTeamCanonicalId: STRING
- stadiumCanonicalId: STRING
- isPlayoff: INT64 (0 or 1)
- broadcastInfo: STRING (optional)
- lastModified: TIMESTAMP
- schemaVersion: INT64
```
### 6.3 Index Requirements
Ensure CloudKit has indexes for:
- `Game`: `sport` (sortable), `dateTime` (sortable, queryable)
- `Team`: `sport` (queryable)
- `Stadium`: `sport` (queryable)
### 6.4 Import Script Updates
Update `cloudkit_import.py` to handle new sports in validation:
```python
VALID_SPORTS = {'NBA', 'MLB', 'NHL', 'WNBA', 'MLS', 'NWSL'}
def validate_game_record(game: dict) -> list[str]:
errors = []
if game.get('sport') not in VALID_SPORTS:
errors.append(f"Invalid sport: {game.get('sport')}")
return errors
```
---
## 7. App-Side Integration (SwiftUI)
### 7.1 Update Sport Enum
**File**: `SportsTime/Core/Models/Domain/Sport.swift`
```swift
enum Sport: String, Codable, CaseIterable, Identifiable {
case mlb = "MLB"
case nba = "NBA"
case nhl = "NHL"
case nfl = "NFL"
case mls = "MLS"
case wnba = "WNBA"
case nwsl = "NWSL"
var id: String { rawValue }
var displayName: String {
switch self {
case .mlb: return "Major League Baseball"
case .nba: return "National Basketball Association"
case .nhl: return "National Hockey League"
case .nfl: return "National Football League"
case .mls: return "Major League Soccer"
case .wnba: return "Women's National Basketball Association"
case .nwsl: return "National Women's Soccer League"
}
}
var iconName: String {
switch self {
case .mlb: return "baseball.fill"
case .nba: return "basketball.fill"
case .nhl: return "hockey.puck.fill"
case .nfl: return "football.fill"
case .mls: return "soccerball"
case .wnba: return "basketball.fill"
case .nwsl: return "soccerball"
}
}
var color: Color {
switch self {
case .mlb: return .red
case .nba: return .orange
case .nhl: return .blue
case .nfl: return .brown
case .mls: return .green
case .wnba: return .purple
case .nwsl: return .pink
}
}
var seasonMonths: (start: Int, end: Int) {
switch self {
case .mlb: return (3, 10) // March - October
case .nba: return (10, 6) // October - June (wraps)
case .nhl: return (10, 6) // October - June (wraps)
case .nfl: return (9, 2) // September - February (wraps)
case .mls: return (2, 12) // February - December
case .wnba: return (5, 10) // May - October
case .nwsl: return (3, 11) // March - November
}
}
/// Currently supported sports
static var supported: [Sport] {
[.mlb, .nba, .nhl, .wnba, .mls, .nwsl]
}
}
```
### 7.2 Trip Planner - No Changes Required
The trip planner uses `Sport` enum and fetches games by sport. New sports automatically work because:
1. `DataProvider.fetchGames(sports:startDate:endDate:)` queries by sport string
2. Games are filtered by `sportStrings.contains(canonical.sport)`
3. Route planning is sport-agnostic (uses stadium coordinates)
### 7.3 Stadium Tracker - No Changes Required
Stadium progress uses `Stadium.sport` field. New sports automatically appear in:
- Stadium list filtering by sport
- Progress tracking per sport
### 7.4 UI Considerations
**Sport Selection Chips**: The `SportSelectionChip` already uses `Sport.allCases`. Adding new cases automatically adds them to the UI.
**Filter Sections**: Update default selections if desired:
```swift
// In TripCreationViewModel
var selectedSports: Set = [.mlb, .nba, .nhl] // Consider adding new sports
```
---
## 8. Testing & Validation
### 8.1 Data Integrity Checks
**Python validation queries** (add to `validate_canonical.py`):
```python
def validate_new_sports(stadiums, teams, games):
"""Validate WNBA, MLS, NWSL data integrity."""
errors = []
# Check all sports have stadiums
for sport in ['WNBA', 'MLS', 'NWSL']:
sport_stadiums = [s for s in stadiums if s['sport'] == sport]
if not sport_stadiums:
errors.append(f"No stadiums for {sport}")
sport_teams = [t for t in teams if t['sport'] == sport]
if not sport_teams:
errors.append(f"No teams for {sport}")
sport_games = [g for g in games if g['sport'] == sport]
if not sport_games:
errors.append(f"No games for {sport}")
# Check team->stadium references
stadium_ids = {s['canonical_id'] for s in stadiums}
for team in teams:
if team['stadium_canonical_id'] not in stadium_ids:
errors.append(f"Team {team['canonical_id']} references unknown stadium {team['stadium_canonical_id']}")
# Check game->team and game->stadium references
team_ids = {t['canonical_id'] for t in teams}
for game in games:
if game['home_team_canonical_id'] not in team_ids:
errors.append(f"Game {game['canonical_id']} references unknown home team")
if game['away_team_canonical_id'] not in team_ids:
errors.append(f"Game {game['canonical_id']} references unknown away team")
if game['stadium_canonical_id'] not in stadium_ids:
errors.append(f"Game {game['canonical_id']} references unknown stadium")
return errors
```
### 8.2 App Smoke Tests
1. **Sport Selection**:
- Open Trip Creation
- Verify WNBA, MLS, NWSL chips appear
- Select each new sport
- Verify games load for date range
2. **Trip Planning**:
- Select WNBA + dates during WNBA season
- Verify trip results show WNBA games
- Verify stadium locations are correct
3. **Stadium Progress**:
- Navigate to Progress tab
- Filter by WNBA/MLS/NWSL
- Verify stadium list shows correct venues
4. **Mixed Sport Trips**:
- Select NBA + WNBA (they share arenas)
- Verify trips correctly handle both sports
- Verify no duplicate stadiums in single stop
### 8.3 Edge Case Tests
1. **Shared Venues**:
- Create trip with MLS Atlanta United + NFL Falcons (same venue)
- Verify games at Mercedes-Benz Stadium appear for both sports
2. **Canadian Teams** (MLS/NWSL):
- Create trip including Toronto FC
- Verify timezone handling is correct
3. **Midweek Matches** (MLS):
- Verify Wednesday/Thursday games don't break route planning
---
## 9. Pipeline Update Summary
### run_canonicalization_pipeline.py Changes
```python
# In run_pipeline():
# STAGE 1: SCRAPING
# ... existing NBA, MLB, NHL ...
# NEW: WNBA
print_section(f"WNBA {season}")
wnba_games = scrape_wnba_basketball_reference(season)
wnba_games = assign_stable_ids(wnba_games, 'WNBA', str(season))
all_games.extend(wnba_games)
print(f" Scraped {len(wnba_games)} WNBA games")
# NEW: MLS
print_section(f"MLS {season}")
mls_games = scrape_mls_fbref(season)
mls_games = assign_stable_ids(mls_games, 'MLS', str(season))
all_games.extend(mls_games)
print(f" Scraped {len(mls_games)} MLS games")
# NEW: NWSL
print_section(f"NWSL {season}")
nwsl_games = scrape_nwsl_fbref(season)
nwsl_games = assign_stable_ids(nwsl_games, 'NWSL', str(season))
all_games.extend(nwsl_games)
print(f" Scraped {len(nwsl_games)} NWSL games")
```
---
## 10. Checklist
### Definition of Done
- [ ] **Scraping**: WNBA, MLS, NWSL scrapers added and tested
- [ ] **Team Mappings**: All current teams with correct abbreviations
- [ ] **Stadiums**: All venues canonicalized with coordinates
- [ ] **Canonicalization**: Pipeline runs without errors for new sports
- [ ] **Validation**: All integrity checks pass
- [ ] **CloudKit**: Records uploaded successfully
- [ ] **Swift Enum**: Sport cases added with correct metadata
- [ ] **Trip Planning**: New sports can be planned into trips
- [ ] **Stadium Tracking**: New stadiums appear in progress
- [ ] **No Regressions**: Existing MLB/NBA/NHL functionality unchanged
### Files Modified
| File | Changes |
|------|---------|
| `Scripts/scrape_schedules.py` | Add team mappings, scrapers |
| `Scripts/canonicalize_stadiums.py` | Generate new sport stadiums |
| `Scripts/canonicalize_teams.py` | Add league structure mappings |
| `Scripts/run_canonicalization_pipeline.py` | Add scraping calls |
| `Scripts/validate_canonical.py` | Add new sport validation |
| `Scripts/cloudkit_import.py` | Add sport validation |
| `SportsTime/Core/Models/Domain/Sport.swift` | Add enum cases |
| `SportsTime/Resources/stadiums_canonical.json` | New venue records |
| `SportsTime/Resources/teams_canonical.json` | New team records |
| `SportsTime/Resources/games_canonical.json` | New game records |