feat(scripts): complete data pipeline remediation

Scripts changes:
- Add WNBA abbreviation aliases to team_resolver.py
- Fix NHL stadium coordinates in stadium_resolver.py
- Add validate_aliases.py script for orphan detection
- Update scrapers with improved error handling
- Add DATA_AUDIT.md and REMEDIATION_PLAN.md documentation
- Update alias JSON files with new mappings

iOS bundle updates:
- Update games_canonical.json with latest scraped data
- Update teams_canonical.json and stadiums_canonical.json
- Sync alias files with Scripts versions

All 5 remediation phases complete.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Trey t
2026-01-20 18:58:47 -06:00
parent 51419fccf2
commit 8ea3e6112a
21 changed files with 56592 additions and 35714 deletions

View File

@@ -531,6 +531,16 @@ class NHLScraper(BaseScraper):
stadium_id = stadium_result.canonical_id
# Fallback: Use home team's default stadium if no venue provided
# This is common for Hockey-Reference which doesn't have venue data
if not stadium_id:
home_team_data = TEAM_MAPPINGS.get("nhl", {})
home_abbrev = self._get_abbreviation(home_result.canonical_id)
for abbrev, (team_id, _, _, default_stadium) in home_team_data.items():
if team_id == home_result.canonical_id:
stadium_id = default_stadium
break
# Get abbreviations for game ID
home_abbrev = self._get_abbreviation(home_result.canonical_id)
away_abbrev = self._get_abbreviation(away_result.canonical_id)