feat(scripts): add sportstime-parser data pipeline
Complete Python package for scraping, normalizing, and uploading sports schedule data to CloudKit. Includes: - Multi-source scrapers for NBA, MLB, NFL, NHL, MLS, WNBA, NWSL - Canonical ID system for teams, stadiums, and games - Fuzzy matching with manual alias support - CloudKit uploader with batch operations and deduplication - Comprehensive test suite with fixtures - WNBA abbreviation aliases for improved team resolution - Alias validation script to detect orphan references All 5 phases of data remediation plan completed: - Phase 1: Alias fixes (team/stadium alias additions) - Phase 2: NHL stadium coordinate fixes - Phase 3: Re-scrape validation - Phase 4: iOS bundle update - Phase 5: Code quality improvements (WNBA aliases) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
15
requirements.txt
Normal file
15
requirements.txt
Normal file
@@ -0,0 +1,15 @@
|
||||
# Core dependencies
|
||||
requests>=2.31.0
|
||||
beautifulsoup4>=4.12.0
|
||||
lxml>=5.0.0
|
||||
rapidfuzz>=3.5.0
|
||||
python-dateutil>=2.8.0
|
||||
pytz>=2024.1
|
||||
rich>=13.7.0
|
||||
pyjwt>=2.8.0
|
||||
cryptography>=42.0.0
|
||||
|
||||
# Development dependencies
|
||||
pytest>=8.0.0
|
||||
pytest-cov>=4.1.0
|
||||
responses>=0.25.0
|
||||
Reference in New Issue
Block a user