feat(scripts): rewrite parser as modular Python CLI
Replace monolithic scraping scripts with sportstime_parser package: - Multi-source scrapers with automatic fallback for 7 sports - Canonical ID generation for games, teams, and stadiums - Fuzzy matching with configurable thresholds for name resolution - CloudKit Web Services uploader with JWT auth, diff-based updates - Resumable uploads with checkpoint state persistence - Validation reports with manual review items and suggested matches - Comprehensive test suite (249 tests) CLI: sportstime-parser scrape|validate|upload|status|retry|clear Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -1,8 +1,15 @@
|
||||
# Sports Schedule Scraper Dependencies
|
||||
requests>=2.28.0
|
||||
beautifulsoup4>=4.11.0
|
||||
pandas>=2.0.0
|
||||
lxml>=4.9.0
|
||||
# Core dependencies
|
||||
requests>=2.31.0
|
||||
beautifulsoup4>=4.12.0
|
||||
lxml>=5.0.0
|
||||
rapidfuzz>=3.5.0
|
||||
python-dateutil>=2.8.0
|
||||
pytz>=2024.1
|
||||
rich>=13.7.0
|
||||
pyjwt>=2.8.0
|
||||
cryptography>=42.0.0
|
||||
|
||||
# CloudKit Import (optional - only needed for cloudkit_import.py)
|
||||
cryptography>=41.0.0
|
||||
# Development dependencies
|
||||
pytest>=8.0.0
|
||||
pytest-cov>=4.1.0
|
||||
responses>=0.25.0
|
||||
|
||||
Reference in New Issue
Block a user