feat(scripts): rewrite parser as modular Python CLI

Replace monolithic scraping scripts with sportstime_parser package:

- Multi-source scrapers with automatic fallback for 7 sports
- Canonical ID generation for games, teams, and stadiums
- Fuzzy matching with configurable thresholds for name resolution
- CloudKit Web Services uploader with JWT auth, diff-based updates
- Resumable uploads with checkpoint state persistence
- Validation reports with manual review items and suggested matches
- Comprehensive test suite (249 tests)

CLI: sportstime-parser scrape|validate|upload|status|retry|clear

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Trey t
2026-01-10 21:06:12 -06:00
parent 284a10d9e1
commit eeaf900e5a
109 changed files with 18415 additions and 266211 deletions

View File

@@ -0,0 +1,58 @@
"""Utility modules for sportstime-parser."""
from .logging import (
get_console,
get_logger,
is_verbose,
log_error,
log_failure,
log_game,
log_stadium,
log_success,
log_team,
log_warning,
set_verbose,
)
from .http import (
RateLimitedSession,
get_session,
fetch_url,
fetch_json,
fetch_html,
)
from .progress import (
create_progress,
create_spinner_progress,
progress_bar,
track_progress,
ProgressTracker,
ScrapeProgress,
)
__all__ = [
# Logging
"get_console",
"get_logger",
"is_verbose",
"log_error",
"log_failure",
"log_game",
"log_stadium",
"log_success",
"log_team",
"log_warning",
"set_verbose",
# HTTP
"RateLimitedSession",
"get_session",
"fetch_url",
"fetch_json",
"fetch_html",
# Progress
"create_progress",
"create_spinner_progress",
"progress_bar",
"track_progress",
"ProgressTracker",
"ScrapeProgress",
]