- 01-01-PLAN.md: core.py + mlb.py (executed) - 01-02-PLAN.md: nba.py + nhl.py - 01-03-PLAN.md: nfl.py + orchestrator refactor - Codebase documentation for planning context Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
4.7 KiB
phase, plan, type
| phase | plan | type |
|---|---|---|
| 01-script-architecture | 03 | execute |
Purpose: Complete the modular architecture and update the main entry point.
Output: Scripts/nfl.py and refactored Scripts/scrape_schedules.py.
<execution_context>
@/.claude/get-shit-done/workflows/execute-phase.md
@/.claude/get-shit-done/templates/summary.md
</execution_context>
Prior work: @.planning/phases/01-script-architecture/01-01-SUMMARY.md @.planning/phases/01-script-architecture/01-02-SUMMARY.md
Source files: @Scripts/core.py @Scripts/mlb.py @Scripts/nba.py @Scripts/nhl.py @Scripts/scrape_schedules.py
Task 1: Create nfl.py sport module Scripts/nfl.py Create `Scripts/nfl.py` following the established pattern:-
Import from core:
from core import Game, Stadium, ScraperSource, StadiumScraperSource, fetch_page, scrape_with_fallback, scrape_stadiums_with_fallback -
NFL game scrapers:
scrape_nfl_espn(season: int) -> list[Game]scrape_nfl_pro_football_reference(season: int) -> list[Game]scrape_nfl_cbssports(season: int) -> list[Game]
-
NFL stadium scrapers:
scrape_nfl_stadiums_scorebot() -> list[Stadium]scrape_nfl_stadiums_geojson() -> list[Stadium]scrape_nfl_stadiums_hardcoded() -> list[Stadium]scrape_nfl_stadiums() -> list[Stadium]
-
Source configurations:
NFL_GAME_SOURCESlist of ScraperSourceNFL_STADIUM_SOURCESlist of StadiumScraperSource
-
Convenience functions:
scrape_nfl_games(season: int) -> list[Game]get_nfl_season_string(season: int) -> str- returns "2025-26" format
Copy exact parsing logic from scrape_schedules.py. python3 -c "from Scripts.nfl import scrape_nfl_games, NFL_GAME_SOURCES; print('OK')" nfl.py exists, imports from core.py, exports NFL scrapers
Task 2: Refactor scrape_schedules.py to orchestrator Scripts/scrape_schedules.py Rewrite `Scripts/scrape_schedules.py` as a thin orchestrator:-
Replace inline scrapers with imports:
from core import Game, Stadium, assign_stable_ids, export_to_json from mlb import scrape_mlb_games, scrape_mlb_stadiums, MLB_GAME_SOURCES from nba import scrape_nba_games, scrape_nba_stadiums, NBA_GAME_SOURCES, get_nba_season_string from nhl import scrape_nhl_games, scrape_nhl_stadiums, NHL_GAME_SOURCES, get_nhl_season_string from nfl import scrape_nfl_games, scrape_nfl_stadiums, NFL_GAME_SOURCES, get_nfl_season_string -
Keep the main() function with argparse for CLI
-
Update sport scraping blocks to use new imports:
if args.sport in ['nba', 'all']:usesscrape_nba_games(season)if args.sport in ['mlb', 'all']:usesscrape_mlb_games(season)- etc.
-
Keep stadium scraping with the new module imports
-
For non-core sports (WNBA, MLS, NWSL, CBB), keep them inline for now with a
# TODO: Extract to separate modulescomment -
Update file header docstring to explain the modular structure:
""" Sports Schedule Scraper Orchestrator This script coordinates scraping across sport-specific modules: - core.py: Shared utilities, data classes, fallback system - mlb.py: MLB scrapers - nba.py: NBA scrapers - nhl.py: NHL scrapers - nfl.py: NFL scrapers Usage: python scrape_schedules.py --sport nba --season 2026 python scrape_schedules.py --sport all --season 2026 """
Target: ~500 lines (down from 3359) for the orchestrator, with sport logic in modules. cd Scripts && python3 scrape_schedules.py --help scrape_schedules.py is thin orchestrator, imports from sport modules, --help works
Before declaring phase complete: - [ ] All sport modules exist: core.py, mlb.py, nba.py, nhl.py, nfl.py - [ ] `python3 -m py_compile Scripts/*.py` passes for all files - [ ] `cd Scripts && python3 scrape_schedules.py --help` shows usage - [ ] scrape_schedules.py is significantly smaller (~500 lines vs 3359) - [ ] No circular imports between modules<success_criteria>
- Phase 1: Script Architecture complete
- All 4 core sports have dedicated modules
- Shared utilities in core.py
- scrape_schedules.py is thin orchestrator
- CLI unchanged (backward compatible) </success_criteria>