--- phase: 01-script-architecture plan: 03 type: execute --- Extract NFL scrapers and refactor scrape_schedules.py to be a thin orchestrator. Purpose: Complete the modular architecture and update the main entry point. Output: `Scripts/nfl.py` and refactored `Scripts/scrape_schedules.py`. @~/.claude/get-shit-done/workflows/execute-phase.md @~/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md **Prior work:** @.planning/phases/01-script-architecture/01-01-SUMMARY.md @.planning/phases/01-script-architecture/01-02-SUMMARY.md **Source files:** @Scripts/core.py @Scripts/mlb.py @Scripts/nba.py @Scripts/nhl.py @Scripts/scrape_schedules.py Task 1: Create nfl.py sport module Scripts/nfl.py Create `Scripts/nfl.py` following the established pattern: 1. Import from core: ```python from core import Game, Stadium, ScraperSource, StadiumScraperSource, fetch_page, scrape_with_fallback, scrape_stadiums_with_fallback ``` 2. NFL game scrapers: - `scrape_nfl_espn(season: int) -> list[Game]` - `scrape_nfl_pro_football_reference(season: int) -> list[Game]` - `scrape_nfl_cbssports(season: int) -> list[Game]` 3. NFL stadium scrapers: - `scrape_nfl_stadiums_scorebot() -> list[Stadium]` - `scrape_nfl_stadiums_geojson() -> list[Stadium]` - `scrape_nfl_stadiums_hardcoded() -> list[Stadium]` - `scrape_nfl_stadiums() -> list[Stadium]` 4. Source configurations: - `NFL_GAME_SOURCES` list of ScraperSource - `NFL_STADIUM_SOURCES` list of StadiumScraperSource 5. Convenience functions: - `scrape_nfl_games(season: int) -> list[Game]` - `get_nfl_season_string(season: int) -> str` - returns "2025-26" format Copy exact parsing logic from scrape_schedules.py. python3 -c "from Scripts.nfl import scrape_nfl_games, NFL_GAME_SOURCES; print('OK')" nfl.py exists, imports from core.py, exports NFL scrapers Task 2: Refactor scrape_schedules.py to orchestrator Scripts/scrape_schedules.py Rewrite `Scripts/scrape_schedules.py` as a thin orchestrator: 1. Replace inline scrapers with imports: ```python from core import Game, Stadium, assign_stable_ids, export_to_json from mlb import scrape_mlb_games, scrape_mlb_stadiums, MLB_GAME_SOURCES from nba import scrape_nba_games, scrape_nba_stadiums, NBA_GAME_SOURCES, get_nba_season_string from nhl import scrape_nhl_games, scrape_nhl_stadiums, NHL_GAME_SOURCES, get_nhl_season_string from nfl import scrape_nfl_games, scrape_nfl_stadiums, NFL_GAME_SOURCES, get_nfl_season_string ``` 2. Keep the main() function with argparse for CLI 3. Update sport scraping blocks to use new imports: - `if args.sport in ['nba', 'all']:` uses `scrape_nba_games(season)` - `if args.sport in ['mlb', 'all']:` uses `scrape_mlb_games(season)` - etc. 4. Keep stadium scraping with the new module imports 5. For non-core sports (WNBA, MLS, NWSL, CBB), keep them inline for now with a `# TODO: Extract to separate modules` comment 6. Update file header docstring to explain the modular structure: ```python """ Sports Schedule Scraper Orchestrator This script coordinates scraping across sport-specific modules: - core.py: Shared utilities, data classes, fallback system - mlb.py: MLB scrapers - nba.py: NBA scrapers - nhl.py: NHL scrapers - nfl.py: NFL scrapers Usage: python scrape_schedules.py --sport nba --season 2026 python scrape_schedules.py --sport all --season 2026 """ ``` Target: ~500 lines (down from 3359) for the orchestrator, with sport logic in modules. cd Scripts && python3 scrape_schedules.py --help scrape_schedules.py is thin orchestrator, imports from sport modules, --help works Before declaring phase complete: - [ ] All sport modules exist: core.py, mlb.py, nba.py, nhl.py, nfl.py - [ ] `python3 -m py_compile Scripts/*.py` passes for all files - [ ] `cd Scripts && python3 scrape_schedules.py --help` shows usage - [ ] scrape_schedules.py is significantly smaller (~500 lines vs 3359) - [ ] No circular imports between modules - Phase 1: Script Architecture complete - All 4 core sports have dedicated modules - Shared utilities in core.py - scrape_schedules.py is thin orchestrator - CLI unchanged (backward compatible) After completion, create `.planning/phases/01-script-architecture/01-03-SUMMARY.md` with: - Phase 1 complete - Ready for Phase 2: Stadium Foundation