Files
Sportstime/.planning/phases/01-script-architecture/01-03-PLAN.md
Trey t 60b450d869 docs: add Phase 1 plans and codebase documentation
- 01-01-PLAN.md: core.py + mlb.py (executed)
- 01-02-PLAN.md: nba.py + nhl.py
- 01-03-PLAN.md: nfl.py + orchestrator refactor
- Codebase documentation for planning context

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 00:00:45 -06:00

4.7 KiB

phase, plan, type
phase plan type
01-script-architecture 03 execute
Extract NFL scrapers and refactor scrape_schedules.py to be a thin orchestrator.

Purpose: Complete the modular architecture and update the main entry point. Output: Scripts/nfl.py and refactored Scripts/scrape_schedules.py.

<execution_context> @/.claude/get-shit-done/workflows/execute-phase.md @/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md

Prior work: @.planning/phases/01-script-architecture/01-01-SUMMARY.md @.planning/phases/01-script-architecture/01-02-SUMMARY.md

Source files: @Scripts/core.py @Scripts/mlb.py @Scripts/nba.py @Scripts/nhl.py @Scripts/scrape_schedules.py

Task 1: Create nfl.py sport module Scripts/nfl.py Create `Scripts/nfl.py` following the established pattern:
  1. Import from core:

    from core import Game, Stadium, ScraperSource, StadiumScraperSource, fetch_page, scrape_with_fallback, scrape_stadiums_with_fallback
    
  2. NFL game scrapers:

    • scrape_nfl_espn(season: int) -> list[Game]
    • scrape_nfl_pro_football_reference(season: int) -> list[Game]
    • scrape_nfl_cbssports(season: int) -> list[Game]
  3. NFL stadium scrapers:

    • scrape_nfl_stadiums_scorebot() -> list[Stadium]
    • scrape_nfl_stadiums_geojson() -> list[Stadium]
    • scrape_nfl_stadiums_hardcoded() -> list[Stadium]
    • scrape_nfl_stadiums() -> list[Stadium]
  4. Source configurations:

    • NFL_GAME_SOURCES list of ScraperSource
    • NFL_STADIUM_SOURCES list of StadiumScraperSource
  5. Convenience functions:

    • scrape_nfl_games(season: int) -> list[Game]
    • get_nfl_season_string(season: int) -> str - returns "2025-26" format

Copy exact parsing logic from scrape_schedules.py. python3 -c "from Scripts.nfl import scrape_nfl_games, NFL_GAME_SOURCES; print('OK')" nfl.py exists, imports from core.py, exports NFL scrapers

Task 2: Refactor scrape_schedules.py to orchestrator Scripts/scrape_schedules.py Rewrite `Scripts/scrape_schedules.py` as a thin orchestrator:
  1. Replace inline scrapers with imports:

    from core import Game, Stadium, assign_stable_ids, export_to_json
    from mlb import scrape_mlb_games, scrape_mlb_stadiums, MLB_GAME_SOURCES
    from nba import scrape_nba_games, scrape_nba_stadiums, NBA_GAME_SOURCES, get_nba_season_string
    from nhl import scrape_nhl_games, scrape_nhl_stadiums, NHL_GAME_SOURCES, get_nhl_season_string
    from nfl import scrape_nfl_games, scrape_nfl_stadiums, NFL_GAME_SOURCES, get_nfl_season_string
    
  2. Keep the main() function with argparse for CLI

  3. Update sport scraping blocks to use new imports:

    • if args.sport in ['nba', 'all']: uses scrape_nba_games(season)
    • if args.sport in ['mlb', 'all']: uses scrape_mlb_games(season)
    • etc.
  4. Keep stadium scraping with the new module imports

  5. For non-core sports (WNBA, MLS, NWSL, CBB), keep them inline for now with a # TODO: Extract to separate modules comment

  6. Update file header docstring to explain the modular structure:

    """
    Sports Schedule Scraper Orchestrator
    
    This script coordinates scraping across sport-specific modules:
    - core.py: Shared utilities, data classes, fallback system
    - mlb.py: MLB scrapers
    - nba.py: NBA scrapers
    - nhl.py: NHL scrapers
    - nfl.py: NFL scrapers
    
    Usage:
        python scrape_schedules.py --sport nba --season 2026
        python scrape_schedules.py --sport all --season 2026
    """
    

Target: ~500 lines (down from 3359) for the orchestrator, with sport logic in modules. cd Scripts && python3 scrape_schedules.py --help scrape_schedules.py is thin orchestrator, imports from sport modules, --help works

Before declaring phase complete: - [ ] All sport modules exist: core.py, mlb.py, nba.py, nhl.py, nfl.py - [ ] `python3 -m py_compile Scripts/*.py` passes for all files - [ ] `cd Scripts && python3 scrape_schedules.py --help` shows usage - [ ] scrape_schedules.py is significantly smaller (~500 lines vs 3359) - [ ] No circular imports between modules

<success_criteria>

  • Phase 1: Script Architecture complete
  • All 4 core sports have dedicated modules
  • Shared utilities in core.py
  • scrape_schedules.py is thin orchestrator
  • CLI unchanged (backward compatible) </success_criteria>
After completion, create `.planning/phases/01-script-architecture/01-03-SUMMARY.md` with: - Phase 1 complete - Ready for Phase 2: Stadium Foundation