Files
SportstimeAPI/templates/dashboard/scraper_status.html
Trey t 63acf7accb feat: add Django web app, CloudKit sync, dashboard, and game_datetime_utc export
Adds the full Django application layer on top of sportstime_parser:
- core: Sport, Team, Stadium, Game models with aliases and league structure
- scraper: orchestration engine, adapter, job management, Celery tasks
- cloudkit: CloudKit sync client, sync state tracking, sync jobs
- dashboard: staff dashboard for monitoring scrapers, sync, review queue
- notifications: email reports for scrape/sync results
- Docker setup for deployment (Dockerfile, docker-compose, entrypoint)

Game exports now use game_datetime_utc (ISO 8601 UTC) instead of
venue-local date+time strings, matching the canonical format used
by the iOS app.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 14:04:27 -06:00

257 lines
11 KiB
HTML

{% extends 'base.html' %}
{% block content %}
<div class="page-header">
<h2>Scraper Status</h2>
</div>
<!-- Status Overview -->
<div class="stat-grid mb-2">
<div class="stat-card {% if running_jobs > 0 %}primary{% endif %}">
<div class="stat-value">{{ running_jobs }}</div>
<div class="stat-label">Running Jobs</div>
</div>
<div class="stat-card {% if pending_jobs > 0 %}warning{% endif %}">
<div class="stat-value">{{ pending_jobs }}</div>
<div class="stat-label">Pending Jobs</div>
</div>
</div>
<!-- Scraper Configs -->
<div class="card">
<div class="card-header" style="display: flex; justify-content: space-between; align-items: center;">
<h3 style="margin: 0;">Scraper Configurations</h3>
<form method="post" action="{% url 'dashboard:run_all_scrapers' %}" style="display: inline;">
{% csrf_token %}
<button type="submit" class="btn btn-primary">Run All Enabled</button>
</form>
</div>
<table class="table">
<thead>
<tr>
<th>Sport</th>
<th>Season</th>
<th>Status</th>
<th>Last Run</th>
<th>Games Found</th>
<th>Source</th>
<th>Actions</th>
</tr>
</thead>
<tbody>
{% for config in configs %}
<tr>
<td><strong>{{ config.sport.short_name }}</strong></td>
<td>{{ config.season }}</td>
<td>
{% if config.is_enabled %}
<span class="badge badge-success">Active</span>
{% else %}
<span class="badge badge-secondary">Inactive</span>
{% endif %}
</td>
<td>
{% if config.last_run %}
{{ config.last_run|timesince }} ago
{% if config.last_run_status == 'completed' %}
<span class="text-success"></span>
{% elif config.last_run_status == 'failed' %}
<span class="text-danger"></span>
{% endif %}
{% else %}
<span class="text-muted">Never</span>
{% endif %}
</td>
<td>
{% if config.last_run_games %}
{{ config.last_run_games }} games
{% else %}
<span class="text-muted">-</span>
{% endif %}
</td>
<td>{{ config.primary_source|default:"auto" }}</td>
<td>
<form method="post" action="{% url 'dashboard:run_scraper' config.sport.code config.season %}" style="display: inline;">
{% csrf_token %}
<button type="submit" class="btn btn-primary" style="padding: 0.25rem 0.5rem; font-size: 0.85rem;">
Run Now
</button>
</form>
</td>
</tr>
{% empty %}
<tr>
<td colspan="7" class="text-muted">No scraper configurations</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
<!-- Recent Jobs -->
<div class="card">
<div class="card-header">
<h3>Recent Jobs</h3>
</div>
<table class="table">
<thead>
<tr>
<th>Sport</th>
<th>Season</th>
<th>Status</th>
<th>Games</th>
<th>Started</th>
<th>Duration</th>
<th>Error</th>
</tr>
</thead>
<tbody>
{% for job in recent_jobs %}
<tr>
<td><strong>{{ job.config.sport.short_name }}</strong></td>
<td>{{ job.config.season }}</td>
<td>
{% if job.status == 'completed' %}
<span class="badge badge-success">Completed</span>
{% elif job.status == 'running' %}
<span class="badge badge-info">Running</span>
{% elif job.status == 'failed' %}
<span class="badge badge-danger">Failed</span>
{% elif job.status == 'pending' %}
<span class="badge badge-warning">Pending</span>
{% else %}
<span class="badge badge-secondary">{{ job.status }}</span>
{% endif %}
</td>
<td>
{% if job.status == 'completed' %}
{{ job.games_found }} found, +{{ job.games_new }} new
{% else %}
-
{% endif %}
</td>
<td class="text-muted">{{ job.created_at|timesince }} ago</td>
<td>{{ job.duration_display }}</td>
<td>
{% if job.error_message %}
<span class="text-danger" title="{{ job.error_message }}">{{ job.error_message|truncatechars:50 }}</span>
{% else %}
-
{% endif %}
</td>
</tr>
{% empty %}
<tr>
<td colspan="7" class="text-muted">No recent jobs</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
<!-- Documentation -->
<div class="card">
<div class="card-header">
<h3>How Scrapers Work</h3>
</div>
<div style="padding: 1rem;">
<h4 style="margin-top: 0;">What Gets Updated Automatically</h4>
<p>When a scraper runs, it fetches schedule data from official sources and updates the following:</p>
<table class="table" style="margin-bottom: 1.5rem;">
<thead>
<tr>
<th>Data Type</th>
<th>Behavior</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Games</strong></td>
<td>Creates new games, updates scores/status for existing games. Uses canonical IDs to match.</td>
</tr>
<tr>
<td><strong>Teams</strong></td>
<td>Auto-created from scraper's built-in team mappings. New teams are added automatically.</td>
</tr>
<tr>
<td><strong>Stadiums</strong></td>
<td>Auto-created from scraper's built-in stadium mappings. New venues are added automatically.</td>
</tr>
<tr>
<td><strong>Conferences</strong></td>
<td>Auto-created based on team data (e.g., Eastern, Western).</td>
</tr>
<tr>
<td><strong>Divisions</strong></td>
<td>Auto-created based on team data (e.g., Atlantic, Pacific).</td>
</tr>
</tbody>
</table>
<h4>New Team Scenario</h4>
<p>If a league adds a new team (e.g., expansion team):</p>
<ol>
<li>Add the team via <a href="/admin/core/team/add/">Admin → Teams</a></li>
<li>Add <a href="/admin/core/teamalias/add/">Team Aliases</a> for any names/abbreviations used by data sources</li>
<li>Add the stadium via <a href="/admin/core/stadium/add/">Admin → Stadiums</a> (if it's a new venue)</li>
<li>Add <a href="/admin/core/stadiumalias/add/">Stadium Aliases</a> for any alternate names used by data sources</li>
<li>Run the scraper - it will automatically import all the new team's games</li>
</ol>
<p class="text-muted" style="font-size: 0.9rem;">If the scraper encounters an unknown team or stadium name, it creates a <strong>Review Item</strong> for manual resolution.</p>
<h4>What Requires Manual Action</h4>
<table class="table" style="margin-bottom: 1.5rem;">
<thead>
<tr>
<th>Situation</th>
<th>Action Required</th>
</tr>
</thead>
<tbody>
<tr>
<td>Unknown team name in schedule</td>
<td>Add a <a href="/admin/core/teamalias/add/">Team Alias</a> in the admin, or resolve in <a href="{% url 'dashboard:review_queue' %}">Review Queue</a></td>
</tr>
<tr>
<td>Unknown stadium name</td>
<td>Add a <a href="/admin/core/stadiumalias/add/">Stadium Alias</a> in the admin, or resolve in Review Queue</td>
</tr>
<tr>
<td>New expansion team</td>
<td>Add a <a href="/admin/core/team/add/">new Team</a> in the admin, then add aliases for any alternate names</td>
</tr>
<tr>
<td>Team relocation/rename</td>
<td>Add a <a href="/admin/core/teamalias/add/">Team Alias</a> with validity dates for the old name</td>
</tr>
<tr>
<td>Stadium rename (naming rights)</td>
<td>Add a <a href="/admin/core/stadiumalias/add/">Stadium Alias</a> with validity dates (e.g., "Staples Center" valid until 2021)</td>
</tr>
</tbody>
</table>
<h4>Managing Aliases via Admin</h4>
<p>Team and stadium name mappings can be managed directly in the admin interface:</p>
<ul>
<li><a href="/admin/core/teamalias/">Team Aliases</a> - Map alternate team names, abbreviations, historical names</li>
<li><a href="/admin/core/stadiumalias/">Stadium Aliases</a> - Map alternate stadium names, former names (naming rights changes)</li>
</ul>
<p>Aliases support <strong>validity dates</strong> - useful for historical names like "Washington Redskins" (valid until 2020) or stadium naming rights changes.</p>
<h4>Data Flow</h4>
<p style="font-family: monospace; background: #f5f5f5; padding: 0.75rem; border-radius: 4px;">
Scraper runs → Fetches from source (ESPN, league API, etc.) → Normalizes team/stadium names → Creates/updates records → Marks changed records for CloudKit sync → Creates review items for unresolved names
</p>
<h4>Tips</h4>
<ul style="margin-bottom: 0;">
<li>Run scrapers regularly to keep scores and game statuses current</li>
<li>Check the <a href="{% url 'dashboard:review_queue' %}">Review Queue</a> after scrapes for items needing attention</li>
<li>Scrapers are idempotent - running multiple times is safe and won't duplicate data</li>
<li>Each sport uses multiple data sources with automatic fallback if one fails</li>
</ul>
</div>
</div>
{% endblock %}