Spanish

Files

T

Trey T 05a367fdbe Books — capture <li> vocab bullets the extractor was silently dropping

extract_epub.py was walking <p> only, but every "Vocabulario" section in
the Olly Richards EPUB lives inside <ul><li>...</li></ul>. That meant
the heading made it through but the entries didn't — 680 vocab lines
across 24 sections in this book were missing from the bundled JSON.

Audit (text-node owner by closest block ancestor) confirmed <li> is the
only silent drop: 5,260 nodes in <p>, 1,960 in <li>, 0 anywhere else.
No <h1>-<h6>, tables, or blockquotes in this EPUB at all.

Fix: walk find_all(["p", "li"]) in document order so bullet entries
slot in right after their "Vocabulario" / list heading. Re-extracted
(2,646 → 3,326 paragraphs), re-translated all 118 jobs in parallel
Claude Code subagents. translate_chapters.py prompt template now tells
subagents to keep bilingual `palabra = meaning` lines verbatim — both
sides already coexist on the line.

Bumped bookDataVersion to 2 so refreshBooksDataIfNeeded re-seeds.
Verified in simulator: all 13 chapter row sizes grew (e.g. ch6
18,295→20,951 chars).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-11 10:10:34 -05:00

books

Books — capture <li> vocab bullets the extractor was silently dropping

2026-05-11 10:10:34 -05:00

textbook

Issue #32 cleanup — drop the last 5 mis-oriented vocab pairs

2026-05-03 18:52:53 -05:00

all_courses_data.json

Initial commit: Conjuga Spanish conjugation app

2026-04-09 20:58:33 -05:00

build_store.swift

Initial commit: Conjuga Spanish conjugation app