Previously the chapter reader showed vocab tables as a flat list of OCR
lines — because Vision reads columns top-to-bottom, the Spanish column
appeared as one block followed by the English column, making pairings
illegible.
Now every vocab table renders as a 2-column grid with Spanish on the
left and English on the right. Supporting changes:
- New ocr_all_vocab.swift: bounding-box OCR over all 931 vocab images,
cluster lines into rows by Y-coordinate, split rows by largest X-gap,
detect 2- / 3- / 4-column layouts automatically. ~2800 pairs extracted
this pass vs ~1100 from the old block-alternation heuristic.
- merge_pdf_into_book.py now prefers bounding-box pairs when present,
falls back to the heuristic, embeds the resulting pairs as
vocab_table.cards in book.json.
- DataLoader passes cards through to TextbookBlock on seed.
- TextbookChapterView renders cards via SwiftUI Grid (2 cols).
- fix_vocab.py quarantine rule relaxed — only mis-pairs where both
sides are clearly the same language are removed. "unknown" sides
stay (bbox pipeline already oriented them correctly).
Textbook card count jumps from 1044 → 3118 active pairs.
textbookDataVersion bumped to 9.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>