Spanish

admin/Spanish

Fork 0

Commit Graph

Author	SHA1	Message	Date
Trey T	7da98d786c	Vocab study — noun & adjective flashcards with CEFR level toggles Add SRS-driven noun and adjective flashcards modeled on the existing verb flashcard flow: - SharedModels/Lexeme — catalog of non-verb vocab, frequency-ranked, with gender for nouns and optional example sentences. Seeded from a bundled vocab_lexemes.json built by Scripts/vocab/build_lexemes.py, which joins frequency.csv + es-en.data from a pinned doozan/spanish_data commit (CC-BY-SA: hermitdave/FrequencyWords + Wiktionary). 1,449 nouns and 600 adjectives, each with Wiktionary-sourced gender and (where available) an example sentence with English translation. - LexemeReviewCard + LexemeReviewStore — cloud-synced SM-2 SRS, keyed by partOfSpeech + lexemeId + drillMode so future drill modes can coexist. - LexemeSessionQueue + LexemePool — parallel to VocabSessionQueue; fresh cards sort by frequency rank. - LexemeStudyGroup — cloud-synced resumable session per (partOfSpeech, drillMode). - NounFlashcardPracticeView + AdjectiveFlashcardPracticeView — same flow as VocabFlashcardPracticeView: English prompt → tap to reveal Spanish → Again/Hard/Good/Easy. Nouns reveal with their article (la taza, el problema) so gender is taught alongside meaning, not as a separate quiz. Example sentence shown when present. CEFR-style level toggles: - LexemeLevel enum (A1/A2/B1/B2/C1+) derived from frequencyRank with standard Spanish-frequency-dictionary cutoffs (250/500/1000/2000). - UserProgress.selectedLexemeLevels — cloud-synced multi-select, defaults to A1+A2 on first launch. - SettingsView gains a "Vocabulary Levels" section with five toggles; the existing "Levels" section is renamed "Verb Levels" for clarity. - Due SRS cards always surface regardless of toggles. Disabling a level only stops new cards from that band entering the pool. PracticeView gets "Nouns" and "Adjectives" rows under "Books". DataLoader: new lexemeDataVersion gate that re-seeds the Lexeme table from vocab_lexemes.json independent of book seeding. project.yml lists the new JSON resource and the existing book_olly-vol2.json (which the previous build was silently excluding because xcodegen rewrote the project from project.yml). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 20:16:55 -05:00
Trey T	3ee1563cb0	Books — pre-computed per-book glossary for context-correct word lookup The book reader's word lookup used DictionaryService, a verb-conjugation index plus ~200 hand-typed words: ordinary nouns like "taza" returned nothing, and homographs always lost (tapping "como" in "como siempre" gave the verb "comer" because the verb index is checked first). Add a glossary phase to the books pipeline (build_glossary.py): every distinct Spanish word is translated once, in its sentence context, by the same Claude-Code-subagent LLM step the pipeline already uses for chapter translation. English front matter is excluded by an ES==EN paragraph-ratio heuristic. The glossary is bundled into book_<slug>.json and is now part of the pipeline for every book. In the app, Book carries the decoded glossary and BookReaderView resolves each tap automatically through cache -> glossary -> DictionaryService -> on-device LLM, citing which source answered so a curated glossary hit reads differently from a best-effort AI guess. book_olly-vol2.json regenerated with a 3,658-word glossary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 10:44:32 -05:00

Author

SHA1

Message

Date

Trey T

7da98d786c

Vocab study — noun & adjective flashcards with CEFR level toggles

Add SRS-driven noun and adjective flashcards modeled on the existing verb
flashcard flow:

- SharedModels/Lexeme — catalog of non-verb vocab, frequency-ranked, with
  gender for nouns and optional example sentences. Seeded from a bundled
  vocab_lexemes.json built by Scripts/vocab/build_lexemes.py, which joins
  frequency.csv + es-en.data from a pinned doozan/spanish_data commit
  (CC-BY-SA: hermitdave/FrequencyWords + Wiktionary). 1,449 nouns and 600
  adjectives, each with Wiktionary-sourced gender and (where available)
  an example sentence with English translation.
- LexemeReviewCard + LexemeReviewStore — cloud-synced SM-2 SRS, keyed by
  partOfSpeech + lexemeId + drillMode so future drill modes can coexist.
- LexemeSessionQueue + LexemePool — parallel to VocabSessionQueue; fresh
  cards sort by frequency rank.
- LexemeStudyGroup — cloud-synced resumable session per
  (partOfSpeech, drillMode).
- NounFlashcardPracticeView + AdjectiveFlashcardPracticeView — same flow
  as VocabFlashcardPracticeView: English prompt → tap to reveal Spanish
  → Again/Hard/Good/Easy. Nouns reveal with their article (la taza, el
  problema) so gender is taught alongside meaning, not as a separate
  quiz. Example sentence shown when present.

CEFR-style level toggles:
- LexemeLevel enum (A1/A2/B1/B2/C1+) derived from frequencyRank with
  standard Spanish-frequency-dictionary cutoffs (250/500/1000/2000).
- UserProgress.selectedLexemeLevels — cloud-synced multi-select, defaults
  to A1+A2 on first launch.
- SettingsView gains a "Vocabulary Levels" section with five toggles; the
  existing "Levels" section is renamed "Verb Levels" for clarity.
- Due SRS cards always surface regardless of toggles. Disabling a level
  only stops new cards from that band entering the pool.

PracticeView gets "Nouns" and "Adjectives" rows under "Books".

DataLoader: new lexemeDataVersion gate that re-seeds the Lexeme table
from vocab_lexemes.json independent of book seeding. project.yml lists
the new JSON resource and the existing book_olly-vol2.json (which the
previous build was silently excluding because xcodegen rewrote the
project from project.yml).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-19 20:16:55 -05:00

Trey T

3ee1563cb0

Books — pre-computed per-book glossary for context-correct word lookup

The book reader's word lookup used DictionaryService, a verb-conjugation
index plus ~200 hand-typed words: ordinary nouns like "taza" returned
nothing, and homographs always lost (tapping "como" in "como siempre"
gave the verb "comer" because the verb index is checked first).

Add a glossary phase to the books pipeline (build_glossary.py): every
distinct Spanish word is translated once, in its sentence context, by
the same Claude-Code-subagent LLM step the pipeline already uses for
chapter translation. English front matter is excluded by an ES==EN
paragraph-ratio heuristic. The glossary is bundled into book_<slug>.json
and is now part of the pipeline for every book.

In the app, Book carries the decoded glossary and BookReaderView resolves
each tap automatically through cache -> glossary -> DictionaryService ->
on-device LLM, citing which source answered so a curated glossary hit
reads differently from a best-effort AI guess.

book_olly-vol2.json regenerated with a 3,658-word glossary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-18 10:44:32 -05:00

2 Commits