Add 13 new grammar notes with 1010 exercises from video extraction

Scraped a 4h Spanish fundamentals YouTube video (transcript + OCR on
14810 frames), extracted structured content across 52 chapters, and
generated fill-in-the-blank quizzes for every grammar topic.

- 13 new GrammarNote entries (articles, possessives, demonstratives,
  greetings, poder, al/del, prepositional pronouns, irregular yo,
  stem-changing, stressed possessives, present/future perfect, present
  indicative conjugation)
- 1010 generated exercises across all 36 grammar notes (new + existing)
- Fix tense guide parser to handle unnumbered *Usages* blocks
- Rewrite 6 broken tense guide bodies (imperative, subj pluperfect,
  subj future) with numbered usage format
- Bump courseDataVersion 5→6 with TenseGuide refresh on upgrade
- Add docs/spanish-fundamentals/ with raw transcripts, polished notes,
  structured JSON, and exercise data

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trey t
2026-04-16 08:40:05 -05:00
parent ff4f906128
commit 47a7871c38
297 changed files with 114661 additions and 14 deletions

View File

@@ -0,0 +1,54 @@
# Exercise-generator instructions
You are generating Spanish grammar quiz items for the iOS app **Conjuga**. Each item is a fill-in-the-blank multiple-choice question with ONE correct answer and ONE distractor.
## Your inputs
- **Registry:** `/Users/treyt/Desktop/code/Spanish/docs/spanish-fundamentals/note_registry.json` — every note's metadata (noteId, title, category, status, target_count, prompt).
- **Seed pool (per note):** `/Users/treyt/Desktop/code/Spanish/docs/spanish-fundamentals/exercises/seed/<noteId>.json` — exercises already extracted from the source video. Use these as-is (don't rewrite unless obviously broken). Generate ADDITIONAL items to hit `target_count`.
- **Existing note bodies (for `existing_*` status):** `/Users/treyt/Desktop/code/Spanish/Conjuga/Conjuga/Models/GrammarNote.swift` — grep for your noteId, read the surrounding `body: """..."""` block for context.
- **New-note source content (for `new` status):** `/Users/treyt/Desktop/code/Spanish/docs/spanish-fundamentals/notes/NN-<slug>.md` (find NN via registry `source_chapters[0].id`).
- **Polished structured data (for new notes):** `/Users/treyt/Desktop/code/Spanish/docs/spanish-fundamentals/structured/NN-<slug>.json` — has rules, examples, conjugation tables.
## Exercise format (output)
Each exercise is a JSON object:
```json
{"sentence": "Ella _____ doctora.", "correct": "es", "distractor": "está", "explanation": "Ser for professions."}
```
Rules:
- `sentence` must contain `_____` (exactly 5 underscores) where the blank goes.
- **Exception:** personal-a style can use yes/no like `{"sentence": "Veo _____ mi hermana.", "correct": "a", "distractor": "—"}` — still use the blank.
- `correct` and `distractor` are short tokens (13 words typically).
- `distractor` must be plausibly wrong, same part-of-speech/tense as correct. Never a silly answer.
- `explanation` is ONE short sentence (≤ 70 chars preferred) stating WHY the correct answer is right.
- Use proper Spanish accents (é, í, ó, ú, ñ, ¿, ¡).
## Quality rules
1. **Variety across the set:** spread pronouns (yo / tú / él/ella / nosotros / ellos), time contexts (present, past, future), and noun domains (family, food, work, school, travel, weather, etc.). Don't have 10 items all about food.
2. **No duplicates:** no two items in the same note's final array should have the same `(sentence, correct)` pair.
3. **Teaching value:** each item should test a concept the note actually covers. Don't invent rules not in the body/notes.
4. **Difficulty mix:** most items at intermediate level. A handful can be easy, a handful slightly tricky — but never ambiguous.
5. **Short sentences:** usually 510 words. Keep them crisp.
6. **No proper-noun soup:** common first names like María, Juan, Ana are fine. Avoid obscure names.
## Output
Write `/Users/treyt/Desktop/code/Spanish/docs/spanish-fundamentals/exercises/final/<noteId>.json` for each noteId you're assigned.
Shape of each file:
```json
{
"noteId": "ser-vs-estar",
"target_count": 37,
"seed_count": 22,
"generated_count": 15,
"exercises": [
{"sentence": "...", "correct": "...", "distractor": "...", "explanation": "..."},
...
]
}
```
`exercises` = seed items (unchanged, first) + your newly generated items. Total length should equal `target_count` (it's OK to go slightly over — up to +5). If seed already exceeds target, just include all seed items and add 0 new.
## Report back
When done, under 100 words: which notes you completed, final counts per note, any notes where you fell short of target and why.