Guide — enrich present-subjunctive entry with WEIRDO + ESCAPA + plan

The present-subjunctive guide was surface-level: two numbered usages and a handful of examples, no mnemonic and no structural trigger cue. That's the recurring problem with the tense guides — they're reference cards, not teaching materials. This commit fixes the immediate gap and lays out a plan to fix the rest: Conjuga/conjuga_data.json — subj_presente body expanded from 794 to 3670 chars. Adds the WEIRDO mnemonic with per-letter triggers and examples (Wishes, Emotions, Impersonal, Recommendations, Doubt, Ojalá), the ESCAPA adverbial-conjunction set, the "que + change of subject" structural rule, adjectival clauses with unknown antecedents, and the future-time-clause rule (cuando / hasta que / en cuanto). Scripts/guide-enrichment/PLAN.md (new) — audit of all 20 tense guides and 36 grammar notes, tier-1/2/3 prioritisation, "thorough" checklist (TL;DR, usages, conjugation, irregulars, mnemonic, pitfalls, contrast, dialogue example), research sources, per-topic workflow, effort estimate. DataLoader.swift — courseDataVersion 7 → 8 so existing installs re-seed the new body on next launch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 23:33:09 -05:00
parent a416233a2d
commit de446b2301
3 changed files with 121 additions and 2 deletions
@@ -3,7 +3,7 @@ import SharedModels
 import Foundation

 actor DataLoader {
-    static let courseDataVersion = 7
+    static let courseDataVersion = 8  // bump: present-subjunctive guide enriched with WEIRDO + ESCAPA
    static let courseDataKey = "courseDataVersion"

    static let textbookDataVersion = 14
@@ -0,0 +1,119 @@
+# Guide enrichment plan
+
+**Trigger**: WEIRDO was missing from the present-subjunctive guide. That's a perfect example of a deeper problem — most tense guides are surface-level reference cards (2-3 usages + examples), missing the mnemonics, contrast tables, and exception lists a real Spanish teacher would hand out.
+
+**Goal**: bring every tense guide and grammar note up to "teacher-handout" depth — enough that a learner could study from it alone and pass a quiz.
+
+## Current state (audit, 2026-05-11)
+
+| Surface | Items | Source of truth | Typical body length | Verdict |
+|---|---|---|---|---|
+| Tense guides | 20 | `Conjuga/Conjuga/conjuga_data.json` → `tenseGuides[]` | 500–1500 chars | **Shallow** — bare *Usages* + examples |
+| Grammar notes | ~36 | `Conjuga/Conjuga/Models/GrammarNote.swift` (`GrammarNote.allNotes`, `generatedNotes`) | 1500–3000 chars | **Decent** — most have mnemonics and contrast examples |
+| Reference store | — | `Conjuga/Conjuga/Services/ReferenceStore.swift` | varies | Not in scope for this pass |
+
+Tense guides are the bulk of the work. Grammar notes need a smaller audit-and-fill pass.
+
+## What "thorough" looks like
+
+Every tense guide should include, at minimum:
+
+1. **Quick TL;DR** — one sentence: what is this tense for?
+2. **When to use it** — numbered usages, each with 2 contrast examples (a clear case and a borderline / common-mistake case).
+3. **How to form it** — conjugation pattern for regular verbs (one table per AR/ER/IR if it differs), plus the irregular pattern callout if applicable. Cross-reference the conjugator screens if relevant.
+4. **Common irregulars** — top 5–10 irregular verbs that learners will hit immediately in this tense (ser, estar, ir, tener, haber, dar, ver, decir, hacer, querer, poder, poner, saber, salir, traer, venir).
+5. **Triggers / mnemonics** — words and structures that signal this tense. WEIRDO and ESCAPA for subjunctive; "yesterday / last X / specific time" for preterite; "used to / when I was a kid" for imperfect; etc.
+6. **Pitfalls** — the top 3–5 mistakes English speakers make. e.g. preterite vs imperfect mixups, ir vs venir, ser vs estar overlap.
+7. **Tense-vs-tense contrast** — pair with the closest neighbour and show 2 minimal pairs (preterite ↔ imperfect, present ↔ present-progressive, future ↔ ir-a + infinitive, subjunctive-presente ↔ subjunctive-imperfecto).
+8. **Real-world feel** — 2–3 dialogue-style examples showing the tense in natural use, not just isolated sentences.
+
+Every grammar note should include, at minimum:
+1. The core distinction in one line.
+2. Each side of the distinction with 4–6 clear examples covering different positions in a sentence.
+3. A mnemonic if one is standard in the language (DOCTOR/PLACE, WEIRDO, ESCAPA, etc.).
+4. Edge cases / verbs that change meaning (e.g. ser/estar adjectives, conocer/saber overlap).
+5. A practice prompt: "Try translating these 3 sentences, then check below."
+
+## Priority order
+
+Triaged by learner impact (frequency of use × typical confusion):
+
+**Tier 1 — most-used, most-confused** (do first):
+1. `ind_presente` (Present indicative) — already 1324 chars, the longest tense guide. Audit for gaps; probably needs irregular tables.
+2. `ind_preterito` (Preterite) — currently 492 chars, the shortest. **Highest priority** — every learner hits this and gets it wrong.
+3. `ind_imperfecto` (Imperfect) — 774 chars. Always taught alongside preterite; the contrast is the entire game.
+4. `subj_presente` (Present subjunctive) — ✅ done in this pass.
+5. `imp_afirmativo` + `imp_negativo` (Imperative pair) — combined 2037 chars. Needs the tú/usted/nosotros/vosotros table and the negative-flips-to-subjunctive rule highlighted.
+
+**Tier 2 — common but often skimped**:
+6. `ind_futuro` (Simple future) — needs contrast with ir-a + infinitive (already covered in grammar notes; cross-link).
+7. `cond_presente` (Conditional) — needs the "if-clause" patterns and the "softening request" usage ("¿Podrías…?").
+8. `ind_perfecto` (Present perfect) — needs the haber + past participle conjugation table and the "ya / todavía / alguna vez" trigger words.
+9. `subj_imperfecto_1` + `subj_imperfecto_2` (Past subjunctive -ra / -se) — needs the if-clause + condicional pairing.
+
+**Tier 3 — compound and less-frequent** (still must be thorough):
+10. `ind_pluscuamperfecto`, `ind_futuro_perfecto`, `ind_preterito_anterior` (literary)
+11. `cond_perfecto`, `subj_perfecto`, `subj_pluscuamperfecto_1`, `subj_pluscuamperfecto_2`
+12. `subj_futuro`, `subj_futuro_perfecto` (largely archaic — note they're rare but explain why they exist)
+
+**Grammar notes audit**:
+- Pass through all 36, score each on the "thorough" criteria above.
+- Fill the gaps. Most already have mnemonics; some don't.
+
+## Research sources
+
+Cite explicitly in each draft so reviewers can verify. Order of trust:
+
+1. **Real Academia Española (RAE) — Nueva gramática de la lengua española** — authoritative reference. Free online: `rae.es`.
+2. **Studyspanish.com** and **SpanishDict.com** grammar references — best free per-topic explanations, well-curated example sentences.
+3. **Practice Makes Perfect: Complete Spanish Grammar** (Dorothy Richmond, McGraw-Hill) — standard teaching reference. The PDF is already at the repo root for cross-reference.
+4. **Lawless Spanish** (Laura Lawless) — accurate, concise, good on subjunctive nuances.
+5. **The user's existing textbook** — *Complete Spanish Step-by-Step* (Bregstein) is already bundled. Cross-reference its chapter on each tense to keep voice consistent.
+6. **YouTube — Butterfly Spanish (Ana), Spring Spanish, Dreaming Spanish (Pablo)** — for natural-use examples and the "feel" of when a native reaches for the tense. The repo already has a curated YouTube list at `Conjuga/Conjuga/youtube_videos.json` — pull from there when a topic has a matching video.
+
+For mnemonics specifically: WEIRDO, ESCAPA, DOCTOR, PLACE are standard. Don't invent new ones unless we can't find a known one.
+
+## Workflow per topic
+
+This is what an enrichment "unit of work" looks like:
+
+1. **Draft** — A research agent (Claude Code subagent, no API key, same pattern as the book translation pipeline) reads the current guide body, consults the sources listed above, drafts a new body following the "thorough" structure. Writes to `Conjuga/Scripts/guide-enrichment/drafts/<topicId>.md`.
+2. **Self-review** — same agent re-reads its own draft against the checklist (TL;DR present? mnemonic present? contrast pair? top 3 pitfalls?). Notes anything it couldn't find a source for.
+3. **Integrate** — a script reads the draft, swaps it into `conjuga_data.json` (for tense guides) or `GrammarNote.swift` (for grammar notes), bumps `courseDataVersion`, runs build to verify.
+4. **Spot-check** — user opens the topic in the app on device, reads it, flags anything that feels wrong or missing.
+5. **Commit** — one commit per topic, message: "Guide enrichment — <topic> (tier N)".
+
+Batching: do tier-1 topics one at a time so the user can review and shape what "thorough enough" looks like. Tiers 2 and 3 can batch 3–5 topics per session once the format is dialed in.
+
+## Tooling
+
+Two small scripts will speed this up:
+
+- **`enrich_topic.py <topicId>`** — opens the current body, writes a Markdown template at `drafts/<topicId>.md` with the section headers pre-filled, and prints a research prompt the user can hand to a subagent.
+- **`apply_draft.py <topicId>`** — reads `drafts/<topicId>.md`, validates the section structure, swaps it into `conjuga_data.json` (or `GrammarNote.swift` for grammar notes), bumps `courseDataVersion`.
+
+Build both when starting tier 1. Don't build them speculatively now.
+
+## Effort estimate
+
+- Tier 1 (5 topics): ~30 min research + 30 min draft + 15 min integrate = **~75 min per topic, ~6 hours total**.
+- Tier 2 (4 topics): faster once the format is dialed in. ~45 min each, ~3 hours.
+- Tier 3 (11 topics): ~30 min each (most are compound tenses with similar structure), ~5 hours.
+- Grammar notes audit + fill: ~10 min audit each × 36 = 6 hours; ~30 min fill on the ~10 that need it = 5 hours. Total ~11 hours.
+
+**Total scoped at ~25 hours.** Spread across sessions: maybe one tier-1 topic per session, two tier-2 or three tier-3 per session once the format's locked in.
+
+## Ship plan
+
+- Each commit is one topic enriched. Small, reviewable diffs.
+- `courseDataVersion` bumps per commit so the change propagates on next launch.
+- The user can preview new bodies via the in-app Guide tab without needing a redeploy after the commit hits gitea — they just need to rebuild + reinstall.
+- The plan doc itself lives here so future sessions can pick up where this one left off without needing to re-derive the structure.
+
+## Out of scope (intentional)
+
+- Audio recordings of example sentences (could be a future TTS pre-bake).
+- Per-region variants (Latin American vs Castilian usage notes) — flag when they matter (vosotros, leísmo), don't comprehensively document.
+- Interactive exercises tied to each guide (separate Tests/Quiz infrastructure exists; cross-link instead of duplicate).
+- Translation of the guides into Spanish (current guides are English-explanation, Spanish-examples; keep that asymmetry).
+- A complete grammar-textbook rewrite. Stop at "depth a teacher would hand out as supplementary material."