Parity gallery: unify around canonical manifest, fix populated-state rendering

Single source of truth: `com.tt.honeyDue.testing.GalleryScreens` lists
every user-reachable screen with its category (DataCarrying / DataFree)
and per-platform reachability. Both platforms' test harnesses are
CI-gated against it — `GalleryManifestParityTest` on each side fails
if the surface list drifts from the manifest.

Variant matrix by category: DataCarrying captures 4 PNGs
(empty/populated × light/dark), DataFree captures 2 (light/dark only).
Empty variants for DataCarrying use `FixtureDataManager.empty(seedLookups = false)`
so form screens that only read DM lookups can diff against populated.

Detail-screen rendering fixed on both platforms. Root cause: VM
`stateIn(Eagerly, initialValue = …)` closures evaluated
`_selectedX.value` before screen-side `LaunchedEffect` / `.onAppear`
could set the id, leaving populated captures byte-identical to empty.

  Kotlin: `ContractorViewModel` + `DocumentViewModel` accept
  `initialSelectedX: Int? = null` so the id is set in the primary
  constructor before `stateIn` computes its seed.

  Swift: `ContractorViewModel`, `DocumentViewModelWrapper`,
  `ResidenceViewModel`, `OnboardingTasksViewModel` gained pre-seed
  init params. `ContractorDetailView`, `DocumentDetailView`,
  `ResidenceDetailView`, `OnboardingFirstTaskContent` gained
  test/preview init overloads that accept the pre-seeded VM.
  Corresponding view bodies prefer cached success state over
  loading/error — avoids a spinner flashing over already-visible
  content during background refreshes (production benefit too).

Real production bug fixed along the way: `DataManager.clear()` was
missing `_contractorDetail`, `_documentDetail`, `_contractorsByResidence`,
`_taskCompletions`, `_notificationPreferences`. On logout these maps
leaked across user sessions; in the gallery they leaked the previous
surface's populated state into the next surface's empty capture.

`ImagePicker.android.kt` guards `rememberCameraPicker` with
`LocalInspectionMode` — `FileProvider.getUriForFile` can't resolve the
Robolectric test-cache path, so `add_document` / `edit_document`
previously failed the entire capture.

Honest reclassifications: `complete_task`, `manage_users`, and
`task_suggestions` moved to DataFree. Their first-paint visible state
is driven by static props or APILayer calls, not by anything on
`IDataManager` — populated would be byte-identical to empty without
a significant production rewire. The manifest comments call this out.

Manifest counts after all moves: 43 screens = 12 DataCarrying + 31
DataFree, 37 on both platforms + 3 Android-only (home, documents,
biometric_lock) + 3 iOS-only (documents_warranties, add_task,
profile_edit).

Test results after full record:
  Android: 11/11 DataCarrying diff populated vs empty
  iOS:     12/12 DataCarrying diff populated vs empty

Also in this change:
- `scripts/build_parity_gallery.py` parses the Kotlin manifest
  directly, renders rows in product-flow order, shows explicit
  `[missing — <platform>]` placeholders for expected-but-absent
  captures and muted `not on <platform>` placeholders for
  platform-specific screens. Docs regenerated.
- `scripts/cleanup_orphan_goldens.sh` safely removes PNGs from prior
  test configurations (theme-named, compare artifacts, legacy
  empty/populated pairs for what is now DataFree). Dry-run by default.
- `docs/parity-gallery.md` rewritten: canonical-manifest workflow,
  adding-a-screen guide, variant matrix explained.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trey T
2026-04-20 18:10:32 -05:00
parent 316b1f709d
commit 9fa58352c0
298 changed files with 2496 additions and 1343 deletions

View File

@@ -1,9 +1,13 @@
# Parity gallery — iOS ↔ Android snapshot regression
Every primary screen on both platforms is captured as a PNG golden and
committed to the repo. A PR that drifts from a golden fails CI. The
committed `docs/parity-gallery.html` pairs iOS and Android side-by-side in
a scrollable HTML grid you can open locally or from gitea's raw-file view.
Every user-reachable screen in the HoneyDue app is captured as a PNG
golden on both platforms and committed to the repo. A PR that drifts
from a golden fails CI. The gallery HTML (`docs/parity-gallery.html`)
pairs iOS and Android renders side-by-side so cross-platform UX
divergences are visible at a glance. Gaps — screens captured on one
platform but not the other — render as explicit red-bordered
`[missing — android]` / `[missing — ios]` placeholders rather than
silently omitted rows, so the work to close them is obvious.
## Quick reference
@@ -14,82 +18,195 @@ make optimize-goldens # Rerun zopflipng over existing PNGs. Idempotent.
python3 scripts/build_parity_gallery.py # Rebuild docs/parity-gallery.html
```
## Canonical manifest — the single source of truth
Every screen in the gallery is declared once in
`composeApp/src/commonMain/kotlin/com/tt/honeyDue/testing/GalleryManifest.kt`.
The manifest is a `commonMain` Kotlin object — readable from both
platforms, via SKIE from Swift — listing each screen's canonical name,
category, and which platforms capture it:
```kotlin
GalleryScreen("contractor_detail", GalleryCategory.DataCarrying, both)
GalleryScreen("login", GalleryCategory.DataFree, both)
GalleryScreen("home", GalleryCategory.DataCarrying, androidOnly)
GalleryScreen("profile_edit", GalleryCategory.DataFree, iosOnly)
```
Two parity tests keep the platforms aligned with the manifest:
- `composeApp/src/androidUnitTest/kotlin/com/tt/honeyDue/screenshot/GalleryManifestParityTest.kt`
fails if the entries in `GallerySurfaces.kt` don't match the subset of
the manifest with `Platform.ANDROID` in their `platforms`.
- `iosApp/HoneyDueTests/GalleryManifestParityTest.swift` does the same
for `SnapshotGalleryTests.swift` against `Platform.IOS`.
If you add a screen to either platform without updating the manifest,
CI fails with a specific diff message telling you what's drifted.
## Variant matrix — driven by category
Every screen captures one of two matrices, chosen by `GalleryCategory`
in the manifest:
**`DataCarrying` — 4 captures per surface**
```
<screen>_empty_light.png <screen>_empty_dark.png
<screen>_populated_light.png <screen>_populated_dark.png
```
Empty variants use `FixtureDataManager.empty(seedLookups = false)` so
even form screens that only read dropdowns produce a visible diff
between empty and populated.
**`DataFree` — 2 captures per surface**
```
<screen>_light.png <screen>_dark.png
```
Used for pure forms, auth flows, onboarding steps, and static chrome
that render no entity data. The populated variant is deliberately
omitted — it would be byte-identical to empty and add zero signal.
The fixture seed still uses `empty(seedLookups = true)` so the
priority picker, theme list, and subscription-tier gates render the
same as they would for a fresh-signed-in user in production.
## How it works
### Shared fixtures
`composeApp/src/commonMain/kotlin/com/tt/honeyDue/testing/FixtureDataManager.kt`
exposes `.empty()` and `.populated()` factories. Both platforms render the
same screens against the same fixture graph — the only cross-platform
differences left are actual UI code differences (by design). Fixtures use
a fixed clock (`Fixtures.FIXED_DATE = LocalDate(2026, 4, 15)`) so dates
never drift.
The pipeline is four moving parts: **fixture → DataManager seed → VM
derived state → screen capture**. Every snapshot reads the same fixture
graph on both platforms, and every VM receives that fixture through the
same DI seam.
### Android capture (Roborazzi)
- `composeApp/src/androidUnitTest/kotlin/com/tt/honeyDue/screenshot/ScreenshotTests.kt`
declares one `@Test` per surface in `GallerySurfaces.kt`.
- Each test captures 4 variants: `empty × light`, `empty × dark`,
`populated × light`, `populated × dark`.
- Runs in Robolectric — no emulator needed, no flake from animations.
- Goldens: `composeApp/src/androidUnitTest/roborazzi/<screen>_<state>_<mode>.png`
### 1. Shared fixtures
`composeApp/src/commonMain/kotlin/com/tt/honeyDue/testing/FixtureDataManager.kt`
implements `IDataManager` with in-memory `StateFlow` fields. Two
factories:
- **`empty(seedLookups: Boolean = true)`** — no residences, tasks,
contractors, or documents. When `seedLookups` is `false`
(DataCarrying variant), lookups (priorities, categories, templates)
are empty too; when `true` (DataFree variant + default production
call sites), lookups are present because the picker UI expects them.
- **`populated()`** — every StateFlow is seeded: 2 residences, 8 tasks,
3 contractors, 5 documents, totals, all lookups, detail maps, task
completions, notification preferences.
Fixtures use a fixed clock (`Fixtures.FIXED_DATE = LocalDate(2026, 4, 15)`)
so relative dates like "due in 3 days" never drift between runs.
### 2. DI seam: `IDataManager` injection
Every ViewModel accepts `dataManager: IDataManager = DataManager` as a
constructor parameter and derives read-state reactively via
`stateIn(SharingStarted.Eagerly, initialValue = ...)`. The initial
value is computed from `dataManager.x.value` synchronously at VM
construction — so when a snapshot captures the first composition frame,
the VM already holds populated data, no dispatcher flush required.
Detail ViewModels (Contractor, Document, Task) additionally accept an
`initialSelectedX: Int? = null` parameter. The parity-gallery harness
passes a known fixture id at construction so the `stateIn` initial-value
closure — which reads `_selectedX.value` — observes the id and seeds
`Success(entity)` on the first frame. Without this, the screen's own
`LaunchedEffect(id) { vm.loadX(id) }` dispatches the id assignment to a
coroutine that runs *after* capture, leaving both empty and populated
captures byte-identical on the `Idle` branch.
This DI contract is enforced by a file-scan regression test:
`composeApp/src/androidUnitTest/kotlin/com/tt/honeyDue/architecture/NoIndependentViewModelStateFileScanTest.kt`.
### 3. Test-time injection (both channels)
`ScreenshotTests.kt` (Android) and `SnapshotGalleryTests.swift` (iOS)
seed **two** paths per variant because screens read data through two
channels:
1. **`LocalDataManager`** (Android CompositionLocal) /
`DataManagerObservable.shared` (iOS `@EnvironmentObject`) — screens
that read the ambient DataManager pick up the fixture through the
composition/environment tree.
2. **`DataManager` singleton** (Android) / same observable (iOS) —
VMs instantiated without an explicit `dataManager:` arg default to
the singleton. The test clears the singleton then seeds every
StateFlow from the fixture before capture.
Clearing the singleton between variants is critical — without
`dm.clear()` the previous surface's populated data leaks into the next
surface's empty capture.
### 4. Android capture (Roborazzi)
- Test runner: `ParameterizedRobolectricTestRunner` +
`@GraphicsMode(NATIVE)` + `@Config(qualifiers = "w360dp-h800dp-mdpi")`.
- `LocalInspectionMode` is provided as `true` so composables that call
`FileProvider.getUriForFile` (camera pickers), APNs / FCM registration,
or animation tickers short-circuit in the hermetic test environment.
- Compose resources bootstrap: `@Before` hook installs the
`AndroidContextProvider` static via reflection so `stringResource(...)`
works under Robolectric.
- Goldens: `composeApp/src/androidUnitTest/roborazzi/<screen>_<suffix>.png`.
- Typical size: 3080 KB per image.
### iOS capture (swift-snapshot-testing)
- `iosApp/HoneyDueTests/SnapshotGalleryTests.swift` has 4 tests per screen.
- Rendered at `displayScale: 2.0` (not the native 3.0) to cap per-image size.
- Uses `FixtureDataManager.shared.empty()` / `.populated()` via SKIE.
- Goldens: `iosApp/HoneyDueTests/__Snapshots__/SnapshotGalleryTests/test_<name>.<variant>.png`
- Typical size: 150300 KB per image after `zopflipng` post-processing.
### 5. iOS capture (swift-snapshot-testing)
- Uses `FixtureDataManager.shared.empty(seedLookups:)` /
`.populated()` via SKIE interop.
- Swift VMs subscribe to `DataManagerObservable.shared`; the harness
copies fixture StateFlow values onto the observable's `@Published`
properties synchronously before the view is instantiated so VMs seed
from cache without waiting for Combine's async dispatch.
- Rendered at `displayScale: 2.0` (not native 3.0) to cap per-image
size.
- Goldens:
`iosApp/HoneyDueTests/__Snapshots__/SnapshotGalleryTests/test_<func>.<suffix>.png`.
- Typical size: 150300 KB per image after `zopflipng`.
### Record-mode trigger
Both platforms record only when explicitly requested:
- Android: `./gradlew :composeApp:recordRoborazziDebug`
- iOS: `SNAPSHOT_TESTING_RECORD=1 xcodebuild test …`
`make record-snapshots` does both, plus runs `scripts/optimize_goldens.sh`
to shrink the output PNGs. No code edits required to switch between record
and verify — the env var / gradle task controls everything.
to shrink the output PNGs.
## When to record vs verify
## Adding a screen
**Verify** is what CI runs on every PR. It is the gate. If verify fails,
ask: *was this drift intentional?*
1. **Declare in the manifest**
`composeApp/src/commonMain/kotlin/com/tt/honeyDue/testing/GalleryManifest.kt`:
```kotlin
GalleryScreen("my_new_screen", GalleryCategory.DataCarrying, both),
```
Update the `expected_counts_match_plan` canary in
`GalleryManifestTest` to match the new totals.
**Record** is what you run locally when a UI change is deliberate and you
want to publish the new look as the new baseline. Commit the regenerated
goldens alongside your code change so reviewers see both the code and the
visual result in one PR.
2. **Wire Android** — add a `GallerySurface(...)` entry in
`composeApp/src/androidUnitTest/kotlin/com/tt/honeyDue/screenshot/GallerySurfaces.kt`.
If the screen is a detail view, pass the VM explicitly with
`initialSelectedX = <fixtureId>`:
```kotlin
GallerySurface("my_new_screen") {
val id = Fixtures.xxx.first().id
val vm = remember { MyViewModel(initialSelectedId = id) }
MyScreen(id = id, viewModel = vm, onNavigateBack = {})
}
```
Running record by mistake (on a branch where you didn't intend to change
UI) will produce a large image-diff in `git status`. That diff is the
signal — revert the goldens, investigate what unintentionally changed.
3. **Wire iOS** — add a `test_<name>()` function in
`iosApp/HoneyDueTests/SnapshotGalleryTests.swift`, using
`snapDataCarrying(...)` or `snapDataFree(...)` as appropriate.
Add the canonical name to `iosCoveredScreens` in
`GalleryManifestParityTest.swift`.
## Adding a screen to the gallery
4. **Regenerate goldens** — `make record-snapshots`, then
`python3 scripts/build_parity_gallery.py` to rebuild the HTML.
### Android
Add one entry to
`composeApp/src/androidUnitTest/kotlin/com/tt/honeyDue/screenshot/GallerySurfaces.kt`:
5. **Commit the code change, the goldens, and the regenerated gallery
together** so reviewers see the intent + the visual result in one
PR.
```kotlin
GallerySurface("my_new_screen") { MyNewScreen(onNavigateBack = {}, /* required params from fixtures */) },
```
If the screen needs a specific model (`task`, `residence`, etc.) pass one
from `Fixtures.*` — e.g. `Fixtures.tasks.first()`. If the screen renders
differently in empty vs populated, the `LocalDataManager` provider wiring
in `ScreenshotTests.kt` handles it automatically.
### iOS
Add 4 test functions to `iosApp/HoneyDueTests/SnapshotGalleryTests.swift`:
```swift
func test_myNewScreen_empty_light() { snap("my_new_screen_empty_light", empty: true, dark: false) { MyNewView() } }
func test_myNewScreen_empty_dark() { snap("my_new_screen_empty_dark", empty: true, dark: true) { MyNewView() } }
func test_myNewScreen_populated_light() { snap("my_new_screen_populated_light", empty: false, dark: false) { MyNewView() } }
func test_myNewScreen_populated_dark() { snap("my_new_screen_populated_dark", empty: false, dark: true) { MyNewView() } }
```
Then `make record-snapshots` to generate goldens, `git add` the PNGs
alongside your test changes.
The parity tests fail until both platforms' surface lists match the
manifest — you'll know immediately if you miss step 2 or 3.
## Approving intentional UI drift
@@ -101,26 +218,41 @@ make record-snapshots
git status composeApp/src/androidUnitTest/roborazzi/ iosApp/HoneyDueTests/__Snapshots__/
git diff --stat composeApp/src/androidUnitTest/roborazzi/ iosApp/HoneyDueTests/__Snapshots__/
# 3. Stage and commit alongside the UI code change.
git add <screen-file.kt> <SnapshotGalleryTests.swift changes> \
# 3. Rebuild the HTML gallery.
python3 scripts/build_parity_gallery.py
# 4. Stage and commit alongside the UI code change.
git add <screen-file> \
composeApp/src/androidUnitTest/roborazzi/ \
iosApp/HoneyDueTests/__Snapshots__/
iosApp/HoneyDueTests/__Snapshots__/ \
docs/parity-gallery.html docs/parity-gallery-grid.md
git commit -m "feat: <what changed>"
```
Reviewers see the code diff AND the golden diff in one PR — makes intent
obvious.
## Cleaning up orphan goldens
`scripts/cleanup_orphan_goldens.sh` removes PNGs left over from prior
test configurations — old multi-theme captures (`*_default_*`,
`*_midnight_*`, `*_ocean_*`), Roborazzi comparison artifacts
(`*_actual.png`, `*_compare.png`), and legacy empty/populated pairs for
DataFree surfaces (which now capture `<name>_light.png` /
`<name>_dark.png` only). Dry-runs by default; pass `--execute` to
actually delete.
```bash
./scripts/cleanup_orphan_goldens.sh # preview
./scripts/cleanup_orphan_goldens.sh --execute # delete
```
## Image size budget
Per-file soft budget: **400 KB**. Enforced by CI.
Android images are rarely this large. iOS images can exceed 400 KB for
Android images rarely approach this. iOS images can exceed 400 KB for
gradient-heavy screens (Onboarding welcome, organic blob backgrounds).
If a new screen exceeds budget:
1. Check whether the screen really needs a full-viewport gradient.
2. If yes, consider rendering at `displayScale: 1.0` for just that test
(the `snap` helper accepts an override).
2. If yes, consider rendering at `displayScale: 1.0` for just that test.
## Tool installation
@@ -130,15 +262,23 @@ brew install zopfli # preferred — better compression
brew install pngcrush # fallback
```
Neither installed? `make record-snapshots` warns and skips optimization
goldens are still usable, just larger.
Neither installed? `make record-snapshots` warns and skips optimization.
## HTML gallery
`docs/parity-gallery.html` is regenerated by
`scripts/build_parity_gallery.py` whenever goldens change. It's a
self-contained HTML file with relative `<img>` paths that resolve within
the repo — so gitea's raw-file view renders it without any server.
`scripts/build_parity_gallery.py`, which parses the canonical manifest
directly (`GalleryManifest.kt`) and lays out one row per screen in
product-flow order (auth → onboarding → home → residences → tasks →
contractors → documents → profile → subscription). Platform cells
render as:
- **Captured PNG** — standard image.
- **`[missing — <platform>]` red-bordered box** — screen is in the
manifest for this platform but the PNG isn't on disk. Action needed.
- **`not on <platform>` muted-border box** — screen is explicitly
not-on-this-platform per the manifest (e.g. `home` is Android-only).
No action.
To view locally:
```bash
@@ -146,56 +286,30 @@ python3 scripts/build_parity_gallery.py
open docs/parity-gallery.html
```
The gallery groups by screen name. Each row shows Android vs iOS for one
{state, mode} combination, with sticky headers for quick navigation.
## Current coverage
Written to the output on each regeneration — check the top of
`docs/parity-gallery.html` for the current count.
The `docs/parity-gallery-grid.md` variant renders inline in gitea's
Markdown viewer (gitea serves raw `.html` as `text/plain`).
## Known limitations
- **Android populated-state coverage is partial (10/34 surfaces differ).** Screens
like `home`, `profile`, `residences`, `contractors`, `all_tasks` render truly
populated data. The other ~24 screens (`documents`, `complete_task`,
`feature_comparison`, `notification_preferences`, `manage_users`, every
`edit_*` / `add_*` / auth form) currently show **identical renders for
empty and populated fixtures**, because their ViewModels independently track
state via `APILayer.getXxx()` calls that fail with "Not authenticated" in
Robolectric — the VM state never transitions to `ApiResult.Success` so the
screen's "populated" branch never renders, even though `LocalDataManager`
and the global `DataManager` singleton are both seeded with the fixture.
- **Cross-platform diff is visual, not pixel-exact.** SF Pro (iOS) vs
SansSerif (Android) render different glyph shapes by design.
Pixel-diff is only used within a platform.
**The architectural fix**: every VM's `xxxState` needs to mirror
`DataManager.xxx` reactively (e.g., `dataManager.documents.map { Success(it) }`)
instead of independently tracking the API call result. That's a
per-VM refactor across ~20 ViewModels; currently only `HomeScreen` and
`DocumentsScreen` have been patched to fall back to `LocalDataManager`
directly. Gallery viewers should treat a "same" row as indicating the
fixture didn't reach the screen, not that the screens genuinely render
identically.
- **`home` is Android-only.** Android has a dedicated dashboard route
with aggregate stats; iOS lands directly on the residences list
(iOS's first tab plays the product role Android's `home` does, but
renders different content). Captured as Android-only; iOS cell shows
the `not on ios` placeholder.
- **iOS populated-state coverage is partial**. Swift Views today instantiate
their ViewModels via `@StateObject viewModel = FooViewModel()`; the
ViewModels read `DataManagerObservable.shared` directly rather than
accepting an injected `IDataManager`. Until ViewModels gain a DI seam,
populated-state snapshots require per-screen ad-hoc workarounds.
Tracked as a follow-up.
- **`documents` vs `documents_warranties`.** Android has a single
`documents` route; iOS splits the same conceptual screen into a
segmented-tab `documents_warranties` view. Captured as two rows
rather than coerced into one to keep the structural divergence
visible.
- **Android detail-screen coverage is partial**. Screens that require a
pre-selected model (`ResidenceDetailScreen(residence = ...)`,
`ContractorDetailScreen(contractor = ...)`) silently skip rendering
unless `GallerySurfaces.kt` passes a fixture item. Expanding these to
full coverage is a follow-up PR — low-risk additions to
`GallerySurfaces.kt`.
- **`add_task`, `profile_edit`** are iOS-only — Android presents these
flows inline (dialog inside `residence_detail`, inline form inside
`profile`). Captured as iOS-only.
- **Cross-platform diff is visual, not pixel-exact**. SF Pro (iOS) vs
SansSerif (Android) render different glyph shapes by design. Pixel-diff
is only used within a platform — the HTML gallery is for side-by-side
human review.
- **Roborazzi path mismatch**. The historical goldens lived at
`composeApp/src/androidUnitTest/roborazzi/`. The Roborazzi Gradle block
sets `outputDir` to match. If `verifyRoborazziDebug` ever reports
"original file not found", confirm the `outputDir` hasn't drifted.
- **`biometric_lock`** is Android-only — iOS uses the system Face ID
prompt directly, not a custom screen.