test(harness): fix T10 by driving daemon respawn from a main-side eipc call

T10 was passing on older bundles where the cowork client retried the VM-service connection on a polling cadence — every retry tick was an implicit trigger for the patched cooldown-gated auto-launch. Post- 1.5354.0 the client opens a persistent socket at boot (zI/E\$i happy path → KSt) and routes every subsequent RPC through it, so steady state has no traffic. After SIGKILL the persistent socket goes dead but no client code is in flight, so kUe()'s catch branch never enters and the daemon stays gone. The case-doc claim is upheld by the production code; the patch is correctly applied (`_lastSpawn` × 3 in installed asar, `_svcLaunched` × 0). Only the test's trigger model was stale. Three changes: 1. Wait for `userLoaded`, not `mainVisible`. The post-kill RPC has to land in a webContents whose URL matches `claude.ai`; pre-login `/login/...` URLs aren't reachable via that filter. 2. Phase 3 fires a daemon RPC each iteration. The renderer wrapper (`window['claude.web'].ClaudeVM.getRunningStatus`) was the obvious first try but was unreliable: 29/30 calls threw `Cannot find context with specified id` because the dead-daemon state forces a renderer re-render that invalidates the cached execution context. Switched to invoking the eipc handler from MAIN directly via `wc.ipc._invokeHandlers.get(channel)(fakeEvent)` with `senderFrame.url = 'https://claude.ai/'`. The handler still goes through zI/VsA/kUe, the dead socket still throws, the cooldown gate still opens, and the patched fork still fires — just without any renderer dependency. Three consecutive runs at 21.0s. 3. Budget bumped 20s → 30s. The 10s cooldown is a hard floor, and the daemon needs another second or two to bind the socket; 20s was on the edge. Telemetry now reports `rpcAttempts` / `rpcFailures` / `globalDaemonPidFinal` (the patched `__coworkDaemonPid` global) so future regressions can be diagnosed from the failure attachment alone. Co-Authored-By: Claude <claude@anthropic.com>
test(harness): fix S25 by routing require through process.mainModule
2026-05-17 08:36:35 +03:00 · 2026-05-04 07:29:57 -04:00 · 2026-05-04 07:29:35 -04:00 · 2026-05-04 00:30:52 -04:00 · 2026-05-04 00:23:16 -04:00 · 2026-05-04 00:23:05 -04:00
162 changed files with 38105 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -33,3 +33,7 @@ result-*

 # Wrangler (Cloudflare Worker dev/deploy cache)
 worker/.wrangler/
+
+# UI snapshots — captured renderer state, intentionally ignored to avoid
+# diff churn. See docs/testing/ui-snapshots/README.md.
+docs/testing/ui-snapshots/*.json
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -15,6 +15,8 @@ The [`docs/learnings/`](docs/learnings/) directory contains hard-won technical k
 - [`tray-rebuild-race.md`](docs/learnings/tray-rebuild-race.md) — why destroy + recreate on `nativeTheme` updates briefly duplicates the tray icon on KDE Plasma, and the in-place `setImage` + `setContextMenu` fast-path that avoids the SNI re-registration race
 - [`mcp-double-spawn.md`](docs/learnings/mcp-double-spawn.md) — Stdio MCPs spawn 2× when chat and Code/Agent panels are both active, root cause in upstream session managers, MCP-author workaround
 - [`linux-topbar-shim.md`](docs/learnings/linux-topbar-shim.md) — why claude.ai's in-app topbar is missing on Linux, the four gates that hide it, why the upstream `frame:false` + WCO config has unclickable buttons on X11 (Chromium-level implicit drag region), and the resolution: hybrid mode (system frame + UA-spoof shim → stacked layout, full button functionality)
+- [`test-harness-electron-hooks.md`](docs/learnings/test-harness-electron-hooks.md) — why constructor-level `BrowserWindow` wraps are silently bypassed by `frame-fix-wrapper`'s Proxy, and the prototype-method hook pattern that works (used by the Quick Entry test runners)
+- [`test-harness-ax-tree-walker.md`](docs/learnings/test-harness-ax-tree-walker.md) — five non-obvious traps in the v7 fingerprint walker after the AX-tree migration: AX-enable async lag, navigateTo-to-same-URL no-op, claude.ai's flat `dialog>button[]` lists, the `more options for X` per-row shape, and sidebar virtualization vs the lookup-failure threshold

 ## Code Style

--- a/docs/learnings/test-harness-ax-tree-walker.md
+++ b/docs/learnings/test-harness-ax-tree-walker.md
@@ -0,0 +1,134 @@
+# Test-harness AX-tree walker — non-obvious traps
+
+Notes from the v6 → v7 fingerprint migration that switched
+`tools/test-harness/explore/walker.ts` from a renderer-side
+`document.querySelectorAll` IIFE to Chromium's accessibility tree
+(`Accessibility.getFullAXTree` over CDP). All five gotchas below cost
+a wasted live-walk to find; capturing them here so the next person
+debugging a 0-entry inventory or a redrive cascade can skip the
+discovery loop.
+
+## 1. `Accessibility.enable` is async; the first `getFullAXTree` lies
+
+Inspector clients call `target.debugger.sendCommand('Accessibility.enable')`
+before the first `getFullAXTree`. Both calls return immediately, but
+Chromium populates the AX tree asynchronously — the very first
+read can return a tree containing only the `RootWebArea` and a
+generic shell (4 nodes total) even when the DOM has hundreds of
+interactive elements. The walker's existing `waitForStable` is a
+DOM-mutation-quiescence observer with a 1.5s ceiling; on claude.ai's
+SPA the DOM mutates constantly so `waitForStable` returns at the
+ceiling without the AX tree ever catching up.
+
+**Fix:** `waitForAxTreeStable` polls `getFullAXTree` until two
+consecutive reads return the same node count. Called once before the
+seed snapshot (with `minNodes: 20` to gate against the 4-node "still
+loading" case), once after each `navigateTo` in `redrivePath`, and
+baked into every `snapshotSurface` call (with `minNodes: 1` for the
+post-click case where the tree is already populated).
+
+**Symptom you'll see:** seed entries: 0. Walker exits with no
+inventory. Stderr says `walker: AX tree settled at 4 nodes` (or
+similar small number).
+
+## 2. `navigateTo(sameUrl)` is a no-op; redrives carry prior state
+
+The walker's `navigateTo(url)` short-circuits when `currentUrl === url`
+(per the original v6 implementation). Every BFS pop re-navigates
+to `startUrl` to replay the recorded path against a clean state, but
+when `currentUrl` already matches `startUrl` the navigation is
+skipped. Anything a prior drill left behind — open dialog, expanded
+sidebar, scrolled focus, route params — carries into the next
+redrive's snapshots. `clickById` then suffix-matches the requested
+fingerprint against a contaminated surface and silently fails to find
+elements that were absolutely on the seed surface.
+
+**Fix:** `redrivePath` uses `reloadPage(inspector)` (which evals
+`location.reload()` in the renderer) instead of
+`navigateTo(startUrl)`. The reload discards the React tree and forces
+a fresh mount even when the URL matches.
+
+**Symptom you'll see:** the first one or two BFS items succeed, then
+every subsequent redrive fails with
+`clickById: no element matches "<seed-id>" on current surface`. The
+`<seed-id>` is a button you can verify with the DevTools console is
+visibly present.
+
+## 3. claude.ai uses flat `dialog>button[]` and `complementary>button[]`, not `role=list`
+
+The v7 plan's `isListRowChild` check assumes list rows use ARIA list
+semantics (`option/listitem` inside `listbox/list`). claude.ai
+exposes the connect-apps marketplace as a `dialog` with ~80 plain
+`button` children (no `list` wrapper) and the cowork sidebar as a
+`complementary` landmark with ~70 plain `button` children. Without
+the heuristic those buttons literal-match by name → each gets a
+unique stable entry → the BFS queues each individually for drilling
+→ inventory bloats from 32 to 442+ entries and most drills fail
+because the per-row buttons are virtualized.
+
+**Fix:** `isListRowChild` extended in two ways. (a) `LIST_ROW_ROLES`
+includes `button`, `LIST_ANCESTOR_ROLES` includes `group`. (b) A
+sibling-count fallback fires when `siblingTotal >= 15` regardless of
+ancestor role — sits well above realistic toolbar sizes (≤10) and
+well below the smallest claude.ai marketplace (~80). Step 3
+(positional fallback) also gates on `!isListRowChild` so list rows
+fall through to step 4's `instance` collapse instead of fragmenting
+into per-index positionals that can't fold.
+
+**Symptom you'll see:** dialog kind count balloons (>200). One surface
+dominates the `surfaceBreakdown` query in the inventory. Each
+marketplace card or sidebar row gets its own `kind: structural`
+entry with a slugified product name in the id-tail.
+
+## 4. The `more options for X` per-row trigger needs its own shape
+
+Cowork sidebar rows have a "⋮" menu next to each session whose
+aria-label is `More options for <session title>`. These don't match
+the `cowork-session` shape (which gates on status prefix), so even
+after `cowork-session` collapsed the session list, the sibling
+"More options for" buttons still emitted individually. Same for any
+future per-row action button claude.ai adds.
+
+**Fix:** new `INSTANCE_SHAPES` entry `row-more-options` with regex
+`/^More options for /` and matching pattern. Generic enough to cover
+any per-row trigger that follows the `<verb> for <row title>` shape.
+
+**Symptom you'll see:** after fixing (1)-(3), a fresh wave of
+redrive failures all matching `more-options-for-X` slugs.
+
+## 5. Sidebar virtualization causes structural redrive misses; bump the threshold
+
+claude.ai's cowork sidebar appears to virtualize the session list:
+each fresh page load exposes a slightly different subset of sessions
+in the AX tree (subset, not just ordering — actually different
+membership). The walker captures session N at seed time but on
+redrive after `reloadPage` session N may not be in the tree. Each
+miss counts toward `MAX_CONSECUTIVE_LOOKUP_FAILURES`, and a stretch
+of 25+ consecutive cowork-row redrives can blow through the original
+threshold without the renderer being meaningfully wedged.
+
+**Fix:** threshold bumped 25 → 75. The timeout counter (still 5
+strikes) gates against actual renderer hangs; the lookup-failure
+counter is more about "discovered DOM has drifted from seed", and on
+a virtualized list a generous threshold is correct. Subtree pruning
+(already in place) keeps the bursts from compounding by dropping
+queue items whose path shares the failed step's prefix.
+
+**Symptom you'll see:** the walker aborts mid-walk with
+`25 consecutive redrive lookup failures` and the failed ids all
+share a common ariaPath prefix (`root.complementary.button-by-name.X`).
+
+## Driver: prefer `walk-isolated.ts` over `explore walk`
+
+`npm run explore:walk` connects to whatever Node inspector is on
+:9229 — i.e. the host Claude Desktop the user is currently using.
+That mutates the host profile (visited surfaces, navigation history,
+route changes) and races with the human at the keyboard.
+
+`tools/test-harness/explore/walk-isolated.ts` mirrors what H05 / U01
+do: kills any running host instance, copies auth into a tmpdir
+(`createIsolation({ seedFromHost: true })`), spawns a fresh Electron
+with isolated `XDG_CONFIG_HOME`, attaches the inspector via
+`SIGUSR1`, runs the walk, tears down. Same flag set as
+`explore walk` plus `--no-seed` for the rare case you want a
+fresh-sign-in run. Use it.
--- a/docs/learnings/test-harness-electron-hooks.md
+++ b/docs/learnings/test-harness-electron-hooks.md
@@ -0,0 +1,99 @@
+# Hooking Electron from the test harness
+
+Why constructor-level `BrowserWindow` wraps don't work in this
+codebase, and the prototype-method hook that does.
+
+## TL;DR
+
+The test harness attaches a Node inspector at runtime (see
+[`docs/testing/automation.md`](../testing/automation.md#the-cdp-auth-gate-and-the-runtime-attach-workaround-that-beats-it))
+and from there can evaluate arbitrary JS in the main process. To
+observe BrowserWindow construction (e.g. find the Quick Entry popup
+ref, capture construction-time options), the natural-feeling
+approach is to wrap `electron.BrowserWindow`:
+
+```js
+const electron = process.mainModule.require('electron');
+const Orig = electron.BrowserWindow;
+electron.BrowserWindow = function(opts) {
+  // record opts...
+  return new Orig(opts);
+};
+```
+
+**This is silently bypassed.** `scripts/frame-fix-wrapper.js`
+returns the electron module wrapped in a `Proxy`; the Proxy's
+`get` trap returns a closure-captured `PatchedBrowserWindow`
+class. Reads of `electron.BrowserWindow` go through the trap and
+always return `PatchedBrowserWindow`, regardless of what was
+written to the underlying module. Writes succeed (Reflect.set on
+the target) but reads ignore them. Upstream code calling
+`new hA.BrowserWindow(opts)` constructs from `PatchedBrowserWindow`,
+your wrap is never invoked, your registry stays empty.
+
+The reliable hook is at the **prototype-method level**:
+
+```js
+const proto = electron.BrowserWindow.prototype;
+const origLoadFile = proto.loadFile;
+proto.loadFile = function(filePath, ...rest) {
+  // every BrowserWindow instance reaches this, regardless of
+  // which subclass constructed it
+  return origLoadFile.call(this, filePath, ...rest);
+};
+```
+
+This is what `tools/test-harness/src/lib/quickentry.ts:installInterceptor`
+does.
+
+## Why prototype-level works through the Proxy
+
+`electron.BrowserWindow` returns `PatchedBrowserWindow`, which
+`extends` the original `BrowserWindow` class. Both share the
+underlying Electron-native prototype chain via `extends`. Setting
+`PatchedBrowserWindow.prototype.loadFile = wrappedFn` shadows the
+inherited method on every instance — `Patched`-constructed,
+frame-fix-constructed, plain. There's no Proxy in front of
+`PatchedBrowserWindow.prototype`, so the assignment sticks and is
+visible to all subsequent `instance.loadFile(...)` calls.
+
+`loadFile` and `loadURL` are reasonable identification points
+because every BrowserWindow that displays content calls one of
+them shortly after construction. The file path / URL is a stable
+upstream-controlled string (no minification — these are file paths
+to bundle assets), making it a durable identifier across releases.
+
+## Why constructor-level *can* work elsewhere
+
+If frame-fix-wrapper is removed (or stops returning a Proxy), the
+naïve constructor wrap would work. Watch for this: an upstream
+fork that adopts `BaseWindow` over `BrowserWindow`, or a
+build-time replacement of frame-fix-wrapper, would change the
+hook surface. The prototype-method approach survives both.
+
+## What can't be observed at the prototype level
+
+Construction-time options (`transparent: true`, `frame: false`,
+`skipTaskbar: true`, etc.) are consumed by the native side
+during `super(options)` and not stored on the instance in a
+reflective form. The harness reads runtime equivalents instead:
+
+- `transparent` → `getBackgroundColor() === '#00000000'`
+- `frame: false` → `getBounds().width === getContentBounds().width`
+  (frameless windows have equal frame and content bounds)
+- `alwaysOnTop` → `isAlwaysOnTop()` (note: the popup sets this
+  via `setAlwaysOnTop()` *after* construction at
+  `index.js:515399`, so this is the only viable read regardless of
+  hook approach)
+
+`skipTaskbar` has no public getter; if a test needs it, capture
+it at the prototype level by hooking a method that takes the same
+options shape, or accept that this signal is unobservable
+post-construction.
+
+## See also
+
+- [`tools/test-harness/src/lib/quickentry.ts`](../../tools/test-harness/src/lib/quickentry.ts) — `installInterceptor()` worked example
+- [`scripts/frame-fix-wrapper.js`](../../scripts/frame-fix-wrapper.js) — the Proxy + closure
+- [`tools/test-harness/src/lib/inspector.ts`](../../tools/test-harness/src/lib/inspector.ts) — how the harness gets main-process JS access in the first place
+- [`docs/testing/automation.md`](../testing/automation.md) — overall harness architecture
--- a/docs/testing/README.md
+++ b/docs/testing/README.md
@@ -0,0 +1,112 @@
+# Linux Compatibility Testing
+
+*Last updated: 2026-05-03*
+
+This directory holds the manual test plan for the Linux fork of Claude Desktop. The structure is designed for human readers today and scripted runners tomorrow.
+
+## Layout
+
+| Folder / file | Purpose |
+|---------------|---------|
+| [`matrix.md`](./matrix.md) | **The dashboard.** Cross-environment results table + per-section env-specific status snapshots. Single source of truth for test status. |
+| [`runbook.md`](./runbook.md) | How to run a sweep: VM setup, diagnostic capture, status update workflow, severity guidance. |
+| [`cases/`](./cases/) | Functional test specs grouped by feature surface. Stable IDs: `T###` cross-env, `S###` env-specific. |
+| [`ui/`](./ui/) | UI element inventory. Per-surface checklists — every interactive element with expected state. |
+
+## Environment key
+
+| Abbrev | Distro | DE | Display server |
+|--------|--------|-----|----------------|
+| KDE-W  | Fedora 43 | KDE Plasma | Wayland |
+| KDE-X  | Fedora 43 | KDE Plasma | X11 |
+| GNOME  | Fedora 43 | GNOME | Wayland |
+| Ubu    | Ubuntu 24.04 | GNOME | Wayland |
+| Sway   | Fedora 43 | Sway | Wayland (wlroots) |
+| i3     | Fedora 43 | i3 | X11 |
+| Niri   | Fedora 43 | Niri | Wayland (wlroots) |
+| Hypr-O | OmarchyOS | Hyprland | Wayland (wlroots) |
+| Hypr-N | NixOS | Hyprland | Wayland (wlroots) |
+
+Status legend: `✓` pass · `✗` fail · `🔧` mitigated · `?` untested · `-` N/A
+
+Cells include linked issue/PR numbers when relevant — e.g. `✗ #404` or `🔧 #406`. A bare `✗` means the failure is verified but no tracking issue is filed yet.
+
+## Severity tiers
+
+Each test is tagged with one of:
+
+| Tier | Meaning | Sweep cadence |
+|------|---------|---------------|
+| **Smoke** | Release-gate. Must pass before any tag is cut. | Every release tag, on KDE-W + one wlroots row |
+| **Critical** | Regression-blocker. Failure on any supported environment blocks the release. | Every release tag, on every active row |
+| **Should** | Important but not blocking. Track as bugs, fix before next stable. | Quarterly + on demand |
+| **Could** | Edge cases, nice-to-have. | On demand only |
+
+## Smoke set
+
+The minimum set that gates a release. Run on **KDE-W** (daily-driver) plus **Hypr-N** (clean wlroots). Sweep target: ~20 minutes.
+
+| ID | Surface | One-line check |
+|----|---------|----------------|
+| [T01](./cases/launch.md#t01--app-launch) | Launch | App opens; main window renders within ~10s |
+| [T03](./cases/tray-and-window-chrome.md#t03--tray-icon-present) | Tray | Tray icon appears; click toggles window |
+| [T04](./cases/tray-and-window-chrome.md#t04--window-decorations-draw) | Window | OS-native frame draws and responds |
+| [T05](./cases/shortcuts-and-input.md#t05--url-handler-opens-claudeai-links-in-app) | Input | `xdg-open https://claude.ai/...` opens in-app |
+| [T07](./cases/tray-and-window-chrome.md#t07--in-app-topbar-renders--clickable) | Window | Hybrid topbar renders, every button clicks |
+| [T08](./cases/tray-and-window-chrome.md#t08--hide-to-tray-on-close) | Window | Close button hides to tray, doesn't quit |
+| [T11](./cases/extensibility.md#t11--plugin-install-anthropic--partners) | Extensibility | Anthropic & Partners plugin install completes |
+| [T15](./cases/code-tab-foundations.md#t15--sign-in-completes-via-browser-handoff) | Auth | Sign-in completes via `xdg-open` browser handoff |
+| [T16](./cases/code-tab-foundations.md#t16--code-tab-loads) | Code tab | Code tab loads (no 403, no blank screen) |
+| [T17](./cases/code-tab-foundations.md#t17--folder-picker-opens) | Code tab | Folder picker opens via portal/native chooser |
+
+## Test corpus snapshot
+
+| Bucket | Count |
+|--------|-------|
+| Cross-environment functional (`T###`) | 39 |
+| Environment-specific functional (`S###`) | 37 |
+| UI surfaces inventoried | 10 |
+| Total functional tests | 76 |
+
+For detailed status by ID, see [`matrix.md`](./matrix.md).
+
+## Automation status
+
+Automation is partially landed. The harness lives at
+[`tools/test-harness/`](../../tools/test-harness/) — twenty Playwright
+specs wired (T01, T03, T04, T17, S09, S12, S29-S37, plus four H-prefix
+self-tests), thirteen passing on KDE-W and six skipping cleanly per
+spec intent. See [`tools/test-harness/README.md`](../../tools/test-harness/README.md)
+for the live status table, [`automation.md`](./automation.md) for
+architectural decisions, and the SIGUSR1 / runtime-attach pattern that
+bypasses the app's CDP auth gate.
+
+### Grounding sweep + probe
+
+Separate from the test sweep:
+[`runbook.md` "Grounding sweep"](./runbook.md#grounding-sweep) covers
+the workflow for verifying case docs themselves against the live
+build on every upstream version bump — static anchor pass plus a
+runtime probe ([`tools/test-harness/grounding-probe.ts`](../../tools/test-harness/grounding-probe.ts))
+that captures IPC handler registry, accelerator state, autoUpdater
+gate, AX-tree fingerprint, and other claims static analysis can't
+disambiguate. Anchor and drift conventions live in
+[`cases/README.md`](./cases/README.md#anchor-scope).
+
+The structure remains automation-friendly for new tests:
+
+1. **Stable test IDs.** `T01`-`T39` and `S01`-`S28` won't move. New tests append. Sequential, not semantic.
+2. **Standardized test bodies.** Every functional test has `Severity`, `Steps`, `Expected`, `Diagnostics on failure`, and `References` sections. The Steps and Diagnostics fields are scripted-runner-shaped.
+3. **Per-element UI checklists.** Each UI surface file lists interactive elements in a table — every row is a candidate `webContents.executeJavaScript` / `xprop` / DBus assertion.
+4. **Severity-driven sweeps.** Tests with a `runner:` field execute via [`tools/test-harness/orchestrator/sweep.sh`](../../tools/test-harness/orchestrator/sweep.sh); JUnit XML lands in `results/results-${ROW}-${DATE}/junit.xml`. Tests without a `runner:` continue to run manually.
+
+For tests that don't have a runner yet, status updates land in [`matrix.md`](./matrix.md) by hand after each manual sweep. For tests that do, the automation invocation is the source of truth — see [`runbook.md`](./runbook.md#automated-runs).
+
+## Conventions
+
+- **One PR per sweep result, not per cell change.** Bundle a full row update into a single commit titled `test: KDE-W sweep $(date +%F)`. Reduces matrix-merge noise.
+- **Tested-version pin.** Every status update should mention the `claude-desktop` upstream version + the project version (`v1.3.x+claude...`) in the commit. Otherwise a `✓` from six months ago looks current.
+- **Diagnostics on failure are mandatory.** Don't file `✗` without the captures listed in the test's `Diagnostics on failure` block. The runbook covers how to capture each.
+- **Issue links go inline.** Status cells link directly to the relevant issue/PR.
+
+See [`runbook.md`](./runbook.md) for the full mechanics.
--- a/docs/testing/automation.md
+++ b/docs/testing/automation.md
@@ -0,0 +1,440 @@
+# Automation Plan
+
+*Last updated: 2026-04-30*
+
+> **Status:** Direction agreed; first vertical slice scaffolded at
+> [`tools/test-harness/`](../../tools/test-harness/) covering T01, T03, T04,
+> T17 on KDE-W. The [Decisions](#decisions) table captures the calls
+> already made; [Still open](#still-open) is the short list of things
+> genuinely undecided. This file will fold into [`README.md`](./README.md)
+> and [`runbook.md`](./runbook.md) once the harness has run a few real
+> sweeps.
+
+The [`README.md`](./README.md) automation roadmap is one paragraph. This file
+is the longer version — what shape the harness takes, which tools fit which
+tests, which anti-patterns to design against, and what to build first.
+
+## Why this exists
+
+The 67 tests in [`cases/`](./cases/) plus the 10 surfaces in [`ui/`](./ui/)
+already have stable IDs, standardized bodies, and per-element checklists. That
+structure is unusually friendly to automation — but only if the harness is
+shaped to match the corpus, rather than the other way around. Three things
+make that non-trivial:
+
+1. The tests aren't homogeneous. Some are pure-renderer (Code tab), some are
+   native-OS-level (tray, autostart, URL handler), some are visual/UX checks
+   that probably stay manual forever.
+2. The matrix is nine environments, four display servers, and two package
+   formats. Input injection on Wayland is genuinely different from X11, and
+   X11 is the project's default backend (Wayland-native is opt-in until
+   portal coverage matures across compositors).
+3. Many failures are environment-specific by construction (mutter XWayland
+   key-grab, BindShortcuts on Niri, Omarchy Ozone-Wayland env exports). A
+   single "run everything everywhere" harness will mis-skip those.
+
+## Decisions
+
+| # | Decision | Rationale |
+|---|----------|-----------|
+| 1 | **Single language: TypeScript.** Every runner is `.ts`; OS tools are shelled out via `child_process` and wrapped as TS helpers. Python only as a last-resort escape hatch for AT-SPI cases that resist portal mocking. | Playwright Electron is JS-native (post-Spectron); `dbus-next` covers DBus end-to-end; portal mocking removes the dogtail dependency for most native-dialog tests. Three-language overhead doesn't pay back. |
+| 2 | **Harness location: `tools/test-harness/`.** Sibling to `scripts/`. | Keeps `docs/testing/` documentation-only; matches the project's existing `tools/` / `scripts/` split. |
+| 3 | **VM images: Packer for imperative distros + Nix flake for `Hypr-N`.** | Packer builds golden snapshots that boot fast and rebuild as code; Nix flake handles NixOS natively without a second wrapper. Vagrant's per-boot provisioning model is the wrong tradeoff for hermetic per-test snapshots. |
+| 4 | **No CI infrastructure initially.** Harness is invokable from CI (orchestrator is a bash script with `ROW`, `ARTIFACT`, `OUTPUT_DIR` env vars), but sweeps run manually from the dev box for the first ~20 tests. CI wrapper comes after there's signal on which tests are stable enough to run unattended. | Avoids weeks of GHA / nested-KVM debugging for tests that aren't ready to be unattended. The bash orchestrator is the same code either way. |
+| 5 | **Selectors: semantic locators only (`getByRole`, `getByLabel`, `getByText`).** No CSS classes against minified renderer output. No proactive `data-testid` injection patch. Escalate per-test only when a specific test proves unstable: first ask upstream for a stable `data-testid`; only carry an `app-asar.sh` patch if upstream declines. | Building selector-injection infrastructure up front is a guess at where rot will happen. Modern React apps usually have enough ARIA roles and visible text for `getByRole`/`getByText` to be durable. Measure before patching. |
+| 6 | **X11-default verification is Smoke. Wayland-native characterization is Should.** Add a Smoke test asserting the launcher log shows X11/XWayland selected on each row (the project's release-gate behavior). Add per-row Should tests characterizing what happens if Electron's default Wayland selection is allowed — these are informational, not release-gating. | The project chose X11 default because portal `GlobalShortcuts` coverage is patchy. The new Wayland-default tests exist to map that landscape, not to gate releases on it. |
+| 7 | **Diagnostic retention: last 10 greens + all reds, on `main` only.** Captures `--doctor`, launcher log, screenshot every run. Reds retained indefinitely; greens rotate. | Cheap regression-bisect baseline; bounded storage; reds are the things you actually need to look at six weeks later. |
+| 8 | **JUnit XML lives as workflow-run artifacts.** Each sweep run uploads `results-${ROW}-${DATE}.tar.zst` containing JUnit + diagnostic bundle. Default 90-day retention, extend to 365 if needed. The matrix-regen step downloads the latest run's artifacts and updates `matrix.md` in a PR. | Zero new infrastructure; GH provides storage, lifecycle, auth. If cross-run analytics later require longer history, promote to a separate `claude-desktop-debian-test-history` repo *then* — not before there's signal on what to keep. |
+
+## The three layers
+
+Looking at the corpus, every test falls into one of three buckets, and each
+bucket maps to a different shape of TS code (not a different language):
+
+| Layer | What it covers | Implementation |
+|-------|----------------|----------------|
+| **L1 — Renderer** | Code tab, plugin install, settings, prompt area, slash menu, side chat, most of `ui/code-tab-panes.md`, `prompt-area.md`, `settings.md` | `playwright-electron` (`_electron.launch()`) directly |
+| **L2 — Native / OS** | Tray (DBus), window decorations, URL handler (`xdg-open`), autostart, `--doctor`, multi-instance, hide-to-tray, native file picker (T17) | TS + `dbus-next` for DBus; `child_process` shell-outs wrapped as TS helpers (`xprop`, `wlr-randr`, `swaymsg`, `niri msg`, `pgrep`, `ydotool`); `dbus-next`-driven portal mocking for native-dialog tests |
+| **L3 — Manual** | "Icon is crisp on HiDPI", drag-and-drop feel, T28 catch-up after suspend (real wall-clock), subjective UX checks | Human eyes; capture in [`runbook.md`](./runbook.md) sweep loop |
+
+The `runner:` field [`README.md`](./README.md) hints at is the right unit.
+One TS file per test under `tools/test-harness/runners/`, free to mix L1 and
+L2 calls within a single test file. Tests without a `runner:` field stay
+manual indefinitely — that's a feature, not a TODO.
+
+## Architecture
+
+```
+host (orchestrator)              per-row VM (or Nobara host for KDE-W)
+─────────────────────            ──────────────────────────────────────
+tools/sweep.sh         ssh →     tools/test-harness/run.ts
+                                   ├── L1 runners  (playwright-electron)
+                                   ├── L2 runners  (dbus-next + shell-outs)
+                                   └── junit.xml + diagnostic bundle
+tools/render-matrix.sh ← scp     /tmp/results-${ROW}-${DATE}.tar.zst
+matrix.md (regenerated)
+```
+
+The orchestrator is dumb: copy artifact in, kick the harness, copy results
+out. Per-row variation lives in `tools/test-images/${ROW}/` (Packer recipe +
+cloud-init / autoinstall, or a Nix flake for `Hypr-N`). The harness inside
+each VM is the same checked-in TS code, branched on `XDG_CURRENT_DESKTOP` /
+`XDG_SESSION_TYPE` for env-specific helpers.
+
+Result format pivots on **JUnit XML** — well-trodden ground. Several actions
+already exist that turn JUnit into Markdown summaries
+([`junit-to-md`](https://github.com/davidahouse/junit-to-md), the
+[Test Summary Action](https://github.com/marketplace/actions/junit-test-dashboard)).
+The matrix-regen step is just "download artifact, merge per-row JUnit, render
+cells, commit a PR."
+
+### Why not drive Playwright over the wire?
+
+The obvious sketch is "orchestrator on the host opens a CDP / DevTools port
+on each VM and runs the whole suite from one place." It looks clean but has
+real costs:
+
+- CDP over network is fragile; port forwards are a constant footgun on
+  flaky links.
+- Doesn't help with L2 at all — DBus calls, `xprop`, `pgrep`, file-system
+  probes still have to run in-VM.
+- You'd end up maintaining two transports anyway, so the centralization
+  win evaporates.
+
+In-VM Playwright via `_electron.launch()` is the [official Electron
+recommendation](https://www.electronjs.org/docs/latest/tutorial/automated-testing)
+since Spectron was archived in Feb 2022. No remote debug port needed; it
+spawns Electron directly and gives you a context.
+
+## Toolchain choices per layer
+
+### L1 — `playwright-electron`
+
+- Spawn via `_electron.launch({ args: ['main.js'] })` — no `--remote-debugging-port`.
+- Gate `nodeIntegration: true` and `contextIsolation: false` behind
+  `process.env.CI === '1'` so tests get full main-process access without
+  weakening production security. (Electron docs explicitly recommend this
+  pattern.)
+- **Locator policy: semantic only.** `getByRole`, `getByLabel`,
+  `getByText`, `getByPlaceholder`. No CSS selectors against minified class
+  names — they rot every upstream release. No `data-testid` infrastructure
+  built up front; if a specific test proves unstable, first ask upstream
+  for a stable `data-testid`, only carry an `app-asar.sh` patch as a last
+  resort.
+- Use Playwright auto-wait. No fixed `sleep`s anywhere in the harness.
+
+### L2 — `dbus-next` + wrapped shell-outs
+
+The unifying observation: most of L2 is either DBus (which `dbus-next`
+handles natively from TS) or short subprocess invocations of OS tools
+(which `child_process.exec()` handles, wrapped as a typed TS helper). No
+parallel bash test scripts; the test code reads as TS.
+
+- **DBus everywhere it applies.**
+  [`dbus-next`](https://github.com/dbusjs/node-dbus-next) is actively
+  maintained, has TypeScript typings, and is designed for Linux desktop
+  integration. Replaces `gdbus call ...` invocations:
+  - Tray / SNI state queries (`org.kde.StatusNotifierWatcher`,
+    `org.freedesktop.DBus`).
+  - Portal availability checks (`org.freedesktop.portal.Desktop`).
+  - Suspend inhibitor inspection (`org.freedesktop.login1`).
+  - AT-SPI introspection where actually needed
+    (`org.a11y.atspi.*`).
+- **Compositor / window-manager state via shell-out helpers.** No good
+  Node bindings exist for `xprop`, `wlr-randr`, `swaymsg`, `niri msg` —
+  but invoking them from `child_process.exec()` inside a TS helper is
+  perfectly fine, and the test code stays unified:
+  ```ts
+  // tools/test-harness/lib/wm.ts
+  export async function listToplevels(): Promise<Toplevel[]> { ... }
+  ```
+  Each helper is a thin typed wrapper; the test reads as TS, not
+  bash-with-extra-steps.
+- **Native dialogs (T17 folder picker, etc.) via portal mocking.** The
+  `org.freedesktop.portal.FileChooser` interface is just DBus. For tests
+  that exercise the *integration* (does Claude make the right portal call
+  and handle the result?) — which is what T17 actually tests — register
+  a mock backend over `dbus-next`, intercept the call, return a canned
+  path. No real dialog ever renders. This is both faster and a more
+  honest unit of test than driving a real chooser.
+- **AT-SPI escape hatch.** For the rare test where portal mocking isn't
+  enough (driving an *actual* GTK/Qt dialog tree), the fallback is a
+  small Python [`dogtail`](https://pypi.org/project/dogtail/) script
+  invoked via `child_process.exec()` — same shape as the other shell-out
+  helpers, just Python on the other end. Today, T17 is the only test
+  that might need this; portal mocking probably covers it. We adopt
+  Python only when a specific test forces it, not speculatively.
+
+### Input injection — `ydotool` now, `libei` next
+
+- [`ydotool`](https://github.com/ReimuNotMoe/ydotool) goes through
+  `/dev/uinput`, so it works on both X11 and Wayland. Needs root or a
+  `uinput` group; not a problem inside a test VM. Invoked via the same
+  `child_process` shell-out pattern — `tools/test-harness/lib/input.ts`.
+- Portal-grabbed shortcuts (T06, S11, S14) `ydotool` **cannot** trigger.
+  That's a kernel-vs-compositor boundary issue, not a tool gap. Those
+  tests stay manual until libei is widely available.
+- The future-correct path is
+  [`libei`](https://www.phoronix.com/news/LIBEI-Emulated-Input-Wayland) +
+  the `RemoteDesktop` portal via `libportal`. KDE, GNOME, and wlroots
+  are all moving there. Worth a roadmap note that the shortcut tests
+  have a path to automation — just not today.
+
+### VM lifecycle
+
+- One image-build recipe per row in `tools/test-images/${ROW}/`. Packer
+  for the imperative distros (Fedora 43, Ubuntu 24.04, OmarchyOS, and
+  manual-install rows like i3 / Niri); Nix flake for `Hypr-N`.
+- Rebuild nightly or per release-tag sweep — don't `apt update` /
+  `dnf update` inside a test run; mirrors hiccup, tests go red for the
+  wrong reason.
+- Each test gets a hermetic `XDG_CONFIG_HOME` / `CLAUDE_CONFIG_DIR`
+  (S19 is already the test-isolation primitive). No shared state
+  between tests.
+
+## The CDP auth gate (and the runtime-attach workaround that beats it)
+
+*Discovered during the first KDE-W run-through; resolved by routing
+through the in-app debugger menu's code path.*
+
+The shipped `index.pre.js` contains an authenticated-CDP gate:
+
+```js
+uF(process.argv) && !qL() && process.exit(1);
+```
+
+`uF(argv)` matches **`--remote-debugging-port`** or
+**`--remote-debugging-pipe`** on argv. `qL()` validates an ed25519-signed
+token in `CLAUDE_CDP_AUTH` (signed payload
+`${timestamp_ms}.${base64(userDataDir)}`, 5-minute TTL) against a hardcoded
+public key. If the gate flag is on argv and a valid token isn't in env,
+the app exits with code 1 right after `frame-fix-wrapper` completes. Both
+Playwright's `_electron.launch()` and `chromium.connectOverCDP()` inject
+`--remote-debugging-port=0` and trigger the gate. The signing key is held
+upstream; we can't forge tokens.
+
+**Crucially, the gate doesn't check `--inspect` or runtime SIGUSR1.** Those
+trigger the **Node inspector**, not the Chrome remote-debugging port —
+different surface. Notably, the in-app `Developer → Enable Main Process
+Debugger` menu item *also* opens the Node inspector at runtime; that
+menu's existence is the hint that this path is tolerated by upstream.
+
+The harness uses this:
+
+1. Spawn Electron with no debug-port flags. Gate stays asleep.
+2. Wait for the X11 window to appear (signal that the app is up).
+3. Send `SIGUSR1` to the main process pid. Same code path as the menu —
+   `inspector.open()` runs at runtime and the Node inspector starts on
+   port 9229.
+4. Connect a WebSocket to `http://127.0.0.1:9229/json/list[0].
+   webSocketDebuggerUrl`.
+5. Use `Runtime.evaluate` to run JS in the main process. From there:
+   - `webContents.getAllWebContents()` lists all live web contents
+     (including `https://claude.ai/...` once it loads into the
+     BrowserView).
+   - `webContents.executeJavaScript(...)` drives renderer-side DOM /
+     state queries.
+   - Main-process mocks (e.g. `dialog.showOpenDialog = ...` for T17) are
+     installed by direct assignment.
+
+[`tools/test-harness/src/lib/inspector.ts`](../../tools/test-harness/src/lib/inspector.ts)
+wraps this; [`tools/test-harness/src/lib/electron.ts`](../../tools/test-harness/src/lib/electron.ts)
+exposes `app.attachInspector()` on the launched-app handle.
+
+**Two implementation gotchas worth recording:**
+
+- **`BrowserWindow.getAllWindows()` returns 0** because frame-fix-wrapper
+  substitutes the `BrowserWindow` class and the substitution breaks the
+  static registry. Use `webContents.getAllWebContents()` instead — that
+  registry stays intact and includes both the shell window and the
+  embedded claude.ai BrowserView.
+- **`Runtime.evaluate` with `awaitPromise: true` + `returnByValue: true`
+  returns empty objects** for awaited Promise resolutions on this build's
+  V8. Workaround: have the IIFE return a `JSON.stringify(value)` and
+  `JSON.parse` on the caller side. `inspector.evalInMain<T>()` does this
+  internally so callers don't think about it.
+
+**Status of the harness today:**
+
+- **L2** — fully working (DBus, xprop). T03 / T04 pass.
+- **L1 — T01** — passes via X11 window probe (no inspector needed).
+- **L1 — T17 / similar** — framework works end-to-end (verified inspector
+  attach + dialog mock + webContents detection + Code-tab navigation
+  click). Selector tuning to match claude.ai's actual Code-tab UI is
+  ordinary iterate-as-needed work, not a blocker.
+- **No `app-asar.sh` patch needed** to neutralize the gate. The
+  `dogtail`/AT-SPI escape hatch (Decision 1) is also no longer the
+  fallback for L1 — it's only relevant for native dialogs that the
+  inspector pattern can't reach.
+
+## Notable shifts since the existing roadmap was written
+
+These three changed the landscape in 2025 and the existing
+[`README.md`](./README.md) Automation roadmap section predates them:
+
+1. **Electron 38+ defaults to native Wayland.** [Electron 38 release
+   notes](https://www.electronjs.org/blog/electron-38-0) and the
+   [Wayland tech talk](https://www.electronjs.org/blog/tech-talk-wayland)
+   document this. Electron now has a Wayland CI job upstream. The project
+   keeps X11 as the default backend (Decision 6) because portal coverage
+   for `GlobalShortcuts` is uneven across compositors — the new tests
+   characterize what works where, not what to ship by default.
+2. **Spectron is dead.** Archived Feb 2022; Playwright is the
+   [official recommendation](https://www.electronjs.org/blog/spectron-deprecation-notice).
+   No discussion needed about which framework — that's settled.
+3. **`libei` is real and shipping.** KWin, mutter, and wlroots have all
+   moved. The shortcut-test gap (T06 / S11 / S14) is automatable in the
+   medium term, not "manual forever."
+
+## Anti-patterns to design against
+
+Pulled from the [Playwright flaky-test
+checklist](https://testdino.com/blog/playwright-automation-checklist/),
+the [Codepipes anti-patterns
+catalogue](https://blog.codepipes.com/testing/software-testing-antipatterns.html),
+and the [TestDevLab top 5
+list](https://www.testdevlab.com/blog/5-test-automation-anti-patterns-and-how-to-avoid-them).
+Designing the harness with these in mind from day one is much cheaper than
+backing them out later:
+
+| Anti-pattern | What it looks like | How to avoid in this project |
+|---|---|---|
+| Silent retry | Test passes on attempt 2; dashboard shows green; flake hidden | Log retry count to JUnit; `matrix.md` shows `✓*` for retried-pass; treat retried-pass as a Should-fix bug |
+| Async-wait by `sleep` | `sleep 5` instead of `waitFor`; ICSE 2021 found ~45% of UI flakes here | No fixed sleeps in `tools/test-harness/`. Always poll a condition (window exists, log line, DBus name owned). Lint for `\bsleep\b` and `setTimeout` with literal numbers in test code |
+| Mixing orchestration with verification | One test installs the package, launches, checks tray, asserts URL handler — five failure modes, one red cell | One test, one assertion class. Setup goes in shared fixtures, not test bodies |
+| End-to-end as the only layer | All regressions caught at full-stack UI level | Keep `scripts/patches/*.sh` independently testable; add unit-level tests on patcher logic separately from the full-app sweep |
+| Implementation-coupled selectors | `div.css-7xz92q` deep selectors against minified renderer classes | Decision 5: semantic locators only. If a selector proves unstable, first ask upstream for a stable `data-testid`; only carry an `app-asar.sh` patch as a last resort, per-test |
+| Timing-sensitive assertions | "Within 500ms after click, X appears" | Time bounds are upper-bound sanity only. Use Playwright's auto-wait with a generous `timeout`; don't fight the framework |
+| Hidden global state across tests | Test 4 fails because test 2 left `~/.config/Claude/SingletonLock` behind | Hermetic per-test `XDG_CONFIG_HOME` / `CLAUDE_CONFIG_DIR` (S19). Treat shared state as an isolation bug, not a known quirk |
+| Long-lived VM state drift | Six-month-old snapshot has stale package mirrors; tests fail with 404s | Image rebuild as code (Packer / Nix flake); rebuild nightly or per release-tag. Never `apt update` mid-test |
+| Treating skip as fail | wlroots-only test fails on KDE because it can't be skipped properly | `?` and `-` are first-class in [`matrix.md`](./matrix.md). Map JUnit `<skipped>` → `-`, `<error>` (harness broke) → `?`, only `<failure>` → `✗` |
+| Diagnostics only on failure | Test goes red; capture fires; previous green run had no baseline to diff against | Decision 7: capture `--doctor`, launcher log, screenshot **on every run**. Last 10 greens + all reds on `main` |
+| Network coupling | "Tray icon present" fails because Cloudflare hiccupped during sign-in | Tests that don't *need* network shouldn't touch it. Sign-in is one fixture; tray test runs on a pre-signed-in profile snapshot |
+
+## What stays manual (for now)
+
+These have no automation path that's worth the cost today, and that's
+honest to call out in the roadmap rather than pretending they'll be
+automated "soon":
+
+- **T06 / S11 / S14** — global shortcut tests behind portal grabs. Path
+  exists (libei + RemoteDesktop portal) but compositor-side support is
+  patchy. Revisit when libei adoption broadens.
+- **T15** — sign-in browser handoff. Needs a fixture account and an
+  upstream auth flow that won't necessarily welcome scripted login.
+- **T28** — scheduled task catch-up after suspend. Real wall-clock event;
+  not worth simulating.
+- **Anything in `ui/` tagged "looks right"** — HiDPI sharpness, theme
+  rendering, drag-feel. AT-SPI sees the tree, not the pixels.
+
+T17 (folder picker) was previously in this list. Portal mocking via
+`dbus-next` moves it into L2. If real-dialog testing turns out to be
+necessary anyway, the dogtail escape hatch covers it.
+
+The matrix already supports leaving these manual via the `?` / `-` /
+existing-cell semantics — no schema change needed.
+
+## Suggested first vertical slice
+
+The smallest end-to-end that proves every architectural decision:
+
+- **One row:** KDE-W (daily-driver host, no VM startup tax).
+- **One test:** T01 — App launch.
+- **Full pipeline:** orchestrator glue → harness entry → Playwright
+  `_electron.launch()` → JUnit XML → matrix-regen step → cell flips
+  from `?` to `✓` automatically.
+
+That single slice forces every decision out into the open: harness
+language (TS), JUnit emission, results-bundle layout, matrix-regen
+rules, diagnostic-capture format. Resist building the orchestrator
+before there's a passing test it can orchestrate. Once the slice is
+real, adding tests 2–10 is mostly mechanical.
+
+After T01: the next sensible additions are T03 (tray — exercises
+`dbus-next` end-to-end), T04 (window decorations — exercises the
+shell-out helper pattern), and T17 (folder picker — exercises portal
+mocking). Those four runners cover every distinct shape of TS code in
+the harness; everything else after them is a recombination.
+
+## Still open
+
+Most of the framing decisions are settled in the [Decisions](#decisions)
+table. What remains:
+
+1. **Owner assignments per row.** [`MEMORY.md`](https://github.com/aaddrick/claude-desktop-debian/blob/main/.claude/projects/-home-aaddrick-source-claude-desktop-debian/memory/MEMORY.md)
+   notes cowork → @RayCharlizard, nix → @typedrat. Hypr-N row is the
+   natural fit for @typedrat once the Nix flake exists. The other eight
+   rows: aaddrick by default, but worth asking the contributor base in a
+   discussion thread.
+2. **AT-SPI escape-hatch trigger.** Decision 1 punts on Python until a
+   specific test forces it. T17 is the only candidate today, and portal
+   mocking probably covers it. If T17 actually needs real-dialog
+   automation, that's the first reopen.
+3. **Selector rot rate.** Decision 5 starts with semantic locators and
+   measures. After ~20 tests on the renderer, revisit whether
+   `getByRole`/`getByText` is holding up or whether per-test
+   `data-testid` patches are warranted. No prediction; this is a
+   measure-and-decide.
+4. **CI execution model.** Decision 4 punts on this entirely until the
+   harness has signal on which tests are stable. Reopen after the first
+   ~20 tests have run from the dev box for a few weeks.
+5. **Smoke-set Wayland-default test wording.** Decision 6 calls for a
+   Smoke test asserting X11/XWayland selection on each row, plus
+   per-row Should tests for Wayland characterization. The exact T-IDs
+   and case-file homes for those tests need to be drafted next time
+   `cases/` is touched.
+
+## Sources
+
+Background reading the recommendations draw on. Linked here so the
+calls have receipts:
+
+### Electron testing & Playwright
+- [Electron — Automated Testing](https://www.electronjs.org/docs/latest/tutorial/automated-testing) — official tutorial, recommends Playwright
+- [Electron — Spectron Deprecation Notice](https://www.electronjs.org/blog/spectron-deprecation-notice) — Feb 2022 archive
+- [Playwright — Electron class](https://playwright.dev/docs/api/class-electron)
+- [Playwright — ElectronApplication class](https://playwright.dev/docs/api/class-electronapplication)
+- [Testing Electron apps with Playwright and GitHub Actions (Simon Willison)](https://til.simonwillison.net/electron/testing-electron-playwright)
+- [`spaceagetv/electron-playwright-example`](https://github.com/spaceagetv/electron-playwright-example) — multi-window Playwright + Electron example
+
+### DBus / TypeScript
+- [`dbus-next` — actively-maintained Node DBus library with TS typings](https://github.com/dbusjs/node-dbus-next)
+- [`dbus-next` on npm](https://www.npmjs.com/package/dbus-next)
+
+### Wayland / X11 / input injection
+- [Electron — Tech Talk: How Electron went Wayland-native](https://www.electronjs.org/blog/tech-talk-wayland)
+- [Electron 38.0.0 release notes](https://www.electronjs.org/blog/electron-38-0)
+- [PR #33355: fix calling X11 functions under Wayland](https://github.com/electron/electron/pull/33355)
+- [LIBEI — Phoronix overview](https://www.phoronix.com/news/LIBEI-Emulated-Input-Wayland)
+- [libei + RemoteDesktop portal — RustDesk discussion](https://github.com/rustdesk/rustdesk/discussions/4515)
+- [`ydotool` README](https://github.com/ReimuNotMoe/ydotool)
+- [`kwin-mcp` — KDE Plasma 6 Wayland automation tools](https://github.com/isac322/kwin-mcp)
+
+### Portals / AT-SPI
+- [XDG Desktop Portal — main repo](https://github.com/flatpak/xdg-desktop-portal)
+- [`org.freedesktop.portal.FileChooser` interface XML](https://github.com/flatpak/xdg-desktop-portal/blob/main/data/org.freedesktop.portal.FileChooser.xml)
+- [File Chooser portal documentation](https://flatpak.github.io/xdg-desktop-portal/docs/doc-org.freedesktop.portal.FileChooser.html)
+- [`dogtail` on PyPI](https://pypi.org/project/dogtail/) — fallback only
+- [Automation through Accessibility — Fedora Magazine](https://fedoramagazine.org/automation-through-accessibility/)
+
+### Anti-patterns / flaky tests
+- [Playwright automation checklist to reduce flaky tests (TestDino)](https://testdino.com/blog/playwright-automation-checklist/)
+- [Flaky Tests: The Complete Guide to Detection & Prevention (TestDino)](https://testdino.com/blog/flaky-tests/)
+- [5 Test Automation Anti-Patterns (TestDevLab)](https://www.testdevlab.com/blog/5-test-automation-anti-patterns-and-how-to-avoid-them)
+- [Software Testing Anti-patterns (Codepipes)](https://blog.codepipes.com/testing/software-testing-antipatterns.html)
+
+### JUnit XML reporting
+- [`junit-to-md`](https://github.com/davidahouse/junit-to-md)
+- [Test Summary GitHub Action](https://github.com/marketplace/actions/junit-test-dashboard)
+- [Test Reporter](https://github.com/marketplace/actions/test-reporter)
+
+### CI / VM matrix
+- [Transient — QEMU CI wrapper](https://www.starlab.io/blog/simple-painless-application-testing-on-virtualized-hardwarenbsp)
+- [`cirruslabs/tart` — VMs for CI automation](https://github.com/cirruslabs/tart)
+
+---
+
+*Once the first vertical slice (KDE-W + T01) ships, the relevant pieces of
+this file fold into [`README.md`](./README.md) (Automation roadmap) and
+[`runbook.md`](./runbook.md) (the harness invocation). Until then: working
+notes that have crossed from brainstorm to plan.*
--- a/docs/testing/cases-grounding-prompt.md
+++ b/docs/testing/cases-grounding-prompt.md
@@ -0,0 +1,347 @@
+# docs/testing/cases grounding sweep — implementation prompt
+
+This file is meant to be **copied verbatim into a fresh Claude Code
+session** as the initial user message. Don't paraphrase it; the
+orchestration depends on the exact directives below.
+
+---
+
+## Prompt to paste
+
+You're picking up after the v7 walker, U01 wire-up, and the
+`claudeai.ts` AX-tree migration all landed. The page-objects are
+stable against the live renderer (T17_folder_picker passes on
+KDE-W). The next workstream is **grounding the case docs in
+`docs/testing/cases/` against actual upstream behavior**.
+
+The cases were written from outside-in — observed user-visible
+flows, expected outcomes, diagnostic captures. Many describe
+behavior the test author *believed* exists in upstream Claude
+Desktop, but no one has cross-checked each Step / Expected against
+the actual extracted source. Your job is to spawn one subagent per
+case file, have each one read the case + grep the build-reference
+extract for the relevant feature, and report what's accurate, what's
+stale, and what's missing — then make in-place adjustments to the
+case files so each one is grounded in concrete code anchors before
+the next sweep cycle.
+
+### Authoritative reference
+
+Read these in order. They're the substrate the subagents will pull
+from.
+
+- `docs/testing/cases/README.md` — the case-doc structure (severity,
+  surface, applies-to, steps, expected, diagnostics, references).
+  The "Standard test body" template at the bottom is the contract
+  every case currently follows.
+- `docs/testing/matrix.md` — live Pass/Fail/Pending matrix per row.
+  Tells you which cases have a runner and which are still
+  human-execution-only.
+- `build-reference/app-extracted/.vite/build/` — the extracted +
+  beautified Claude Desktop source. ~14 files; `index.js` is the
+  main process (~546k lines after beautification), `mainView.js` /
+  `mainWindow.js` / `quickWindow.js` are renderer preloads,
+  `coworkArtifact.js` is the cowork BrowserView preload,
+  `buddy.js` is the supervisor, etc. **This is the ground truth.**
+- `tools/test-harness/src/runners/` — existing runners that *do*
+  have working selectors / event hooks. Sometimes the runner has
+  more accurate code anchors than the case doc.
+- `CLAUDE.md` (project root) — project conventions, attribution
+  format, commit style. Don't violate.
+
+### Case files in scope
+
+Eleven files plus the README. One subagent per file:
+
+| File | Tests covered |
+|---|---|
+| `code-tab-foundations.md` | T15-T20 |
+| `code-tab-handoff.md` | T23-T25, T34, T38, T39 |
+| `code-tab-workflow.md` | T21-T22, T29-T32 |
+| `distribution.md` | S01-S05, S15, S16, S26 |
+| `extensibility.md` | T11, T33, T35-T37, S27, S28 |
+| `launch.md` | T01, T02, T13, T14 |
+| `platform-integration.md` | T09, T10, T12, S17, S18, S22-S25 |
+| `routines.md` | T26-T28, S19-S21 |
+| `shortcuts-and-input.md` | T05, T06, S06-S14, S29-S37 |
+| `tray-and-window-chrome.md` | T03, T04, T07, T08, S08, S13 |
+
+### Why this iteration
+
+Several cases have been silently bit-rotting against upstream
+changes — a Step says "click the X menu" but X was renamed two
+upstream versions ago, or an Expected references a behavior the
+team shipped behind a feature flag that's now off by default. When
+the sweep runs against a row that's stale, the failure looks like a
+Linux compatibility issue but is actually a doc-vs-upstream drift.
+Grounding the cases against the actual extracted source closes
+that gap and makes future sweeps interpretable.
+
+This isn't a one-time correctness pass — it's a cycle. After every
+upstream version bump (`CLAUDE_DESKTOP_VERSION` rolls in
+`scripts/setup/detect-host.sh`), the grounding can drift again.
+Optimise for **leaving concrete code-anchor breadcrumbs** in each
+case so the next grounding pass is fast.
+
+### Repo conventions
+
+- Tabs for indentation in code; markdown is space-indented as the
+  existing files do it.
+- Markdown lines wrap at ~80 chars unless they're tables or links
+  that don't break naturally.
+- Don't commit. The user reviews and commits.
+- Don't run the host Claude Desktop. The user runs it. Read from
+  `build-reference/` instead — that's already extracted +
+  beautified specifically so you don't have to attach to a live
+  app to verify behavior.
+
+### Code anchors
+
+- `build-reference/app-extracted/.vite/build/index.js` — main
+  process. Every IPC channel registration, window-management
+  decision, app-lifecycle hook, tray-menu construction, autostart
+  toggle, dialog invocation, and protocol handler lives here.
+- `build-reference/app-extracted/.vite/build/quickWindow.js` —
+  Quick Entry preload + window setup.
+- `build-reference/app-extracted/.vite/build/mainWindow.js` —
+  main shell BrowserWindow preload (claude.ai is loaded into a
+  child BrowserView; this preload runs in the shell frame).
+- `build-reference/app-extracted/.vite/build/mainView.js` —
+  preload running inside the claude.ai BrowserView itself.
+- `build-reference/app-extracted/.vite/build/coworkArtifact.js` —
+  preload running inside cowork's iframe-shaped artifact view.
+- `build-reference/app-extracted/.vite/build/buddy.js` — supervisor
+  process (the daemon that respawns the cowork worker; see
+  `docs/learnings/cowork-vm-daemon.md`).
+- `build-reference/app-extracted/package.json` — declared main /
+  preloads, electron version, native deps. Quick reference for
+  whether a feature is wired up at all.
+
+### Phases
+
+#### Phase 0 — calibration
+
+1. `cd tools/test-harness && npm run typecheck` — should pass; if
+   not, stop and report.
+2. Read `docs/testing/cases/README.md` end-to-end and one full case
+   file (suggest `launch.md` — small, four tests, easy
+   surface-area). Confirm you understand the case-doc contract
+   before fanning out.
+3. Pick T01 (App launch) as a calibration case. Manually grep
+   `build-reference/app-extracted/.vite/build/index.js` for the
+   launcher-log / backend-selection logic referenced in T01's
+   Expected. Confirm you can read the beautified source and locate
+   the relevant code. Report the anchor (`index.js:N-M`) so the
+   user knows the workflow is sound before you fan out.
+
+If Phase 0 surfaces a problem (build-reference stale relative to
+the case doc, calibration anchor not findable, README structure
+unclear), stop and report. Don't fan out subagents against an
+unverified workflow.
+
+#### Phase 1 — fan-out
+
+Spawn one subagent per case file (eleven total). Use
+`subagent_type: 'general-purpose'`. Send them in **parallel** —
+they're independent. Keep the prompt to each subagent
+self-contained; the subagent has no context from this conversation.
+
+Per-subagent prompt template (fill in the case file path):
+
+```
+You're grounding ONE test-case file in
+docs/testing/cases/<FILE>.md against the extracted Claude Desktop
+source at build-reference/app-extracted/.vite/build/.
+
+Read these first:
+- docs/testing/cases/README.md (case-doc contract)
+- docs/testing/cases/<FILE>.md (your case file)
+- CLAUDE.md (project conventions)
+
+For each test in the file:
+
+1. Read the test's Steps + Expected.
+2. Identify the load-bearing claim — the upstream behavior the
+   test depends on (an IPC channel, a tray-menu item, a
+   dialog.showOpenDialog call, a globalShortcut.register, a
+   nativeTheme listener, etc.).
+3. Grep build-reference/app-extracted/.vite/build/ for that claim.
+   Use ripgrep / grep -E. The code is beautified but minified
+   variable names — anchor on string literals, IPC channel names,
+   menu labels, event names, not variable identifiers.
+4. Classify the result:
+   - **Grounded** — claim verified, anchor found. Append a
+     `**Code anchors:** <file>:<line>` line to the test body
+     directly under the existing References field.
+   - **Drifted** — feature exists but the case's Steps or Expected
+     don't match what's actually shipping. Edit the case to
+     match upstream behavior. Note what changed.
+   - **Missing** — feature isn't in the build at all (deprecated,
+     never shipped, behind unset flag). Mark the test with a
+     prepended block:
+     `> **⚠ Missing in build 1.5354.0** — <one-line note>. Re-verify after next upstream bump.`
+   - **Ambiguous** — claim could be one of several upstream code
+     paths and you can't disambiguate from the case alone. Don't
+     edit; report under "Open questions".
+
+Per-test, prefer concrete code anchors over wordy explanations.
+The next person reading this case should see exactly where
+upstream implements the feature.
+
+Constraints:
+- Don't fabricate anchors. If you can't find it, mark Missing or
+  Ambiguous — never invent a `index.js:12345` reference.
+- Don't restructure the case files. Keep the existing template
+  (Severity / Surface / Applies to / Issues / Steps / Expected /
+  Diagnostics / References). Only add code anchors and edit
+  Steps/Expected for drift.
+- Don't expand scope. If you notice an unrelated bug or missing
+  test, note it under "Open questions" — don't fix it inline.
+- Don't run the host Claude Desktop. Read from build-reference/
+  only.
+
+Report shape (~300-500 words):
+
+## <FILE>.md grounding
+
+- Tests reviewed: N
+- Grounded: N
+- Drifted (edited): N (one-line per: <test-id> — <what changed>)
+- Missing (marked): N (one-line per: <test-id> — <what's gone>)
+- Ambiguous (flagged): N (one-line per: <test-id> — <why>)
+
+### Code anchor highlights
+- <test-id>: <file>:<line> — <what the anchor proves>
+
+### Open questions
+- ...
+
+### Files touched
+- docs/testing/cases/<FILE>.md
+```
+
+Keep the report tight. The orchestrator reads eleven of these and
+synthesizes.
+
+#### Phase 2 — synthesis
+
+Once all eleven subagents return:
+
+1. Aggregate per-classification counts across all files. Big
+   numbers in any column are signals:
+   - Lots of **Drifted** → upstream had a recent feature shuffle;
+     the team should know.
+   - Lots of **Missing** → either the case doc was written
+     speculatively or upstream removed features without telling.
+   - Lots of **Ambiguous** → the case-doc template needs a
+     "Implementation hint" field so future grounding has a
+     starting point.
+2. Cross-check: did any subagent edit the same anchor differently?
+   (Unlikely since each owns one file, but worth a sanity pass.)
+3. Check that `git diff docs/testing/cases/` matches what the
+   subagents reported. If a subagent claimed Drifted but didn't
+   write to disk, surface it.
+4. Build the user-facing summary (see "Final report format" below).
+
+Don't make the user re-read the eleven subagent reports — give
+them the synthesised view + the per-file links.
+
+### Self-correction loop
+
+After Phase 1 returns:
+
+1. If any subagent failed (no report, error, hit token limit),
+   re-spawn just that one with a tighter scope (e.g. "process
+   tests T15-T17 only, not the full file").
+2. If a subagent's report claims edits but `git diff` shows no
+   changes, the subagent silently dropped the writes — re-spawn
+   with explicit instruction to use the Edit tool.
+3. If two subagents flag the same upstream code path with
+   contradictory claims (one says Grounded, one says Missing),
+   re-read the source yourself and adjudicate.
+
+Cap re-spawns at **2 per file** — past that, mark the file as
+"needs human review" in the final report and move on.
+
+### Termination conditions
+
+Stop and write a final report when one of:
+
+1. **All eleven files grounded.** Per-file classification counts +
+   diff stat. Done.
+2. **Hit the re-spawn cap on 3+ files.** Stop, write up which
+   files are blocked, what each blocker looks like.
+3. **Build-reference is stale.** If multiple subagents report
+   "Missing" against features the user knows shipped, the
+   extract may be out of date — verify the version
+   (`build-reference/app-extracted/package.json` `version` field
+   vs `CLAUDE_DESKTOP_VERSION` repo variable) before continuing.
+
+### What you should NOT do
+
+- Don't commit. The user reviews everything.
+- Don't restructure the case-doc template. Eleven files, one
+  shape — keep it that way.
+- Don't add new tests. Grounding is a verify-and-anchor pass, not
+  a coverage expansion.
+- Don't run the host Claude Desktop. The build-reference extract
+  exists specifically so you don't have to attach to a live app.
+- Don't edit anything outside `docs/testing/cases/`. If you find
+  a runner discrepancy (case says "click X", runner clicks "Y"),
+  flag it under Open questions; don't edit the runner.
+- Don't invent anchors. If the grep doesn't find the literal,
+  classify Missing or Ambiguous — never write a fictional
+  `index.js:12345` reference.
+
+### Final report format
+
+```markdown
+## Cases grounding summary
+
+- Files reviewed:    11 / 11
+- Tests reviewed:    N (sum across all files)
+- Grounded:          N (with code anchors added)
+- Drifted (edited):  N
+- Missing (marked):  N
+- Ambiguous:         N
+- Files needing
+  human review:      N
+
+## Per-file breakdown
+
+| File | Reviewed | Grounded | Drifted | Missing | Ambiguous |
+|---|---|---|---|---|---|
+| code-tab-foundations.md | ... | ... | ... | ... | ... |
+| ... | | | | | |
+
+## Notable findings
+- <test-id>: <one-line significance>
+- ...
+
+## Open questions
+- ...
+
+## Files touched
+git status output (only docs/testing/cases/*.md should appear)
+
+## Diff summary
+git diff --stat docs/testing/cases/
+```
+
+### Operational notes
+
+- Subagents are launched in parallel via a single message with
+  multiple Agent tool calls. Don't serialize them — Phase 1 takes
+  ~15 minutes serial, ~3 minutes parallel.
+- Each subagent's Edit calls land directly in the working tree.
+  No merge conflicts because each owns one file.
+- The build-reference `index.js` is 546k lines. Subagents should
+  use `grep -nE` with anchored string literals, not full reads.
+  Recommended grep pattern style:
+  `grep -nE 'globalShortcut\.register\([^)]*' build-reference/app-extracted/.vite/build/index.js`
+- If a subagent needs to verify a renderer-side claim (DOM event
+  flow, React component shape), the relevant preload is in
+  `mainView.js` / `mainWindow.js`. Don't grep `index.js` for
+  renderer-only behavior.
+
+Begin with Phase 0. Don't fan out until calibration succeeds.
--- a/docs/testing/cases/README.md
+++ b/docs/testing/cases/README.md
@@ -0,0 +1,94 @@
+# Functional Test Cases
+
+Test specifications grouped by feature surface. For live status, see [`../matrix.md`](../matrix.md). For sweep workflow, see [`../runbook.md`](../runbook.md). For the UI element inventory, see [`../ui/`](../ui/).
+
+## Files
+
+| File | Surfaces covered | Tests |
+|------|------------------|-------|
+| [`launch.md`](./launch.md) | App startup, doctor, package detection, multi-instance | T01, T02, T13, T14 |
+| [`tray-and-window-chrome.md`](./tray-and-window-chrome.md) | Tray icon, window decorations, hybrid topbar, hide-to-tray | T03, T04, T07, T08, S08, S13 |
+| [`shortcuts-and-input.md`](./shortcuts-and-input.md) | URL handler, Quick Entry, global shortcuts | T05, T06, S06, S07, S09, S10, S11, S12, S14, S29, S30, S31, S32, S33, S34, S35, S36, S37 |
+| [`code-tab-foundations.md`](./code-tab-foundations.md) | Sign-in, Code tab load, folder picker, drag-drop, terminal, file pane | T15, T16, T17, T18, T19, T20 |
+| [`code-tab-workflow.md`](./code-tab-workflow.md) | Preview, PR monitor, worktrees, auto-archive, side chat, slash menu | T21, T22, T29, T30, T31, T32 |
+| [`code-tab-handoff.md`](./code-tab-handoff.md) | Notifications, external editor, file manager, connector OAuth, IDE handoff | T23, T24, T25, T34, T38, T39 |
+| [`routines.md`](./routines.md) | Scheduled tasks, catch-up runs, suspend inhibit, config dir | T26, T27, T28, S19, S20, S21 |
+| [`extensibility.md`](./extensibility.md) | Plugins, MCP, hooks, CLAUDE.md memory, worktree storage | T11, T33, T35, T36, T37, S27, S28 |
+| [`distribution.md`](./distribution.md) | DEB, RPM, AppImage, dependency pulls, auto-update | S01, S02, S03, S04, S05, S15, S16, S26 |
+| [`platform-integration.md`](./platform-integration.md) | Autostart, Cowork, WebGL, PATH inheritance, Computer Use, Dispatch | T09, T10, T12, S17, S18, S22, S23, S24, S25 |
+
+## Standard test body
+
+Every test in this directory follows this structure:
+
+```markdown
+### T## — Title
+
+**Severity:** Smoke | Critical | Should | Could
+**Surface:** human-readable surface tag (e.g. "Code tab → Environment")
+**Applies to:** All | <subset of rows>
+**Issues:** linked issue/PR list, or `—`
+
+**Steps:**
+1. ...
+2. ...
+
+**Expected:** what should happen.
+
+**Diagnostics on failure:** which captures to attach when filing. See [`../runbook.md#diagnostic-capture`](../runbook.md#diagnostic-capture).
+
+**References:** docs links, learnings, related issues.
+
+**Code anchors:** `<file>:<line>` pointers to the upstream code or
+wrapper script that backs the load-bearing claim above. Added during
+the grounding sweep — see "Anchor scope" for guidance on where
+anchors can and can't land.
+
+**Inventory anchor:** (optional) `<element-id>` from
+[`../ui-inventory.json`](../ui-inventory.json) — only if the surface
+shows up in the v7 walker's idle capture. For surfaces inside modals
+or popups, append a sentence noting which click-chain opens them so
+the next inventory regeneration can grab them.
+```
+
+The Steps and Diagnostics fields are written so they can later become
+script entry points without a rewrite.
+
+### Anchor scope
+
+Where the load-bearing claim lives determines where the anchor goes:
+
+- **Upstream code** — any file under
+  `build-reference/app-extracted/.vite/build/` (most often `index.js`,
+  the main process). Use `index.js:N` style anchors.
+- **Our wrapper code** — `scripts/launcher-common.sh`, `scripts/doctor.sh`,
+  `scripts/patches/*.sh`, `scripts/frame-fix-wrapper.js`,
+  `scripts/wco-shim.js`. Use `<repo-relative-path>:N` style anchors.
+- **Server-rendered (claude.ai SPA)** — anchorable only via the v7
+  walker inventory (`docs/testing/ui-inventory.json`) or a runtime
+  capture from `tools/test-harness/grounding-probe.ts`. Idle-state
+  inventory misses contextual surfaces (modals, popups, slash menus,
+  context menus, side panels) — note that explicitly.
+- **Upstream `claude` CLI binary** — out of scope for this matrix
+  (e.g. T39 `/desktop` is a CLI slash-command, not in the Electron
+  asar). Mark as Ambiguous and link to a separate CLI matrix if one
+  exists.
+
+If a claim spans multiple scopes (a wrapper script triggering
+upstream behavior, e.g. T01's launcher-log + main-window-opens),
+list all the anchors. The whole point is making the next sweep
+faster — over-anchoring is fine, missing anchors is not.
+
+### Drift markers
+
+When a sweep finds upstream behavior no longer matches the case:
+
+- **Edited Steps/Expected** — fix the case in place, mention what
+  changed in the commit message. The case is the spec.
+- **Missing in build X.Y.Z** — prepend a blockquote under the test
+  heading: `> **⚠ Missing in build 1.5354.0** — <one-line note>.
+  Re-verify after next upstream bump.` Use when the feature isn't
+  in the build at all (deprecated, behind unset flag, never shipped).
+- **Ambiguous** — don't edit; flag in the sweep report. Use when
+  the load-bearing claim could be one of several candidate code
+  paths and static analysis can't disambiguate.
--- a/docs/testing/cases/code-tab-foundations.md
+++ b/docs/testing/cases/code-tab-foundations.md
@@ -0,0 +1,197 @@
+# Code Tab — Foundations
+
+Tests covering Code-tab availability on Linux (officially unsupported per upstream docs), sign-in flow, folder picker, drag-and-drop, and the basic editing surfaces (terminal, file pane). See [`../matrix.md`](../matrix.md) for status.
+
+## T15 — Sign-in completes in the embedded webview
+
+> **Drift in build 1.5354.0** — Sign-in is an in-app `mainView.webContents.loadURL` flow, not an `xdg-open` browser handoff. Claude.ai/login renders inside the embedded BrowserView; the resulting `sessionKey` cookie is then exchanged at `${apiHost}/v1/oauth/${org}/authorize` with redirect URI `https://claude.ai/desktop/callback`. No system browser is involved.
+
+**Severity:** Smoke
+**Surface:** Auth / embedded webview
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Launch a fresh app instance (signed-out state).
+2. Click **Sign in**. Observe claude.ai/login rendering inside the app.
+3. Authenticate. Observe the in-app navigation completing back to the
+   workspace.
+
+**Expected:** Sign-in stays inside the embedded webview (`will-navigate`
+handler `Ihr` keeps `/login/` paths in-app). After auth the
+`sessionKey` cookie is captured and silently exchanged for an OAuth
+token via the `desktop/callback` redirect. Account dropdown populates;
+no auth banner remains.
+
+**Diagnostics on failure:** DevTools console for the `mainView`
+BrowserView, network captures of the `/v1/oauth/{org}/authorize` and
+`/v1/oauth/token` calls, launcher log, cookie jar inspection
+(`sessionKey` on `.claude.ai`).
+
+**References:** [Code tab auth troubleshooting](https://code.claude.com/docs/en/desktop#403-or-authentication-errors-in-the-code-tab)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:141996` — desktop
+  OAuth redirect URI `https://claude.ai/desktop/callback`
+- `build-reference/app-extracted/.vite/build/index.js:142431` — POST to
+  `${apiHost}/v1/oauth/${org}/authorize` with `Bearer ${sessionKey}`
+- `build-reference/app-extracted/.vite/build/index.js:216565` — `Ihr`
+  treats `/login/` paths as in-app (not external)
+- `build-reference/app-extracted/.vite/build/index.js:141316` —
+  `mainView.webContents.loadURL(...)` drives the embedded sign-in
+
+## T16 — Code tab loads
+
+**Severity:** Smoke
+**Surface:** Code tab — top-level UI
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. After sign-in, click the **Code** tab at the top center.
+2. Wait a few seconds.
+
+**Expected:** Code tab renders the session UI (sidebar, prompt area, environment dropdown). Per upstream docs the Code tab is "not supported" on Linux — the patched build under this project should render the UI normally or surface a clear, actionable message. Not a blank screen, infinite spinner, or `Error 403: Forbidden`.
+
+**Diagnostics on failure:** Screenshot, DevTools console, network captures (auth/feature-flag responses), launcher log, the active patch set in `scripts/patches/`.
+
+**References:** [Use Claude Code Desktop](https://code.claude.com/docs/en/desktop), [Get started with the desktop app](https://code.claude.com/docs/en/desktop-quickstart)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:525066` —
+  `sidebarMode === "code"` rewrites the BrowserView path to `/epitaxy`
+- `build-reference/app-extracted/.vite/build/index.js:496066` — Code
+  deeplinks (`claude://code?...`) navigate to `/epitaxy?...`
+- `build-reference/app-extracted/.vite/build/index.js:105273` — `IHi`
+  recognises `/epitaxy` and `/epitaxy/...` as the Code-tab path
+- `build-reference/app-extracted/.vite/build/index.js:105346` —
+  `sidebarMode` enum contains `"code"`
+
+**Inventory anchor:** `…tablist.tab-by-name.code` (role `tab`, label
+`Code`) — confirms the Code tab is reachable from the new-chat tablist
+in the captured idle state.
+
+## T17 — Folder picker opens
+
+**Severity:** Smoke
+**Surface:** Code tab → Environment selection
+**Applies to:** All rows
+**Issues:** —
+**Runner:** [`tools/test-harness/src/runners/T17_folder_picker.spec.ts`](../../../tools/test-harness/src/runners/T17_folder_picker.spec.ts) — runtime-attach via SIGUSR1 + main-process `dialog.showOpenDialog` mock + `webContents.executeJavaScript` to drive the renderer. Click chain to reach the folder-picker button awaits selector tuning
+
+**Steps:**
+1. In the Code tab, click the environment pill → **Local** → **Select folder**.
+2. Choose a project directory.
+
+**Expected:** Native file chooser opens. On Wayland sessions the chooser is `xdg-desktop-portal`-backed (verify with `busctl --user tree org.freedesktop.portal.Desktop`). On X11 sessions the GTK/Qt native picker fires. Selected path appears in the env pill.
+
+**Diagnostics on failure:** `systemctl --user status xdg-desktop-portal`, `XDG_SESSION_TYPE`, the portal backend in use (`xdg-desktop-portal-kde`, `xdg-desktop-portal-gnome`, `xdg-desktop-portal-wlr`), launcher log.
+
+**References:** [Local sessions](https://code.claude.com/docs/en/desktop#local-sessions)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:66403` — IPC
+  channel `claude.web_FileSystem_browseFolder` (renderer → main)
+- `build-reference/app-extracted/.vite/build/index.js:509188` —
+  `browseFolder` impl calls `dialog.showOpenDialog` with
+  `properties: ["openDirectory", "createDirectory"]`
+- `build-reference/app-extracted/.vite/build/index.js:450534` —
+  `grantViaPicker` (Operon host-access folder grant) uses the same
+  `["openDirectory"]` shape
+- `tools/test-harness/src/lib/claudeai.ts:122` — `installOpenDialogMock`
+  intercepts both `(opts)` and `(window, opts)` arities, matching the
+  call sites at index.js:509196 and :450534
+
+**Inventory anchor:** `root.main.region.button-by-name.select-folder`
+(role `button`, label `Select folder…`) — the persistent button the
+T17 runner clicks before the dialog mock fires.
+
+## T18 — Drag-and-drop files into prompt
+
+**Severity:** Critical
+**Surface:** Code tab → Prompt area
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Open a Code-tab session.
+2. From the system file manager, drag one or more files into the prompt area.
+3. Repeat with multiple files at once.
+
+**Expected:** Files attach to the prompt. The renderer resolves dropped
+`File` objects to absolute paths via the preload-bridged
+`claudeAppSettings.filePickers.getPathForFile` (Electron's
+`webUtils.getPathForFile`). Multi-file drops attach each file. Works on
+both Wayland and X11.
+
+**Diagnostics on failure:** Screen recording, `wl-paste --list-types` (Wayland) or `xclip -selection clipboard -t TARGETS -o` (X11) during drag, DevTools console, launcher log.
+
+**References:** [Add files and context](https://code.claude.com/docs/en/desktop#add-files-and-context-to-prompts)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/mainView.js:9267` —
+  `filePickers.getPathForFile` wraps `webUtils.getPathForFile`
+- `build-reference/app-extracted/.vite/build/mainView.js:9552` —
+  exposed to the renderer as `window.claudeAppSettings`
+
+## T19 — Integrated terminal
+
+**Severity:** Critical
+**Surface:** Code tab → Terminal pane
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In a Code-tab session, press `` Ctrl+` `` (or open via the Views menu).
+2. Confirm the terminal opens in the session's working directory.
+3. Run `git status`, `npm --version`, `gh auth status`.
+
+**Expected:** Terminal pane opens in the session's working directory, inherits the same `PATH` Claude sees. Standard commands run cleanly. Terminal pane is local-session-only per docs.
+
+**Diagnostics on failure:** Terminal pane content, `echo $PATH` from inside the pane, `pwd`, the shell binary in use, launcher log.
+
+**References:** [Run commands in the terminal](https://code.claude.com/docs/en/desktop#run-commands-in-the-terminal)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:69135` — IPC
+  channel `claude.web_LocalSessions_startShellPty` (also
+  `resizeShellPty`, `writeShellPty` at :69184, :69210)
+- `build-reference/app-extracted/.vite/build/index.js:486438` —
+  `startShellPty` body: spawns `node-pty` in
+  `n.worktreePath ?? n.cwd` with `TERM=xterm-256color`
+- `build-reference/app-extracted/.vite/build/index.js:486463` —
+  `node-pty` dynamic import (optional dep, `package.json` line 100)
+- `build-reference/app-extracted/.vite/build/index.js:259306` —
+  `shell-path-worker/shellPathWorker.js` resolves the user's interactive
+  PATH; `FX()` (line 259311) returns it for the spawned PTY env
+
+## T20 — File pane opens and saves
+
+**Severity:** Critical
+**Surface:** Code tab → File pane
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In a Code-tab session, click a file path in chat or diff to open it in the file pane.
+2. Make a small edit. Click **Save**.
+3. Modify the file externally (e.g. `echo >> file`). Re-edit in the pane. Observe the on-disk-changed warning.
+
+**Expected:** File opens in the editor pane. Edits write back to disk on Save. If the file changed on disk since opening, the pane shows the on-disk-changed warning and offers override or discard. (The conflict check is sha256-based, not mtime-based — `writeSessionFile` reads the current bytes, hashes them, and rejects with `Conflict` if the renderer-supplied `expectedHash` doesn't match.)
+
+**Diagnostics on failure:** `sha256sum <file>` output (and stat mtime for cross-checking), launcher log, DevTools console, screen recording of the warning state.
+
+**References:** [Open and edit files](https://code.claude.com/docs/en/desktop#open-and-edit-files)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:68922` — IPC
+  channel `claude.web_LocalSessions_readSessionFile`
+- `build-reference/app-extracted/.vite/build/index.js:69003` — IPC
+  channel `claude.web_LocalSessions_writeSessionFile` with
+  `expectedHash` argument at position 3
+- `build-reference/app-extracted/.vite/build/index.js:492874` —
+  `readSessionFile` impl
+- `build-reference/app-extracted/.vite/build/index.js:492954` —
+  `writeSessionFile` impl: sha256-hashes current on-disk bytes,
+  returns `{ status: nW.Conflict, currentHash }` when `expectedHash`
+  mismatches
--- a/docs/testing/cases/code-tab-handoff.md
+++ b/docs/testing/cases/code-tab-handoff.md
@@ -0,0 +1,163 @@
+# Code Tab — Handoffs to Other Apps
+
+Tests covering desktop notifications, "Open in" external editor, "Show in Files" file manager, connector OAuth round-trips, IDE handoff, and graceful failure of the macOS/Windows-only `/desktop` CLI command. See [`../matrix.md`](../matrix.md) for status.
+
+## T23 — Desktop notifications fire
+
+**Severity:** Critical
+**Surface:** Notifications (libnotify / XDG Notifications)
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Trigger each notification source: scheduled-task fire ([T27](./routines.md#t27--scheduled-task-fires-and-notifies)), CI completion ([T22](./code-tab-workflow.md#t22--pr-monitoring-via-gh)), Dispatch handoff ([S24](./platform-integration.md#s24--dispatch-spawned-code-session-appears-with-badge-and-notification)).
+2. Observe each notification appears.
+3. Click each — confirm it focuses the relevant session.
+
+**Expected:** Notifications appear in the active DE's notification area (Plasma's notification daemon, Mako on wlroots, gnome-shell, etc.) and are clickable to focus the relevant session.
+
+**Diagnostics on failure:** `gdbus call --session --dest=org.freedesktop.Notifications --object-path=/org/freedesktop/Notifications --method=org.freedesktop.DBus.Introspectable.Introspect`, `notify-send "test"` (sanity check daemon), launcher log, DE-specific notification logs.
+
+**References:** [Scheduled tasks](https://code.claude.com/docs/en/desktop-scheduled-tasks), [Monitor pull request status](https://code.claude.com/docs/en/desktop#monitor-pull-request-status)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:494456` (`new hA.Notification(r)` — backed by Electron's libnotify on Linux); `:495110` (`showNotification(title, body, tag, navigateTo)` dispatches Swift on macOS, Electron elsewhere); `:511174`, `:512738` (cu-lock / tool-permission notifications wire a click callback that navigates to `/local_sessions/{sessionId}` to focus the session).
+
+## T24 — Open in external editor
+
+**Severity:** Should
+**Surface:** Code tab → Right-click → Open in
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Install at least one of: VS Code, Cursor, Zed, Windsurf (any install method —
+   flatpak, AppImage, distro package). Xcode is darwin-only and absent on Linux.
+2. In the Code tab, right-click a file path → **Open in** → choose the editor.
+3. Confirm the editor opens at that file.
+
+**Expected:** Right-click → **Open in** launches the chosen editor with the file
+path. Editor is invoked by URL scheme (`vscode://file/<path>`,
+`cursor://file/<path>`, `zed://file/<path>`, `windsurf://file/<path>`) via
+`shell.openExternal`, which delegates to `xdg-open`'s
+`x-scheme-handler/<editor>` resolution rather than hard-coded paths.
+
+**Diagnostics on failure:** `xdg-mime query default x-scheme-handler/vscode` (or
+`cursor`/`zed`/`windsurf`), `desktop-file-validate` on the editor's `.desktop`
+file, `xdg-open vscode://file/<path>` from terminal (sanity check), launcher
+log.
+
+**References:** [Open files in other apps](https://code.claude.com/docs/en/desktop#open-files-in-other-apps)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:59076`
+(editor enum: VSCode, Cursor, Zed, Windsurf, Xcode); `:463902` (`Mtt`
+registry — `vscode://`, `cursor://`, `zed://`, `windsurf://`, `xcode://` with
+darwin-only flag on Xcode); `:463956` (`getInstalledEditors` probes via
+`app.getApplicationInfoForProtocol`); `:464011`
+(`shell.openExternal('<scheme>://file/<encoded-path>:<line>')` — path is
+URL-encoded but `/` separators are preserved); `:68816` IPC handler
+`LocalSessions.openInEditor(path, editor, sshConfig, line)`.
+
+## T25 — Show in Files / file manager
+
+**Severity:** Should
+**Surface:** Code tab → Right-click → Show in Files
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In the Code tab, right-click a file path → "Show in Files" (Linux equivalent of macOS "Show in Finder" / Windows "Show in Explorer").
+2. Confirm the system file manager opens with the containing folder selected.
+
+**Expected:** System file manager (Nautilus on GNOME, Dolphin on KDE, Thunar on Xfce, etc.) opens with the file pre-selected. Resolution respects `xdg-mime` defaults.
+
+**Diagnostics on failure:** `xdg-mime query default inode/directory`, `xdg-open <dir>` from terminal, the menu label rendered (was it Linux-specific or stuck on "Show in Finder"?), launcher log.
+
+**References:** [Open files in other apps](https://code.claude.com/docs/en/desktop#open-files-in-other-apps)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:66652` IPC
+handler `FileSystem.showInFolder(path)`; `:509431` impl thin-wraps
+`hA.shell.showItemInFolder(Tc(path))`. Electron's `showItemInFolder` on Linux
+falls back to `xdg-open` on the parent directory when no DBus FileManager1
+service is present, so the file is rarely pre-selected on minimal DEs — only
+the parent folder opens.
+
+## T34 — Connector OAuth round-trip
+
+**Severity:** Critical
+**Surface:** Connectors → OAuth handoff
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In a Code-tab session, click **+** → **Connectors** → choose a service (Slack, GitHub, Linear, Notion, Google Calendar).
+2. Step through the OAuth flow in the system browser.
+3. Return to Claude Desktop and verify the connector appears in **Settings → Connectors**.
+4. Use the connector in a prompt (e.g. "list my Slack channels").
+
+**Expected:** Adding a connector launches the browser via `xdg-open`, OAuth callback hands control back to Claude Desktop, connector appears in Settings, and is usable in subsequent prompts.
+
+**Diagnostics on failure:** `xdg-mime query default x-scheme-handler/https`, the callback URL scheme, network captures of OAuth redirect, launcher log, DevTools console.
+
+**References:** [Connect external tools](https://code.claude.com/docs/en/desktop#connect-external-tools), [Connectors for everyday life](https://claude.com/blog/connectors-for-everyday-life)
+
+**Code anchors:**
+`build-reference/app-extracted/.vite/build/index.js:524819`
+(`hA.app.setAsDefaultProtocolClient("claude")` — registers the `claude://`
+deep-link scheme used by the OAuth callback); `:525026` mainWindow
+`setWindowOpenHandler` routes external URLs through `MAA(url)` →
+`:525102`–`:525135` (only `http:`/`https:`/`mailto:`/`tel:`/`sms:`/
+`ms-(excel|powerpoint|word):` are forwarded to system handlers; everything
+else is dropped); `:136233` `$a(url)` thin-wraps `hA.shell.openExternal(url)`
+(this is the single egress point for browser handoff); `:159634`
+`mcpSubmitOAuthCallbackUrl(serverName, callbackUrl)` and `:159651`
+`claudeOAuthCallback(authorizationCode, state)` — IPC bridges that consume
+the deep-link callback. See [`docs/learnings/plugin-install.md`](../../learnings/plugin-install.md)
+for orgId/sessionKey cookie chain that gates connector listing.
+
+## T38 — Continue in IDE
+
+**Severity:** Should
+**Surface:** Code tab → Continue in menu
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In a Code-tab session, click the IDE icon (bottom right of session toolbar) → **Continue in** → choose an IDE.
+2. Confirm the IDE opens at the working directory.
+
+**Expected:** Selected IDE opens the project at the current working directory. Resolution via `xdg-open` / `.desktop` files.
+
+**Diagnostics on failure:** `xdg-open <project-dir>` sanity check, `xdg-mime query default x-scheme-handler/vscode` (or matching scheme for the chosen IDE), launcher log, the IDE's `.desktop` file.
+
+**References:** [Continue in another surface](https://code.claude.com/docs/en/desktop#continue-in-another-surface)
+
+**Code anchors:** Same IPC surface as [T24](#t24--open-in-external-editor) —
+`build-reference/app-extracted/.vite/build/index.js:68816`
+(`LocalSessions.openInEditor(path, editor, sshConfig, line)` accepts a
+directory path the same way as a file path); `:463902` editor registry;
+`:464011` `shell.openExternal('<scheme>://file/<cwd>')`. The "Continue in"
+chooser UI is rendered server-side by claude.ai and not present in the local
+asar — only the IPC bridge can be code-anchored.
+
+## T39 — `/desktop` CLI handoff (graceful N/A)
+
+> **Note** — This test exercises the upstream `claude` CLI binary, not the
+> Electron app. The CLI ships separately from this packaging (out of
+> `build-reference/`), so no anchor in `app-extracted/.vite/build/` exists for
+> the slash-command handler. Re-verify behaviour against the CLI binary that
+> ships with the upstream version under test (currently 1.5354.0).
+
+**Severity:** Could
+**Surface:** CLI `/desktop` command
+**Applies to:** All rows (Linux equally)
+**Issues:** —
+
+**Steps:**
+1. In a CLI session, run `/desktop`.
+2. Inspect exit code and output.
+
+**Expected:** `/desktop` is documented as macOS/Windows-only. On Linux it must fail gracefully — print a clear "not supported on Linux" message and exit cleanly. No partial state transition, no panic, no corrupted session file.
+
+**Diagnostics on failure:** Full CLI output, exit code, the session file before/after (`~/.claude/sessions/...`), strace if the CLI hangs.
+
+**References:** [Coming from the CLI](https://code.claude.com/docs/en/desktop#coming-from-the-cli)
--- a/docs/testing/cases/code-tab-workflow.md
+++ b/docs/testing/cases/code-tab-workflow.md
@@ -0,0 +1,151 @@
+# Code Tab — Workflow Surfaces
+
+Tests covering the dev-server preview pane, PR monitoring, worktree isolation, auto-archive, side chat, and the slash command menu. See [`../matrix.md`](../matrix.md) for status.
+
+## T21 — Dev server preview pane
+
+**Severity:** Should
+**Surface:** Code tab → Preview pane
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In a Code-tab session, ensure `.claude/launch.json` is configured (or let auto-detect populate it).
+2. Click **Preview** dropdown → **Start**.
+3. Interact with the embedded browser. Verify auto-verify takes screenshots.
+4. Stop the server from the dropdown.
+
+**Expected:** Configured dev server starts. Embedded browser renders the running app. Auto-verify takes screenshots and inspects DOM. Stopping from the dropdown actually stops the process.
+
+**Diagnostics on failure:** `lsof -i :<port>` to see the server, screenshot of preview pane state, `.claude/launch.json` content, launcher log, DevTools console.
+
+**References:** [Preview your app](https://code.claude.com/docs/en/desktop#preview-your-app)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:262175` — `Pae = "Claude Preview"` + `preview_*` MCP tool table (`preview_start`, `preview_stop`, `preview_list`, `preview_screenshot`, `preview_snapshot`, `preview_inspect`, `preview_click`, `preview_fill`, `preview_eval`, `preview_network`, `preview_resize`).
+- `build-reference/app-extracted/.vite/build/index.js:259604` — `setAutoVerify()` and `parseLaunchJson()` (reads `.claude/launch.json`, honours `autoVerify` flag default-on).
+- `build-reference/app-extracted/.vite/build/index.js:260015` — `capturePage()` / `captureViaCDP()` drive `preview_screenshot` against the embedded preview WebContents.
+
+## T22 — PR monitoring via `gh`
+
+**Severity:** Critical
+**Surface:** Code tab → CI status bar
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Ensure `gh` is installed and authenticated (`gh auth status`).
+2. In a Code-tab session, ask Claude to open a PR for a small change.
+3. Observe the CI status bar. Toggle **Auto-fix** and **Auto-merge**.
+4. Run a separate test on a row where `gh` is **not** installed — confirm the missing-`gh` prompt appears the first time a PR action is taken.
+
+**Expected:** With `gh` present and authenticated, CI status bar surfaces in the session toolbar. Auto-fix and Auto-merge toggles work (auto-merge requires the corresponding GitHub repo setting). If `gh` is missing, the app surfaces a prompt directing the user to https://cli.github.com (auto-install via `installGh` only runs on macOS/brew; Linux returns an error string with the install URL).
+
+**Diagnostics on failure:** `gh auth status`, `which gh`, launcher log, DevTools console, screenshot of status bar, the GitHub repo's "Allow auto-merge" setting.
+
+**References:** [Monitor pull request status](https://code.claude.com/docs/en/desktop#monitor-pull-request-status)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:464281` — `GitHubPrManager` (`prStateCache`, `prChecksCache`); `getPrChecks` at line 464964 fans out to `gh pr view`.
+- `build-reference/app-extracted/.vite/build/index.js:464368` — `"gh CLI not found in PATH"` throw site that backs the missing-`gh` prompt.
+- `build-reference/app-extracted/.vite/build/index.js:464480` — `installGh()`: macOS-only `brew install gh`; Linux/Windows return error pointing to https://cli.github.com.
+- `build-reference/app-extracted/.vite/build/index.js:465019` — `autoMergeRequest { enabledAt }` GraphQL fragment; `enableAutoMerge` / `disableAutoMerge` at lines 465531 / 465556.
+- `build-reference/app-extracted/.vite/build/index.js:534033` — `AutoFixEngine.handleSessionEvent` toggles on `autoFixEnabled` per session.
+
+## T29 — Worktree isolation
+
+**Severity:** Critical
+**Surface:** Code tab → Sidebar (parallel sessions)
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In a Code-tab session against a Git project, open two new sessions in parallel via **+ New session**.
+2. Make different edits in each session.
+3. Confirm `<project-root>/.claude/worktrees/<branch>` exists for each.
+4. Archive one session via the sidebar archive icon.
+
+**Expected:** Each session creates an isolated worktree at `<project-root>/.claude/worktrees/<branch>` (or the dir configured in Settings → Claude Code → "Worktree location"). Edits in one session do not appear in another until committed. Archiving removes the worktree.
+
+**Diagnostics on failure:** `git worktree list` from project root, `ls -la <project-root>/.claude/worktrees/`, launcher log.
+
+**References:** [Work in parallel with sessions](https://code.claude.com/docs/en/desktop#work-in-parallel-with-sessions)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:462835` — `getWorktreeParentDir()`: returns `<baseRepo>/.claude/worktrees`, or `<chillingSlothLocation.customPath>/<basename>` when overridden in Settings.
+- `build-reference/app-extracted/.vite/build/index.js:462843` — `createWorktree()`: runs `git worktree add` with `core.longpaths=true` under the parent dir.
+- `build-reference/app-extracted/.vite/build/index.js:463290` — `git worktree remove --force` invoked on archive (cleanup path).
+- `build-reference/app-extracted/.vite/build/index.js:55231` — `chillingSlothLocation: "default"` settings key (Settings → "Worktree location").
+
+## T30 — Auto-archive on PR merge
+
+**Severity:** Should
+**Surface:** Code tab → Sidebar
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In Settings → Claude Code, enable **Auto-archive on PR close** (`ccAutoArchiveOnPrClose`).
+2. Open a PR from a local session. Merge or close it on GitHub.
+3. Wait up to ~5–6 minutes (sweep runs every 5 minutes, with a 30s startup delay). Observe the sidebar.
+
+**Expected:** Local session whose PR is `merged` or `closed` is archived from the sidebar on the next sweep tick (≤ ~5 min) after the merge/close event. Cached PR-state lookups have a 1-hour cooldown for sessions whose state isn't yet terminal. Remote and SSH sessions are not affected.
+
+**Diagnostics on failure:** Screenshot of sidebar, `gh pr view <num>` output (confirming merge state), launcher log, settings file content (`ccAutoArchiveOnPrClose`).
+
+**References:** [Work in parallel with sessions](https://code.claude.com/docs/en/desktop#work-in-parallel-with-sessions)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:55269` — default `ccAutoArchiveOnPrClose: !1` setting.
+- `build-reference/app-extracted/.vite/build/index.js:533517` — sweep cadence constants: `$3n = 300_000` ms (5 min interval), `W3n = 3_600_000` ms (1 h recheck cooldown), `Fst = 10` (concurrent batch size).
+- `build-reference/app-extracted/.vite/build/index.js:533520` — `AutoArchiveEngine.start()` schedules the 5-min interval + 30s initial delay.
+- `build-reference/app-extracted/.vite/build/index.js:533537` — `sweep()` gates on `Qi("ccAutoArchiveOnPrClose")` and archives sessions whose `prState` lowercases to `merged` or `closed` (`D3A` predicate at line 533607).
+- `build-reference/app-extracted/.vite/build/index.js:533571` — `archiveSession(..., { cleanupWorktree: true })` removes the worktree alongside the archive.
+
+## T31 — Side chat opens
+
+**Severity:** Should
+**Surface:** Code tab → Side chat overlay
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In a Code-tab session, press `Ctrl+;` (or type `/btw` in the prompt).
+2. Ask a question in the side chat. Confirm the side chat sees the main thread context.
+3. Close the side chat. Confirm focus returns to the main session and the side chat content is not in the main thread.
+
+**Expected:** Side chat opens, has access to main-thread context, but its replies do not appear in the main conversation. Closing returns focus.
+
+**Diagnostics on failure:** Screenshot, launcher log, DevTools console.
+
+**References:** [Ask a side question](https://code.claude.com/docs/en/desktop#ask-a-side-question-without-derailing-the-session)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:487025` — side-chat system-prompt suffix: "You are running in a side chat — a lightweight fork… nothing you say here lands in the main transcript."
+- `build-reference/app-extracted/.vite/build/index.js:487265` — `this.sideChats = new Map()` per-session fork registry.
+- `build-reference/app-extracted/.vite/build/index.js:491658` — `startSideChat()` implementation; emits `side_chat_ready` / `side_chat_assistant` / `side_chat_turn_end` / `side_chat_closed` / `side_chat_error` events.
+- `build-reference/app-extracted/.vite/build/mainView.js:7506` — preload IPC bridges: `startSideChat`, `sendSideChatMessage`, `stopSideChat` (the renderer SPA wires `Ctrl+;` / `/btw` to these — UI lives in claude.ai's remote bundle, not build-reference).
+
+## T32 — Slash command menu
+
+**Severity:** Should
+**Surface:** Code tab → Prompt slash menu
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In a Code-tab session, type `/` in the prompt box.
+2. Verify built-in commands, custom skills under `~/.claude/skills/`, project skills, and skills from installed plugins all appear.
+3. Select an entry — confirm it inserts as a highlighted token.
+
+**Expected:** Slash menu lists every available command/skill. Selection inserts the token correctly.
+
+**Diagnostics on failure:** Screenshot of slash menu, `ls ~/.claude/skills/`, project `.claude/skills/`, installed plugin manifest, launcher log.
+
+**References:** [Use skills](https://code.claude.com/docs/en/desktop#use-skills)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:459463` — `getSupportedCommands({sessionId})` aggregates per-session `slashCommands` + cowork command registry (`p2()`) + built-ins (`Q_t`).
+- `build-reference/app-extracted/.vite/build/index.js:332711` — `slashCommands: Di.array(Di.string()).optional()` schema field on the session record.
+- `build-reference/app-extracted/.vite/build/index.js:377670` — `SkillManager` constructor: `skillDir = <agentDir>/.claude/skills`, `_discoverSkills()` walks project skills.
+- `build-reference/app-extracted/.vite/build/index.js:444678` — private/public skill split under `<skillsRoot>/skills/{private,public}` for plugin-supplied skills.
--- a/docs/testing/cases/distribution.md
+++ b/docs/testing/cases/distribution.md
@@ -0,0 +1,168 @@
+# Distribution — DEB, RPM, AppImage
+
+Tests covering Ubuntu/DEB-specific install behavior, Fedora/RPM-specific install behavior, AppImage fallback paths, and the auto-update interaction with system package managers. See [`../matrix.md`](../matrix.md) for status.
+
+## S01 — AppImage launches without manual `libfuse2t64` install
+
+**Severity:** Critical (for Ubuntu users)
+**Surface:** AppImage runtime / FUSE
+**Applies to:** Ubu (and any Ubuntu 24.04+ host)
+**Issues:** —
+
+**Steps:**
+1. Fresh Ubuntu 24.04 install with default packages only.
+2. Download the project AppImage.
+3. Make executable and run it.
+
+**Expected:** AppImage runs without first installing `libfuse2t64`. Either the AppImage bundles its own FUSE shim, the `.desktop`/postinst declares the dep, or the launcher gives a clear error pointing at the package name.
+
+**Currently:** Fails on Ubuntu 24.04 with `dlopen(): error loading libfuse.so.2`. Workaround: `sudo apt install libfuse2t64`. Not yet filed.
+
+**Diagnostics on failure:** Full stderr from the AppImage launch, `ldd ./claude-desktop-*.AppImage`, `dpkg -l | grep -i fuse`.
+
+**References:** —
+
+**Code anchors:** `scripts/packaging/appimage.sh:226` (downloads the upstream `appimagetool` AppImage as-is — no FUSE shim or static-mksquashfs bundling), `scripts/launcher-common.sh:64` (AppImage forces `--no-sandbox` "due to FUSE constraints"), `.github/workflows/test-artifacts.yml:47` (CI installs `libfuse2` before running the AppImage — i.e. the runtime hard-depends on libfuse2/libfuse2t64). No postinst dep declaration or user-facing FUSE error message exists.
+
+## S02 — `XDG_CURRENT_DESKTOP=ubuntu:GNOME` doesn't break DE detection
+
+**Severity:** Critical
+**Surface:** DE detection / patch gate
+**Applies to:** Ubu
+**Issues:** —
+
+**Steps:**
+1. On Ubuntu 24.04 (where `XDG_CURRENT_DESKTOP=ubuntu:GNOME`), launch the app.
+2. Inspect launcher log for any DE-detection branches that should fire as GNOME.
+3. Audit `scripts/launcher-common.sh` and any DE-gated patches for string-equality checks against `XDG_CURRENT_DESKTOP`.
+
+**Expected:** DE-detection logic handles Ubuntu's colon-separated value. `contains "GNOME"` or splitting on `:` is the safe pattern; `== "GNOME"` would miss Ubuntu.
+
+**Diagnostics on failure:** `echo $XDG_CURRENT_DESKTOP`, the relevant launcher.sh code path, launcher log, the patches that ran or didn't.
+
+**References:** Surfaced via session-capture review.
+
+**Code anchors:** `scripts/launcher-common.sh:35-44` (Niri auto-detect lowercases `XDG_CURRENT_DESKTOP` and uses `*niri*` glob — handles colon-separated values), `scripts/patches/quick-window.sh:34-35` and `:117-118` (KDE gate uses `.toLowerCase().includes("kde")` — substring, not equality), `scripts/doctor.sh:304` (purely informational `_info "Desktop: $desktop"`, no branching). No `==` equality checks against `XDG_CURRENT_DESKTOP` exist anywhere in shell or patched JS.
+
+## S03 — DEB install via APT pulls all required runtime deps
+
+**Severity:** Critical
+**Surface:** APT repository / dependency declarations
+**Applies to:** Ubu (any DEB-based distro)
+**Issues:** [`docs/learnings/apt-worker-architecture.md`](../../learnings/apt-worker-architecture.md)
+
+**Steps:**
+1. Add the project's APT repo per the README install instructions.
+2. `sudo apt install claude-desktop` on a fresh container/VM.
+3. Run `claude-desktop` — first launch should succeed with no further package installs.
+
+**Expected:** All transitive runtime deps are declared in the package and pulled by APT. First launch succeeds without manual `apt install` of any extra package.
+
+**Diagnostics on failure:** `apt-cache depends claude-desktop`, missing-library errors from the launcher, `ldd` against the binary.
+
+**References:** [`docs/learnings/apt-worker-architecture.md`](../../learnings/apt-worker-architecture.md)
+
+**Code anchors:** `scripts/packaging/deb.sh:185-197` (DEBIAN/control file — no `Depends:` field is emitted; relies on bundled Electron + the comment "No external dependencies are required at runtime" at line 183), `scripts/packaging/deb.sh:202-230` (postinst only sets chrome-sandbox suid, no dep-pull). Worker chain serving the package: `worker/src/worker.js:22-31` (`DEB_RE`) and `:33-43` (302 → GitHub Releases).
+
+## S04 — RPM install via DNF pulls all required runtime deps
+
+**Severity:** Critical
+**Surface:** DNF repository / dependency declarations
+**Applies to:** KDE-W, KDE-X, GNOME, Sway, i3, Niri (any RPM-based distro)
+**Issues:** [`docs/learnings/apt-worker-architecture.md`](../../learnings/apt-worker-architecture.md) *(covers both APT and DNF)*
+
+**Steps:**
+1. Add the project's DNF repo per the README.
+2. `sudo dnf install claude-desktop` on a fresh container/VM.
+3. Run `claude-desktop` — first launch should succeed.
+
+**Expected:** All transitive runtime deps are declared in the RPM and pulled by DNF. First launch succeeds with no further package installs.
+
+**Diagnostics on failure:** `dnf repoquery --requires claude-desktop`, `rpm -qR claude-desktop`, launcher missing-library errors.
+
+**References:** [`docs/learnings/apt-worker-architecture.md`](../../learnings/apt-worker-architecture.md)
+
+**Code anchors:** `scripts/packaging/rpm.sh:188` (`AutoReqProv: no` — explicitly disables RPM's auto-dep generation; spec declares no `Requires:`), `scripts/packaging/rpm.sh:194-198` (strip + build-id disabled because Electron binaries don't tolerate them — bundled approach). Worker chain: `worker/src/worker.js:28-31` (`RPM_RE`).
+
+## S05 — Doctor recognises dnf-installed package, doesn't false-flag as AppImage
+
+**Severity:** Should
+**Surface:** Doctor package-format detection
+**Applies to:** KDE-W, KDE-X, GNOME, Sway, i3, Niri
+**Issues:** —
+
+**Steps:**
+1. On a Fedora/Nobara/RPM-based distro with claude-desktop installed via dnf, run `claude-desktop --doctor`.
+2. Look for the install-method line.
+
+**Expected:** Doctor detects rpm install (e.g. via `rpm -qf` against the binary path) and reports it cleanly. No `not found via dpkg (AppImage?)` warning.
+
+**Currently:** Doctor's install-method check is gated on `command -v dpkg-query`, so on RPM-only hosts (no dpkg installed) the block is skipped entirely — no install-method line is printed. On hosts that have *both* `dpkg-query` and an rpm-installed `claude-desktop` (uncommon, e.g. mixed Debian + dnf), the misleading `claude-desktop not found via dpkg (AppImage?)` WARN does fire. Either way, no `rpm -qf` branch exists. Affects KDE-W, KDE-X, GNOME, Sway, i3, Niri rows ([T13](./launch.md#t13--doctor-reports-correct-package-format)). Not yet filed.
+
+**Diagnostics on failure:** Full `--doctor` output, `rpm -qf $(which claude-desktop)`, the doctor source line that decides the format.
+
+**References:** [T13](./launch.md#t13--doctor-reports-correct-package-format)
+
+**Code anchors:** `scripts/doctor.sh:353-362` — install-method check is gated on `command -v dpkg-query`; only runs on Debian-family hosts. Falls through to `_warn 'claude-desktop not found via dpkg (AppImage?)'` only if `dpkg-query` is present but returns empty. On Fedora/RPM hosts (`dpkg-query` absent), the entire block is skipped and **no install-method line is printed at all** — neither the misleading WARN nor a correct `rpm -qf` PASS. The drift is "no detection" rather than "false-flag as AppImage" on dpkg-less systems.
+
+## S15 — AppImage extraction (`--appimage-extract`) works as documented fallback
+
+**Severity:** Could
+**Surface:** AppImage runtime / FUSE-less fallback
+**Applies to:** Any AppImage row
+**Issues:** —
+
+**Steps:**
+1. On a host without FUSE, run `./claude-desktop-*.AppImage --appimage-extract`.
+2. Inspect `squashfs-root/`.
+3. Run `squashfs-root/AppRun`.
+
+**Expected:** Extraction completes. `squashfs-root/AppRun` launches the app cleanly without FUSE.
+
+**Diagnostics on failure:** Extraction stderr, `ls squashfs-root/`, AppRun stderr.
+
+**References:** Linked from the runtime error message when FUSE is missing.
+
+**Code anchors:** `scripts/packaging/appimage.sh:282` and `:312` (built with stock `appimagetool`, which always supports `--appimage-extract`), `scripts/packaging/appimage.sh:70-118` (`AppRun` script that lives at `squashfs-root/AppRun` after extraction). CI exercises this path: `tests/test-artifact-appimage.sh:36-44` and `.github/workflows/ci.yml:388` both run `--appimage-extract` and assert `squashfs-root/` exists.
+
+## S16 — AppImage mount cleans up on app exit
+
+**Severity:** Should
+**Surface:** AppImage mount lifecycle
+**Applies to:** Any AppImage row
+**Issues:** [CLAUDE.md "Common Gotchas"](https://github.com/aaddrick/claude-desktop-debian/blob/main/CLAUDE.md)
+
+**Steps:**
+1. Launch the AppImage. Confirm `mount | grep claude` shows the mount.
+2. Quit the app cleanly via tray → Quit (or `Ctrl+Q`).
+3. Re-run `mount | grep claude` — mount should be gone.
+
+**Expected:** AppImage's mount at `/tmp/.mount_claude*` is unmounted and the directory removed when all child Electron processes exit. Stale mounts after force-quit are handled by `pkill -9 -f "mount_claude"` per CLAUDE.md but should not be the common case.
+
+**Diagnostics on failure:** `mount | grep claude` after exit, `ls -la /tmp/.mount_claude*`, `pgrep -af claude`, `journalctl -k -n 50` for mount errors.
+
+**References:** [CLAUDE.md "Common Gotchas"](https://github.com/aaddrick/claude-desktop-debian/blob/main/CLAUDE.md)
+
+**Code anchors:** Mount lifecycle is owned by upstream `appimagetool`'s runtime, not this repo — `scripts/packaging/appimage.sh:282`/`:312` invokes the stock tool with no custom AppRun-side cleanup. `CLAUDE.md:179-183` documents `pkill -9 -f "mount_claude"` as the manual recovery for stale mounts after force-quit. No project-side unmount handler exists; the test asserts upstream behavior, not ours.
+
+## S26 — Auto-update is disabled when installed via `apt` / `dnf`
+
+> **⚠ Missing in build 1.5354.0** — No project-side suppression of upstream auto-update exists; the launcher exports `ELECTRON_FORCE_IS_PACKAGED=true`, which causes upstream's `lii()` gate to return true on Linux and the auto-update tick loop to start. Suppression is "accidental" — it relies on Electron's built-in `autoUpdater` module being unimplemented on Linux (so `setFeedURL`/`checkForUpdates` throw, the `error` listener logs, and no download happens). Tracked at [#567](https://github.com/aaddrick/claude-desktop-debian/issues/567); re-verify after next upstream bump.
+
+**Severity:** Critical
+**Surface:** Auto-update path
+**Applies to:** All DEB/RPM rows
+**Issues:** [#567](https://github.com/aaddrick/claude-desktop-debian/issues/567)
+
+**Steps:**
+1. Install via APT or DNF.
+2. Launch the app and let it sit for ~5 minutes.
+3. Inspect launcher log + filesystem for any auto-update download attempt.
+
+**Expected:** When installed via the project's APT or DNF repo, the in-app auto-update path is suppressed. The app does not download replacement binaries (which would race the package manager). Updates flow through `apt upgrade` / `dnf upgrade` only. AppImage installs may continue to self-update or punt to the user.
+
+**Diagnostics on failure:** Launcher log, network captures (look for downloads from `releases.anthropic.com` or `api.anthropic.com/api/desktop/linux/...`), filesystem changes under `~/.config/Claude/`.
+
+**References:** [`docs/learnings/apt-worker-architecture.md`](../../learnings/apt-worker-architecture.md)
+
+**Code anchors:** `scripts/launcher-common.sh:249` (`export ELECTRON_FORCE_IS_PACKAGED=true` — makes upstream think it's installed); `build-reference/app-extracted/.vite/build/index.js:508761-508769` (upstream `lii()` returns `hA.app.isPackaged` on Linux — passes the gate); `:508554-508559` (only suppression hook is enterprise-policy `disableAutoUpdates`, no Linux/distro carve-out); `:508770-508774` (feed URL `https://api.anthropic.com/api/desktop/linux/<arch>/squirrel/update?...`); `:508800-508803` (calls `hA.autoUpdater.setFeedURL` + `.checkForUpdates()` unconditionally on Linux). No patch in `scripts/patches/*.sh` neutralizes the autoUpdater module or sets `disableAutoUpdates`. AppImage continues to ship update info: `scripts/packaging/appimage.sh:308-309` (`gh-releases-zsync` zsync metadata embedded for releases).
--- a/docs/testing/cases/extensibility.md
+++ b/docs/testing/cases/extensibility.md
@@ -0,0 +1,153 @@
+# Extensibility — Plugins, MCP, Hooks, Memory
+
+Tests covering the Anthropic & Partners plugin install flow, the plugin browser, MCP server config, hooks, `CLAUDE.md` memory loading, and per-user storage of plugins/worktrees. See [`../matrix.md`](../matrix.md) for status.
+
+## T11 — Plugin install (Anthropic & Partners)
+
+**Severity:** Smoke
+**Surface:** Plugin browser → install flow
+**Applies to:** All rows
+**Issues:** [`docs/learnings/plugin-install.md`](../../learnings/plugin-install.md)
+
+**Steps:**
+1. In a Code-tab session, click **+** → **Plugins** → **Add plugin**.
+2. Find an Anthropic & Partners plugin. Click **Install**.
+3. Verify it lands in **Manage plugins** and its skills appear in the slash menu.
+4. Re-install the same plugin to verify idempotence.
+
+**Expected:** Install completes end-to-end: gate logic accepts, backend endpoint responds, plugin appears in the plugin list. Re-install is idempotent.
+
+**Diagnostics on failure:** DevTools network panel during install, launcher log, `~/.claude/plugins/` content, the gate-logic code path (see learnings doc).
+
+**References:** [`docs/learnings/plugin-install.md`](../../learnings/plugin-install.md), [Install plugins](https://code.claude.com/docs/en/desktop#install-plugins)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:507181` (`installPlugin` IPC + gate, with `pluginSource === "remote"` branch and CLI fallback); `:507193` log `[CustomPlugins] installPlugin: attempting remote API install`; `:465816` `dx()` returns `~/.claude/plugins`; `:465822` `installed_plugins.json` (idempotency record).
+
+**Inventory anchor:** `…customize.main.navigation.button-by-name.add-plugin` (role `button`, label `Add plugin`); sibling `…button-by-name.browse-plugins` (label `Browse plugins`). Both are persistent in the Customize panel — anchors the entry-point click chain.
+
+## T33 — Plugin browser
+
+**Severity:** Should
+**Surface:** Plugin browser UI
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Click **+** → **Plugins** → **Add plugin**.
+2. Confirm entries from the official Anthropic marketplace appear.
+3. Install a non-Anthropic plugin end-to-end.
+4. Verify it shows in **Manage plugins** and contributes its skills to the slash menu.
+
+**Expected:** Plugin browser opens, shows the marketplace, install completes. Installed plugins appear under Manage plugins and contribute to the slash menu.
+
+**Diagnostics on failure:** Screenshot of plugin browser, network captures, launcher log, `~/.claude/plugins/` listing.
+
+**References:** [Install plugins](https://code.claude.com/docs/en/desktop#install-plugins)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:71392` (`CustomPlugins.listMarketplaces` IPC); `:71534` (`listAvailablePlugins` IPC); `:507176` (`listMarketplaces` main-process handler); `:496236` deep-link route `plugins/new` opens the browser surface.
+
+**Inventory anchor:** `…customize.main.navigation.button-by-name.browse-plugins` (role `button`, label `Browse plugins`); sibling `…link-by-name.connectors` (role `link`, label `Connectors`). The browser surface itself (marketplace listings, install button) appears under a child dialog not captured at idle — re-capture with the dialog open to anchor those.
+
+## T35 — MCP server config picked up
+
+**Severity:** Critical
+**Surface:** MCP / Code tab
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Add an MCP server to `~/.claude.json` or `<project>/.mcp.json`.
+2. Open a Code-tab session against the project.
+3. Type `/` in the prompt — verify MCP-provided tools appear in the slash menu (or invoke one directly).
+4. Separately, confirm `claude_desktop_config.json` (Chat-tab MCP) is **not** picked up by Code tab.
+
+**Expected:** MCP servers in `~/.claude.json` or `.mcp.json` start when a Code session opens. Tools appear in the slash menu, calls succeed end-to-end. `claude_desktop_config.json` is separate per upstream docs.
+
+**Diagnostics on failure:** Server stderr (MCP servers log to stderr), `~/.claude.json` and `.mcp.json` content, launcher log, DevTools console for MCP wire errors.
+
+**References:** [MCP servers: desktop chat app vs Claude Code](https://code.claude.com/docs/en/desktop#shared-configuration), [`docs/learnings/plugin-install.md`](../../learnings/plugin-install.md)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:215418` (Code-tab loads `<project>/.mcp.json` per scanned dir); `:176766` reads `~/.claude.json`; `:489098` Code-session passes `settingSources: ["user", "project", "local"]` to the agent SDK; `:130821` `claude_desktop_config.json` is the chat-tab path constant (separate userData dir at `:130829` `kee()`), confirming the two trees do not overlap.
+
+## T36 — Hooks fire
+
+**Severity:** Critical
+**Surface:** Hooks runtime
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Add a `SessionStart` hook in `~/.claude/settings.json` that writes a marker file.
+2. Open a new Code-tab session.
+3. Confirm the marker file exists.
+4. Repeat with `PreToolUse` / `PostToolUse` hooks. Switch transcript view to Verbose to see the hook output.
+
+**Expected:** Hooks defined in `~/.claude/settings.json` execute at the documented points. Hook output is visible in Verbose transcript mode. A failing hook surfaces a clear error rather than silently breaking the session.
+
+**Diagnostics on failure:** Hook script stderr, marker file presence, launcher log, settings file content, Verbose transcript output.
+
+**References:** [Shared configuration](https://code.claude.com/docs/en/desktop#shared-configuration)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:489098` Code-session sets `settingSources: ["user", "project", "local"]` (agent SDK reads `~/.claude/settings.json` hooks from this); `:455717` built-in `PreToolUse` hooks registry the runtime extends; `:455819` `UserPromptSubmit`; `:465680` `PostToolUse`; `:465754` `Stop`; `:493411` runtime emits `hook_started` / `hook_progress` / `hook_response` for `SessionStart` (Verbose transcript path).
+
+## T37 — `CLAUDE.md` memory loads
+
+**Severity:** Critical
+**Surface:** Memory / Code tab session prompt
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Confirm a project `CLAUDE.md` exists at the working folder.
+2. Confirm `~/.claude/CLAUDE.md` exists with at least one identifying token.
+3. Open a Code-tab session against the project.
+4. Ask Claude "what's in your CLAUDE.md" — verify the response matches on-disk content.
+5. Edit `CLAUDE.md`. Start a new session — verify the new content is loaded.
+
+**Expected:** Project `CLAUDE.md` and `CLAUDE.local.md` at the working folder, plus `~/.claude/CLAUDE.md`, are loaded into the session's system prompt. Updates after edit on the next session start.
+
+**Diagnostics on failure:** `cat CLAUDE.md` and `cat ~/.claude/CLAUDE.md` outputs, launcher log, system-prompt dump if accessible (Verbose transcript may show it).
+
+**References:** [Shared configuration](https://code.claude.com/docs/en/desktop#shared-configuration)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:259691` working-dir scan reads `CLAUDE.md` and `.claude/CLAUDE.md`; `:455188` global account memory `zhA(accountId, orgId)` is copied to the per-session `.claude/CLAUDE.md` at session start (`[GlobalMemory] Copied CLAUDE.md`); `:283107` `cE()` resolves `CLAUDE_CONFIG_DIR` or `~/.claude`, the dir whose `CLAUDE.md` the agent SDK loads via `settingSources: ["user", ...]` (see T36 anchor at `:489098`).
+
+## S27 — Plugins install per-user, not into system paths
+
+**Severity:** Should
+**Surface:** Plugin storage
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. As a non-root user, install a plugin via the desktop plugin browser.
+2. Inspect `~/.claude/plugins/` for the install.
+3. Verify nothing was written under `/usr` or other system-managed trees (`find /usr -newer /tmp/marker -name '*claude*' 2>/dev/null` after `touch /tmp/marker; install plugin`).
+
+**Expected:** Plugins land under `~/.claude/plugins/` (or the equivalent per-user dir). Never under `/usr`. Non-root install/enable/disable works without `sudo`.
+
+**Diagnostics on failure:** `find / -name '*<plugin-name>*' 2>/dev/null`, install logs, launcher log.
+
+**References:** [Install plugins](https://code.claude.com/docs/en/desktop#install-plugins)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:283107` `cE()` resolves the config root to `CLAUDE_CONFIG_DIR` or `~/.claude` — never `/usr`; `:465815` `dx()` returns `<cE()>/plugins`; `:465821`/`:465824`/`:465827` `installed_plugins.json`, `known_marketplaces.json`, `marketplaces/` all sit under `dx()`. No system-path writes in the install path.
+
+## S28 — Worktree creation surfaces clear error on read-only mounts
+
+**Severity:** Could
+**Surface:** Worktree creation on read-only filesystem
+**Applies to:** All rows (NixOS users hit this most often)
+**Issues:** —
+
+**Steps:**
+1. Place a project on a read-only mount (e.g. squashfs, NFS read-only export, `mount -o ro` bind).
+2. Open a Code-tab session against it.
+3. Try to start a parallel session that needs a worktree.
+
+**Expected:** Worktree creation fails with a clear error pointing at the read-only mount. No silent loss of work, no writes to a wrong directory, no parent-repo corruption.
+
+**Diagnostics on failure:** `mount | grep <project-path>`, `git worktree add` direct invocation (does it fail the same way?), launcher log, screenshot of error dialog.
+
+**References:** [Work in parallel with sessions](https://code.claude.com/docs/en/desktop#work-in-parallel-with-sessions)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:462841` worktree parent dir is `<repo>/.claude/worktrees` (or `chillingSlothLocation.customPath` override at `:462836`); `:462928` `git worktree add` failure path returns `null` after `R.error("Failed to create git worktree: …")`; `:462760` `Sbn()` classifies "Permission denied" / "Access is denied" / "could not lock config file" as `"permission-denied"` (the read-only-mount taxonomy bucket).
--- a/docs/testing/cases/launch.md
+++ b/docs/testing/cases/launch.md
@@ -0,0 +1,77 @@
+# Launch & Process Lifecycle
+
+Tests covering app startup, the `--doctor` health check, package-format detection, and multi-instance behavior. See [`../matrix.md`](../matrix.md) for status.
+
+## T01 — App launch
+
+**Severity:** Smoke
+**Surface:** App startup
+**Applies to:** All rows
+**Issues:** —
+**Runner:** [`tools/test-harness/src/runners/T01_app_launch.spec.ts`](../../../tools/test-harness/src/runners/T01_app_launch.spec.ts)
+
+**Steps:**
+1. From a clean session, run `claude-desktop` (deb/rpm) or launch the AppImage.
+2. Wait up to 10 seconds.
+
+**Expected:** Main window opens within ~10s. No error toast, no crash. The launcher log at `~/.cache/claude-desktop-debian/launcher.log` shows the expected backend selection (`Using X11 backend via XWayland` on Wayland sessions, or native Wayland when forced).
+
+**Diagnostics on failure:** Launcher log, `--doctor` output, session env (`XDG_SESSION_TYPE`, `XDG_CURRENT_DESKTOP`), `dmesg | tail -50`, any crash report under `~/.config/Claude/logs/`.
+
+**References:** —
+**Code anchors:** `scripts/launcher-common.sh:98` (X11-via-XWayland log line), `scripts/launcher-common.sh:102` (native-Wayland log line), `build-reference/app-extracted/.vite/build/index.js:524875` (`app.on("ready")` registration), `build-reference/app-extracted/.vite/build/index.js:524881-524931` (main `BrowserWindow` factory `Ori()` — `titleBarStyle`, mainWindow.js preload, initial `show`).
+
+## T02 — Doctor health check
+
+**Severity:** Critical
+**Surface:** CLI / `--doctor`
+**Applies to:** All rows
+**Issues:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538)
+
+**Steps:**
+1. Run `claude-desktop --doctor`.
+2. Inspect exit code (`echo $?`) and stdout/stderr.
+
+**Expected:** Exits 0. All checks PASS or report expected WARN. No FAIL checks. Doctor currently reports display-server, menu-bar mode, Electron path/version, Chrome sandbox perms, SingletonLock, MCP config, Node.js, desktop entry, disk space, and a Cowork section — it does **not** surface the resolved titlebar style. See also [T13](#t13--doctor-reports-correct-package-format) for the package-format detection slice.
+
+**Diagnostics on failure:** Full `--doctor` output, the install path being inspected (`which claude-desktop`), package metadata (`dpkg -S` / `rpm -qf` against the binary).
+
+**References:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538)
+**Code anchors:** `scripts/doctor.sh:280` (`run_doctor` entry point), `scripts/doctor.sh:301-319` (display-server check), `scripts/doctor.sh:401-417` (SingletonLock check), `scripts/doctor.sh:744-753` (exit-code summary).
+
+## T13 — Doctor reports correct package format
+
+**Severity:** Should
+**Surface:** CLI / `--doctor`
+**Applies to:** All rows (currently `✗` on every Fedora row — see [S05](./distribution.md#s05--doctor-recognises-dnf-installed-package-doesnt-false-flag-as-appimage))
+**Issues:** — *(no issue filed; surfaced via session-capture review)*
+
+**Steps:**
+1. Install via the relevant package manager (`apt` / `dnf`) or AppImage.
+2. Run `claude-desktop --doctor` and look for the install-method line.
+
+**Expected:** Doctor identifies the install method correctly. On RPM-based distros (Fedora, Nobara) it does **not** report `not found via dpkg (AppImage?)` — that warning currently false-flags every dnf install. On DEB-based distros it does not assume AppImage when dpkg returns the package metadata.
+
+**Diagnostics on failure:** `dpkg -S $(which claude-desktop)`, `rpm -qf $(which claude-desktop)`, full `--doctor` output, the line of doctor source that decides the format.
+
+**References:** [S05](./distribution.md#s05--doctor-recognises-dnf-installed-package-doesnt-false-flag-as-appimage)
+**Code anchors:** `scripts/doctor.sh:353-362` — version probe is dpkg-only (`dpkg-query -W -f='${Version}' claude-desktop`); on RPM/AppImage hosts that lack `dpkg-query` the block is skipped, but on a Fedora host that *does* have `dpkg-query` installed (e.g. for cross-distro tooling) the `_warn 'claude-desktop not found via dpkg (AppImage?)'` branch fires for any dnf-installed copy. There is no corresponding `rpm -qf` / `rpm -q claude-desktop` branch.
+
+## T14 — Multi-instance behavior
+
+**Severity:** Critical
+**Surface:** App lifecycle
+**Applies to:** All rows
+**Issues:** [PR #536](https://github.com/aaddrick/claude-desktop-debian/pull/536) (closed, docs-only — no in-tree opt-in flag)
+
+**Steps:**
+1. Launch `claude-desktop`. Wait for the main window.
+2. Launch `claude-desktop` again from another terminal or `.desktop` invocation.
+3. Optionally: follow the manual `--user-data-dir` recipe sketched in PR #536 (separate Electron `userData` per profile so each gets its own `SingletonLock` — note the PR was closed, the recipe is not shipped in-tree).
+
+**Expected:** Second invocation focuses the existing window — no new process. The launcher's `cleanup_stale_lock` removes a `SingletonLock` whose owning PID is no longer running. With separate `--user-data-dir` per profile (manual workaround, not an in-tree feature), each profile runs an independent Electron instance.
+
+**Diagnostics on failure:** `pgrep -af claude-desktop`, `ls -la ~/.config/Claude/SingletonLock`, launcher log, any "another instance is running" dialog text.
+
+**References:** [PR #536](https://github.com/aaddrick/claude-desktop-debian/pull/536)
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:525162-525173` (`requestSingleInstanceLock()` + `app.on("second-instance", ...)` — shows existing window, restores if minimized, focuses), `build-reference/app-extracted/.vite/build/index.js:525204-525207` (early-return on lost lock at `app.on("ready")`), `scripts/launcher-common.sh:187-208` (`cleanup_stale_lock` — drops a `SingletonLock` symlink whose `hostname-PID` target points at a dead PID).
--- a/docs/testing/cases/platform-integration.md
+++ b/docs/testing/cases/platform-integration.md
@@ -0,0 +1,282 @@
+# Platform Integration
+
+Tests covering autostart, Cowork integration, WebGL graceful degradation, `.desktop`-launch env inheritance, encrypted env-var storage, the macOS/Windows-only Computer Use feature, and Dispatch session pairing. See [`../matrix.md`](../matrix.md) for status.
+
+## T09 — AutoStart via XDG
+
+**Severity:** Critical
+**Surface:** XDG Autostart
+**Applies to:** All rows
+**Issues:** [PR #450](https://github.com/aaddrick/claude-desktop-debian/pull/450)
+
+**Steps:**
+1. In Settings, toggle "Open at Login" / "Start at boot" ON.
+2. Inspect `~/.config/autostart/` for a `.desktop` entry.
+3. Logout/login. Verify app launches automatically.
+4. Toggle OFF. Verify the autostart entry is removed.
+
+**Expected:** Toggling ON creates a `~/.config/autostart/*.desktop` entry that is XDG-spec compliant (not a custom systemd unit or shell hook). After login, app launches automatically. Toggling OFF removes the entry.
+
+**Diagnostics on failure:** `ls -la ~/.config/autostart/`, content of the .desktop file, `desktop-file-validate` on it, launcher log.
+
+**References:** [PR #450](https://github.com/aaddrick/claude-desktop-debian/pull/450)
+
+**Code anchors:**
+- `scripts/frame-fix-wrapper.js:376` — XDG Autostart shim
+  intercepting `app.{get,set}LoginItemSettings` (writes/removes
+  `$XDG_CONFIG_HOME/autostart/claude-desktop.desktop`).
+- `scripts/frame-fix-wrapper.js:429` — `buildAutostartContent()`
+  emits the spec-compliant `[Desktop Entry]` block.
+- `build-reference/app-extracted/.vite/build/index.js:524205` —
+  upstream `isStartupOnLoginEnabled` / `setStartupOnLoginEnabled` IPC
+  surface that the wrapper interposes on.
+
+## T10 — Cowork integration
+
+**Severity:** Should
+**Surface:** Cowork tab + VM daemon
+**Applies to:** All rows
+**Issues:** [`docs/learnings/cowork-vm-daemon.md`](../../learnings/cowork-vm-daemon.md)
+
+**Steps:**
+1. Sign into the app. Open the Cowork tab.
+2. Confirm Cowork-specific UI renders (ghost icon in topbar, Cowork menus).
+3. Trigger a Cowork action that needs the VM daemon.
+4. Kill the VM daemon process; verify it respawns within the documented timeout.
+
+**Expected:** Cowork features render. VM daemon spawns when needed, files are visible, daemon respawns within the documented timeout if it crashes.
+
+**Diagnostics on failure:** `pgrep -af cowork`, daemon logs, launcher log, the respawn-logic code path (see learnings doc).
+
+**References:** [`docs/learnings/cowork-vm-daemon.md`](../../learnings/cowork-vm-daemon.md)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:143371` —
+  upstream's Windows named-pipe path (`\\.\pipe\cowork-vm-service`)
+  that `scripts/patches/cowork.sh` Patch 1 rewrites to
+  `$XDG_RUNTIME_DIR/cowork-vm-service.sock`.
+- `build-reference/app-extracted/.vite/build/index.js:143453` —
+  `kUe()` retry loop (5 attempts, 1 s gap) that the auto-launch
+  injection from Patch 6 piggybacks on after the rewrite.
+- `scripts/patches/cowork.sh:244` — Patch 6 (auto-launch + stdio
+  pipe + 10 s rate-limited respawn — issue #408).
+- `scripts/patches/cowork.sh:365` — Patch 6b (extends the
+  reinstall-delete list with `sessiondata.img` / `rootfs.img.zst`
+  so a wedged daemon can self-recover).
+
+## T12 — WebGL warn-only
+
+**Severity:** Could
+**Surface:** Chromium GPU diagnostics
+**Applies to:** All rows (especially VM rows and hybrid-GPU laptops)
+**Issues:** —
+
+**Steps:**
+1. Launch the app. Open DevTools → navigate to `chrome://gpu`.
+2. Inspect WebGL1/WebGL2 status.
+3. Use the app for ~5 minutes — exercise UI, sidebar, settings.
+
+**Expected:** WebGL1/2 may report as blocklisted (typical on virtio-gpu in VMs and on hybrid GPU laptops). This is informational. UI continues to render without graphical glitches; no feature is broken by the blocklist.
+
+**Diagnostics on failure:** `chrome://gpu` full content, screenshot of any visual glitch, `glxinfo | head -20` (X11) or `eglinfo` (Wayland), `lspci -k | grep -A2 VGA`.
+
+**References:** —
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:524809` —
+  `app.disableHardwareAcceleration()` is gated on the user-toggleable
+  `isHardwareAccelerationDisabled` setting; upstream does not pass
+  `--ignore-gpu-blocklist` or `--use-gl=*`, so chrome://gpu reflects
+  Chromium's stock blocklist behaviour.
+- `build-reference/app-extracted/.vite/build/index.js:500571` —
+  the only `webgl:!1` override is scoped to the feedback popup
+  (`in-memory-feedback` partition); main UI does not disable WebGL.
+
+## S17 — App launched from `.desktop` inherits shell `PATH`
+
+**Severity:** Critical
+**Surface:** `.desktop`-launch env handling
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Configure `~/.bashrc` (or `~/.zshrc`) with `export PATH="$HOME/.custom-bin:$PATH"` and a custom binary in that dir.
+2. Launch the app via dmenu/krunner/GNOME Activities/Plasma launcher (i.e. **not** from a terminal).
+3. Open a Code-tab terminal pane. Run `which <custom-binary>`.
+4. Repeat for `npm`, `node`, `git`, `gh`.
+
+**Expected:** Code session can find tools defined in the user's shell profile, even when the app was launched non-interactively. Either the launcher script sources the user's shell profile, or the app reads `~/.bashrc` / `~/.zshrc` to extract `PATH` the way macOS does.
+
+**Diagnostics on failure:** `echo $PATH` from inside the integrated terminal, the env passed to the app process (`cat /proc/$(pgrep -f electron)/environ | tr '\0' '\n' | grep PATH`), launcher log.
+
+**References:** [Local sessions](https://code.claude.com/docs/en/desktop#local-sessions), [Session not finding installed tools](https://code.claude.com/docs/en/desktop#session-not-finding-installed-tools)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:259300` —
+  `SLr()` resolves the bundled `shell-path-worker/shellPathWorker.js`.
+- `build-reference/app-extracted/.vite/build/index.js:259349` —
+  `NLr()` forks it via `utilityProcess.fork`; on success
+  `FX()` (line 259311) merges the extracted env into `process.env`.
+- `build-reference/app-extracted/.vite/build/shell-path-worker/shellPathWorker.js:205`
+  — `extractPathFromShell()` runs the user's login shell (`-l -i`)
+  and parses the printed `$PATH` between sentinels (mac-style env
+  inheritance now applied on Linux too).
+
+## S18 — Local environment editor persists across reboot
+
+**Severity:** Should
+**Surface:** Local env editor / encrypted store
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Open the local environment editor. Add `TEST_VAR=hello`.
+2. Restart the app — verify variable is still there.
+3. Reboot the host. Sign back in. Verify variable is still there.
+
+**Expected:** Variables saved via the local environment editor (per-app, encrypted) survive a logout/login cycle and a full reboot. On Linux this implies the encrypted store is wired to libsecret / kwallet / gnome-keyring and unlocks at session start.
+
+**Diagnostics on failure:** `secret-tool search` (libsecret), `kwallet5-query` (KDE), `seahorse` UI inspection (GNOME), launcher log, the env-editor IPC call.
+
+**References:** [Local sessions](https://code.claude.com/docs/en/desktop#local-sessions)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:259251` —
+  `I2t = new K_({ name: "ccd-environment-config", ... })` electron-store
+  backing file (`~/.config/Claude/ccd-environment-config.json`).
+- `build-reference/app-extracted/.vite/build/index.js:259253` —
+  `hLr()` writes via `safeStorage.encryptString` (libsecret on Linux).
+- `build-reference/app-extracted/.vite/build/index.js:259268` —
+  `J1()` decrypts on read; bails to `{}` if `safeStorage` reports
+  encryption unavailable (no keyring backend running).
+- `build-reference/app-extracted/.vite/build/index.js:70782` —
+  `LocalSessionEnvironment.save` IPC entry that calls into `hLr`.
+
+## S22 — Computer-use toggle is absent or visibly disabled on Linux
+
+**Severity:** Should
+**Surface:** Settings → Desktop app → General
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Open Settings → Desktop app → General.
+2. Look for the "Computer use" toggle.
+
+**Expected:** Toggle either does not render on Linux, or renders as a disabled control with a clear "not supported on Linux" hint. Must not appear functional and silently fail (e.g. flip on but never produce screen-control behavior).
+
+**Diagnostics on failure:** Screenshot of the Settings page, DevTools inspection of the toggle DOM (is it conditionally hidden? disabled? always-rendered?), launcher log.
+
+**References:** [Let Claude use your computer](https://code.claude.com/docs/en/desktop#let-claude-use-your-computer), [Dispatch and computer use](https://claude.com/blog/dispatch-and-computer-use)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:240557` —
+  `qDA = new Set(["darwin", "win32"])` excludes Linux from the
+  computer-use platform set.
+- `build-reference/app-extracted/.vite/build/index.js:241190` —
+  `TF()` (the master enable check) short-circuits to `false` when
+  `qDA.has(process.platform)` is false, so toggling
+  `chicagoEnabled` on Linux can't activate the feature.
+- `build-reference/app-extracted/.vite/build/index.js:242387` —
+  `tvr()` returns `{ status: "unsupported", reason: "Computer use
+  is not available on this platform", unsupportedCode:
+  "unsupported_platform" }` for the Settings UI — confirms the
+  toggle should render with a platform-unavailable hint, not silent
+  failure.
+
+## S23 — Dispatch-spawned sessions don't soft-lock on a never-approvable computer-use prompt
+
+**Severity:** Critical (for Dispatch users)
+**Surface:** Dispatch session lifecycle on Linux
+**Applies to:** All rows with Dispatch enabled
+**Issues:** —
+
+**Steps:**
+1. From a paired phone, dispatch a task that would invoke computer use.
+2. Observe the Code-tab session that spawns on the desktop.
+3. Try to interact with other parts of the app.
+
+**Expected:** Permission prompt times out or denies cleanly rather than hanging the session indefinitely. User can continue interacting with the rest of the app.
+
+**Diagnostics on failure:** Screenshot of session state, launcher log, sidebar state (is the Dispatch session blocking the whole sidebar?), `pgrep -af claude`.
+
+**References:** [Sessions from Dispatch](https://code.claude.com/docs/en/desktop#sessions-from-dispatch)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:512789` —
+  `tool_permission_request` notification handler explicitly skips
+  `toolName.startsWith("computer:")`, so the desktop never queues a
+  user-facing prompt for computer-use tool calls (which couldn't run
+  on Linux anyway — see S22).
+- `build-reference/app-extracted/.vite/build/index.js:241190` —
+  `TF()` gates computer-use execution off entirely on Linux, so a
+  Dispatch-spawned session that requests it should hit the upstream
+  "Set up computer use" remote-client setup card
+  (`index.js:330114`) rather than block on a desktop prompt.
+
+## S24 — Dispatch-spawned Code session appears with badge and notification
+
+**Severity:** Critical
+**Surface:** Dispatch handoff
+**Applies to:** All rows with Dispatch enabled
+**Issues:** —
+
+**Steps:**
+1. From a paired phone, dispatch a task that routes to Code (e.g. "fix this bug").
+2. Observe the desktop sidebar.
+3. Confirm a desktop notification fires.
+4. Open the session and confirm 30-min approval expiry per upstream docs.
+
+**Expected:** Dispatch task creates a sidebar entry tagged **Dispatch**, posts a desktop notification, and lands ready for review. App-permission approvals on this session expire after 30 minutes per upstream docs.
+
+**Diagnostics on failure:** Screenshot of sidebar (badge present?), notification daemon state, launcher log, the Dispatch pairing config under `~/.config/Claude/`.
+
+**References:** [Sessions from Dispatch](https://code.claude.com/docs/en/desktop#sessions-from-dispatch), [Dispatch and computer use](https://claude.com/blog/dispatch-and-computer-use)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:144561` —
+  `Sd = "dispatch_child"` session-type constant.
+- `build-reference/app-extracted/.vite/build/index.js:512200` —
+  `onRemoteSessionStart` IPC routes a Dispatch-initiated child
+  session into the local sidebar via `dispatchOnRemoteSessionStart`.
+- `build-reference/app-extracted/.vite/build/index.js:285621` —
+  `notifyDispatchParentIfNeeded()` posts the
+  `Task "<title>" <state>` meta-notification when the dispatch
+  child finishes (lands the result in the parent thread's
+  notification queue).
+- `build-reference/app-extracted/.vite/build/index.js:285954` —
+  `kind:"dispatch_child"` is the sidebar badge tag.
+
+## S25 — Mobile pairing survives Linux session restart
+
+**Severity:** Should
+**Surface:** Dispatch pairing persistence
+**Applies to:** All rows with Dispatch enabled
+**Issues:** —
+
+**Steps:**
+1. Pair the desktop with a phone.
+2. Quit the app fully. Re-launch.
+3. Try a Dispatch task. Verify pairing still works without re-pairing.
+4. Logout/login the desktop. Re-test.
+
+**Expected:** Pairing remains active across app restart and logout/login. Pairing token is stored under `~/.config/Claude/` (or wherever the secure store lives) and survives.
+
+**Diagnostics on failure:** `ls -la ~/.config/Claude/`, secret-store inspection, launcher log, pairing-flow IPC.
+
+**References:** [Sessions from Dispatch](https://code.claude.com/docs/en/desktop#sessions-from-dispatch)
+
+**Code anchors:**
+- `build-reference/app-extracted/.vite/build/index.js:511984` —
+  `ZEe = "coworkTrustedDeviceToken"` electron-store key for the
+  trusted-device token.
+- `build-reference/app-extracted/.vite/build/index.js:511989` —
+  `oYn()` writes the token via `safeStorage.encryptString` (libsecret
+  on Linux); `aYn()` (`:512003`) decrypts on read.
+- `build-reference/app-extracted/.vite/build/index.js:512022` —
+  `gYn()` re-enrolls via `POST /api/auth/trusted_devices` only when
+  there's no cached token, so a successful pair survives restart.
+- `build-reference/app-extracted/.vite/build/index.js:330229` —
+  `_5r = "bridge-state.json"` (per-org/account bridge state under
+  `~/.config/Claude/bridge-state.json`); `JF()`/`X0A()` at `:330230`
+  read/locate it.
--- a/docs/testing/cases/routines.md
+++ b/docs/testing/cases/routines.md
@@ -0,0 +1,125 @@
+# Routines & Scheduled Tasks
+
+Tests covering the Routines page, scheduled task firing, catch-up runs after suspend, and the suspend-inhibit toggle. See [`../matrix.md`](../matrix.md) for status.
+
+## T26 — Routines page renders
+
+**Severity:** Critical
+**Surface:** Routines page
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Sign into the app, open the Code tab.
+2. Click **Routines** in the sidebar.
+3. Click **New routine** → **Local**.
+
+**Expected:** Routines list opens. New-routine form shows all schedule presets (Manual, Hourly, Daily, Weekdays, Weekly), permission-mode picker, model picker, working-folder picker, and worktree toggle.
+
+**Diagnostics on failure:** Screenshot of the Routines page (or the failure state), DevTools console output, launcher log, network captures of the routines API call (`mitmproxy` or DevTools network panel).
+
+**References:** [Schedule recurring tasks](https://code.claude.com/docs/en/desktop-scheduled-tasks)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:507710` (create payload — `permissionMode`, `model`, `userSelectedFolders`, `useWorktree`, `cronExpression`, `fireAt`); `build-reference/app-extracted/.vite/build/index.js:280299` (`@hourly: "0 * * * *"` preset)
+
+**Inventory anchors:** `root.complementary.button-by-name.routines` (sidebar entry); `root.complementary.button-by-name.routines.main.region.button-by-name.new-routine` (form trigger); siblings `…button-by-name.all`, `…button-by-name.calendar` (list-view tabs). Preset list (Hourly/Daily/etc.) lives inside the New-routine modal and is not in the idle-state inventory — re-capture with the modal open to anchor.
+
+## T27 — Scheduled task fires and notifies
+
+**Severity:** Critical
+**Surface:** Routines runtime + libnotify
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Create a Manual task with a simple instruction (e.g. "echo hello").
+2. Click **Run now**. Observe.
+3. Optionally: create an Hourly task and verify across the next hour boundary.
+
+**Expected:** A fresh session starts, appears in the **Scheduled** section of the sidebar, and posts a desktop notification when it begins. Subsequent runs respect the deterministic offset described in upstream docs.
+
+**Diagnostics on failure:** Launcher log, screenshot of sidebar, `gdbus call --session --dest=org.freedesktop.Notifications --object-path=/org/freedesktop/Notifications --method=org.freedesktop.DBus.Introspectable.Introspect` (verify daemon present), task SKILL.md content under `~/.claude/scheduled-tasks/<task-name>/`.
+
+**References:** [How scheduled tasks run](https://code.claude.com/docs/en/desktop-scheduled-tasks#how-scheduled-tasks-run)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:282332` (`runNow(A)` — manual dispatch); `build-reference/app-extracted/.vite/build/index.js:512837` (`Rc.showNotification(...,scheduled-${l},...)` — desktop notification on completion); `build-reference/app-extracted/.vite/build/index.js:282654` (`getJitterSecondsForTask` — deterministic per-task offset via `v2r(A, n*60)`, capped by `dispatchJitterMaxMinutes` default 10)
+
+## T28 — Scheduled task catch-up after suspend
+
+**Severity:** Should
+**Surface:** Routines runtime / wake-from-suspend
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Create an Hourly task.
+2. Suspend the host (`systemctl suspend`).
+3. Wait past at least one hourly slot. Wake the host.
+4. Observe whether a catch-up run starts.
+
+**Expected:** Exactly one catch-up run for the most recently missed slot (older missed slots are discarded). Notification announces the catch-up. Missed runs older than seven days are not retried.
+
+**Diagnostics on failure:** Task history in the routines detail page, launcher log, `journalctl --since="-1 day" | grep -i suspend`.
+
+**References:** [Missed runs](https://code.claude.com/docs/en/desktop-scheduled-tasks#missed-runs)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:281695` (`R2r` — walks back from now, capped at `10080 * 60 * 1e3` ms = 7 days, returns at most one missed slot, dedupes by `IfA` bucket-key); `build-reference/app-extracted/.vite/build/index.js:281942` (`scheduledTaskPostWakeDelayMs` default 60000 ms — gates dispatch after `powerMonitor.on("resume")`); `build-reference/app-extracted/.vite/build/index.js:282569` (catch-up branch: `c ? 0 : this.getJitterSecondsForTask(o.id)` — missed-slot dispatch skips jitter)
+
+## S19 — `CLAUDE_CONFIG_DIR` redirects scheduled-task storage
+
+**Severity:** Could
+**Surface:** Config dir env var
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. In the local environment editor, set `CLAUDE_CONFIG_DIR=/some/other/path`.
+2. Restart the app.
+3. Create a scheduled task. Inspect filesystem.
+
+**Expected:** Tasks resolve under `${CLAUDE_CONFIG_DIR}/scheduled-tasks/<task-name>/SKILL.md` rather than `~/.claude/scheduled-tasks/`. Pre-existing tasks under the old path are not silently dropped.
+
+**Diagnostics on failure:** `ls -la ${CLAUDE_CONFIG_DIR}/scheduled-tasks/` and `~/.claude/scheduled-tasks/`, launcher log, env dump.
+
+**References:** [Manage scheduled tasks](https://code.claude.com/docs/en/desktop-scheduled-tasks#manage-scheduled-tasks)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:283108` (`cE()` — resolves `process.env.CLAUDE_CONFIG_DIR ?? ~/.claude`, handles `~` prefix); `build-reference/app-extracted/.vite/build/index.js:283118` (`Tce()` — returns `${cE()}/scheduled-tasks`); `build-reference/app-extracted/.vite/build/index.js:488317` and `:509032` (call sites passing `taskFilesDir: Tce()` into the scheduled-tasks substrate)
+
+## S20 — "Keep computer awake" inhibits idle suspend
+
+**Severity:** Should
+**Surface:** Suspend inhibitor
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Open Settings → Desktop app → General → "Keep computer awake". Toggle ON.
+2. Run `systemd-inhibit --list`. Look for a Claude-owned lock with `idle:sleep` what.
+3. Toggle OFF. Re-run `systemd-inhibit --list` — lock should be gone.
+
+**Expected:** Toggling ON registers `systemd-inhibit --what=idle:sleep` (or the `org.freedesktop.PowerManagement.Inhibit` DBus call). Toggling OFF releases the lock.
+
+**Diagnostics on failure:** `systemd-inhibit --list` before/after, `busctl --user tree org.freedesktop.PowerManagement` (if the path uses that backend), launcher log, the relevant settings IPC call.
+
+**References:** [How scheduled tasks run](https://code.claude.com/docs/en/desktop-scheduled-tasks#how-scheduled-tasks-run)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:241897` (`hA.powerSaveBlocker.start("prevent-app-suspension")` — single block call, ref-counted by `PhA` Set); `build-reference/app-extracted/.vite/build/index.js:241905` (`hA.powerSaveBlocker.stop(BP)` when last claim drops); `build-reference/app-extracted/.vite/build/index.js:241909` (settings binding: `PHe = "keepAwakeEnabled"`); `build-reference/app-extracted/.vite/build/index.js:241914` (`vy.on("keepAwakeEnabled", YHe)` — toggle observer)
+
+## S21 — Lid-close still suspends per OS policy
+
+**Severity:** Critical
+**Surface:** Suspend inhibitor scope
+**Applies to:** All rows (laptop hosts)
+**Issues:** —
+
+**Steps:**
+1. With "Keep computer awake" ON, close the laptop lid.
+2. Observe whether the machine suspends.
+
+**Expected:** Machine still suspends per logind's `HandleLidSwitch=suspend`. The inhibit lock taken in [S20](#s20--keep-computer-awake-inhibits-idle-suspend) targets `idle:sleep`, not `handle-lid-switch`, so lid-close behavior is unaffected.
+
+**Diagnostics on failure:** `loginctl show-session --property=HandleLidSwitch`, `journalctl --since="-5 minutes"`, the actual `--what=` flags on the Claude-owned inhibitor.
+
+**References:** [How scheduled tasks run](https://code.claude.com/docs/en/desktop-scheduled-tasks#how-scheduled-tasks-run)
+
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:241897` (only `"prevent-app-suspension"` is passed to `powerSaveBlocker.start` — Electron maps this to `idle:sleep`); no `handle-lid-switch` / `HandleLidSwitch` token anywhere in `index.js` (verified via `grep -nE 'lid|HandleLidSwitch|handle-lid' index.js`)
--- a/docs/testing/cases/shortcuts-and-input.md
+++ b/docs/testing/cases/shortcuts-and-input.md
@@ -0,0 +1,365 @@
+# Shortcuts & Input
+
+Tests covering URL handling, the Quick Entry global shortcut, and DE-specific shortcut/input failure modes. See [`../matrix.md`](../matrix.md) for status.
+
+## T05 — `claude://` URL handler opens links in-app
+
+**Severity:** Smoke
+**Surface:** URL handler / xdg-open
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. With Claude Desktop running, in another app run `xdg-open 'claude://chat/new?q=hello'` (or click a `claude://` link in a browser/terminal).
+2. Observe.
+
+**Expected:** Link is delivered to the running Claude Desktop process — no new browser tab, no crash, no error dialog. (Upstream's `claudeURLHandler` only accepts the `claude:`, `claude-dev:`, `claude-nest:`, `claude-nest-dev:`, `claude-nest-prod:` schemes; bare `https://claude.ai/...` clicks route through the user's default browser, not Claude Desktop. The `.desktop` file registers `MimeType=x-scheme-handler/claude` only, matching the upstream contract.)
+
+**Diagnostics on failure:** `xdg-mime query default x-scheme-handler/claude`, the registered `.desktop` file content, launcher log, app crash report (if any), `coredumpctl list claude-desktop` (if subprocess died — see [S06](#s06--url-handler-doesnt-segfault-on-native-wayland)).
+
+**References:** upstream `index.js:495996-496009` (`bEe()` protocol filter), `index.js:524819` (`setAsDefaultProtocolClient("claude")`), `index.js:525140-525148` (macOS `open-url`), `index.js:525162-525172` (Linux/Win `second-instance` argv path), project `scripts/packaging/{deb,rpm,appimage}.sh` (MimeType registration).
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:495996, 524819, 525140, 525162
+
+## T06 — Quick Entry global shortcut (unfocused)
+
+**Severity:** Critical
+**Surface:** Global shortcut / Electron globalShortcut
+**Applies to:** All rows
+**Issues:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406), [PR #102](https://github.com/aaddrick/claude-desktop-debian/pull/102), [PR #153](https://github.com/aaddrick/claude-desktop-debian/pull/153)
+
+**Steps:**
+1. Launch app, focus another application (browser, terminal).
+2. Press the configured Quick Entry shortcut (default `Ctrl+Alt+Space`).
+3. Type a prompt and submit.
+4. Repeat from a different virtual desktop / workspace.
+
+**Expected:** Quick Entry prompt opens regardless of focused app or workspace. Shortcut is globally registered, not focus-bound. Submitting creates a new session and shows it in the main window.
+
+**Diagnostics on failure:** Launcher log (look for `Using X11 backend via XWayland (for global hotkey support)` or portal-shortcut markers), `XDG_SESSION_TYPE`, `XDG_CURRENT_DESKTOP`, output of `gdbus call --session --dest=org.freedesktop.portal.Desktop --object-path=/org/freedesktop/portal/desktop --method=org.freedesktop.DBus.Introspectable.Introspect`, the active patch set in `scripts/patches/`.
+
+**References:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406)
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:499376 (`ort` default accelerator: `"Ctrl+Alt+Space"` non-mac, `"Alt+Space"` on mac), 499416 (`globalShortcut.register`), 525287-525290 (Quick Entry trigger callback registered against `Pw.QUICK_ENTRY`).
+
+## S06 — URL handler doesn't segfault on native Wayland
+
+**Severity:** Critical (for wlroots rows)
+**Surface:** URL handler subprocess
+**Applies to:** Sway, Niri, Hypr-O, Hypr-N (any native-Wayland session)
+**Issues:** —
+
+**Steps:**
+1. Launch the app on a native Wayland session (no XWayland forcing).
+2. From another app, click a `claude.ai` link or run `xdg-open https://claude.ai/...`.
+
+**Expected:** Link opens in-app cleanly. No `Failed to connect to Wayland display` errors followed by a SIGSEGV from the URL handler subprocess.
+
+**Diagnostics on failure:** `coredumpctl info claude-desktop`, `WAYLAND_DISPLAY` env in the subprocess (if capturable via `strace -f -e execve`), launcher log, full env dump.
+
+**Currently:** Sway capture shows `Failed to connect to Wayland display: No such file or directory (2)` followed by `Segmentation fault` from the URL handler subprocess. The main app process keeps running; the URL handler dies. Not yet filed.
+
+**References:** —
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:495996 (`bEe()` URL handler), 525140-525148 (`open-url` macOS), 525162-525172 (`second-instance` argv path on Linux); project `scripts/launcher-common.sh:96-99` (`--ozone-platform=x11` default), `scripts/launcher-common.sh:41-44` (Niri force-native-Wayland).
+
+## S07 — `CLAUDE_USE_WAYLAND=1` opt-in path works without crashing
+
+**Severity:** Should
+**Surface:** Native Wayland mode
+**Applies to:** Sway, Niri, Hypr-O, Hypr-N
+**Issues:** [PR #228](https://github.com/aaddrick/claude-desktop-debian/pull/228), [PR #232](https://github.com/aaddrick/claude-desktop-debian/pull/232)
+
+**Steps:**
+1. Set `CLAUDE_USE_WAYLAND=1`. Launch the app.
+2. Use the app for ~5 minutes — open chats, switch tabs, exercise basic flows.
+
+**Expected:** App forces native Wayland (no XWayland), continues to render and respond. Previously broken paths in PR #228 still hold.
+
+**Diagnostics on failure:** Launcher log (confirm Wayland mode active), `--doctor`, full env dump, screenshot of any crash dialog.
+
+**References:** [PR #228](https://github.com/aaddrick/claude-desktop-debian/pull/228), [PR #232](https://github.com/aaddrick/claude-desktop-debian/pull/232)
+**Code anchors:** project `scripts/launcher-common.sh:28-29` (`CLAUDE_USE_WAYLAND=1` opt-out of XWayland), 100-111 (native-Wayland Electron flags: `UseOzonePlatform,WaylandWindowDecorations`, `--ozone-platform=wayland`, `--enable-wayland-ime`, `--wayland-text-input-version=3`, `GDK_BACKEND=wayland`).
+
+## S09 — Quick window patch runs only on KDE (post-#406 gate)
+
+**Severity:** Critical
+**Surface:** Patch gate
+**Applies to:** All rows (verifies the gate, not the feature)
+**Issues:** [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406), [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393)
+
+**Steps:**
+1. On a KDE row, launch the app. Inspect launcher log for quick-window-patch markers.
+2. On a non-KDE row, launch the app. Inspect launcher log — the markers should be absent.
+
+**Expected:** On KDE sessions the quick-window patch is applied (Quick Entry uses the patched code path). On non-KDE sessions the patch is **not** applied, preventing the [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393) regression on GNOME etc.
+
+**Diagnostics on failure:** Launcher log, `XDG_CURRENT_DESKTOP`, the patch-gate code path in `scripts/patches/`.
+
+**References:** [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406), [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393)
+**Code anchors:** project `scripts/patches/quick-window.sh:32-42` (KDE-gated `blur()` insertion), 115-125 (KDE-gated focus/visibility check replacement); upstream sites the patch rewrites are around `index.js:515374-515471` (Quick Entry popup construction + handlers).
+
+## S10 — Quick Entry popup is transparent (no opaque square frame)
+
+**Severity:** Should
+**Surface:** Quick Entry window (KDE Wayland)
+**Applies to:** KDE-W
+**Issues:** [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370), [#223](https://github.com/aaddrick/claude-desktop-debian/issues/223), [PR #244](https://github.com/aaddrick/claude-desktop-debian/pull/244)
+
+**Steps:**
+1. On KDE Plasma Wayland, invoke Quick Entry.
+2. Observe the popup background.
+
+**Expected:** Quick Entry popup renders with a transparent background — no opaque square frame visible behind the rounded prompt UI.
+
+**Diagnostics on failure:** Screenshot, KDE compositor settings (`kwriteconfig5 --read kwinrc Compositing/Backend`), launcher log, BrowserWindow construction args.
+
+**References:** [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370) (current open report), [#223](https://github.com/aaddrick/claude-desktop-debian/issues/223) (closed predecessor), [PR #244](https://github.com/aaddrick/claude-desktop-debian/pull/244)
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515380 (`transparent: !0`), 515383 (`backgroundColor: "#00000000"`), 515381 (`frame: !1`), 515377 (`skipTaskbar: !0`).
+
+## S11 — Quick Entry shortcut fires from any focus on Wayland (mutter XWayland key-grab)
+
+**Severity:** Critical (for GNOME users)
+**Surface:** Global shortcut on GNOME mutter
+**Applies to:** GNOME, Ubu
+**Issues:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406)
+
+**Steps:**
+1. On GNOME/mutter Wayland, launch the app.
+2. Focus another application; press the Quick Entry shortcut.
+3. Repeat from another virtual desktop.
+
+**Expected:** Shortcut fires regardless of focused app or workspace.
+
+**Diagnostics on failure:** Launcher log (note `Using X11 backend via XWayland (for global hotkey support)`), `XDG_CURRENT_DESKTOP`, mutter version (`gnome-shell --version`), the active patch set.
+
+**Currently:** Fedora 43 GNOME Wayland reproduces [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404) — mutter doesn't honour the XWayland-side key grab, so the shortcut is focus-bound. On Ubuntu 24.04 GNOME, the [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406) KDE-only gate prevents the regressing patch from running, leaving the older (working) code path active — hence `🔧` on Ubu. The unsolved fix path is [S12](#s12----enable-featuresglobalshortcutsportal-launcher-flag-wired-up-for-gnome-wayland).
+
+**References:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406)
+**Code anchors:** project `scripts/launcher-common.sh:96-99` (XWayland-default `--ozone-platform=x11`); upstream `index.js:499416` (`globalShortcut.register`).
+
+## S12 — `--enable-features=GlobalShortcutsPortal` launcher flag wired up for GNOME Wayland
+
+**Severity:** Critical
+**Surface:** Launcher flag wiring
+**Applies to:** GNOME, Ubu (any GNOME Wayland)
+**Issues:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404)
+
+**Steps:**
+1. On GNOME Wayland, launch the app.
+2. Inspect the Electron command line via `pgrep -af claude-desktop` — look for `--enable-features=GlobalShortcutsPortal`.
+3. Test Quick Entry shortcut from unfocused state (see [T06](#t06--quick-entry-global-shortcut-unfocused)).
+
+**Expected:** Launcher detects GNOME Wayland and appends `--enable-features=GlobalShortcutsPortal` to Electron's argv, routing global shortcuts through XDG Desktop Portal instead of X11 key grabs. Once wired, [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404) is closeable.
+
+**Diagnostics on failure:** Full process argv (`cat /proc/$(pgrep -f electron)/cmdline | tr '\0' ' '`), launcher log, `XDG_CURRENT_DESKTOP`.
+
+**Currently:** Not yet implemented. Tracking under [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404).
+
+> **⚠ Missing in build 1.5354.0** — `--enable-features=GlobalShortcutsPortal` is not appended by `scripts/launcher-common.sh` for any GNOME Wayland variant. Re-verify after next upstream bump and after #404 lands.
+
+**References:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404)
+**Code anchors:** project `scripts/launcher-common.sh:59-112` (`build_electron_args` — no `GlobalShortcutsPortal` branch present).
+
+## S14 — Global shortcuts via XDG portal work on Niri
+
+**Severity:** Critical (for Niri users)
+**Surface:** XDG Desktop Portal `BindShortcuts`
+**Applies to:** Niri
+**Issues:** —
+
+**Steps:**
+1. On Niri, launch the app (the launcher special-cases Niri to native Wayland + portal).
+2. Configure the Quick Entry shortcut.
+3. Observe portal interaction in launcher log.
+
+**Expected:** `BindShortcuts` succeeds. Configured Quick Entry shortcut is registered and fires.
+
+**Diagnostics on failure:** Launcher log capture of the `BindShortcuts` call, `busctl --user tree org.freedesktop.portal.Desktop`, Niri version, full env.
+
+**Currently:** `Failed to call BindShortcuts (error code 5)` — portal global shortcuts fail on Niri. Different root cause from [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), same user-visible symptom (Quick Entry shortcut doesn't fire). Not yet filed.
+
+**References:** —
+**Code anchors:** project `scripts/launcher-common.sh:41-44` (Niri force-native-Wayland branch); upstream `index.js:499416` (`globalShortcut.register`, which on native Wayland routes through Electron's `xdg-desktop-portal` `BindShortcuts` path inside Chromium).
+
+## S29 — Quick Entry popup is created lazily on first shortcut press (closed-to-tray sanity)
+
+**Severity:** Critical
+**Surface:** Quick Entry popup lifecycle
+**Applies to:** All rows
+**Issues:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393)
+
+**Steps:**
+1. Launch app, wait for main window to appear, hide-to-tray (close via X — see [T08](./tray-and-window-chrome.md#t08--hide-to-tray-on-close)).
+2. Confirm no Claude window is mapped (e.g. `wmctrl -l | grep -i claude` returns empty on X11; `swaymsg -t get_tree` for Wayland equivalents).
+3. Press the Quick Entry shortcut.
+4. Type `hello`, press Enter.
+
+**Expected:** Popup appears even though no Claude window was mapped before the keypress. Upstream constructs the popup `BrowserWindow` lazily on first shortcut invocation (`if (!Ko || ...) Ko = new BrowserWindow(...)` near `index.js:515375`), so the popup does not need a pre-existing main window. New chat session is created and reachable on submit.
+
+**Diagnostics on failure:** Launcher log, `~/.config/Claude/logs/`, `XDG_CURRENT_DESKTOP`, screenshot of empty desktop after shortcut press.
+
+**References:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), upstream `index.js:515375-515397`
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515374 (`if (!Ko ...) Ko = new BrowserWindow(...)` lazy construction guard), 515394 (`preload: ".vite/build/quickWindow.js"`), 515438 (`Ko.loadFile(".vite/renderer/quick_window/quick-window.html")`).
+
+## S30 — Quick Entry shortcut becomes a no-op after full app exit
+
+**Severity:** Should
+**Surface:** Global shortcut unregistration
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Launch app. Confirm Quick Entry shortcut works (popup opens).
+2. Quit Claude Desktop fully via tray → Quit (or `pkill -f app.asar`). Confirm no `electron` processes for the app remain.
+3. Press the Quick Entry shortcut.
+
+**Expected:** No popup appears. No error dialog. No zombie process. Electron unregisters the global shortcut on app exit; the shortcut becomes a system-level no-op.
+
+**Diagnostics on failure:** `pgrep -af app.asar` output, `journalctl --user -e -n 100`, OS-level shortcut bindings (`gsettings list-recursively | grep -i shortcut`).
+
+**References:** upstream `index.js:499416` (registration site)
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:499398-499428 (`nG()` register/unregister wrapper — passing `null` accelerator unregisters), 499416 (`hA.globalShortcut.register`), 499403 (`hA.globalShortcut.unregister`).
+
+## S31 — Quick Entry submit makes the new chat reachable from any main-window state
+
+**Severity:** Critical
+**Surface:** Submit → main window show
+**Applies to:** All rows
+**Issues:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406)
+
+**Steps:**
+1. For each main-window state: (a) visible-and-focused, (b) minimized, (c) hidden-to-tray, (d) on a different workspace, (e) closed via X (project's hide-to-tray override).
+2. Set the state, then invoke Quick Entry, type `hello`, submit.
+3. Record what happens to the main window: auto-restored, requires tray click, came to current workspace, stayed on its own workspace.
+
+**Expected:** The new chat session is **reachable** from each starting state. Acceptance is "user can reach the new chat" — not "main window auto-restored." Upstream calls `mainWin.show()` + `mainWin.focus()` only (`index.js:515566, 515599`), with no `restore()`, no `setVisibleOnAllWorkspaces()`, no `moveTop()`. Whether `show()` un-minimizes or migrates workspaces is purely compositor-dependent. The failure case is "new chat created but the user has no way to surface it" — that's a regression. Anything that reaches the chat (even via a tray click) is upstream-acceptable.
+
+**Diagnostics on failure:** `~/.config/Claude/logs/`, screenshot at each state, output of `wmctrl -l` (X11) or `swaymsg -t get_tree` (sway), launcher log.
+
+**Currently:** On non-KDE rows, the post-#406 KDE-only patch gate leaves the upstream code path (`isFocused()` short-circuit) active. Andrej730's #393 GNOME repro shows the stale-`isFocused()` bug can still suppress `show()` in tray-only state. See [S32](#s32--quick-entry-submit-on-gnome-mutter-doesnt-trip-electron-stale-isfocused).
+
+**References:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), upstream `index.js:515566, 515599, 105164-171`
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515567 (`h1() || ut.show(), ut.focus()` in `gHn()` existing-chat path), 515598-515599 (`h1() || ut.show(), ut.focus()` in `ynt()` new-chat path), 105164-105171 (`h1()` returns `ut.isFocused() || mainView.webContents.isFocused()`).
+
+## S32 — Quick Entry submit on GNOME mutter doesn't trip Electron stale-`isFocused()`
+
+**Severity:** Critical (for GNOME users)
+**Surface:** Electron `BrowserWindow.isFocused()` on Linux
+**Applies to:** GNOME, Ubu
+**Issues:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393)
+
+**Steps:**
+1. On GNOME Wayland, launch the app, then close to tray.
+2. Confirm the app is in tray-only state (no window mapped, no Dash entry, no taskbar entry).
+3. Invoke Quick Entry, type `hello`, submit.
+4. Repeat after re-pinning the app to the Dash and reproducing the tray-only state from there.
+
+**Expected:** Submit produces a reachable new chat session in both Dash-pinned and not-pinned cases. **The Dash distinction is empirical, not code-driven** — upstream has no notion of Dash presence. The underlying failure mode is Electron's `BrowserWindow.isFocused()` returning stale-true on Linux mutter, which causes upstream's `h1() || ut.show()` short-circuit (`index.js:515566`) to skip `show()`. Andrej730 traced this on #393.
+
+**Diagnostics on failure:** Bundled `index.js` h1() body (extract via `npx asar extract`); add temporary logging in `h1()` per Andrej730's diff in #393 if reproducing locally; `gnome-shell --version`; `~/.config/Claude/logs/`.
+
+**Currently:** Open. The KDE-only gate from PR #406 leaves this path unfixed on GNOME. Resolution requires either (a) widening the patch to all DEs by dropping the `isFocused()` fallback in the patched code, or (b) waiting for an upstream Electron fix to `isFocused()` on Linux.
+
+**References:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393) (Andrej730's diagnosis with `eU()` logging output)
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:105164-105171 (`h1()` body — the exact short-circuit Andrej730 instrumented), 515567 + 515598 (the two `h1() || ut.show()` call sites the suppression hits).
+
+## S33 — Quick Entry transparent rendering tracked against bundled Electron version
+
+**Severity:** Should
+**Surface:** Bundled Electron version
+**Applies to:** All rows (relevant where #370 reproduces)
+**Issues:** [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370)
+
+**Steps:**
+1. After install, capture the Electron version bundled with the app: extract `app.asar.unpacked` and run the bundled Electron with `--version`, or read it from the bundled binary's metadata.
+2. Record the version in [`../matrix.md`](../matrix.md) per row, alongside the [S10](#s10--quick-entry-popup-is-transparent-no-opaque-square-frame) status.
+
+**Expected:** Captured version is recorded. If the version is **41.0.4 through 41.x.y** and S10 fails, the upstream electron/electron#50213 regression hypothesis (per @noctuum's bisect on #370) holds and the issue is blocked on upstream. If the version is **41.0.3 or earlier** and S10 fails, the bisect is wrong — investigate. If the version is **a later release that includes a CSD-rendering fix** and S10 still fails, the upstream-regression hypothesis is also wrong.
+
+**Diagnostics on failure:** Output of the version capture command, link to electron/electron#50213, the BrowserWindow construction args from the bundled `index.js`.
+
+**Currently:** Per @noctuum's bisect, 41.0.4 introduced the regression. No upstream fix shipped as of last check.
+
+**References:** [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370), upstream `index.js:515380, 515383` (already sets `transparent: true` and `backgroundColor: "#00000000"`)
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515380 (`transparent: !0`), 515383 (`backgroundColor: "#00000000"`), 515374-515397 (popup `BrowserWindow` construction args block, including `frame: !1`, `hasShadow: Zr`, `type: Zr ? "panel" : void 0`).
+
+## S34 — Quick Entry shortcut focuses fullscreen main window instead of showing popup
+
+**Severity:** Should
+**Surface:** Shortcut behavior on fullscreen main
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Launch app. Put the main window into native fullscreen (F11 or platform equivalent).
+2. Press the Quick Entry shortcut.
+
+**Expected:** Popup does **not** appear. Main window receives focus and `ide()` runs (upstream behavior at `index.js:525287-525290`). This is intentional upstream UX — assumes the user wants to interact with the existing fullscreen Claude rather than overlay a popup on it.
+
+**Diagnostics on failure:** Screenshot, launcher log, confirm fullscreen state via `wmctrl -l -G` / Wayland equivalent.
+
+**References:** upstream `index.js:525287-525290`
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:525287-525290 (Quick Entry callback: `ut && !ut.isDestroyed() && ut.isFullScreen() ? (ut.focus(), ide()) : Yri()`), 515234-515241 (`ide()` — `show()` + `focus()` + `webContents.send(TEe.cmdK)` for the cmd-K dispatch).
+
+## S35 — Quick Entry popup position is persisted across invocations and across app restarts
+
+**Severity:** Should
+**Surface:** Popup placement memory
+**Applies to:** All rows
+**Issues:** —
+
+**Steps:**
+1. Launch app. Invoke Quick Entry. Note the popup position (record monitor + coordinates if possible — e.g. `xdotool getactivewindow getwindowgeometry` on X11).
+2. Dismiss (Esc). Re-invoke. Position should be unchanged across this dismiss/re-invoke cycle.
+3. Quit Claude Desktop fully (`pkill -f app.asar`). Re-launch. Invoke Quick Entry.
+4. Confirm position matches the pre-restart capture.
+
+**Expected:** Popup reappears at the same monitor + position before and after a full app restart. Upstream persists position via `an.get("quickWindowPosition")` (`index.js:515491-515526`), keyed on monitor label + resolution.
+
+**Diagnostics on failure:** Captured coordinates pre/post-restart, content of any persisted settings file (project's settings storage location varies by OS).
+
+**References:** upstream `index.js:515491-515526`
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515444-515461 (`Ko.on("hide", …)` persists `quickWindowPosition` via `an.set(...)`), 515491-515521 (`aHn()` resolves saved monitor by `label + bounds.width + bounds.height`, falling back to label-only or proportional placement), 515489 (`Ko.setPosition(...)` after show).
+
+## S36 — Quick Entry popup falls back to primary display when saved monitor is gone
+
+**Severity:** Smoke
+**Surface:** Multi-monitor placement
+**Applies to:** All rows with a multi-monitor capable host
+**Issues:** —
+
+**Steps:**
+1. **Multi-monitor required.** With an external monitor connected, invoke Quick Entry on the external monitor. Trigger position persistence (per [S35](#s35--quick-entry-popup-position-is-persisted-across-invocations-and-across-app-restarts)).
+2. Disconnect the external monitor (libvirt: detach the second display device; bare metal: unplug).
+3. Invoke Quick Entry.
+
+**Expected:** Popup appears on the primary display, not at off-screen coordinates. Upstream falls back to `cHn()` when the saved monitor is no longer present (`index.js:515502`).
+
+**Diagnostics on failure:** `xrandr` (X11) / `wlr-randr` (wlroots) output before and after disconnect, captured popup coordinates, screenshot.
+
+**Skip when:** Single-monitor VM or host. Not part of the [§ Mandatory matrix](../quick-entry-closeout.md#mandatory-matrix); skip with `-` in the dashboard.
+
+**References:** upstream `index.js:515502`
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515502 (`return cHn();` early-return when no saved position), 515523-515527 (`cHn()` centres popup on `screen.getPrimaryDisplay()` workArea), 515514-515515 (`label`-only match fallback before primary-display fallback).
+
+## S37 — Quick Entry popup remains functional after main window destroy
+
+**Severity:** Should
+**Surface:** Popup lifecycle independence from main window
+**Applies to:** All rows (where reachable)
+**Issues:** —
+
+**Steps:**
+1. Launch app, focus main window.
+2. **Trigger main window destroy without quitting the app.** On this project, the X-button hide-to-tray override means the standard close path does **not** destroy `ut`. Reach the destroy path via one of:
+   - DevTools console on the main window: `require('electron').remote.getCurrentWindow().destroy()` (if `remote` is exposed; not guaranteed).
+   - A debug build with the hide-to-tray override removed.
+   - Skip and mark `-` if unreachable.
+3. After destroy: invoke Quick Entry, type `hello`, submit.
+
+**Expected:** Popup appears and accepts input. Upstream's `!ut || ut.isDestroyed()` guard at `index.js:515595` skips the show/focus block without crashing. The new chat is created in the data layer; whether it has a window to surface in is a separate question (upstream contract is "popup itself does not crash").
+
+**Diagnostics on failure:** Crash dump, `~/.config/Claude/logs/`, sequence of actions taken to reach the destroy path.
+
+**Currently:** Likely unreachable on Linux without a debug build, due to project's hide-to-tray override of the X button. Mark `-` (N/A) on rows where the destroy path can't be triggered.
+
+**References:** upstream `index.js:515595`
+**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515595-515602 (`setTimeout(() => { !ut || ut.isDestroyed() || (h1() || ut.show(), ut.focus(), Qe == null || Qe.webContents.focus(), iri()); }, 0)` — guard skips show/focus block on destroy without throwing); 515547 (companion guard in `nde()` chat-id submit path: `else if (ut && !ut.isDestroyed())`).
--- a/docs/testing/cases/tray-and-window-chrome.md
+++ b/docs/testing/cases/tray-and-window-chrome.md
@@ -0,0 +1,123 @@
+# Tray & Window Chrome
+
+Tests covering the tray icon, OS-native window decorations, the hybrid in-app topbar (PR #538), and hide-to-tray on close. See [`../matrix.md`](../matrix.md) for status.
+
+## T03 — Tray icon present
+
+**Severity:** Smoke
+**Surface:** System tray / SNI
+**Applies to:** All rows
+**Issues:** —
+**Runner:** [`tools/test-harness/src/runners/T03_tray_icon_present.spec.ts`](../../../tools/test-harness/src/runners/T03_tray_icon_present.spec.ts) — registration only (left-click toggle + theme-switch in-place rebuild are v2)
+
+**Steps:**
+1. Launch the app. Wait a few seconds.
+2. Locate the tray icon in the system tray / status area.
+3. Right-click → confirm standard menu (Show, Quit, etc.). Left-click → confirm window toggles.
+4. Switch the system theme between light and dark; observe the tray icon update.
+
+**Expected:** Tray icon appears within a few seconds of app launch. Right-click exposes the standard menu. Left-click toggles main window visibility. Theme changes update the icon in place without spawning a duplicate.
+
+**Diagnostics on failure:** `RegisteredStatusNotifierItems` from the SNI watcher (see [runbook](../runbook.md#tray--dbus-state-kde)), the tray daemon process for the DE (Plasma's `plasmashell`, GNOME's `gnome-shell` + AppIndicator extension state, etc.), launcher log.
+
+**References:** [`docs/learnings/tray-rebuild-race.md`](../../learnings/tray-rebuild-race.md)
+**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:525627` (`vy.on("menuBarEnabled", () => { Sde() })` — re-entry), `index.js:525631-525673` (`function Sde()` — tray construction), `index.js:525645` (`new hA.Tray(hA.nativeImage.createFromPath(t))`), `index.js:525646` (`qh.on("click", () => void Yri())` — left-click handler), `index.js:525653` (`qh.setContextMenu(mnt())` — Linux right-click via context menu), `index.js:515150-515169` (`function mnt()` — Show App + Quit menu items), `index.js:525623` (`hA.nativeTheme.on("updated", ...)` — theme-change re-entry).
+
+## T04 — Window decorations draw
+
+**Severity:** Smoke
+**Surface:** Window chrome
+**Applies to:** All rows
+**Issues:** [PR #127](https://github.com/aaddrick/claude-desktop-debian/pull/127), [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538)
+**Runner:** [`tools/test-harness/src/runners/T04_window_decorations.spec.ts`](../../../tools/test-harness/src/runners/T04_window_decorations.spec.ts) — X11 / XWayland only (checks `_NET_FRAME_EXTENTS`); native-Wayland window-state queries are deferred
+
+**Steps:**
+1. Launch the app.
+2. Confirm window has a working OS-native frame: close, minimize, maximize render and respond.
+3. Resize via window edges.
+
+**Expected:** Frame is drawn by the DE/compositor (not the app). All controls render and respond. Resize works.
+
+**Diagnostics on failure:** `xprop _NET_WM_WINDOW_TYPE` (X11) / `swaymsg -t get_tree` or compositor-equivalent (Wayland), launcher log line for `frame:` setting, screenshot.
+
+**References:** [PR #127](https://github.com/aaddrick/claude-desktop-debian/pull/127), [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538) (hybrid mode keeps native frame), [`docs/learnings/linux-topbar-shim.md`](../../learnings/linux-topbar-shim.md)
+**Code anchors:** Upstream factory passes `titleBarStyle: "hidden"` and `titleBarOverlay: ys` (Windows-only flag) to `BrowserWindow` at `build-reference/app-extracted/.vite/build/index.js:524892-524909` (`Ori()`). On Linux the wrapper at `scripts/frame-fix-wrapper.js:122` overrides to `options.frame = true` and at `scripts/frame-fix-wrapper.js:129-130` deletes the macOS-only `titleBarStyle` / `titleBarOverlay` so the DE draws the frame. (Hybrid-mode plumbing — `CLAUDE_TITLEBAR_STYLE` resolution and the `native`/`hybrid`/`hidden` branches — lives on `main` per PR #538; the docs/compat-matrix branch's `frame-fix-wrapper.js` carries only the unconditional `frame:true` patch, which is sufficient for T04's "frame draws" assertion.)
+
+## T07 — In-app topbar renders + clickable
+
+**Severity:** Smoke
+**Surface:** In-app topbar (hybrid mode)
+**Applies to:** All rows on PR #538 builds
+**Issues:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538), [PR #127](https://github.com/aaddrick/claude-desktop-debian/pull/127)
+
+**Steps:**
+1. Launch a PR #538 build.
+2. Observe the in-app topbar below the OS frame.
+3. Click each of: hamburger menu, sidebar toggle, search, back, forward, Cowork ghost.
+
+**Expected:** All five topbar buttons render below the native frame. Each responds to mouse clicks (no implicit drag region capturing the events). If any single button fails to render or click, the test is `✗` — note which one in the linked issue.
+
+**Diagnostics on failure:** Screenshot, env (`OZONE_PLATFORM`, `ELECTRON_OZONE_PLATFORM_HINT`, `GDK_BACKEND`, `QT_QPA_PLATFORM`, `MOZ_ENABLE_WAYLAND`, `SDL_VIDEODRIVER`), launcher log, DevTools `document.querySelector('.topbar')` HTML if accessible.
+
+**References:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538), [PR #127](https://github.com/aaddrick/claude-desktop-debian/pull/127), [`docs/learnings/linux-topbar-shim.md`](../../learnings/linux-topbar-shim.md)
+**Code anchors:** UA-spoof shim source `scripts/wco-shim.js` (lines 1-30 module guard / `CLAUDE_TITLEBAR_STYLE != 'native'` gate, lines 184-191 `navigator.userAgent` redefinition matching `/(win32|win64|windows|wince)/i`, lines 52-53 `CONTROLS_WIDTH=140` / `TITLEBAR_HEIGHT=40`); injection orchestrator `scripts/patches/wco-shim.sh` (`patch_wco_shim()` prepends shim source to `mainView.js`); hybrid-mode wrapper branch `scripts/frame-fix-wrapper.js:62-70` (`VALID_TITLEBAR_STYLES`, default `hybrid`) and `:152-240` (per-mode `frame` / `titleBarStyle` handling).
+
+## T08 — Hide-to-tray on close
+
+**Severity:** Smoke
+**Surface:** Window lifecycle
+**Applies to:** All rows
+**Issues:** [PR #451](https://github.com/aaddrick/claude-desktop-debian/pull/451)
+
+**Steps:**
+1. Launch the app. Click the window close (X) button.
+2. Confirm app process is still running (`pgrep -af claude-desktop`).
+3. Click the tray icon (or invoke Quick Entry) → window restores.
+4. Quit explicitly via tray menu or `Ctrl+Q`.
+
+**Expected:** Close button hides main window to tray, doesn't quit. App keeps running. Tray-click restores. Explicit Quit ends the process.
+
+**Diagnostics on failure:** `pgrep -af claude-desktop` after close, launcher log, screenshot of any dialog.
+
+**References:** [PR #451](https://github.com/aaddrick/claude-desktop-debian/pull/451)
+**Code anchors:** Upstream Linux quit-on-last-close at `build-reference/app-extracted/.vite/build/index.js:525550-525552` (`hA.app.on("window-all-closed", () => { Zr || Ap() })` — `Zr` is darwin). Wrapper interception at `scripts/frame-fix-wrapper.js:178-185` (`this.on('close', e => { if (!result.app._quittingIntentionally && !this.isDestroyed()) { e.preventDefault(); this.hide() } })`) and `scripts/frame-fix-wrapper.js:370-374` (`app.on('before-quit', () => { app._quittingIntentionally = true })` — arms the bypass for tray-Quit / `Ctrl+Q` / SIGTERM). `CLOSE_TO_TRAY` gate (Linux + `CLAUDE_QUIT_ON_CLOSE !== '1'`) at `scripts/frame-fix-wrapper.js:49-51`. Tray Quit menu item `mnt()` `click: rde` at `index.js:515166`; `function rde()` at `index.js:515306-515308` calls `Ap(!1)`.
+
+## S08 — Tray icon doesn't duplicate after `nativeTheme` update
+
+**Severity:** Should
+**Surface:** Tray (KDE)
+**Applies to:** KDE-W, KDE-X
+**Issues:** [`docs/learnings/tray-rebuild-race.md`](../../learnings/tray-rebuild-race.md)
+
+**Steps:**
+1. Launch the app on KDE.
+2. Toggle system theme (light ↔ dark).
+3. Observe the tray for ~10 seconds.
+
+**Expected:** Tray icon updates in place via `setImage` + `setContextMenu`. SNI service stays registered — no de-register / re-register churn that would leave a duplicate icon visible until KDE garbage-collects.
+
+**Diagnostics on failure:** SNI watcher state before/after theme switch (see [runbook](../runbook.md#tray--dbus-state-kde)), launcher log, `journalctl --user -u plasma-plasmashell -n 50`.
+
+**References:** [`docs/learnings/tray-rebuild-race.md`](../../learnings/tray-rebuild-race.md). Mitigated upstream — the in-place fast-path is the current behavior.
+**Code anchors:** Upstream destroy+recreate slow-path at `build-reference/app-extracted/.vite/build/index.js:525643` (`qh && (qh.destroy(), (qh = null))`) followed immediately by `new hA.Tray(...)` at `:525645` and `setContextMenu(mnt())` at `:525653` — the SNI re-register that races on KDE. Fast-path injection in `scripts/patches/tray.sh` `patch_tray_inplace_update()` (lines 95-231): extracts `tray_var` / `menu_func` / `path_var` / `enabled_var` dynamically, then injects `if (TRAY && ENABLED !== false) { TRAY.setImage(EL.nativeImage.createFromPath(PATH)); process.platform !== "darwin" && TRAY.setContextMenu(MENU()); return }` before the destroy block. Idempotency marker at `tray.sh:174-180` keys on the post-rename `setImage(...nativeImage.createFromPath(PATH_VAR))` literal. Mutex + 250 ms DBus settle delay (the prior mitigation, kept for the legitimate slow-path entries) at `tray.sh:48-60`.
+
+## S13 — Hybrid topbar shim survives Omarchy's Ozone-Wayland env exports
+
+**Severity:** Critical (for Omarchy users)
+**Surface:** In-app topbar (hybrid mode) under Omarchy env
+**Applies to:** Hypr-O
+**Issues:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538)
+
+**Steps:**
+1. On OmarchyOS, export Omarchy's session-wide env (`ELECTRON_OZONE_PLATFORM_HINT=wayland`, `OZONE_PLATFORM=wayland`, `GDK_BACKEND=wayland,x11,*`, `QT_QPA_PLATFORM=wayland;xcb`, `MOZ_ENABLE_WAYLAND=1`, `SDL_VIDEODRIVER=wayland,x11`).
+2. Launch a PR #538 build.
+3. Click each of the five topbar buttons.
+
+**Expected:** The hybrid-mode topbar shim (`scripts/wco-shim.js`) loads in time to spoof the UA before claude.ai's `isWindows()` check fires. All five topbar buttons render and click.
+
+**Diagnostics on failure:** Full session env, launcher log, `--doctor`, screenshot, video (per @lukedev45's bug report on PR #538), DevTools console for shim-load errors.
+
+**Currently:** Reproduces partial render on OmarchyOS Hyprland per [@lukedev45](https://github.com/lukedev45)'s video on [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538). @aaddrick attempted local repro on KDE Plasma + Wayland with the same env vars and could not reproduce; root cause TBD pending diagnostic capture from a broken run.
+
+**References:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538), [`docs/learnings/linux-topbar-shim.md`](../../learnings/linux-topbar-shim.md)
+**Code anchors:** Shim is inlined at the top of `mainView.js` (the BrowserView preload), not loaded via `require` — see the rationale at `scripts/patches/wco-shim.sh:23-40` ("Sandboxed preloads can only require a fixed allowlist of modules…"). The injection prepends `scripts/wco-shim.js` source at the start of `app.asar.contents/.vite/build/mainView.js` so the UA override fires before the bundle's `isWindows()` regex (`/(win32|win64|windows|wince)/i`) ever runs in the page main world (`scripts/wco-shim.js:184-191`). The shim's IIFE no-ops on non-Linux at `wco-shim.js:29` and on `CLAUDE_TITLEBAR_STYLE === 'native'` at `wco-shim.js:30-32`, so the only env-export interaction with `OZONE_PLATFORM` etc. is via Chromium's own platform plumbing — none of those exports are read by the shim itself, which makes the partial-render repro on Omarchy mysterious to static analysis.
--- a/docs/testing/claudeai-lib-ax-migration-prompt.md
+++ b/docs/testing/claudeai-lib-ax-migration-prompt.md
@@ -0,0 +1,322 @@
+# lib/claudeai.ts AX-tree migration — implementation prompt
+
+This file is meant to be **copied verbatim into a fresh Claude Code
+session** as the initial user message. Don't paraphrase it; the
+self-correction loop depends on the exact directives below.
+
+---
+
+## Prompt to paste
+
+You're picking up after the v7 fingerprint walker + U01 wire-up
+landed. Walker, resolver, and U01 are all on the AX-tree substrate.
+The page-object library `tools/test-harness/src/lib/claudeai.ts` is
+still on the old substrate — `document.querySelector` against
+minified-tailwind class shapes (`button[aria-haspopup="menu"]` +
+`span.truncate.max-w-[Npx]`) — and that's where every claude.ai UI
+spec couples to upstream's React DOM. Your job is to migrate the
+brittle CSS-shape walks in `claudeai.ts` to AX-tree resolution using
+the v7 walker primitives, run the H/S spec families that consume
+them, and iterate until those specs pass without DOM-shape coupling.
+
+### Authoritative reference
+
+Read these in order. They contain the design, the gotchas, and the
+runtime contract — the prompt below assumes them as background.
+
+- `docs/testing/fingerprint-v7-plan.md` — design contract for the v7
+  fingerprint, kind-strictness matrix, resolver fallback chain. Skim
+  the "Capture algorithm" and "Resolver / fallback chain" sections;
+  the migration consumes the same primitives.
+- `docs/learnings/test-harness-ax-tree-walker.md` — the five
+  non-obvious AX-tree traps (AX-enable async lag, navigateTo no-op,
+  flat dialog>button[] lists, more-options shape, sidebar
+  virtualization). All apply here too — `lib/claudeai.ts` calls run
+  inside the same renderer the walker drives.
+- `tools/test-harness/src/lib/claudeai.ts` — the migration target.
+  ~340 lines, eight functions plus two classes (`CodeTab`,
+  `LocalEnvPill`). Every public function is a discovery walk against
+  `evalInRenderer` with `document.querySelectorAll`.
+
+### Why this iteration
+
+Per the v7 plan's design goal §2 "Resilient to cosmetic drift" —
+upstream regenerates tailwind class signatures on rebuild
+(`max-w-[Npx]`, `df-pill`-style atoms), so `claudeai.ts`'s CSS-shape
+walks break on any minor UI rebuild even when the AX-computed role
+and accessible name are stable. The U01 wire-up confirmed the AX
+tree is a usable substrate end-to-end (~7s/test, 89/90 stable across
+two consecutive sweeps). Pulling `claudeai.ts` onto the same
+substrate eliminates the recurring "tailwind regen breaks H05/S31
+again" failure mode.
+
+Acceptance per the plan: H05 + S29-S37 + T-prefix specs that consume
+`claudeai.ts` keep passing on the same account, with zero new
+flakes. Migration is mechanical (replace the eval-string walks with
+AX-tree queries) and the existing tests are the contract.
+
+### Repo conventions
+
+- Tabs for indentation, lines under 80 chars, single quotes for
+  literals, TypeScript strict mode (`tools/test-harness/tsconfig.json`
+  enforces it).
+- Comments only when the WHY is non-obvious — write the `because:`
+  clause, not the `that:` clause.
+- No backward-compatibility shims. If a function's signature needs
+  to change, change every caller. Don't keep both code paths.
+- Don't commit. The user reviews and commits.
+
+### Code anchors
+
+- `tools/test-harness/explore/walker.ts` — exports the primitives
+  you'll consume:
+  - `findByFingerprint(inspector, fingerprint, kind)` — full
+    resolver with strictness gating + relaxed-scope fallback.
+    Overkill for one-shot lookups against the live renderer.
+  - `queryAccessibleTree(elements, query)` — pure filter, used at
+    capture and resolve time. Takes a `RawElement[]` snapshot and
+    an `AxQuery` (ariaPath + leaf criteria). What you'll likely
+    wrap.
+  - `axTreeToSnapshot(nodes)` — converts CDP `AxNode[]` to the
+    walker's `RawElement[]` shape. Drops ignored nodes.
+  - `walkLandmarkAncestors(raw)` — emits the AriaStep[] for an
+    element. Useful if a method needs to disambiguate by landmark.
+  - `waitForAxTreeStable(inspector, opts)` — gating primitive used
+    by walker + U01. Use `{ minNodes: 1, timeoutMs: 10000 }` for
+    post-click reads (matches `snapshotSurface`'s default).
+- `tools/test-harness/src/lib/inspector.ts` — `getAccessibleTree`
+  fetches the raw CDP tree filtered to the claude.ai webContents.
+- `tools/test-harness/src/lib/claudeai.ts` — the migration target.
+  Read the file-header comment first; it documents the discovery
+  strategy you're replacing.
+- `tools/test-harness/src/runners/H05_ui_drift_check.spec.ts`,
+  `S31_quick_entry_submit_reaches_new_chat.spec.ts`,
+  `S32_quick_entry_submit_gnome_stale_isfocused.spec.ts` — primary
+  consumers of the methods being migrated.
+
+### Phases
+
+#### Phase A — spike on one method
+
+1. `cd tools/test-harness && npm run typecheck` — must pass before
+   doing anything.
+2. Pick `openPill(inspector, labelPattern, opts)` as the spike.
+   It's the most CSS-shape-coupled method and exercises the
+   menu-render polling pattern the rest of `claudeai.ts` reuses.
+3. Replace its body with an AX-tree query:
+   - Fetch the AX tree (`inspector.getAccessibleTree('claude.ai')`),
+     convert via `axTreeToSnapshot`.
+   - Filter to elements with `computedRole === 'button'` and
+     accessibleName matching `labelPattern`.
+   - For each candidate, compute its parent landmark via
+     `walkLandmarkAncestors`. The compact-pill discriminator —
+     "has a `span.truncate.max-w-[Npx]` child" — needs an AX
+     analogue. Most likely: parent is `toolbar` / `group` and the
+     element has `aria-haspopup === 'menu'` (exposed in AX as
+     `hasPopup` property; check whether `RawElement` carries it
+     and extend if needed).
+   - Click via `inspector.clickByBackendNodeId(raw.backendDOMNodeId)`.
+   - Poll for menu items via AX role match (`menuitem`,
+     `menuitemradio`, `menuitemcheckbox`).
+4. Run H05 against your branch (`./node_modules/.bin/playwright
+   test src/runners/H05_ui_drift_check.spec.ts`). H05 doesn't
+   directly call `openPill` but exercises the same renderer state;
+   if H05 regresses your AX walk is wrong.
+5. Run S31 (`./node_modules/.bin/playwright test
+   src/runners/S31_quick_entry_submit_reaches_new_chat.spec.ts`).
+   This calls `openPill` indirectly via `CodeTab.activate` →
+   `findCompactPills`.
+6. If both pass, the AX substrate works for at least one method.
+   Commit the shape mentally (don't `git commit` — the user does
+   that). If either fails, the spike is in trouble; re-read the
+   AX-tree learnings doc for traps you missed and fix the
+   primitive before expanding.
+
+#### Phase B — migrate the rest
+
+For each remaining function in `claudeai.ts`, port the discovery
+walk to AX:
+
+- `activateTab(inspector, name)` — `button` with
+  `accessibleName === name` under root or banner landmark. Existing
+  `aria-label="X"` selector → AX `name` literal match.
+- `findCompactPills(inspector)` — list of buttons with
+  `hasPopup === 'menu'` AND inner `span.truncate.max-w-[…]` text
+  child. AX equivalent: button role + hasPopup + a child
+  `genericContainer` (or whatever AX exposes for `<span>`) carrying
+  the visible text. Returns `{text, maxW, expanded}` today —
+  `maxW` is a tailwind artifact and should be dropped from the AX
+  shape (callers don't use it for matching, just for diagnostics;
+  keep a placeholder or remove from the type).
+- `clickMenuItem(inspector, textPattern, opts)` — element with
+  role in `{menuitem, menuitemradio, menuitemcheckbox}` and
+  accessibleName matching `textPattern`. The CSS attribute selector
+  has an AX direct equivalent.
+- `pressEscape(inspector)` — keep as-is. It's a keydown dispatch,
+  not a discovery walk.
+- `CodeTab.activate(opts)` — calls `activateTab` + polls
+  `findCompactPills`. Migrates by transitivity.
+- `LocalEnvPill` — read its body to enumerate callers.
+
+After each migration:
+1. `npm run typecheck` — must pass.
+2. `npx tsx explore/walker.ts` — selfTest must pass (you may have
+   touched walker.ts to expose new primitives).
+3. Run the affected spec(s).
+
+#### Phase C — full sweep
+
+1. Run all H/S/T runners that consume `claudeai.ts`:
+   - H05 (UI drift)
+   - S31 (Code-tab submit)
+   - S32 (GNOME stale isFocused)
+   - any T-prefix that uses `installOpenDialogMock` or `pressEscape`
+2. Tally pass/fail. The post-migration baseline must equal the
+   pre-migration baseline, modulo flakes characterized in
+   `docs/learnings/test-harness-ax-tree-walker.md`.
+
+Cap iterations at **5 sweep cycles** total (spike + 4 fix-rerun
+cycles) — past that, stop and report.
+
+##### Failure classes
+
+1. **AX-shape mismatch.** Element has the CSS shape the old code
+   relied on but a different AX role/name than expected. Fix:
+   probe the AX tree for the actual shape (use
+   `inspector.getAccessibleTree('claude.ai')` interactively from a
+   one-shot script), update the AX query.
+2. **Missing AX property exposure.** `hasPopup`, `expanded`, etc.
+   may not be in `RawElement` today (the walker only reads role,
+   name, ancestors, sibling info). Extend `RawElement` and
+   `axTreeToSnapshot` to expose what the migration needs. Update
+   walker.ts selfTest if you change the snapshot shape.
+3. **Race against menu render.** Old code polled
+   `document.querySelectorAll('[role=menuitem]')` every 50ms. AX
+   tree updates lag DOM by hundreds of ms; bake a
+   `waitForAxTreeStable({ minNodes: 1 })` between click and
+   menuitem fetch instead of a short DOM poll.
+4. **Tailwind-class diagnostic loss.** `findCompactPills` returns
+   `maxW` which callers use only in error messages. If the
+   AX-only return shape drops `maxW`, error messages get less
+   informative — accept it, don't reintroduce DOM walks just for
+   diagnostics. Keep the `maxW` field optional/null in the type.
+
+##### What "fix" means
+
+A fix is one of:
+- A code change in `claudeai.ts`, `walker.ts`, or `inspector.ts`.
+- A targeted extension of `RawElement` / `axTreeToSnapshot` to
+  expose an AX property the migration needs.
+
+Not a fix:
+- `// eslint-disable-next-line` / `// @ts-ignore` / `as unknown as ...`.
+- Keeping the old `document.querySelector` walk as a fallback.
+- Adding an AX walk that wraps a CSS walk that wraps an AX walk.
+
+### Self-correction loop (general protocol)
+
+After each phase's specific loop:
+
+1. If `npm run typecheck` reports errors, fix root causes — no
+   `// @ts-ignore`, no `any`, no `as unknown as ...`.
+2. If `npx tsx explore/walker.ts` (selfTest) fails, the change broke
+   an algorithmic invariant. Don't relax the test; fix the change.
+3. **Cap fix attempts per problem class at 3.** After 3 attempts
+   on the same class without progress, stop and report.
+4. Mark Phase complete only when every step in that Phase passes
+   cleanly.
+
+### Termination conditions
+
+Stop and write a final report when one of:
+
+1. **Migration is clean.** All `claudeai.ts` methods on AX
+   substrate, all consuming specs pass at the pre-migration
+   baseline. Report final pass tallies + diff stat.
+2. **Hit the 5-sweep cap.** Report what's done, what's blocked,
+   and what each remaining failure looks like.
+3. **Hit the 3-attempt cap on a non-trivial issue.** Report
+   attempts, why each failed, what's blocked.
+4. **AX exposure gap.** A claude.ai surface uses a property the AX
+   tree doesn't expose (e.g., custom `data-state` attributes
+   without a corresponding ARIA reflection). Stop, document the
+   gap, ask the user before adding a hybrid AX+DOM walk.
+
+### What you should NOT do
+
+- Don't commit. The user reviews everything.
+- Don't keep both substrates. The migration is atomic per method:
+  CSS walk out, AX walk in. No fallback chains.
+- Don't add new abstractions in `claudeai.ts` that aren't required
+  by the migration. The file's shape (one function per UI verb) is
+  load-bearing for callers — don't introduce a `PageObject` base
+  class or a generic AX builder.
+- Don't run the host Claude Desktop. The user runs it. The H/S
+  specs use `launchClaude` with `seedFromHost` or `null` isolation
+  per spec — confirm with the user before any sweep.
+- Don't widen `RawElement` speculatively. Only add fields the
+  migration consumes. Each new field bloats every snapshot.
+- Don't drill into a single-method workaround that other methods
+  would have to duplicate. If a fix wants to live in a helper,
+  put it next to `queryAccessibleTree` in `walker.ts`.
+
+### Final report format
+
+```markdown
+## Migration summary
+
+- Functions migrated:    N / N
+- Walker.ts changes:     <one-line summary>
+- Inspector.ts changes:  <one-line summary or none>
+- H/S/T specs run:       N
+- H/S/T specs passed:    N
+- New flakes introduced: N (description)
+
+## Iteration log
+
+### Spike — openPill
+- Result: ...
+- AX shape used: ...
+- Issues hit: ...
+
+### Phase B — remaining methods
+- One block per method ...
+
+### Phase C — full sweep
+- Per-spec pass/fail tally
+- Diff against pre-migration baseline
+
+## Open issues
+- ...
+
+## Files touched
+git status output
+
+## Diff for review
+git diff --stat output
+```
+
+### Operational notes
+
+- Background runs: use `Bash run_in_background: true` for any
+  multi-spec sweep, and `Monitor` with a tight grep filter
+  (`✓|✘|Error|FAIL|EXIT=`) to stream events. Stop the monitor when
+  the run completes.
+- Check for leftover Electron processes between runs
+  (`pgrep -af '/usr/lib/claude-desktop/node_modules/electron'`)
+  and stale tmpdirs (`ls /tmp/claude-test-*`) — clean both up if
+  the prior run errored before teardown.
+- The U01 wire-up landed two `walker.ts` fixes that are part of
+  the substrate you're inheriting:
+  1. `findByFingerprint`: strictness gate also defers to
+     `fingerprint.classification === 'instance'` for degenerate
+     fingerprints.
+  2. `redrivePath`: navigates to startUrl when current URL drifted;
+     reloads only when already at startUrl.
+  Both are live in the working tree (or just-merged main,
+  depending on when this prompt fires).
+
+Begin with Phase A. Read `claudeai.ts` end-to-end first — in
+particular the file-header discovery comment (lines 1-31) and the
+`openPill` body (lines 162-202) — so you understand what the
+existing CSS-shape walks are anchoring on before you replace them.
--- a/docs/testing/claudeai-ui-map.md
+++ b/docs/testing/claudeai-ui-map.md
@@ -0,0 +1,218 @@
+# claude.ai UI Map
+
+*Last updated: 2026-05-02*
+
+This file is the index from "UI surface" → "test-harness abstraction." It
+answers: *which renderer surface does each Layer-2 helper cover, and where
+are the gaps?* For human-readable behavior and visual specs of each surface
+(what each button looks like, what each menu does), see [`ui/`](./ui/).
+For the architectural rationale and growth strategy of the wrapper, see
+[`claudeai-ui-mapping-plan.md`](./claudeai-ui-mapping-plan.md).
+
+A `✓` marker means the helper exists today, with a `file:line` reference
+into [`tools/test-harness/src/lib/claudeai.ts`](../../tools/test-harness/src/lib/claudeai.ts).
+A `TODO` marker is a planned helper — when a third test needs the same
+shape, promote it from inline `evalInRenderer` to a top-level helper or
+page-object method (see plan Phase 3).
+
+## Top-level routes
+
+- `/new` — chat composer page (default landing for signed-in users)
+- `/chat/<uuid>` — open chat session
+- `/epitaxy` — Code tab landing
+- `/projects/<id>` — project view
+- `/login`, `/auth/*` — pre-login routes (test harness skips here)
+
+The Code df-pill click does **not** change the URL — the router rerenders
+the tab body inline. Helpers must poll for body-mount signals (e.g. a
+compact pill rendering) rather than waiting on navigation.
+
+## Surfaces by tab
+
+### Chat (df-pill "Chat", route /new)
+
+UI reference: [`ui/prompt-area.md`](./ui/prompt-area.md),
+[`ui/window-chrome-and-tabs.md`](./ui/window-chrome-and-tabs.md).
+
+- df-pill activation — `lib/claudeai.ts:activateTab` (:44) ✓
+- Composer textarea — TODO `ChatTab.composer()`
+- "+" submenu (Add files / Add to project / Skills / Connectors / ...)
+  — TODO `ChatTab.openAttachMenu()`
+- Slash menu (triggered by typing `/`) — TODO `ChatTab.openSlashMenu()`
+- Model picker — TODO `ChatTab.openModelPicker()`
+- Permission mode picker — TODO `ChatTab.openPermissionPicker()`
+- Effort picker — TODO
+- Send button — TODO `ChatTab.send()`
+- Stop button (replaces Send while responding) — TODO `ChatTab.stop()`
+- Attachment chip / drag-drop overlay — TODO
+- Usage ring — TODO
+
+### Cowork (df-pill "Cowork")
+
+UI reference: see ghost-icon row in
+[`ui/window-chrome-and-tabs.md`](./ui/window-chrome-and-tabs.md). No
+dedicated surface doc yet — the ghost icon is the canonical "topbar shim
+alive" indicator and the tab body itself is largely undocumented at the
+time of writing.
+
+- df-pill activation — `lib/claudeai.ts:activateTab` (:44) ✓
+- Workspace list — TODO `CoworkTab.listWorkspaces()`
+- Environment switcher — TODO `CoworkTab.switchEnvironment()`
+- Dispatch state indicator — TODO
+
+### Code (df-pill "Code", route /epitaxy)
+
+UI reference: [`ui/code-tab-panes.md`](./ui/code-tab-panes.md),
+[`ui/sidebar.md`](./ui/sidebar.md),
+[`ui/prompt-area.md`](./ui/prompt-area.md).
+
+- df-pill activation — `lib/claudeai.ts:activateTab` (:44) ✓
+- Tab activation + body-mount wait — `lib/claudeai.ts:CodeTab.activate` (:285) ✓
+- Env pill (Local / Cloud / SSH) — `lib/claudeai.ts:CodeTab.openEnvPill` (:317) ✓
+- Local env selection — `lib/claudeai.ts:CodeTab.selectLocal` (:350) ✓
+- Select-folder pill (rendered after Local) — used internally by
+  `lib/claudeai.ts:CodeTab.openFolderPicker` (:368) ✓
+- Folder picker dialog (full chain) — `lib/claudeai.ts:CodeTab.openFolderPicker` (:368) ✓
+- Folder picker dialog mock + assertion — `lib/claudeai.ts:installOpenDialogMock`
+  (:70) ✓ + `lib/claudeai.ts:getOpenDialogCalls` (:113) ✓
+- File tree (left panel) — TODO `CodeTab.fileTree()`
+- Editor pane — TODO `CodeTab.editor()`
+- Diff pane — TODO `CodeTab.openDiff()`
+- Preview pane — TODO `CodeTab.openPreview()`
+- Integrated terminal — TODO `CodeTab.openTerminal()`
+- Tasks / subagent / plan panes — TODO
+- Side-chat — TODO `CodeTab.openSideChat()`
+- Recent-folder selection (radio in Select-folder menu) — TODO
+
+## Surfaces independent of tab
+
+### Sidebar
+
+UI reference: [`ui/sidebar.md`](./ui/sidebar.md).
+
+- Search overlay (topbar Search icon) — TODO `SidebarNav.search()`
+- Recent conversations — TODO `SidebarNav.openRecent(idx | uuid)`
+- "More options" per row — TODO `SidebarNav.rowContextMenu(uuid)`
+- "+ New session" button — TODO `SidebarNav.newSession()`
+- Routines link — TODO `SidebarNav.openRoutines()`
+- Customize link — TODO `SidebarNav.openCustomize()`
+- Status / project / environment filters — TODO
+- Group-by control — TODO
+- Collapse toggle — TODO
+
+### Window chrome / topbar (in-app hybrid)
+
+UI reference: [`ui/window-chrome-and-tabs.md`](./ui/window-chrome-and-tabs.md).
+
+- Hamburger menu — TODO `Topbar.openHamburger()`
+- Sidebar toggle — TODO `Topbar.toggleSidebar()`
+- Back / forward arrows — TODO
+- Cowork ghost icon (topbar-alive sentinel) — TODO `Topbar.coworkGhostPresent()`
+
+### Native dialogs
+
+- File / folder picker mock — `lib/claudeai.ts:installOpenDialogMock` (:70) ✓
+- File / folder picker call inspection — `lib/claudeai.ts:getOpenDialogCalls` (:113) ✓
+- Message box / confirm — TODO `installShowMessageBoxMock`
+- Save dialog — TODO `installShowSaveDialogMock`
+
+### Menus / popovers
+
+- Compact-pill discovery — `lib/claudeai.ts:findCompactPills` (:130) ✓
+- Compact-pill open + menu read — `lib/claudeai.ts:openPill` (:162) ✓
+- Click any menuitem by text regex — `lib/claudeai.ts:clickMenuItem` (:210) ✓
+- Dismiss popover via Escape — `lib/claudeai.ts:pressEscape` (:256) ✓
+- Modal dismiss / confirm — TODO `Modal.dismiss()` / `Modal.confirm()`
+- Toast / status — TODO `waitForToast(regex)`
+- Right-click context menus (sidebar row, etc.) — TODO `openContextMenu(target)`
+
+### Settings
+
+UI reference: [`ui/settings.md`](./ui/settings.md).
+
+- Open Settings — TODO `Settings.open()`
+- Hotkey rebind — TODO `Settings.rebindHotkey(action, chord)`
+- Theme toggle — TODO `Settings.setTheme('dark' | 'light' | 'auto')`
+- Account / sign-out — TODO `Settings.signOut()`
+- Computer-use toggle (absent on Linux per S22) — TODO
+- Keep-computer-awake toggle (per S20) — TODO
+
+### Routines page
+
+UI reference: [`ui/routines-page.md`](./ui/routines-page.md).
+
+- Routines list — TODO `RoutinesPage.list()`
+- New-routine form — TODO `RoutinesPage.create(spec)`
+- Routine detail page — TODO `RoutinesPage.open(id)`
+
+### Connectors and plugins
+
+UI reference: [`ui/connectors-and-plugins.md`](./ui/connectors-and-plugins.md).
+
+- Connector picker — TODO `ConnectorPicker.open()`
+- Connector list / status — TODO
+- Plugin browser — TODO `PluginBrowser.open()`
+- Plugin install (Anthropic & Partners flow) — TODO `PluginBrowser.install(slug)`
+- Plugin manager (installed list) — TODO
+
+### Quick Entry popup
+
+UI reference: [`ui/quick-entry.md`](./ui/quick-entry.md). Note: the
+Quick Entry harness lives in [`quickentry.ts`](../../tools/test-harness/src/lib/quickentry.ts),
+not `claudeai.ts`. The `installOpenDialogMock` shape here intentionally
+mirrors `QuickEntry.installInterceptor` (quickentry.ts:86) — keep them
+aligned when extending either.
+
+- Open Quick Entry (global shortcut) — covered by `lib/quickentry.ts`
+- Compose + send — covered by `lib/quickentry.ts`
+- Closeout cases (S29–S37) — covered by `lib/quickentry.ts`
+
+### Notifications
+
+UI reference: [`ui/notifications.md`](./ui/notifications.md). libnotify
+rendering is environmental — likely stays a manual checklist rather than
+a renderer-side helper. No `claudeai.ts` coverage planned.
+
+### Tray
+
+UI reference: [`ui/tray.md`](./ui/tray.md). Tray is owned by the main
+process / native bindings, not the renderer DOM — outside the scope of
+`claudeai.ts`. Covered by separate tests (T03, S08).
+
+## Atoms inventory
+
+Stable structural patterns the lib already anchors on. See the
+discovery comment at the top of
+[`tools/test-harness/src/lib/claudeai.ts`](../../tools/test-harness/src/lib/claudeai.ts)
+for why each is shape-matched rather than class-matched.
+
+| Atom | Fingerprint | Helper |
+|---|---|---|
+| df-pill | `button[aria-label][class*="df-pill"]` | `activateTab(name)` (:44) |
+| compact-pill | `button[aria-haspopup=menu] > span.truncate.max-w-[*]` | `findCompactPills` (:130), `openPill` (:162) |
+| menu / menuitem | `[role=menu] [role=menuitem*]` | `clickMenuItem(regex)` (:210) |
+| Escape dismiss | `document.dispatchEvent(KeyboardEvent('keydown', Escape))` | `pressEscape` (:256) |
+| Electron `dialog.showOpenDialog` | main-process IPC | `installOpenDialogMock` (:70), `getOpenDialogCalls` (:113) |
+
+Atoms not yet abstracted (when a third test needs the same shape,
+promote to a top-level helper):
+
+| Atom | Probable fingerprint | Status |
+|---|---|---|
+| modal | `[role=dialog]` | not seen yet |
+| toast | `[role=status][aria-live]` | not seen yet |
+| sidebar nav row | `[class*="df-row"] [aria-label]` | seen, not abstracted |
+| chat composer | textarea / contenteditable in composer container | not abstracted |
+| right-click context menu | `[role=menu]` triggered by `contextmenu` event | not abstracted |
+| Electron `dialog.showMessageBox` | main-process IPC | not abstracted |
+| Electron `dialog.showSaveDialog` | main-process IPC | not abstracted |
+| settings panel section | route-anchored container in Settings tab | not abstracted |
+
+## See also
+
+- [`claudeai-ui-mapping-plan.md`](./claudeai-ui-mapping-plan.md) —
+  governing plan and phase rollout
+- [`automation.md`](./automation.md) — harness architecture and the
+  SIGUSR1 / runtime-attach pattern
+- [`ui/`](./ui/) — per-surface visual / behavior specs
+- [`cases/`](./cases/) — functional test specs (T## / S##)
--- a/docs/testing/claudeai-ui-mapping-plan.md
+++ b/docs/testing/claudeai-ui-mapping-plan.md
@@ -0,0 +1,415 @@
+# claude.ai UI Mapping Plan
+
+This is an executable plan for systematically mapping claude.ai's
+renderer UI into reusable test-harness abstractions. It can be picked
+up by a fresh session — start at "Phase 1" and walk down.
+
+## Where we are
+
+The harness already has one worked example: `tools/test-harness/src/lib/claudeai.ts`
+exports a `CodeTab` class plus atom helpers (`activateTab`,
+`installOpenDialogMock`, `findCompactPills`, `openPill`, `clickMenuItem`,
+`pressEscape`). `T17_folder_picker.spec.ts` is its only consumer
+today — drives the chain `Code df-pill → env pill → Local → Select
+folder → Open folder` and asserts `dialog.showOpenDialog` fires.
+
+Discovery evidence captured by `tools/test-harness/probe.ts` (run
+against a live debugger on port 9229):
+
+- df-pill is a stable atom — exactly 3 instances on Code-tab page
+  (`Chat`, `Cowork`, `Code`), all with `class*="df-pill"` and
+  matching `aria-label`.
+- compact-pill is a stable atom — `button[aria-haspopup=menu]` with
+  a `span.truncate.max-w-[Npx]` child. Env pill uses 200px,
+  Select-folder pill uses 160px. Same Tailwind class signature; we
+  anchor on structure, not classes.
+- 80 `button[aria-haspopup=menu]` total on a Code-tab page; only the
+  2 with the truncate fingerprint are pills, the other 78 are sidebar
+  "More options" buttons.
+
+Pattern proven: discovery-by-shape in the lib layer, page-object
+classes per major UI surface, specs use the lib. This doc covers
+how to extend that pattern across the rest of claude.ai.
+
+## Strategy: three layers
+
+**Layer 1 — atoms.** Generic helpers around stable structural
+patterns. Live in `lib/claudeai.ts`. Built once, reused everywhere.
+Examples already there: compact-pill, df-pill, menu, dialog mock.
+
+**Layer 2 — page objects.** Domain classes per major UI surface
+(CodeTab, ChatTab, Settings, etc.). Compose atoms. Built per test
+demand — premature otherwise. CodeTab is the template.
+
+**Layer 3 — discovery tooling.** Standalone scripts that connect to
+a running debugger and let humans + agents explore the renderer.
+`probe.ts` is the seed; this doc grows it into a small CLI.
+
+The thing to avoid: comprehensively mapping the UI upfront. Even
+with a recording tool, that burns time on surfaces no test will
+exercise for months. Lazy + bookmark-the-shape wins.
+
+## Phase 1 — Tooling foundation
+
+**Goal:** turn `probe.ts` into a proper exploration CLI under
+`tools/test-harness/explore/`, with snapshot + diff capability that
+catches UI drift before tests do.
+
+**Deliverables:**
+
+- `tools/test-harness/explore/explore.ts` — entry point with
+  subcommands.
+- `tools/test-harness/explore/snapshot.ts` — capture renderer state.
+- `tools/test-harness/explore/diff.ts` — compare two snapshots.
+- `tools/test-harness/explore/find.ts` — search for elements.
+- `docs/testing/ui-snapshots/` — directory for captured snapshots
+  (gitignore the file contents but commit the directory + a README).
+- `tools/test-harness/package.json` — add scripts:
+  `npm run explore`, `npm run explore:snapshot <name>`, etc.
+
+**Subcommand spec:**
+
+```
+npx tsx explore/explore.ts                  # full snapshot to stdout
+npx tsx explore/explore.ts pills            # df-pills + compact-pills + state
+npx tsx explore/explore.ts menu             # currently-open menu structure
+npx tsx explore/explore.ts snapshot <name>  # write to docs/testing/ui-snapshots/<name>.json
+npx tsx explore/explore.ts diff <a> <b>     # diff two snapshots — flags renamed/removed
+npx tsx explore/explore.ts find <regex>     # search renderer for matching text/aria-label
+```
+
+Snapshot shape (per file):
+
+```json
+{
+  "capturedAt": "2026-05-02T17:30:00Z",
+  "claudeAiUrl": "https://claude.ai/epitaxy",
+  "appVersion": "1.1.7714",
+  "dfPills": [...],
+  "compactPills": [...],
+  "ariaLabeledButtons": [...],
+  "openMenu": null,
+  "modals": [...]
+}
+```
+
+`diff` should flag: removed elements (selector → no match), changed
+text/aria-label, new elements (informational, not a failure). Output
+human-readable + a `--json` flag for machine consumption.
+
+**How to dispatch this work:**
+
+Single agent, `general-purpose`. Brief:
+
+> Build the explore CLI under `tools/test-harness/explore/`. Read
+> `tools/test-harness/probe.ts` as the seed implementation. Match the
+> existing project style (tabs, multi-line `//` why-blocks, terse).
+> Reuse `src/lib/inspector.ts` (`InspectorClient.connect(9229)`) for
+> the debugger connection. Subcommands as specified in
+> `docs/testing/claudeai-ui-mapping-plan.md` Phase 1. Do not delete
+> probe.ts — leave it as a one-off; it can be removed in a follow-up.
+> Typecheck with `npx tsc --noEmit` (no test runs). Add npm scripts
+> to `package.json`. Add a thin README in
+> `docs/testing/ui-snapshots/README.md` explaining how to capture +
+> compare snapshots.
+
+**Exit criteria:**
+
+- `npx tsx explore/explore.ts pills` against a running debugger lists
+  the 3 df-pills and 2 compact-pills (or whatever's on screen).
+- `explore/explore.ts snapshot baseline-code-tab` writes a JSON file.
+- `explore/explore.ts diff baseline-code-tab baseline-code-tab`
+  reports zero diffs.
+- Typecheck green.
+
+## Phase 2 — UI map document
+
+**Goal:** maintain a living markdown index of every reachable UI
+surface, the navigation path to reach it, and which Layer-2 class
+covers it (or `TODO` if none yet).
+
+**Deliverable:** `docs/testing/claudeai-ui-map.md`.
+
+**Initial content** (populate from what's known today, leave gaps
+marked TODO):
+
+```markdown
+# claude.ai UI Map
+
+Source of truth for "where does each UI surface live, and which
+test-harness abstraction covers it." Update as new abstractions are
+added.
+
+## Top-level routes
+
+- `/new` — chat composer page (default landing for signed-in users)
+- `/chat/<uuid>` — open chat session
+- `/epitaxy` — Code tab landing
+- `/projects/<id>` — project view
+- `/login`, `/auth/*` — pre-login routes (test harness skips here)
+
+## Surfaces by tab
+
+### Chat (df-pill "Chat", route /new)
+- Composer textarea — TODO `ChatTab.composer()`
+- "+" submenu (Add files / Add to project / Skills / Connectors / ...)
+  — TODO `ChatTab.openAttachMenu()`
+- Model selector — TODO
+- Stop / regenerate — TODO
+
+### Cowork (df-pill "Cowork")
+- Workspace list — TODO
+- Environment switcher — TODO
+
+### Code (df-pill "Code", route /epitaxy)
+- Env pill (Local / Cloud / SSH) — `lib/claudeai.ts:CodeTab.openEnvPill()` ✓
+- Select folder pill — `lib/claudeai.ts:CodeTab` (used internally by
+  `openFolderPicker`) ✓
+- Folder picker dialog — `lib/claudeai.ts:installOpenDialogMock` ✓
+- File tree (left panel) — TODO
+- Editor pane — TODO
+
+## Surfaces independent of tab
+
+### Sidebar
+- Search — TODO `SidebarNav.search()`
+- Recent conversations — TODO `SidebarNav.openRecent(idx | uuid)`
+- "More options" per row — TODO
+- New session button — TODO
+
+### Native dialogs
+- File / folder picker — `lib/claudeai.ts:installOpenDialogMock` ✓
+- Message box / confirm — TODO `installShowMessageBoxMock`
+- Save dialog — TODO `installShowSaveDialogMock`
+
+### Menus / popovers
+- Generic menu open + click — `lib/claudeai.ts:openPill` /
+  `clickMenuItem` ✓
+- Modal — TODO `Modal.dismiss() / Modal.confirm()`
+- Toast / status — TODO `waitForToast(regex)`
+
+### Settings
+- Hotkey rebind — TODO
+- Theme toggle — TODO
+- Account / sign-out — TODO
+
+## Atoms inventory
+
+Stable structural patterns the lib already anchors on:
+
+| Atom | Fingerprint | Helper |
+|---|---|---|
+| df-pill | `button[aria-label][class*="df-pill"]` | `activateTab(name)` |
+| compact-pill | `button[aria-haspopup=menu] > span.truncate.max-w-[*]` | `findCompactPills`, `openPill` |
+| menu / menuitem | `[role=menu] [role=menuitem*]` | `clickMenuItem(regex)` |
+
+Atoms not yet abstracted (when a third test needs the same shape,
+promote to a top-level helper):
+
+| Atom | Probable fingerprint | Status |
+|---|---|---|
+| modal | `[role=dialog]` | not seen yet |
+| toast | `[role=status][aria-live]` | not seen yet |
+| sidebar nav row | `[class*="df-row"] [aria-label]` | seen, not abstracted |
+| chat composer | textarea/contenteditable in composer container | not abstracted |
+```
+
+**How to dispatch this work:**
+
+A claude-code-guide or general-purpose agent can write the initial
+file. Single message:
+
+> Create `docs/testing/claudeai-ui-map.md` matching the structure in
+> `docs/testing/claudeai-ui-mapping-plan.md` Phase 2. Pull TODO
+> entries from the planned ChatTab/Settings/etc. surfaces. Mark
+> existing helpers from `tools/test-harness/src/lib/claudeai.ts`
+> with ✓ and the file:line. Don't run any tests.
+
+**Exit criteria:**
+
+- File exists with all top-level routes documented.
+- Every existing `lib/claudeai.ts` export is referenced ✓.
+- Every planned surface from this plan has a TODO entry.
+
+## Phase 3 — Page objects per test demand
+
+**Goal:** add new Layer-2 classes (ChatTab, Settings, etc.) when the
+first test needs them. Don't speculate.
+
+**Template:** `tools/test-harness/src/lib/claudeai.ts:CodeTab`. Match
+its shape:
+
+- Instance class taking `inspector: InspectorClient` in constructor.
+- Public methods are either single-step (`openEnvPill`,
+  `selectLocal`) or multi-step convenience (`openFolderPicker`).
+- Discovery by shape, not Tailwind classes.
+- Multi-line `//` why-block at top of class explaining what UI
+  surface it covers and the discovery strategy.
+- Failures throw with enough context for the spec to attach to
+  `testInfo.attach()`.
+
+**Workflow per new page object:**
+
+1. Identify which test motivates the new class. Don't build
+   speculatively.
+2. Run `explore.ts snapshot <name>` against a live debugger on the
+   target UI surface. Commit the snapshot under
+   `docs/testing/ui-snapshots/`.
+3. Inspect the snapshot — pick stable structural fingerprints, not
+   Tailwind classes.
+4. Write the class in `lib/claudeai.ts`. If the file gets large
+   (>1500 lines), split per-tab into separate files
+   (`lib/claudeai/code-tab.ts`, `lib/claudeai/chat-tab.ts`, with
+   `lib/claudeai.ts` as the barrel).
+5. Update `docs/testing/claudeai-ui-map.md` — replace the TODO with
+   the class name + ✓.
+6. Add the spec that uses it.
+7. Run typecheck. Don't run tests until everything's wired.
+
+**Don't pull out yet:**
+
+- Single-consumer methods. If only one spec calls
+  `Settings.toggleDarkMode()`, the inline implementation is fine.
+  Promote to its own method when a second consumer arrives.
+- Generic primitives that haven't repeated three times. Three is
+  the threshold for "this is an atom" — two could still be
+  coincidence.
+
+## Phase 4 — Atom promotion
+
+**Goal:** keep the atom layer (Layer 1) growing in step with the
+page-object layer (Layer 2).
+
+**Rule:** when a discovery pattern (CSS selector + JS predicate)
+appears in 3 different page objects, promote it to a top-level
+helper in `lib/claudeai.ts`.
+
+**Examples of likely promotions in the next 6 months:**
+
+- `findModal()` / `dismissModal()` — every page object that opens a
+  confirmation modal will need this.
+- `waitForToast(regex, timeout)` — error and success toasts are
+  pervasive.
+- `installShowMessageBoxMock(inspector, response)` — for native
+  confirm dialogs.
+- `clickNavRow(label)` — sidebar interactions.
+
+**Process:**
+
+1. Notice the third occurrence of the same pattern.
+2. Move the inline implementation up to a top-level export.
+3. Replace the three call sites with calls to the new export.
+4. Add an entry to the atoms inventory in `claudeai-ui-map.md`.
+
+## Phase 5 — Drift detection
+
+**Goal:** catch UI changes that break selectors *before* a sweep
+fails — fast, automatic, runs on every harness invocation.
+
+**Deliverable:** `tools/test-harness/src/runners/H05_ui_drift_check.spec.ts`.
+
+**Design:**
+
+- Loads each `*.json` file from `docs/testing/ui-snapshots/`.
+- Connects to a running app via the existing `launchClaude` +
+  `attachInspector` flow (NOT against an externally-running app —
+  the harness must be self-contained).
+- For each snapshot, navigates to the captured URL (if not already
+  there), then asserts each captured selector still resolves to an
+  element with the same text/aria-label.
+- Failures are *attachments*, not full failures — the spec passes
+  if ≥80% of snapshots match, surfaces the diffs as warnings. Hard
+  threshold can be tightened later. Goal is "tell me what drifted,"
+  not "block CI on every minor renderer change."
+
+**How to dispatch:**
+
+Single agent, after Phases 1–2 are done. Brief:
+
+> Create `tools/test-harness/src/runners/H05_ui_drift_check.spec.ts`
+> per the design in `docs/testing/claudeai-ui-mapping-plan.md`
+> Phase 5. Read each `*.json` under `docs/testing/ui-snapshots/`,
+> drive the renderer to the captured URL, assert each captured
+> element selector still matches. Surface diffs via
+> `testInfo.attach`. Pass if ≥80% match. Severity Should, surface
+> "claude.ai UI drift detection". Typecheck only.
+
+**Exit criteria:**
+
+- Runs cleanly against current renderer state (all snapshots match).
+- Returns ≤200ms per snapshot.
+- Skip with a clear message when no signed-in host config available
+  (most snapshots will be of post-login surfaces).
+
+## Recommended order
+
+1. **Phase 1 (tooling)** — ~2 hours, single agent. Foundation for
+   everything else.
+2. **Phase 2 (UI map doc)** — ~30 min, single agent. Cheap,
+   self-documenting.
+3. **Phase 3 (page objects)** — incremental, per test need.
+4. **Phase 4 (atom promotion)** — opportunistic, no scheduled work.
+5. **Phase 5 (drift detection)** — once Phase 1 is done and a few
+   snapshots exist.
+
+Phases 1 and 2 are independent and can run in parallel.
+
+## Today's starting state (reference)
+
+What's already in place as of session-end:
+
+```
+tools/test-harness/
+├── probe.ts                              # one-off probe (Phase 1 seed)
+├── src/
+│   ├── lib/
+│   │   ├── claudeai.ts                   # CodeTab + atoms (NEW today)
+│   │   ├── electron.ts                   # SIGINT cleanup, lastExitInfo
+│   │   ├── inspector.ts                  # idempotent close()
+│   │   ├── quickentry.ts                 # disk-read getStoredPosition
+│   │   └── ... (unchanged)
+│   └── runners/
+│       ├── H01_cdp_gate_canary.spec.ts          # NEW
+│       ├── H02_frame_fix_wrapper_present.spec.ts # NEW
+│       ├── H03_patch_fingerprints.spec.ts        # NEW
+│       ├── H04_cowork_daemon_lifecycle.spec.ts   # NEW
+│       ├── T17_folder_picker.spec.ts             # refactored to lib/claudeai.ts
+│       ├── _investigate_t17_urls.spec.ts         # one-off, can be deleted
+│       └── ... (T01/T03/T04, S09/S12, S29-S37)
+├── orchestrator/sweep.sh                  # multi-suite JUnit parser
+└── playwright.config.ts                   # CI-gated retries + forbidOnly
+```
+
+**Pending cleanup** (covered in a final commit, not part of this plan):
+
+- Delete `_investigate_t17_urls.spec.ts` — investigation served.
+- Delete `probe.ts` once `explore/` lands and supersedes it.
+- Update `tools/test-harness/README.md` Status table — T17 from
+  "selector-tuning pending" to passing on KDE-W.
+
+**Useful commands for a fresh session:**
+
+```sh
+cd /home/aaddrick/source/claude-desktop-debian/tools/test-harness
+
+# Typecheck (must pass after every edit)
+npx tsc --noEmit
+
+# Run a single spec
+ROW=KDE-W CLAUDE_TEST_USE_HOST_CONFIG=1 npx playwright test \
+  src/runners/T17_folder_picker.spec.ts --reporter=list
+
+# Full sweep
+ROW=KDE-W CLAUDE_TEST_USE_HOST_CONFIG=1 ./orchestrator/sweep.sh
+
+# Probe a running app (requires main process debugger enabled)
+npx tsx probe.ts
+
+# Kill stale instances before launch
+pkill -9 -f claude-desktop; pkill -9 -f mount_claude
+```
+
+**Before starting Phase 1:** open Claude Desktop, enable
+`Developer → Enable Main Process Debugger` from the menu, navigate
+to a known UI state. Then run `npx tsx probe.ts` to confirm the
+inspector is reachable on port 9229.
--- a/docs/testing/fingerprint-v7-plan.md
+++ b/docs/testing/fingerprint-v7-plan.md
@@ -0,0 +1,490 @@
+# Fingerprint v7 Plan — Contextual, Account-Portable Identification
+
+This is an executable plan for the v6 → v7 migration of the inventory
+fingerprint shape used by `tools/test-harness/explore/walker.ts` and
+`tools/test-harness/src/runners/U01_ui_visibility.spec.ts`. It can be
+picked up by a fresh session — start at "Phase 1" and walk down.
+
+## Where we are
+
+`docs/testing/ui-inventory.json` v6 (captured 2026-05-03 against app
+1.5354.0, 383 entries) records each interactive element with a
+fingerprint of this shape:
+
+```ts
+fingerprint: {
+  selector: 'button[aria-label="Search"]',
+  ariaLabel: 'Search',
+  role: null,
+  tagName: 'BUTTON',
+  textContent: null,
+}
+```
+
+`U01` resolves entries by handing the `selector` field to Playwright.
+The current scheme has three load-bearing failure modes:
+
+1. **Account-specific names baked into selectors and IDs.** Entries
+   like `root.button.awaaddrick-max` (the user's plan badge,
+   `button:has-text("AWAaddrick·Max")`) hardcode the walker-author's
+   username + plan tier. Any contributor running U01 against their
+   own auth fails this entry on selector match — the element is
+   structurally present, just labeled differently.
+2. **Instance text in selectors of "stable" entries.** Search-result
+   options, recent-conversations buttons, and pinned conversations
+   carry titles like "Fine-tuning diffusion models with reinforcement
+   learning" in their selectors. These are inherently per-account; the
+   `kind: instance` taxonomy already exists to handle them, but the
+   selector still encodes the literal title, so the v6 capture
+   couldn't actually leverage `instance` semantics.
+3. **Selector brittleness under cosmetic redesigns.** `button:has-text(...)`
+   selectors break under any label change. `button[aria-label="..."]`
+   selectors break under any aria-label rewrite (which the upstream
+   team does for accessibility audits without warning). Neither
+   strategy carries enough redundancy to recover when one signal drifts.
+
+The reconciliation doc (`ui-inventory-reconciliation.md`) flags these
+as "Walker coverage gap" and "Account-state-dependent" categories,
+and the U01 brief lists per-user inventory regeneration as "a
+separate workstream." This is that workstream.
+
+## Design goals
+
+In priority order:
+
+1. **Account-portable.** A v7 inventory walked against User A's
+   account matches against User B's renderer for any entry whose
+   target element is structurally present in both accounts. Entries
+   that genuinely don't exist in B's account fall back to the existing
+   "skip if absent" semantics (`kind: instance` + ancestor-presence
+   check).
+2. **Resilient to cosmetic drift.** Label changes, aria-label
+   rewrites, minified-class churn, and CSS rewrites must not
+   invalidate the fingerprint when the element's semantic role and
+   structural position survive.
+3. **Surface drift before failure.** Soft drift (primary aria-path
+   missed, relaxed-scope match recovered) attaches a warning to the
+   test rather than passing silently. Hard drift (no strategy
+   resolves) fails as today. The sweep gains a third state:
+   `passed-with-drift`.
+4. **Atomic cutover, not gradual migration.** v7 walker, v7 inventory
+   schema, and v7 resolver land together. The committed v6 inventory
+   gets invalidated the moment v7 walker ships; no parallel-emit
+   compatibility window, no `legacy` selector fallback in the
+   resolver. Two systems are worse than one.
+
+Non-goals:
+
+- Pixel-level visual diff. Separate concern; H05 is the right shape.
+- AI / embedding-based matching. Out of scope for a Linux repackager.
+- Behavioral fingerprints (click-and-verify-effect). Too expensive at
+  383 entries.
+
+## v7 schema
+
+```ts
+interface FingerprintV7 {
+  // Primary: accessibility-tree path from nearest landmark down to
+  // the leaf. Each step carries (role, optional name).
+  ariaPath: AriaStep[];
+
+  // The element itself. Drops `name` entirely when role + ariaPath
+  // suffice for uniqueness on the captured surface.
+  leaf: {
+    role: string;                     // "button", "link", "menuitem", ...
+    name: NameMatcher | null;
+    siblingIndex: SiblingIndex | null;
+  };
+
+  // Stability classification — drives how strictly the resolver
+  // matches. See "Kind-strictness matrix" below. Distinct from the
+  // existing `kind` field (persistent / structural / menu / instance)
+  // which captures *lifecycle*, not *match strictness*.
+  classification: 'stable' | 'positional' | 'instance';
+}
+
+interface AriaStep {
+  role: string;          // landmark / region / grouping role
+  name: NameMatcher | null;  // optional — only included when needed
+}
+
+type NameMatcher =
+  | { kind: 'literal'; value: string }       // "Search", "Cowork"
+  | { kind: 'pattern'; regex: string };      // "\\w+·(Free|Pro|Max|...)"
+
+interface SiblingIndex {
+  role: string;          // role of siblings being indexed
+  position: number;      // 0-based
+  total: number;         // total siblings of that role at capture
+}
+```
+
+## Capture algorithm
+
+Run during walker.ts's element emission, after the surface has settled.
+
+```text
+captureFingerprint(element, surface):
+  ariaPath = walkLandmarkAncestors(element)
+    // Stop at <body>; emit a step for each role in
+    // {banner, main, navigation, region, complementary,
+    //  contentinfo, search, form, toolbar, menu, menubar,
+    //  listbox, list, dialog, tablist, tabpanel, group}
+    // with grouping role plus optional accessible name.
+
+  role = element.role
+  name = element.accessibleName
+
+  // Step 1: try uniqueness without the name.
+  matches = surface.queryAccessibleTree({
+    ariaPath,
+    leaf: { role }
+  })
+  if matches.length == 1:
+    return { ariaPath, leaf: { role, name: null, siblingIndex: null },
+             classification: 'stable' }
+
+  // Step 2: still too broad — try the name as a discriminator,
+  // shaping it if it looks instance-specific.
+  classification = classifyName(name, surface)
+  if classification != 'instance':
+    nameMatcher = (classification == 'positional')
+      ? null
+      : (looksInstanceShaped(name)
+          ? { kind: 'pattern', regex: shapeOfName(name) }
+          : { kind: 'literal', value: name })
+    matches = surface.queryAccessibleTree({
+      ariaPath, leaf: { role, name: nameMatcher }
+    })
+    if matches.length == 1:
+      return { ariaPath, leaf: { role, name: nameMatcher,
+               siblingIndex: null },
+               classification }
+
+  // Step 3: still ambiguous — fall through to sibling position.
+  siblings = element.parent.childrenWithRole(role)
+  if siblings.length > 1:
+    siblingIndex = {
+      role,
+      position: siblings.indexOf(element),
+      total: siblings.length
+    }
+    return { ariaPath, leaf: { role, name: null, siblingIndex },
+             classification: 'positional' }
+
+  // Step 4: instance — assert ≥1 match within ariaPath.
+  return { ariaPath, leaf: { role, name: null, siblingIndex: null },
+           classification: 'instance' }
+```
+
+`queryAccessibleTree` should hit `Accessibility.getFullAXTree` over
+CDP, not the DOM. The accessibility tree is what screen readers see
+and what the platform APIs query — it's the substrate that aria
+roles and accessible names actually live in.
+
+## Name classifier
+
+`classifyName(name, surface)` decides whether a name is `stable`,
+`instance`, or `positional` (no usable name). Heuristics in priority
+order:
+
+```text
+1. Empty / whitespace name      → 'positional'
+2. Element is a list-row child  → 'instance'  (handled by ancestor
+   role: option/listitem inside listbox/list)
+3. Name matches a known
+   instance-shape regex          → 'instance'  (record as pattern)
+4. Name is in the corpus of
+   "stable UI vocabulary"        → 'stable'
+5. Default                       → 'stable' but flag for review
+```
+
+### Known instance-shape regexes
+
+| Regex | Example match | Shape recorded |
+|---|---|---|
+| `/^.+·(Free\|Pro\|Max\|Team\|Enterprise)$/` | `AWAaddrick·Max` | `\\w+·<PLAN>` |
+| `/^Opus \d/` `/^Sonnet \d/` `/^Haiku \d/` | `Opus 4.7Adaptive` | model-name passthrough (stable across users, just versioned) |
+| `/\d{1,3}%$/` | `Usage: plan 11%` | `Usage: plan \d+%` |
+| `/Today\|Yesterday\|\d+ (day\|hour\|minute)s? ago/` | `Today+12` | `<RELATIVE-DATE>(\\+\d+)?` |
+| `/^\d+\.\d+ \w+/` | `1.5 GB` | `\d+\.\d+ \w+` |
+| `/@\w+/` | `@aaddrick` | `@\w+` (treat as user-handle) |
+| `/[A-Z][a-z]+ [A-Z][a-z]+ [a-z]/` (3+ word title-case) | `Fine-tuning diffusion models...` | treat as `'instance'`, no pattern |
+
+These regexes live in a registry that's part of the v7 capture
+config. Adding a new shape is a one-file change; the registry should
+be ordered (first match wins) so specific patterns take precedence
+over general ones.
+
+### Building the stable UI vocabulary
+
+After the walker finishes the BFS, run a second pass:
+
+1. Collect every `accessibleName` from every captured element.
+2. Bucket by `kind` (existing taxonomy).
+3. Names appearing in 3+ entries with `kind: persistent` or
+   `kind: structural`, across 2+ surfaces, are **stable**.
+4. Names appearing in only 1 entry with `kind: persistent`/`structural`
+   are **suspect** — flag for human triage during reconciliation.
+5. Names in `kind: instance` entries are excluded from the corpus
+   entirely.
+
+Commit the resulting vocabulary list to
+`docs/testing/ui-vocabulary.json` so future walks can use it without
+re-deriving. Refresh the vocabulary on each major upstream release.
+
+## Kind-strictness matrix
+
+The existing `kind` field (`persistent` / `structural` / `menu` /
+`instance`) tunes how strictly the resolver matches at runtime,
+independently from the capture-time `classification`:
+
+| kind | aria-path required | name required | siblingIndex strict | assertion |
+|---|---|---|---|---|
+| `persistent` | yes (deepest scope) | matcher must hit if present | yes | exactly 1 match |
+| `structural` | yes (or 1 step shallower) | matcher OR position | flexible (±1 ok) | exactly 1 match |
+| `menu` | yes, scoped to transient menu surface | literal text fallback ok | n/a | ≥1 match |
+| `instance` | yes (closest list/listbox ancestor) | ignored | ignored | ≥1 match within scope |
+
+Examples:
+
+- `root.button.search` → `kind: persistent`, `classification: stable`,
+  `name: null` (unique by ariaPath alone). Strict 1-match assertion.
+- `root.button.awaaddrick-max` → `kind: persistent`, `classification: stable`,
+  `name: { kind: 'pattern', regex: '\\w+·(Free|Pro|Max|...)' }`.
+  Plan-shape pattern; user-portable.
+- `root.button.search.option.untitled-conversationtoday+12` →
+  `kind: instance`, `classification: instance`, no name, scoped to
+  search-results listbox. Assert ≥1 option in listbox.
+- `root.button.fine-tuning-diffusion-models-with-reinforcement-learning` →
+  `kind: instance`, scoped to pinned-conversations list. Assert ≥1
+  button in pinned list.
+
+## Resolver / fallback chain
+
+In `findByFingerprint`:
+
+```text
+resolve(fp):
+  // Strategy 1 — primary: full aria-tree path
+  result = tryAriaTreeMatch(fp.ariaPath, fp.leaf, fp.kind)
+  if result.matched: return { found: true, strategy: 'aria-tree' }
+
+  // Strategy 2 — relaxed aria scope (drop deepest landmark step
+  // in the path; keep the rest). Catches the common case where the
+  // upstream team adds or removes one container layer.
+  if fp.ariaPath.length > 1:
+    result = tryAriaTreeMatch(fp.ariaPath.slice(0, -1), fp.leaf, fp.kind)
+    if result.matched: return {
+      found: true, strategy: 'aria-tree-relaxed', drift: 'scope-shifted'
+    }
+
+  return { found: false, strategy: null }
+```
+
+When `drift` is set, attach a soft warning to the Playwright test
+without failing it:
+
+```ts
+testInfo.attach('drift-warning', {
+  body: JSON.stringify({
+    entryId: entry.id,
+    expected: fp.ariaPath,
+    matchedVia: result.strategy,
+    drift: result.drift,
+    note: 'primary aria-tree match failed; recovered via fallback. ' +
+          'Re-walk inventory before drift compounds.',
+  }, null, 2),
+  contentType: 'application/json',
+});
+```
+
+CI exposes `drift-warning` as a separate counter alongside pass /
+fail. Sweep summary becomes `383 passed, 12 with drift, 0 failed`.
+
+## Migration plan
+
+The cutover is atomic — no parallel-emit window. Walker, schema, and
+resolver all flip from v6 to v7 in the same merge. The committed v6
+inventory becomes invalid; first action after merge is a re-walk.
+
+### Phase 1 — vocabulary scaffold (pre-walker)
+
+The name classifier needs a stable-UI vocabulary corpus to
+disambiguate suspect names from known-stable copy. Build it from the
+existing v6 inventory before the walker rewrite:
+
+1. Iterate `docs/testing/ui-inventory.json` v6.
+2. Names appearing in 3+ entries with `kind: persistent` or
+   `kind: structural`, across 2+ surfaces, are **stable**.
+3. Names matching any registry regex (plan badge, model version,
+   percentage, relative date, user handle) are **instance-shaped**.
+4. Names appearing in only 1 entry, not matching a regex, not in
+   `kind: instance` — flag for human triage.
+5. Commit the resulting corpus to `docs/testing/ui-vocabulary.json`.
+
+The corpus survives the walker rewrite — it's keyed on names, not on
+v6 schema specifics.
+
+### Phase 2 — walker rewrite
+
+1. Add `Accessibility.getFullAXTree` query to walker's surface-settle
+   step (or AX subtree at target node if full-tree latency is
+   unacceptable; see open questions).
+2. Implement `walkLandmarkAncestors`, `queryAccessibleTree`,
+   `captureFingerprint` per the algorithm above.
+3. Implement the name classifier consuming `ui-vocabulary.json` and
+   the instance-shape registry.
+4. Replace v6 fingerprint emit with v7. Inventory schema header bumps
+   to `walkerVersion: 7`; v6 readers will fail loudly rather than
+   silently mis-resolve.
+5. Walker passes that fail to compute a v7 fingerprint (AX query
+   error, accessible-name-computation failure) emit the entry with
+   `classification: 'positional'` and `name: null`, scoped to its
+   ariaPath. Uncaptured fingerprints are not silently dropped — they
+   become positional entries with explicit looseness.
+
+Acceptance: a walk against the v6-author's account produces v7
+fingerprints for ≥98% of the surfaces v6 captured. ≥80% have
+`classification: 'stable'`; the rest split between `'positional'` and
+`'instance'`.
+
+#### Live-walk shakedown (post-Phase 2)
+
+The first end-to-end walks against the running renderer surfaced five
+real bugs the synthetic selfTest couldn't see. All landed in
+`walker.ts` / `name-classifier.ts` / `inspector.ts`:
+
+1. **AX-tree settle gate.** `Accessibility.enable` populates the tree
+   asynchronously; the existing `waitForStable` (1.5s ceiling on
+   DOM-mutation quiescence) returned long before claude.ai's React
+   tree mounted. Seed snapshots came back with 4 AX nodes (just the
+   `RootWebArea` + a generic shell) and the walker emitted zero
+   entries. Fix: `waitForAxTreeStable(inspector, { minNodes: 20 })`
+   polls `getFullAXTree` until two consecutive reads return the same
+   node count. Called once before the seed snapshot and once after
+   each `navigateTo` in `redrivePath`. Baked into every
+   `snapshotSurface` call too (with `minNodes: 1`) so post-click
+   reads don't race the React update.
+2. **`reloadPage` in `redrivePath`.** `navigateTo(url)` short-circuits
+   when `currentUrl === url`, but every BFS pop re-navigates to
+   `startUrl`, so any state a prior drill left behind (open dialog,
+   expanded sidebar, scrolled focus) carried into the next redrive
+   and contaminated `clickById`'s snapshot. Replaced the redrive's
+   initial `navigateTo` with `location.reload()` to discard the
+   React tree.
+3. **List-row sibling-count heuristic.** The plan's `isListRowChild`
+   check requires `option/listitem` inside `listbox/list`. claude.ai
+   exposes the marketplace dialog as `dialog > button[]` with no
+   list role at all (~80 cards) and the cowork sidebar as
+   `complementary > button[]` (72 sessions). Without a heuristic,
+   each row literal-matches by name and emits as a separate stable
+   entry. Extension: `LIST_ROW_ROLES` includes `button`,
+   `LIST_ANCESTOR_ROLES` includes `group`, AND `siblingTotal >= 15`
+   on its own qualifies regardless of ancestor role. Step 3
+   (positional fallback) also gates on `!isListRowChild` so list
+   rows fall through to step 4's `instance` collapse instead of
+   fragmenting into per-index positionals.
+4. **Two new instance shapes** in `name-classifier.ts`:
+   `cowork-session` matches status-prefixed session titles
+   (`^(Idle|Ready|Working|Awaiting input|Pull request merged|Done|Failed|Cancelled)\s`)
+   and `row-more-options` matches per-row triggers
+   (`^More options for `). Both ordered before `long-title` so the
+   pattern wins over the no-pattern instance fallback.
+5. **Lookup-failure threshold bump** 25 → 75. Sidebar virtualization
+   means the AX tree exposes a slightly different subset of cowork
+   sessions on each fresh load; redrives accumulate
+   "no element matches" misses in a row that aren't a real wedge.
+   The timeout counter (5 strikes) still gates against actual
+   renderer hangs.
+
+Result on the AX migration's first clean walk
+(`startUrl: claude.ai/epitaxy`, account: aaddrick, app 1.5354.0):
+**90 entries** (37 persistent / 37 structural / 8 dialog / 8
+instance), 6 denylisted, 23 non-fatal lookup misses. The marketplace
+dialog folded to a single `button-instance+704`; the cowork sidebar
+to `button-instance+72`; search history to `option-instance+25`.
+Acceptance criteria from §Phase 2 met (≥98% structural overlap is
+trivially true on a re-walk; ≥80% stable hit at 75/90 ≈ 83%).
+
+### Phase 3 — resolver rewrite (U01 + walker.ts findByFingerprint)
+
+1. Replace `findByFingerprint` body with the two-strategy chain
+   (primary aria-tree, relaxed-scope fallback). Drop the v6
+   selector code path entirely.
+2. `gen-render-specs.ts` regenerates U01 from the v7 inventory; per-
+   entry test bodies consume `entry.fingerprint` (now v7-shaped)
+   directly.
+3. Add the `drift-warning` attachment shape to U01's test runner.
+4. Run U01 against the v7 inventory captured in Phase 2; baseline
+   drift counts.
+
+Acceptance: U01 against a fresh walker pass produces 0 drift
+warnings on the same account, fails 0 entries. Drift warnings only
+appear when actually-drifted elements are encountered.
+
+### Phase 4 — account-portability validation
+
+1. A second contributor walks their own v7 inventory.
+2. Diff against the v6-author's v7 inventory: structural overlap
+   should be ≥80% on `kind: persistent` and `kind: structural`
+   entries (the cross-user-stable subset).
+3. Run the v6-author's inventory's U01 against the second
+   contributor's renderer (with `seedFromHost` lifting their auth).
+4. Expect ≥80% pass on the cross-user-stable subset; `kind: instance`
+   entries pass via the ancestor-presence check.
+
+This is the actual goal. If account-portability hits, the inventory
+is no longer a "my-account snapshot" but a true render contract.
+
+## Open questions
+
+### Resolved
+
+- **CDP `Accessibility.getFullAXTree` cost.** Not a bottleneck. The
+  signed-in `claude.ai/epitaxy` surface returns a 817-node tree;
+  `waitForAxTreeStable` settles in <1s once Chromium has populated
+  it. The cold-load gate dominates total latency, not per-call
+  overhead. Plan B (subtree queries at the target node) is unused.
+- **Role overrides.** Confirmed working. `Skip to content` on
+  claude.ai is captured as `link` (its AX-computed role) regardless
+  of the underlying tag — a class of mismatch the v6 DOM walker
+  silently got wrong.
+- **`account-bound` kind.** Not needed. The combination of
+  shape-patterned name matchers (plan badge, cowork session) +
+  the sibling-count list heuristic + persistent collapse handles
+  every account-shaped element observed in the first clean walk.
+  Re-evaluate if a future surface exposes account state without
+  one of those signals.
+
+### Open
+
+- **Accessible-name computation parity.** Chrome's AX-tree-computed
+  name should match what Playwright's `getByRole({ name })` matches
+  at resolution time, but they're independent implementations of
+  the ARIA name-computation spec. Validate at Phase 3 acceptance
+  with a sample of 50 entries — capture vs resolve should agree.
+- **Stale vocabulary across releases.** When upstream renames
+  "Cowork" to "Workspaces" (hypothetical), the corpus needs to
+  update. Should vocabulary be re-derived automatically on each walk
+  (cheap, drift-following) or pinned to a committed version (stable,
+  manual updates)? Provisionally: re-derive on walk, commit the
+  derived corpus alongside the inventory so reconciliation can diff
+  vocabulary changes.
+
+## Cross-references
+
+- `tools/test-harness/explore/walker.ts` — capture site
+- `tools/test-harness/explore/walk-isolated.ts` — driver that runs
+  the walk inside the test-harness `launchClaude` + `seedFromHost`
+  isolation path (use this rather than `explore walk` to avoid
+  mutating the host profile)
+- `tools/test-harness/explore/gen-render-specs.ts` — emits U01 from
+  inventory; needs to consume v7 fingerprints
+- `tools/test-harness/src/runners/U01_ui_visibility.spec.ts` —
+  resolver consumer
+- `tools/test-harness/src/lib/inspector.ts` — `getAccessibleTree`
+  + `clickByBackendNodeId` for the AX-driven capture/click pair
+- `docs/testing/ui-inventory-reconciliation.md` — current v6 reconciliation
+- `docs/testing/claudeai-ui-mapping-plan.md` — broader UI mapping
+  strategy this fits inside
--- a/docs/testing/matrix.md
+++ b/docs/testing/matrix.md
@@ -0,0 +1,187 @@
+# Test Status Matrix
+
+*Last updated: 2026-04-30 · Tested against: claude-desktop 1.4758.0 (project varies per row)*
+
+This is the live dashboard. Update this file (and only this file) when status changes. For the test specs themselves, see [`cases/`](./cases/). For orientation, see [`README.md`](./README.md).
+
+Status legend: `✓` pass · `✗` fail · `🔧` mitigated · `?` untested · `-` N/A. Cells include linked issue/PR numbers when relevant.
+
+## Cross-environment matrix (T-series)
+
+| Test | KDE-W | KDE-X | GNOME | Ubu | Sway | i3 | Niri | Hypr-O | Hypr-N |
+|------|-------|-------|-------|-----|------|----|------|--------|--------|
+| [T01](./cases/launch.md#t01--app-launch) | ✓ | ? | ? | ? | ? | ? | ? | ? | ✓ |
+| [T02](./cases/launch.md#t02--doctor-health-check) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T03](./cases/tray-and-window-chrome.md#t03--tray-icon-present) | ✓ | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T04](./cases/tray-and-window-chrome.md#t04--window-decorations-draw) | ✓ | ? | ? | ? | ? | ? | ? | ? | ✓ |
+| [T05](./cases/shortcuts-and-input.md#t05--url-handler-opens-claudeai-links-in-app) | ? | ? | ? | ? | ✗ | ? | ? | ? | ? |
+| [T06](./cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused) | ✓ | ✓ | ✗ [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404) | 🔧 [#406](https://github.com/aaddrick/claude-desktop-debian/pull/406) | ? | ? | ✗ | ? | ? |
+| [T07](./cases/tray-and-window-chrome.md#t07--in-app-topbar-renders--clickable) | ? | ? | ? | ? | ? | ? | ? | ✗ [#538](https://github.com/aaddrick/claude-desktop-debian/pull/538) | ✓ |
+| [T08](./cases/tray-and-window-chrome.md#t08--hide-to-tray-on-close) | ✓ | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T09](./cases/platform-integration.md#t09--autostart-via-xdg) | ✓ | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T10](./cases/platform-integration.md#t10--cowork-integration) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T11](./cases/extensibility.md#t11--plugin-install-anthropic--partners) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T12](./cases/platform-integration.md#t12--webgl-warn-only) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T13](./cases/launch.md#t13--doctor-reports-correct-package-format) | ✗ | ✗ | ✗ | ? | ✗ | ✗ | ✗ | ? | ? |
+| [T14](./cases/launch.md#t14--multi-instance-behavior) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T15](./cases/code-tab-foundations.md#t15--sign-in-completes-via-browser-handoff) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T16](./cases/code-tab-foundations.md#t16--code-tab-loads) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T17](./cases/code-tab-foundations.md#t17--folder-picker-opens) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T18](./cases/code-tab-foundations.md#t18--drag-and-drop-files-into-prompt) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T19](./cases/code-tab-foundations.md#t19--integrated-terminal) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T20](./cases/code-tab-foundations.md#t20--file-pane-opens-and-saves) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T21](./cases/code-tab-workflow.md#t21--dev-server-preview-pane) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T22](./cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T23](./cases/code-tab-handoff.md#t23--desktop-notifications-fire) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T24](./cases/code-tab-handoff.md#t24--open-in-external-editor) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T25](./cases/code-tab-handoff.md#t25--show-in-files-file-manager) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T26](./cases/routines.md#t26--routines-page-renders) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T27](./cases/routines.md#t27--scheduled-task-fires-and-notifies) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T28](./cases/routines.md#t28--scheduled-task-catch-up-after-suspend) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T29](./cases/code-tab-workflow.md#t29--worktree-isolation) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T30](./cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T31](./cases/code-tab-workflow.md#t31--side-chat-opens) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T32](./cases/code-tab-workflow.md#t32--slash-command-menu) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T33](./cases/extensibility.md#t33--plugin-browser) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T34](./cases/code-tab-handoff.md#t34--connector-oauth-round-trip) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T35](./cases/extensibility.md#t35--mcp-server-config-picked-up) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T36](./cases/extensibility.md#t36--hooks-fire) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T37](./cases/extensibility.md#t37--claudemd-memory-loads) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T38](./cases/code-tab-handoff.md#t38--continue-in-ide) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+| [T39](./cases/code-tab-handoff.md#t39--desktop-cli-handoff-graceful-na) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+
+## UI visibility (U-series)
+
+Auto-generated render attestation: each entry in [`ui-inventory.json`](./ui-inventory.json) is asserted to mount with its recorded fingerprint on each platform. The single matrix cell aggregates every inventory entry — pass means every entry rendered, fail means at least one didn't (per-entry diagnostics in the JUnit attachments). Regenerate the spec with `npm run gen:render-specs` after re-walking. See [`claudeai-ui-mapping-plan.md`](./claudeai-ui-mapping-plan.md) for the discovery + walker design.
+
+| Test | KDE-W | KDE-X | GNOME | Ubu | Sway | i3 | Niri | Hypr-O | Hypr-N |
+|------|-------|-------|-------|-----|------|----|------|--------|--------|
+| [U01](../tools/test-harness/src/runners/U01_ui_visibility.spec.ts) — UI visibility | ? | ? | ? | ? | ? | ? | ? | ? | ? |
+
+## Environment-specific status
+
+### Ubuntu / DEB
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S01](./cases/distribution.md#s01--appimage-launches-without-manual-libfuse2t64-install) | AppImage launches without manual `libfuse2t64` install | ✗ | Workaround documented; not yet filed |
+| [S02](./cases/distribution.md#s02--xdg_current_desktopubuntu-gnome-doesnt-break-de-detection) | `XDG_CURRENT_DESKTOP=ubuntu:GNOME` doesn't break DE detection | ? | — |
+| [S03](./cases/distribution.md#s03--deb-install-via-apt-pulls-all-required-runtime-deps) | DEB install via APT pulls all required runtime deps | ? | — |
+
+### Fedora / RPM
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S04](./cases/distribution.md#s04--rpm-install-via-dnf-pulls-all-required-runtime-deps) | RPM install via DNF pulls all required runtime deps | ? | — |
+| [S05](./cases/distribution.md#s05--doctor-recognises-dnf-installed-package-doesnt-false-flag-as-appimage) | Doctor recognises dnf-installed package (no AppImage false-flag) | ✗ | Affects KDE-W, KDE-X, GNOME, Sway, i3, Niri (T13) |
+
+### Wayland-native (wlroots)
+
+Applies to: Sway, Niri, Hypr-O, Hypr-N (any session running native Wayland rather than XWayland).
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S06](./cases/shortcuts-and-input.md#s06--url-handler-doesnt-segfault-on-native-wayland) | URL handler doesn't segfault on native Wayland | ✗ on Sway | Captured; not yet filed |
+| [S07](./cases/shortcuts-and-input.md#s07--claude_use_wayland1-opt-in-path-works-without-crashing) | `CLAUDE_USE_WAYLAND=1` opt-in path works | ? | [#228](https://github.com/aaddrick/claude-desktop-debian/pull/228), [#232](https://github.com/aaddrick/claude-desktop-debian/pull/232) |
+
+### KDE
+
+Applies to: KDE-W, KDE-X.
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S08](./cases/tray-and-window-chrome.md#s08--tray-icon-doesnt-duplicate-after-nativetheme-update) | Tray icon doesn't duplicate after `nativeTheme` update | 🔧 | [`tray-rebuild-race.md`](../learnings/tray-rebuild-race.md) |
+| [S09](./cases/shortcuts-and-input.md#s09--quick-window-patch-runs-only-on-kde-post-406-gate) | Quick window patch runs only on KDE | ✓ | [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406) |
+| [S10](./cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame) | Quick Entry popup is transparent | ? | [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370), [#223](https://github.com/aaddrick/claude-desktop-debian/issues/223) |
+
+### GNOME
+
+Applies to: GNOME, Ubu (Ubuntu's GNOME), and any other mutter session.
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S11](./cases/shortcuts-and-input.md#s11--quick-entry-shortcut-fires-from-any-focus-on-wayland-mutter-xwayland-key-grab) | Quick Entry shortcut fires from any focus | ✗ on GNOME, 🔧 on Ubu | [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406) |
+| [S12](./cases/shortcuts-and-input.md#s12----enable-featuresglobalshortcutsportal-launcher-flag-wired-up-for-gnome-wayland) | `--enable-features=GlobalShortcutsPortal` wired up | ? | [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404) |
+
+### Omarchy
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S13](./cases/tray-and-window-chrome.md#s13--hybrid-topbar-shim-survives-omarchys-ozone-wayland-env-exports) | Hybrid topbar shim survives Omarchy's Ozone-Wayland env exports | ✗ | [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538) |
+
+### Niri
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S14](./cases/shortcuts-and-input.md#s14--global-shortcuts-via-xdg-portal-work-on-niri) | Global shortcuts via XDG portal work on Niri | ✗ | Captured; not yet filed |
+
+### AppImage
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S15](./cases/distribution.md#s15--appimage-extraction---appimage-extract-works-as-documented-fallback) | AppImage extraction (`--appimage-extract`) works as fallback | ? | — |
+| [S16](./cases/distribution.md#s16--appimage-mount-cleans-up-on-app-exit) | AppImage mount cleans up on app exit | ? | — |
+
+### Linux launcher / `.desktop` env handling
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S17](./cases/platform-integration.md#s17--app-launched-from-desktop-inherits-shell-path) | App launched from `.desktop` inherits shell `PATH` | ? | — |
+| [S18](./cases/platform-integration.md#s18--local-environment-editor-persists-across-reboot) | Local environment editor persists across reboot | ? | — |
+| [S19](./cases/routines.md#s19--claude_config_dir-redirects-scheduled-task-storage) | `CLAUDE_CONFIG_DIR` redirects scheduled-task storage | ? | — |
+
+### Idle-sleep / suspend
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S20](./cases/routines.md#s20--keep-computer-awake-inhibits-idle-suspend) | "Keep computer awake" inhibits idle suspend | ? | — |
+| [S21](./cases/routines.md#s21--lid-close-still-suspends-per-os-policy) | Lid-close still suspends per OS policy | ? | — |
+
+### Computer Use (Linux: out-of-scope per upstream)
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S22](./cases/platform-integration.md#s22--computer-use-toggle-is-absent-or-visibly-disabled-on-linux) | Computer-use toggle is absent or visibly disabled | ? | — |
+| [S23](./cases/platform-integration.md#s23--dispatch-spawned-sessions-dont-soft-lock-on-a-never-approvable-computer-use-prompt) | Dispatch sessions don't soft-lock on never-approvable prompt | ? | — |
+
+### Dispatch
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S24](./cases/platform-integration.md#s24--dispatch-spawned-code-session-appears-with-badge-and-notification) | Dispatch-spawned Code session appears with badge + notification | ? | — |
+| [S25](./cases/platform-integration.md#s25--mobile-pairing-survives-linux-session-restart) | Mobile pairing survives Linux session restart | ? | — |
+
+### Auto-update vs. system package manager
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S26](./cases/distribution.md#s26--auto-update-is-disabled-when-installed-via-apt--dnf) | Auto-update is disabled when installed via `apt` / `dnf` | ? | — |
+
+### Plugin / worktree storage
+
+| ID | Test | Status | Notes |
+|----|------|--------|-------|
+| [S27](./cases/extensibility.md#s27--plugins-install-per-user-not-into-system-paths) | Plugins install per-user, not into system paths | ? | — |
+| [S28](./cases/extensibility.md#s28--worktree-creation-surfaces-clear-error-on-read-only-mounts) | Worktree creation surfaces clear error on read-only mounts | ? | — |
+
+## Known failures rollup
+
+Tests currently `✗` somewhere — investigation priority order:
+
+| Test | Failing on | Root cause |
+|------|------------|------------|
+| [T05 / S06](./cases/shortcuts-and-input.md#s06--url-handler-doesnt-segfault-on-native-wayland) | Sway | URL handler subprocess SIGSEGV on native Wayland — `Failed to connect to Wayland display` |
+| [T06 / S11](./cases/shortcuts-and-input.md#s11--quick-entry-shortcut-fires-from-any-focus-on-wayland-mutter-xwayland-key-grab) | GNOME | mutter doesn't honour XWayland-side key grab |
+| [T06 / S14](./cases/shortcuts-and-input.md#s14--global-shortcuts-via-xdg-portal-work-on-niri) | Niri | `BindShortcuts` returns error code 5 |
+| [T07 / S13](./cases/tray-and-window-chrome.md#s13--hybrid-topbar-shim-survives-omarchys-ozone-wayland-env-exports) | Hypr-O | Hybrid topbar shim partial render under Omarchy's Ozone-Wayland env exports |
+| [T13 / S05](./cases/launch.md#t13--doctor-reports-correct-package-format) | every Fedora row | Doctor only checks dpkg, false-flags every dnf install as AppImage |
+| [S01](./cases/distribution.md#s01--appimage-launches-without-manual-libfuse2t64-install) | Ubuntu 24.04 | AppImage requires `libfuse2t64`; not auto-pulled |
+
+## Notes on the current state
+
+- Most cells are `?` because every captured VM in the recent test session ran the **released** build (`dnf install` / `apt install` / current AppImage), which predates [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538). Topbar verification (T07) on the VM rows specifically requires a branch build deployed before any cell can flip from `?`.
+- KDE-W status reflects @aaddrick's daily-driver host (Nobara KDE Plasma Wayland) where multiple features have been in continuous use.
+- Hypr-N status reflects @typedrat's report on [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538) ("Working great on NixOS with Hyprland").
+- Hypr-O status reflects @lukedev45's broken-case report on [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538) (partial render, root cause unconfirmed but Omarchy-env-specific — see [S13](./cases/tray-and-window-chrome.md#s13--hybrid-topbar-shim-survives-omarchys-ozone-wayland-env-exports)).
+- T13 is `✗` on every Fedora row because the dpkg false-flag is a deterministic property of the doctor script, not a per-environment failure mode. It will flip to `✓` everywhere once the doctor learns to detect rpm/dnf installs.
+- T15–T39 are derived from upstream Claude Code Desktop docs (`code.claude.com/docs/en/desktop*`) — features whose Linux behavior is officially undocumented (the docs explicitly state "Linux is not supported" for the Code tab). All cells start as `?` because the upstream Code-tab feature surface has not been systematically exercised on the patched Linux build.
--- a/docs/testing/quick-entry-closeout.md
+++ b/docs/testing/quick-entry-closeout.md
@@ -0,0 +1,225 @@
+# Quick Entry Closeout — Test Plan
+
+Focused sweep plan for closing the three open Quick Entry issues:
+
+- [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393) — Submit doesn't open the main window (Ubuntu 24.04 GNOME and friends). Mitigated by [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406)'s KDE-only gate; root cause is `BrowserWindow.isFocused()` returning stale-true on Linux Electron.
+- [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404) — Shortcut doesn't fire from unfocused state on Fedora 43 GNOME. mutter no longer honours XWayland-side key grabs. Fix path: wire `--enable-features=GlobalShortcutsPortal` into the launcher on GNOME Wayland.
+- [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370) — Opaque square frame behind the transparent Quick Entry popup on KDE Wayland. Bisected to Electron 41.0.4 (electron/electron#50213); upstream regression. Workarounds in `frame-fix-wrapper.js` not yet attempted.
+
+This doc is a **sweep plan**, not a test catalog. Test bodies and diagnostics live in [`cases/`](./cases/); the live status dashboard lives in [`matrix.md`](./matrix.md). The 21 `QE-*` items below map to existing `T*` / `S*` IDs where possible, and call out gaps to add as new `S*` cases.
+
+## Goal
+
+Pass all `QE-*` items in [§ Test list](#test-list) on every row in [§ Mandatory matrix](#mandatory-matrix). When that holds, all three issues are closeable (or, for #370, demonstrably blocked on upstream Electron with reproducible evidence).
+
+## Upstream design intent
+
+Read this before reading the test list. Several `QE-*` rows test things upstream does not actually promise — those tests are still valuable as black-box behavior checks, but the calibration of "expected" matters.
+
+Source for everything below: `build-reference/app-extracted/.vite/build/index.js`. Symbol names (`h1`, `ut`, `Ko`, `ynt`, `nde`, `g3A`, `u7A`) drift between releases — anchor on shape, not name.
+
+### What upstream promises
+
+- **Global shortcut** registered via Electron `globalShortcut.register()` (`:499416`). No app-focus gate — fires regardless of which app is focused.
+- **Popup is lazily created** on first shortcut press (`if (!Ko || ...) Ko = new BrowserWindow(...)` near `:515375`). The popup `BrowserWindow` is constructed on demand, not at app startup. This is what makes QE-4 (closed-to-tray) work.
+- **Position memory:** popup position persists across invocations via `an.get("quickWindowPosition")` (`:515491-515526`), keyed on monitor label + resolution. If the original monitor is gone, falls back to primary display.
+- **Submit always creates a NEW chat session** when no `chatId` is provided (`ynt(e)` at `:515546`). Quick Entry never appends to an existing conversation.
+- **Click-outside dismiss** is wired in the main process via the popup `blur` handler (`Ko.on("blur", () => g3A(null))` at `:515465`).
+- **Popup survives main-window close.** If the user closes the main window via the X button (not full quit), `!ut || ut.isDestroyed()` guards at `:515595` skip the `show()/focus()` calls; the popup itself remains functional.
+- **Window construction** sets `transparent: true`, `backgroundColor: "#00000000"`, `frame: false`, `alwaysOnTop: true` (level `"pop-up-menu"`), `skipTaskbar: true`, `resizable: false`, `show: false` (`:515375-515397`). `hasShadow: Zr` and `type: Zr ? "panel" : void 0` are macOS-only (`Zr === process.platform === "darwin"`).
+
+### What upstream does NOT promise
+
+- **Workspace migration.** No `setVisibleOnAllWorkspaces()`, no `moveTop()`, no `setWorkspace()` is called anywhere in the Quick Entry submit path. Whether the main window comes to the user's current workspace or stays on its own is purely a compositor decision driven by `mainWin.show()` + `mainWin.focus()`. **Linux/Wayland behavior here is not part of the upstream feature spec.**
+- **Restore from minimized.** No `restore()` call in the submit path. `show()` un-minimizes on most WMs; whether it does on a given Wayland compositor is up to that compositor.
+- **Multi-monitor placement on cursor / focused display.** Upstream uses last-saved position or primary display, never "where the user is right now."
+- **Multi-window targeting.** All `show`/`focus` calls go through `ut` (the main window). If the user has multiple windows, behavior is undefined.
+- **Popup re-creation if its `BrowserWindow` is destroyed.** Upstream does not re-construct `Ko` after destroy — it's only created on first shortcut press.
+- **Compositor-aware behavior.** Upstream has no concept of "GNOME vs KDE vs wlroots." Anywhere our patches branch on `XDG_CURRENT_DESKTOP`, that's our project compensating for compositor-specific Electron breakage, not implementing an upstream-defined contract.
+
+### Edge case: fullscreen main window
+
+`:525287-525290` reads (paraphrased): *"if `ut` exists and `ut.isFullScreen()` is true, focus `ut` and call `ide()`; else show the Quick Entry popup."* So if the main window is fullscreen when the shortcut fires, **the popup does not appear** — the shortcut focuses the main window instead. QE-1 needs this caveat.
+
+### Edge case: `h1()` is a *don't-show-if-already-focused* optimization
+
+The visibility-check function (`h1()` at `:105164-105171`) is upstream's mechanism for "don't redundantly call `show()` if the main window is already focused." Sound design. The reason it's broken on Linux is Electron's `BrowserWindow.isFocused()` returning stale-true after `hide()` on Linux backends — i.e., **the patch we apply is fixing a Linux-Electron bug, not diverging from upstream intent.** Once `isFocused()` returns honest values on Linux, the patch could be retired.
+
+## Test list
+
+Each item is a single check. Severity tier matches the existing scaffolding (Critical / Should / Smoke). Existing test ID in parentheses — `(new)` means this item should be added to [`cases/shortcuts-and-input.md`](./cases/shortcuts-and-input.md) before this sweep is reproducible by anyone else.
+
+### Shortcut activation — covers #404
+
+| ID | Severity | Step | Expected | Existing |
+|----|----------|------|----------|----------|
+| QE-1 | Smoke | App focused (not fullscreen), press shortcut | Popup appears. **Edge case from upstream design:** if main window is fullscreen, the shortcut focuses main and runs `ide()` instead of showing the popup (`:525287-525290`). Test this fullscreen variant separately as QE-1b — popup should *not* appear. | [S34](./cases/shortcuts-and-input.md#s34--quick-entry-shortcut-focuses-fullscreen-main-window-instead-of-showing-popup) (QE-1b only) |
+| QE-2 | Critical | Other app focused, press shortcut | Popup appears | [T06](./cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused), [S11](./cases/shortcuts-and-input.md#s11--quick-entry-shortcut-fires-from-any-focus-on-wayland-mutter-xwayland-key-grab) |
+| QE-3 | Critical | App on a different workspace, press shortcut | Popup appears on current workspace | [T06](./cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused) |
+| QE-4 | Critical | App closed-to-tray (no window mapped), press shortcut | Popup appears | [S29](./cases/shortcuts-and-input.md#s29--quick-entry-popup-is-created-lazily-on-first-shortcut-press-closed-to-tray-sanity) |
+| QE-5 | Should | App quit entirely, press shortcut | No popup, no error, no zombie process | [S30](./cases/shortcuts-and-input.md#s30--quick-entry-shortcut-becomes-a-no-op-after-full-app-exit) |
+| QE-6 | Should | Inspect Electron argv via `cat /proc/$(pgrep -f 'app\.asar')/cmdline \| tr '\0' ' '` (the launcher script also matches `claude-desktop`, so anchor on `app.asar` to hit the Electron process). Cross-check launcher log line `Using X11 backend via XWayland (for global hotkey support)` vs `Using native Wayland backend (global hotkeys may not work)` (verbatim from `scripts/launcher-common.sh:98, 102`). | **Pre-S12 fix:** flag absent; shortcut fails on GNOME Wayland (this is the #404 repro). **Post-S12 fix:** `--enable-features=GlobalShortcutsPortal` present in argv on GNOME Wayland; QE-2 / QE-3 begin to pass. | [S12](./cases/shortcuts-and-input.md#s12----enable-featuresglobalshortcutsportal-launcher-flag-wired-up-for-gnome-wayland) |
+
+### Submit → main window — covers #393
+
+| ID | Severity | Step | Expected | Existing |
+|----|----------|------|----------|----------|
+| QE-7 | Smoke | Main window visible, submit prompt from QE | Popup closes; main window navigates to a **new** chat session (not appended to current chat — `ynt(e)` at `:515546` always creates new). | [S31](./cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) |
+| QE-8 | Critical | Main window minimized, submit | **Upstream calls `show() + focus()` only — no `restore()`.** Whether the WM un-minimizes is compositor-dependent. Test as black-box: record whether the new chat is reachable to the user (window comes back to view, OR user has to click tray/dock to see it). Both outcomes are upstream-acceptable; only "new chat created but unreachable" is a regression. | [S31](./cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) |
+| QE-9 | Critical | Main window hidden-to-tray (after [T08](./cases/tray-and-window-chrome.md#t08--hide-to-tray-on-close)), submit | Same as QE-8 — `show()` should re-map a hidden window on most compositors, but upstream doesn't guarantee it. The new chat must be reachable; the path to reach it (auto vs tray-click) is compositor-dependent. | [S31](./cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) |
+| QE-10 | Should | Main window on different workspace, submit | **Upstream has no workspace logic** (no `setVisibleOnAllWorkspaces`, no `moveTop`). Outcome is whatever the compositor decides on `show()` + `focus()`. Record observed behavior per row; do not treat any single outcome as the "right" one. | [S31](./cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) |
+| QE-11 | Critical | **GNOME-specific (Andrej730 repro):** App in tray, *not* present in Dash/dock, submit | Main window opens. The codebase doesn't reason about Dash presence — this is purely a compositor-observed state. The underlying failure is `BrowserWindow.isFocused()` returning stale-true on GNOME mutter, which causes the patched (KDE) code path's `h1() || ut.show()` chain to short-circuit before `show()`. Test as a black-box repro. | [S32](./cases/shortcuts-and-input.md#s32--quick-entry-submit-on-gnome-mutter-doesnt-trip-electron-stale-isfocused) |
+| QE-12 | Should | App in tray, *also* present in Dash/dock, submit | Main window opens (this state should not trip the stale-focus bug, but verify) | [S32](./cases/shortcuts-and-input.md#s32--quick-entry-submit-on-gnome-mutter-doesnt-trip-electron-stale-isfocused) |
+| QE-13 | Smoke | Submit prompt with 1-2 chars (`hi`) | Upstream silently drops. The actual gate is `> 2` chars at `index.js:515530, 515533` — anything 3+ submits. So `hi` (2) drops, `hel` (3) submits. Document, do not fix. | — |
+
+### Visual / window appearance — covers #370
+
+| ID | Severity | Step | Expected | Existing |
+|----|----------|------|----------|----------|
+| QE-14 | Should | Inspect popup background | Transparent; no opaque square frame visible behind the rounded UI. **Note:** upstream already sets `transparent: true` and `backgroundColor: "#00000000"` (`:515380, :515383`), so the #370 triage-bot suggestion to "try setting backgroundColor to transparent" is moot — those are already in place. The Electron 41.0.4 regression is at the CSD/shadow rendering layer below those flags, not at the option-passing layer. | [S10](./cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame) |
+| QE-15 | Smoke | Inspect popup chrome | No titlebar, no close/min/max buttons (frameless) | [`ui/quick-entry.md`](./ui/quick-entry.md) |
+| QE-16 | Smoke | Inspect popup edges | Drop shadow + rounded corners render (compositor-dependent — note where missing) | [`ui/quick-entry.md`](./ui/quick-entry.md) |
+| QE-17 | Smoke | Open popup, then click on another window | Popup stays above (always-on-top) | [`ui/quick-entry.md`](./ui/quick-entry.md) |
+| QE-18 | Should | `electron --version` against the running app's bundled binary; record version in matrix | When > 41.0.4 ships and #370 still reproduces, the upstream-regression hypothesis is wrong | [S33](./cases/shortcuts-and-input.md#s33--quick-entry-transparent-rendering-tracked-against-bundled-electron-version) |
+
+### Patch-application sanity — regression prevention
+
+| ID | Severity | Step | Expected | Existing |
+|----|----------|------|----------|----------|
+| QE-19 | Critical | **All rows.** Extract the installed `app.asar` (`npx asar extract /usr/lib/claude-desktop/app.asar /tmp/inspect-installed`) and grep the bundled JS for the KDE gate string injected by the patch: `grep -c 'XDG_CURRENT_DESKTOP' /tmp/inspect-installed/.vite/build/index.js`. The patch (`scripts/patches/quick-window.sh:34-35, 117-118`) injects `(process.env.XDG_CURRENT_DESKTOP\|\|"").toLowerCase().includes("kde")` — that string is the runtime fingerprint. Note: the `Patched quick window` / `WARNING: No quick entry show() calls patched` lines from the patch are **build-time stdout** (not in `launcher.log`); check the build log if you built locally. | Bundled JS contains the KDE gate string (patch ran at build time). The patch ships in every build; the KDE-vs-non-KDE branch is decided at runtime by the env-var check. **Runtime gate effectiveness is verified implicitly by QE-7 through QE-12 passing on KDE and the unpatched-equivalent path running on non-KDE.** | [S09](./cases/shortcuts-and-input.md#s09--quick-window-patch-runs-only-on-kde-post-406-gate) |
+
+### Input behavior smoke — catches collateral breakage
+
+| ID | Severity | Step | Expected | Existing |
+|----|----------|------|----------|----------|
+| QE-21 | Smoke | In popup: `Esc` dismisses; click-outside dismisses; `Shift+Enter` inserts newline; `Enter` submits | All four behave as labelled. **Implementation notes for diagnostics:** click-outside is wired in the **main process** via the popup's `blur` handler (`:515465`). `Esc` / `Enter` / `Shift+Enter` are **renderer-side** (not visible in `index.js`); they go through IPC to `requestDismiss()` (`:515409`) and `requestDismissWithPayload()`. If a dismiss key fails, isolate which side is broken before reporting. | [`ui/quick-entry.md`](./ui/quick-entry.md) |
+
+### Popup placement & lifecycle — upstream contract sanity
+
+These verify upstream-promised behaviors that aren't directly broken by #393/#404/#370 but live in the same surface area. Failures here would indicate a separate regression — file a new issue rather than folding it into the close-out trio.
+
+| ID | Severity | Step | Expected | Existing |
+|----|----------|------|----------|----------|
+| QE-22 | Should | Invoke Quick Entry. Note popup position. Dismiss (Esc). Quit Claude Desktop entirely (`pkill -f app.asar` after closing the main window, or via tray → Quit). Re-launch. Invoke Quick Entry. | Popup reappears at the same monitor + position as before the restart. Upstream persists position via `an.get("quickWindowPosition")` (`:515491-515526`), keyed on monitor label + resolution. Position must survive a full app restart, not just dismiss/re-invoke. | [S35](./cases/shortcuts-and-input.md#s35--quick-entry-popup-position-is-persisted-across-invocations-and-across-app-restarts) |
+| QE-23 | Smoke | **Multi-monitor required.** With an external monitor connected, invoke Quick Entry on the external monitor — let the position be saved (trigger QE-22's persistence path). Disconnect the external monitor (libvirt: `virsh detach-device` for the second display, or unplug the host monitor passing through). Invoke Quick Entry. | Popup falls back to the primary display via `cHn()` (`:515502`). Does **not** appear at off-screen coordinates. Skip this row in single-monitor VMs. | [S36](./cases/shortcuts-and-input.md#s36--quick-entry-popup-falls-back-to-primary-display-when-saved-monitor-is-gone) |
+| QE-24 | Should | Launch app, focus main window, then **destroy** the main window without quitting the app. On this project the X button hide-to-tray override means the standard close path won't destroy `ut`; force the destroy via a) DevTools console (`Cmd+Opt+I` / `Ctrl+Shift+I` → `require('electron').remote.getCurrentWindow().destroy()` if exposed), or b) accept that this case is unreachable on Linux without a code change and skip. After destroy, invoke Quick Entry, type, submit. | Popup remains functional (lazy-recreation on shortcut press; the `!ut \|\| ut.isDestroyed()` guard at `:515595` skips the show/focus block but does not crash). New chat creation may not have a window to surface in — if app remains running with no main window, this is the "popup outlives main" path upstream guarantees. **If unreachable on Linux, mark this row N/A and document why.** | [S37](./cases/shortcuts-and-input.md#s37--quick-entry-popup-remains-functional-after-main-window-destroy) |
+
+## Mandatory matrix
+
+The five rows below are the must-pass set to close all three issues. Display server is the **session selected at login** — KDE and GNOME both let you choose Wayland vs Xorg from the greeter.
+
+| Row | Distro | DE | Display server | Closes / verifies | Reporter |
+|-----|--------|----|--------------:|-------------------|----------|
+| **GNOME-W** | Fedora 43 Workstation | GNOME 49.x | Wayland | #404 (S11/S12), #393 (QE-11/QE-12) | @gianluca-peri (#404), @Andrej730 (#393 root cause) |
+| **Ubu-W** | Ubuntu 24.04 LTS | GNOME (Ubuntu) | Wayland | #393 close-out (post-#406 gate). Also catches the `XDG_CURRENT_DESKTOP=ubuntu:GNOME` quirk (S02) | @Andrej730 |
+| **KDE-W** | Fedora 43 KDE *or* Nobara 43 KDE | Plasma 6 | Wayland | #370 (S10), QE-19 patch sanity, daily-driver regression baseline | @noctuum (#370), aaddrick |
+| **GNOME-X** | Ubuntu 24.04 (GNOME on Xorg session at greeter) | GNOME | Xorg | Differentiates whether #404 is mutter-as-compositor or mutter-XWayland-grabs specifically. **Note:** Fedora 43 GNOME may not ship an X11 session anymore (GNOME 49 deprecation); use Ubuntu's GNOME-on-Xorg session instead. | — |
+| **KDE-X** | Fedora 43 KDE (Plasma X11 session at greeter) | Plasma 6 | Xorg | Catches kwin-X11 specifics; regression baseline for the historic working path | — |
+
+## Strongly recommended
+
+Catches generalization gaps but not blocking close-out.
+
+| Row | Distro | DE | Display server | Why |
+|-----|--------|----|--------------:|------|
+| **COSMIC** | popOS 24.04 (COSMIC alpha) | COSMIC | Wayland | @davidsmorais reported #393 there; not covered by KDE or GNOME branches |
+| **Ubu-X** | Ubuntu 24.04 (GNOME on Xorg) | GNOME | Xorg | Already counted under GNOME-X above. Listed here too because the Ubuntu install base is large — counts as its own row in the dashboard |
+
+## Optional
+
+Tracked under different bugs ([S06](./cases/shortcuts-and-input.md#s06--url-handler-doesnt-segfault-on-native-wayland), [S14](./cases/shortcuts-and-input.md#s14--global-shortcuts-via-xdg-portal-work-on-niri)) — skip unless closing those in the same sweep.
+
+| Row | DE | Tracked under |
+|-----|----|--------------:|
+| Sway | wlroots | S06 |
+| Niri | wlroots | S14 |
+| Hypr-N (Omarchy) | wlroots | per @typedrat |
+| Hypr-O | Hyprland Xorg | per @typedrat |
+| i3 | Xorg | matrix |
+
+## VM inventory
+
+Existing host: `~/vms/` (libvirt, qcow2 images on a separate root-owned dir). Per-VM creation scripts in `~/vms/scripts/`. Per-VM test protocol in [`~/vms/README.md`](file:///home/aaddrick/vms/README.md).
+
+### Have
+
+| Row | VM image | Status |
+|-----|----------|--------|
+| GNOME-W | `claude-fedora43-gnome.qcow2` | Ready |
+| Ubu-W | `claude-ubuntu-2404.qcow2` | Ready |
+| KDE-W | `claude-fedora43-kde.qcow2` | Ready (Nobara KDE on the bare-metal host is the alternative) |
+| GNOME-X | `claude-ubuntu-2404.qcow2` | Ready (use the GNOME-on-Xorg session at the greeter — same VM as Ubu-W) |
+| KDE-X | `claude-fedora43-kde.qcow2` | Ready (use the Plasma X11 session at the greeter — same VM as KDE-W) |
+
+### Need to add for full mandatory + recommended coverage
+
+| Row | What | Why |
+|-----|------|-----|
+| **COSMIC** | popOS 24.04 (COSMIC alpha) ISO + `~/vms/scripts/create-popos-cosmic.sh` | Davidsmorais's #393 environment; otherwise unrepresented |
+
+### Need to add only if closing optional rows in the same sweep
+
+| Row | What | Use existing | Why |
+|-----|------|--------------|-----|
+| Niri | Fedora-Niri-Live ISO + `~/vms/scripts/create-fedora-niri.sh` | — | S14 (`BindShortcuts` error 5) |
+| Hypr-N | Possibly already covered by `claude-omarchy` | `claude-omarchy.qcow2` | Omarchy is a Hypr-N variant; may not exercise stock Hyprland |
+| Sway | `claude-fedora43-sway.qcow2` | Existing | S06 URL handler segfault |
+| i3 | `claude-fedora43-i3.qcow2` | Existing | Coverage only |
+
+## Minimum viable kill-set
+
+If the goal is the smallest pass that justifies closing all three issues:
+
+- **GNOME-W** — must pass QE-2/3/4/6/7/8/9/11 → closes #404, half of #393.
+- **Ubu-W** — must pass QE-7/8/9/11 → closes other half of #393.
+- **KDE-W** — must pass QE-7/8/9 + QE-14 + QE-19 → closes #370 (or punts upstream with QE-18 evidence) and confirms the gated patch path still works.
+
+(QE-20 has been folded into QE-19 — the patch ships in every build, so a single bundled-JS check covers both KDE and non-KDE rows.)
+
+Three VMs, ~21 items per row, one full sweep ≈ 90 minutes if the visual checks are batched.
+
+## Per-row pass criteria
+
+| Issue | Closeable when |
+|-------|----------------|
+| #393 | QE-7 through QE-12 pass on **GNOME-W**, **Ubu-W**, and **KDE-W**. QE-19 confirms the patch was applied at build (KDE gate string present). If QE-11 fails on GNOME-W, the KDE-only gate is preserved as a permanent fix; otherwise the patch can be widened. |
+| #404 | QE-2 and QE-3 pass on **GNOME-W**. QE-6 confirms the launcher actually appended `--enable-features=GlobalShortcutsPortal` on GNOME Wayland (S12). |
+| #370 | QE-14 passes on **KDE-W**. **OR** QE-18 records an Electron version > 41.0.4 in the bundled binary and QE-14 still fails — at that point the upstream-regression hypothesis is wrong and we re-investigate. |
+
+## Scaffold integration
+
+This sweep is fully wired into the existing test scaffold. The `QE-*` items in [§ Test list](#test-list) map onto formal `S##` test cases in [`cases/shortcuts-and-input.md`](./cases/shortcuts-and-input.md):
+
+| Case | Title | Backs |
+|------|-------|-------|
+| [S29](./cases/shortcuts-and-input.md#s29--quick-entry-popup-is-created-lazily-on-first-shortcut-press-closed-to-tray-sanity) | Popup created lazily on first shortcut press (closed-to-tray sanity) | QE-4 |
+| [S30](./cases/shortcuts-and-input.md#s30--quick-entry-shortcut-becomes-a-no-op-after-full-app-exit) | Shortcut becomes no-op after full app exit | QE-5 |
+| [S31](./cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) | Submit makes the new chat reachable from any main-window state | QE-7 through QE-10 |
+| [S32](./cases/shortcuts-and-input.md#s32--quick-entry-submit-on-gnome-mutter-doesnt-trip-electron-stale-isfocused) | Submit on GNOME mutter doesn't trip Electron stale-`isFocused()` | QE-11, QE-12 |
+| [S33](./cases/shortcuts-and-input.md#s33--quick-entry-transparent-rendering-tracked-against-bundled-electron-version) | Transparent rendering tracked against bundled Electron version | QE-18 |
+| [S34](./cases/shortcuts-and-input.md#s34--quick-entry-shortcut-focuses-fullscreen-main-window-instead-of-showing-popup) | Shortcut focuses fullscreen main instead of showing popup | QE-1b |
+| [S35](./cases/shortcuts-and-input.md#s35--quick-entry-popup-position-is-persisted-across-invocations-and-across-app-restarts) | Popup position persisted across invocations and across app restarts | QE-22 |
+| [S36](./cases/shortcuts-and-input.md#s36--quick-entry-popup-falls-back-to-primary-display-when-saved-monitor-is-gone) | Popup falls back to primary display when saved monitor is gone | QE-23 |
+| [S37](./cases/shortcuts-and-input.md#s37--quick-entry-popup-remains-functional-after-main-window-destroy) | Popup remains functional after main window destroy | QE-24 |
+
+UI-element-level checks for QE-14 through QE-17 and QE-21 live in [`ui/quick-entry.md`](./ui/quick-entry.md), which has been refined against the upstream evidence captured in [§ Upstream design intent](#upstream-design-intent).
+
+(QE-13, QE-21 don't need their own S-IDs — they're documentation items / already covered by `ui/quick-entry.md`.)
+
+## Sweep mechanics
+
+Per-row procedure (one full pass):
+
+1. Boot VM. Confirm session at greeter matches the row (Wayland vs Xorg, correct DE).
+2. Install the latest build:
+   - DEB: `sudo apt install ./claude-desktop_*.deb`
+   - RPM: `sudo dnf install ./claude-desktop-*.rpm`
+3. Capture environment baseline: `XDG_SESSION_TYPE`, `XDG_CURRENT_DESKTOP`, `gnome-shell --version` or `kwin --version`, `electron --version` (for QE-18).
+4. Launch app. Wait for main window. Run QE-21 input smoke first to catch obvious breakage early.
+5. Run shortcut tests (QE-1 → QE-6) in order. Each run, scrape `~/.cache/claude-desktop-debian/launcher.log` and `pgrep -af claude-desktop` argv.
+6. Run submit tests (QE-7 → QE-13). For each window-state precondition, set the state, then trigger Quick Entry, then submit.
+7. Run visual checks (QE-14 → QE-18). Screenshot QE-14 to attach to #370 if still failing.
+8. Run patch sanity (QE-19 / QE-20).
+9. Update [`matrix.md`](./matrix.md) status cells. Save logs under a row-tagged subdirectory: `~/vms/collected/<row>-<date>/`.
+
+For the deeper #393 bisect (isolating which half of PR #390 regresses GNOME), see the two-variant build instructions in [`~/vms/README.md`](file:///home/aaddrick/vms/README.md) — build a blur-only and a vis-only variant, run QE-7 through QE-11 on each on **Ubu-W** and **GNOME-W**, gate the offending half rather than the whole patch.
--- a/docs/testing/runbook.md
+++ b/docs/testing/runbook.md
@@ -0,0 +1,343 @@
+# Testing Runbook
+
+*Last updated: 2026-05-03*
+
+How to run a test sweep, capture diagnostics, file failures, and update [`matrix.md`](./matrix.md). For the test specs themselves, see [`cases/`](./cases/) and [`ui/`](./ui/). For the automation harness, see [`automation.md`](./automation.md) and [`tools/test-harness/`](../../tools/test-harness/). For the grounding sweep workflow (verify case docs against the live build), see [Grounding sweep](#grounding-sweep) below.
+
+## When to sweep
+
+| Trigger | Scope | Rows |
+|---------|-------|------|
+| Release tag (`vX.Y.Z+claude...`) | Smoke set | KDE-W + Hypr-N (or Sway) |
+| Release tag, monthly | Smoke + Critical | All active rows |
+| Upstream Claude Desktop bump | Smoke set + [grounding sweep](#grounding-sweep) | KDE-W + one wlroots row |
+| PR touching `scripts/patches/*.sh` | Tests in the affected surface (use surface tags in cases files) | KDE-W minimum |
+| Bug report citing an env | The relevant test on the reporter's row | Just that row |
+
+## Setup: VM matrix
+
+Each non-host row in [`matrix.md`](./matrix.md) is a QEMU/KVM guest. Standard config:
+
+- 4 GB RAM, 2 vCPU minimum
+- virtio-gpu **with** `gl=on` (3D acceleration). On hybrid GPU hosts, pin `rendernode=/dev/dri/renderD129` (AMD); avoid renderD128 (NVIDIA, EGL init fails on aaddrick's laptop)
+- 32 GB qcow2 disk
+- Bridged networking
+- Virgil 3D enabled where possible (helps WebGL detection in T12)
+
+ISOs / images per row:
+
+| Row | Source |
+|-----|--------|
+| Fedora 43 (KDE-W, KDE-X, GNOME, Sway, i3, Niri) | https://fedoraproject.org/spins/ for KDE/GNOME, https://fedoraproject.org/sericea/ for Sway, manual install for i3/Niri |
+| Ubuntu 24.04 (Ubu) | https://ubuntu.com/download/desktop |
+| OmarchyOS (Hypr-O) | https://omarchy.org |
+| NixOS (Hypr-N) | https://nixos.org/download with Hyprland module |
+
+For the host (KDE-W), test against Nobara directly — no VM needed.
+
+## Setup: building the install candidate
+
+```bash
+# Build from the branch under test
+./build.sh --build appimage --clean no
+./build.sh --build deb --clean no
+./build.sh --build rpm --clean no
+
+# Or pull from CI artifacts for a tagged release
+gh run download <RUN_ID> -n claude-desktop-deb-amd64
+gh run download <RUN_ID> -n claude-desktop-rpm-amd64
+gh run download <RUN_ID> -n claude-desktop-appimage-amd64
+```
+
+Drop the resulting `.deb` / `.rpm` / `.AppImage` into a shared folder mounted into each guest, or `scp` per-guest.
+
+## Running a sweep: the standard loop
+
+For each test in scope:
+
+1. **Read the test spec** in `cases/<surface>.md` (or `ui/<surface>.md` for UI checklists). Note the `Severity`, `Steps`, and `Expected` sections.
+2. **Execute the steps** as described.
+3. **Compare against Expected.** Mark internally as `✓`, `✗`, `🔧`, or `?` (untested if you couldn't run it for env reasons; `-` if N/A).
+4. **On `✗`**: capture the diagnostics from the test's `Diagnostics on failure` block (see [diagnostic capture](#diagnostic-capture) below). File an issue if one isn't already linked.
+5. **Update [`matrix.md`](./matrix.md)** in a single PR per row per sweep, titled `test: <ROW> sweep YYYY-MM-DD`.
+
+## Diagnostic capture
+
+Standard captures referenced from test `Diagnostics on failure` blocks:
+
+### `--doctor` output
+
+```bash
+claude-desktop --doctor 2>&1 | tee /tmp/doctor.txt
+```
+
+Or for AppImage:
+
+```bash
+./claude-desktop-*.AppImage --doctor 2>&1 | tee /tmp/doctor.txt
+```
+
+### Launcher log
+
+```bash
+cat ~/.cache/claude-desktop-debian/launcher.log
+```
+
+Truncate and re-run if the file is stale:
+
+```bash
+: > ~/.cache/claude-desktop-debian/launcher.log
+claude-desktop 2>&1 | tee -a ~/.cache/claude-desktop-debian/launcher.log
+```
+
+### Session env
+
+```bash
+echo "XDG_SESSION_TYPE=$XDG_SESSION_TYPE"
+echo "XDG_CURRENT_DESKTOP=$XDG_CURRENT_DESKTOP"
+echo "WAYLAND_DISPLAY=$WAYLAND_DISPLAY"
+echo "DISPLAY=$DISPLAY"
+echo "GDK_BACKEND=$GDK_BACKEND"
+echo "QT_QPA_PLATFORM=$QT_QPA_PLATFORM"
+echo "OZONE_PLATFORM=$OZONE_PLATFORM"
+echo "ELECTRON_OZONE_PLATFORM_HINT=$ELECTRON_OZONE_PLATFORM_HINT"
+```
+
+### Tray / DBus state (KDE)
+
+```bash
+# List registered tray icons
+gdbus call --session --dest=org.kde.StatusNotifierWatcher \
+  --object-path=/StatusNotifierWatcher \
+  --method=org.freedesktop.DBus.Properties.Get \
+  org.kde.StatusNotifierWatcher RegisteredStatusNotifierItems
+
+# Find which process owns a connection
+gdbus call --session --dest=org.freedesktop.DBus \
+  --object-path=/org/freedesktop/DBus \
+  --method=org.freedesktop.DBus.GetConnectionUnixProcessID ":1.XXXX"
+```
+
+### Portal availability (Wayland)
+
+```bash
+systemctl --user status xdg-desktop-portal
+busctl --user tree org.freedesktop.portal.Desktop
+```
+
+### Suspend inhibitors
+
+```bash
+systemd-inhibit --list
+```
+
+### App version
+
+```bash
+claude-desktop --version
+gh variable get CLAUDE_DESKTOP_VERSION
+gh variable get REPO_VERSION
+```
+
+Always include the upstream version + project version in the issue body and the matrix-update commit message.
+
+## Filing failures
+
+Issue title format: `[<row>] <T## or S##>: <one-line symptom>`
+
+Issue body template:
+
+```markdown
+**Test:** [T17 — Folder picker opens](./docs/testing/cases/code-tab-foundations.md#t17--folder-picker-opens)
+**Environment:** GNOME (Fedora 43, Wayland)
+**Project version:** v1.3.23+claude1.4758.0
+**Upstream version:** 1.4758.0
+
+## Steps
+<paste from test spec>
+
+## Expected
+<paste from test spec>
+
+## Actual
+<observed behavior>
+
+## Diagnostics
+<--doctor output, launcher log, session env, anything else from the test's Diagnostics block>
+
+## Notes
+<any hypotheses, related PRs, recent regressions>
+```
+
+Link the issue back into [`matrix.md`](./matrix.md) on the affected cell using the standard format: `✗ #NNN`.
+
+## Updating the matrix
+
+One PR per sweep per row. Bundle every status change for that row into a single commit so the matrix history reads as a sequence of sweep events, not individual cell flips.
+
+Commit message template:
+
+```
+test(<row>): sweep <YYYY-MM-DD> — <project_version>+claude<upstream_version>
+
+- T01 ? → ✓
+- T03 ? → ✓
+- T05 ? → ✗ (filed #NNN)
+- T17 ? → ✓
+- ...
+```
+
+If the same sweep also turned up new tests worth adding, those go in a separate commit before the status update so the diff stays focused.
+
+## Severity guidance for new tests
+
+When adding a test to `cases/` or `ui/`, pick severity using these heuristics:
+
+| Tier | Pick when | Example |
+|------|-----------|---------|
+| Smoke | First-launch experience; if this fails the app is unusable for normal users | T01 (app launch), T03 (tray), T16 (Code tab loads) |
+| Critical | Feature is documented in upstream docs **and** breaks core workflows when broken | T22 (PR monitoring), T34 (connector OAuth), T17 (folder picker) |
+| Should | Quality-of-life or documented edge case; users hit it but have a workaround | T28 (catch-up after suspend), S26 (auto-update vs apt) |
+| Could | Niche, env-specific, or graceful-degradation checks | T39 (`/desktop` CLI N/A), S22 (computer-use toggle absent on Linux) |
+
+When in doubt, file as **Should**. Smoke and Critical mean release gates — be conservative about adding gates.
+
+## Adding a new test
+
+1. Pick the right surface file in `cases/` (or create one with prior buy-in if no existing surface fits — don't sprinkle new files lightly).
+2. Use the next free ID: highest `T##` + 1 for cross-env, highest `S##` + 1 for env-specific. Don't reuse retired IDs.
+3. Follow the standard structure: `**Severity:**`, `**Surface:**`, `**Applies to:**`, `**Steps:**`, `**Expected:**`, `**Diagnostics on failure:**`, `**References:**`.
+4. Add the row to [`matrix.md`](./matrix.md) with all-`?` initial state.
+5. Mention the new test in the PR description so reviewers know to read the spec.
+
+For UI checklist additions, append rows to the relevant `ui/<surface>.md` table. UI rows don't need `T##` / `S##` IDs — the surface file + element name is the identity.
+
+## Automated runs
+
+The harness at [`tools/test-harness/`](../../tools/test-harness/) drives any
+test with a `runner:` field. As of 2026-04-30, that's T01, T03, T04, T17.
+
+### Invoking a sweep
+
+```sh
+cd tools/test-harness
+npm install                       # first time only
+ROW=KDE-W ./orchestrator/sweep.sh
+```
+
+Output:
+
+- `results/results-${ROW}-${DATE}/junit.xml` — the JUnit summary (one
+  testsuite per `.spec.ts` file, with the test's annotations preserved as
+  metadata).
+- `results/results-${ROW}-${DATE}/test-output/<test>/` — per-test
+  attachments (screenshots, launcher log, session env, frame extents,
+  click-attempt diagnostics, etc.). Captured on every run, not just on
+  failure (Decision 7).
+- `results/results-${ROW}-${DATE}/html/` — Playwright's HTML report.
+- `results/results-${ROW}-${DATE}.tar.zst` — bundled artifact for
+  off-machine inspection (when `zstd` is available).
+
+`sweep.sh` prints a summary line at the end:
+
+```
+summary: tests=4 failures=0 errors=0 skipped=1
+```
+
+### Translating results to the matrix
+
+JUnit `<failure>` → `✗`, `<error>` (harness broke) → `?`, `<skipped>` →
+`-` (when intentionally not applicable) or stays `?` (when the test
+couldn't reach an assertion — common case for renderer tests that need
+sign-in or selectors that haven't been tuned). For now this mapping is
+manual: open `junit.xml`, update `matrix.md` cells, commit. A
+`render-matrix.sh` to do this automatically is on the to-do list.
+
+### Coexistence with manual tests
+
+Tests without a `runner:` continue to flow through the manual loop above.
+The matrix doesn't distinguish automated from manual cells — a `✓` is a
+`✓` regardless of how it was produced. The `runner:` field on each case
+makes the source-of-truth explicit per-test.
+
+### Path through the CDP auth gate (why this works)
+
+The shipped Electron exits if `--remote-debugging-port` is on argv
+without a valid `CLAUDE_CDP_AUTH` token. Both `_electron.launch()` and
+`chromium.connectOverCDP()` inject that flag. The harness sidesteps the
+gate by spawning Electron clean and attaching the Node inspector via
+`SIGUSR1` at runtime — same code path as `Developer → Enable Main
+Process Debugger`. From there, main-process JS evaluation reaches the
+renderer through `webContents.executeJavaScript()`. Full writeup:
+[`automation.md`](./automation.md#the-cdp-auth-gate-and-the-runtime-attach-workaround-that-beats-it).
+
+### Wayland-mode sweep
+
+Default backend is X11-via-XWayland (matches `launcher-common.sh`'s
+default). To sweep the suite under native Wayland, set
+`CLAUDE_HARNESS_USE_WAYLAND=1`:
+
+```sh
+CLAUDE_HARNESS_USE_WAYLAND=1 ROW=KDE-W ./orchestrator/sweep.sh
+```
+
+Every `launchClaude()` swaps to the Wayland flag set
+(`--ozone-platform=wayland` + WaylandWindowDecorations / IME / text-
+input-version=3, mirroring `scripts/launcher-common.sh:132-139`) and
+exports `CLAUDE_USE_WAYLAND=1` + `GDK_BACKEND=wayland` into the spawn
+env. Per-launch overrides via `launchClaude({ extraEnv })` still win,
+so a single test can opt back to X11 inside a Wayland-mode sweep.
+
+Caveat: T04 (`_NET_FRAME_EXTENTS` xprop check) only works under
+XWayland — native-Wayland sessions have no X11 client list, so T04
+will skip with a "no X11 client list" diagnostic.
+
+## Grounding sweep
+
+Separate from the test sweep. Where the test sweep verifies *upstream
+Linux compat behavior* against case specs, the grounding sweep
+verifies *the specs themselves* against upstream behavior — making
+sure the Steps and Expected fields haven't bit-rotted past what the
+shipped build actually does. Run on every upstream `CLAUDE_DESKTOP_VERSION`
+bump.
+
+### Static pass
+
+For each file under [`cases/`](./cases/), confirm every test's
+`**Code anchors:**` field still resolves and the Steps/Expected match
+behavior. The convention is documented in
+[`cases/README.md`](./cases/README.md#anchor-scope) — anchors are
+either upstream code (`build-reference/app-extracted/.vite/build/`),
+wrapper scripts (`scripts/`), v7 walker inventory, or out-of-scope
+(CLI binary, server-rendered SPA).
+
+When a test drifts, edit Steps/Expected in place. When a feature is
+gone from the build, prepend
+`> **⚠ Missing in build X.Y.Z** — <note>. Re-verify after next
+upstream bump.` under the test heading.
+[`cases-grounding-prompt.md`](./cases-grounding-prompt.md) is the
+fan-out prompt the last sweep used — paste verbatim into a fresh
+session to repeat the workflow.
+
+### Runtime pass
+
+Run [`tools/test-harness/grounding-probe.ts`](../../tools/test-harness/grounding-probe.ts)
+against the live build:
+
+```sh
+cd tools/test-harness
+npm run grounding-probe -- --launch --include-synthetic \
+  --out ../../docs/testing/cases-grounding-runtime.json
+```
+
+Captures runtime state for tests where static greps can't disambiguate
+(IPC handler registry, `globalShortcut.isRegistered()` for known
+accelerators, `app.getLoginItemSettings()`, `safeStorage`,
+`autoUpdater.getFeedURL()`, SNI tray registration, AX-tree fingerprint
+of whatever's on screen). Output is keyed by test ID — diff against
+the previous version's capture to spot drift the static pass missed.
+
+Surfaces inside modals or popups (T22 PR toolbar, T26 preset list,
+T31 side chat, T32 slash menu) need the surface open at probe time.
+Open the relevant view in the running app before re-running with
+`--port 9229` (attach mode).
--- a/docs/testing/runner-implementation-followup-prompt.md
+++ b/docs/testing/runner-implementation-followup-prompt.md
@@ -0,0 +1,238 @@
+# test-harness runner implementation — session 17 prompt
+
+This file is meant to be **copied verbatim into a fresh Claude Code
+session** as the initial user message. Don't paraphrase it; the
+orchestration depends on the exact directives below.
+
+> **ORCHESTRATION STOPPED AFTER SESSION 16.** This prompt is rotated
+> for completeness only. **Session 17 will NOT run automatically** —
+> the autonomous orchestration was halted at the end of session 16
+> after coverage stalled at 74/76 (97%) for four consecutive sessions
+> (13, 14, 15, 16). To resume, the user must manually trigger another
+> orchestration run AND meet at least one of these preconditions:
+>
+> 1. **Real signed-in Claude Desktop running with `--inspect=9229`**
+>    on the dev box (debugger-attached, signed in, NOT a leaked test
+>    isolation). This unblocks Categories A (operon-mode probe) and
+>    B (Tier 3 read-only reframes that need auth-bearing renderer
+>    state).
+> 2. **A real claude.ai account fixture for write-side state.** The
+>    remaining 2 specs (matrix coverage 74/76 → 76/76) need real
+>    write-side state (e.g. an installed plugin to exercise
+>    `LocalPlugins.listSkillFiles`, or a deep-linked deferred install
+>    intent for T11). The Tier 3 destructive constraint
+>    (`Don't run destructive Tier 3 write-side tests`) explicitly
+>    forbids the harness constructing this state itself.
+> 3. **Renderer-drift event** that requires re-anchoring page-objects
+>    (e.g. claude.ai redesign breaks `findCompactPills`,
+>    `clickMenuItem`, etc.). Triggers a defensive-migration session.
+> 4. **New IPC surface** added by upstream that the harness should
+>    cover (e.g. a new `claude.web` interface, a new eipc method
+>    that's case-doc-anchored).
+>
+> If none of those preconditions hold, the orchestration should NOT
+> resume — further sessions will produce documentation-only or
+> marginal output. The structural ceiling of the harness without
+> real-account fixtures is 74/76 (97%); we're already there.
+
+You're picking up after session 16 of the test-harness runner
+implementation work. Session 16 was the final session of the
+sessions-13-to-16 orchestration run and produced: T17 verification
+(session-15 structural fix VERIFIED — bare 60s timeout gone, new
+failure mode at `openFolderPicker` post-`selectLocal` classified as
+renderer-state-dependent and deferred), schema-rev for
+`listRemotePluginsPage` / `listSkillFiles` (both schemas resolved by
+bundle inspection — neither shipped as a Tier 2 invocation because
+`listRemotePluginsPage` is not anchored in any case doc, and
+`listSkillFiles` needs Tier 3 destructive setup). NO coverage gain.
+Plan-doc updated. Followup-prompt rotated with the STOP flag (this
+document).
+
+The plan doc at
+[`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
+captures the tier classification and execution-time reclassifications.
+Its "Status (post-execution)" section is the source of truth for
+what's done and what's deferred — read **session 16** first, then
+**session 15**, **session 14**, **session 13**, **session 12**,
+**session 11**, **session 10**, **session 9**, **session 8**,
+**session 7**, **session 6**, **session 5**, **session 4**, **session
+3**, **session 2**, then **session 1** sub-sections.
+
+This session is a continuation, not a restart. Start by reading the
+plan doc's status sections AND verifying at least one of the
+preconditions above holds. If none hold, STOP and report; don't try
+to fan out.
+
+### Session 16 final findings (key context for any session-17 attempt)
+
+1. **T17's session-15 structural fix VERIFIED.** Bare 60s timeout is
+   gone. `seedFromHost` clones the host's signed-in config,
+   `waitForReady('userLoaded')` resolves to a post-login URL
+   (`https://claude.ai/epitaxy` on the dev box), the dialog mock
+   installs, and `CodeTab.activate({ timeout: 15_000 })` (session 14
+   migration) succeeds first try.
+2. **T17's NEW failure mode is renderer-state-dependent, not AX.**
+   After `selectLocal()` clicks the Local menuitem, the Select-folder
+   pill never appears within 4s. The URL during the run was
+   `/epitaxy` — the user's workspace route. The folder-picker UI
+   may only render on `/new` (or a fresh project), not on a workspace
+   already containing files. To unblock: navigate to `/new`
+   post-userLoaded BEFORE `openFolderPicker()`. NOT shipped session
+   16 — needs a careful navigation primitive that doesn't break
+   existing seedFromHost specs.
+3. **`openPill` / `clickMenuItem` migration STILL parked.** Session
+   16's T17 trace confirmed the env-pill open + Local click both
+   succeeded, ruling out the AX-polling-loop hypothesis once and for
+   all. Don't migrate those speculatively.
+4. **Schema-rev resolved both deferred validators.**
+   `CustomPlugins.listRemotePluginsPage(limit: number, offset:
+   number)`. `LocalPlugins.listSkillFiles(pluginId: string,
+   skillName: string, pluginContext?: opaque)`. Neither shipped as a
+   Tier 2 invocation: `listRemotePluginsPage` is not anchored in any
+   case doc; `listSkillFiles` needs Tier 3 destructive setup.
+5. **Coverage stalled at 74/76 (97%) for 4 consecutive sessions.**
+   Sessions 13-16 net deliverables: 1 primitive, 1 AX migration, 1
+   structural fix, 1 verification + 1 schema-rev investigation.
+   Without real-account fixtures, the harness's structural ceiling
+   is 74/76. The remaining 2 specs need real-account write-side
+   state.
+
+### What a future session 17 might attempt (only if preconditions hold)
+
+If precondition 1 (real signed-in debugger-attached Claude) holds:
+
+- **Operon-mode probe** (Category A from sessions 13-16). Run
+  `eipc-registry-probe.ts` against the user's Claude with operon mode
+  toggled on/off, capture the diff in registered channels. May
+  surface a new case-doc-coverable handler.
+- **Schema-rev smoke-test** for the session-16-resolved schemas
+  against the live debugger. `listRemotePluginsPage(limit: 10,
+  offset: 0)` should return an array shape; `listSkillFiles('some-
+  installed-plugin', 'some-skill')` would test the LocalPlugins
+  handler's auth path.
+
+If precondition 2 (real-account write-side fixture) holds:
+
+- **T11 runtime invocation.** With an installed plugin in
+  `~/.claude/plugins/`, the post-install state can be probed via
+  `listSkillFiles` and the slash-menu skills would assert the
+  case-doc claim "skills appear in the slash menu" (T11 step 3).
+- **T17 navigation fix.** Add a `/new` navigation primitive to
+  `claudeai.ts`'s `CodeTab` so `openFolderPicker` works on a fresh
+  project route. Verify T17 reaches the dialog mock fired assertion.
+
+If precondition 3 or 4 holds:
+
+- **Defensive page-object refactor.** Re-snapshot the AX tree at the
+  Customize panel and Plugin browser modal, refresh case-doc
+  inventory anchors, migrate any decayed selectors.
+
+### Termination signal interpretation
+
+If session 17 is triggered without any precondition met, the right
+move is the same as session 16's STOP recommendation: write a one-
+paragraph "preconditions not met, no work shipped" plan-doc update
+and terminate. Don't burn a session on documentation-only output.
+
+### Constraints to respect (unchanged from sessions 1-16)
+
+- Use `seedFromHost: true` for any auth-required spec — never
+  `CLAUDE_TEST_USE_HOST_CONFIG=1` / `isolation: null` (legacy shape
+  removed in session 15).
+- eipc handlers register on `webContents.ipc._invokeHandlers`, NOT
+  global `ipcMain._invokeHandlers`. Use `lib/eipc.ts`.
+- For arg validator schema-rev: smoke-test first, fall back to
+  bundle-grep on the rejection literal.
+- For AX-tree consumers: use `lib/ax.ts` (`snapshotAx` /
+  `waitForAxNode` / `waitForAxNodes`).
+- For call-site migrations to `waitForAxNode`: keep per-spec retry
+  budgets matching existing tuning.
+- `lib/input.ts` is X11-only. `lib/input-niri.ts` is Niri-only. CDP
+  auth gate is alive (runtime SIGUSR1 attach, never Playwright
+  `_electron.launch()`). BrowserWindow Proxy gotcha — use
+  `webContents.getAllWebContents()`. `skipUnlessRow()` always first.
+- No fixed sleeps. `retryUntil` from `lib/retry.ts`, Playwright
+  auto-wait, or `waitForAxNode` from `lib/ax.ts`.
+- Diagnostics on every run via `testInfo.attach()`. Tag with
+  `severity:` and `surface:` annotations.
+- Tabs in TS, ~80-char wrap.
+- Don't break existing runners. H01-H05 are the canaries.
+- `npm run typecheck` must stay clean.
+- Don't run destructive Tier 3 write-side tests.
+
+### Authoritative reference
+
+Read these in order before fanning out:
+
+- [`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
+  — tier classification + status sections.
+- [`tools/test-harness/README.md`](../../tools/test-harness/README.md)
+  — runner conventions, the 74-spec inventory, primitives in
+  `lib/`, isolation defaults.
+- [`docs/testing/cases/README.md`](cases/README.md) — case-doc
+  structure and the four anchor scopes.
+- [`tools/test-harness/src/lib/`](../../tools/test-harness/src/lib/)
+  — the existing primitives.
+- [`tools/test-harness/src/runners/`](../../tools/test-harness/src/runners/)
+  — every existing spec is a template.
+
+### Phase 0 — calibration (mandatory before fanning out)
+
+1. `cd tools/test-harness && npm run typecheck` — should pass.
+2. Check debugger ATTACHMENT QUALITY (not just port). `ss -tln |
+   grep ':9229'`. If port open, probe webContents via `evalInMain`:
+
+   ```ts
+   import { InspectorClient } from './src/lib/inspector.js';
+   const client = await InspectorClient.connect(9229);
+   const wcs = await client.evalInMain<unknown>(`
+     const { webContents } = process.mainModule.require('electron');
+     return webContents.getAllWebContents().map((w) => ({
+       id: w.id, url: w.getURL(), title: w.getTitle(),
+     }));
+   `);
+   console.log(wcs); client.close();
+   ```
+
+   If every URL is `/login` / `find_in_page` / `main_window`, treat
+   as soft-blocked for auth-required investigations.
+3. Disambiguate running Claude processes. `pgrep -af
+   "ozone-platform=x11.*app.asar"`; for each, inspect cmdline for
+   `user-data-dir`. Real Claude has
+   `~/.config/Claude` (or no user-data-dir flag); leaked test
+   isolations have `/tmp/claude-test-*`.
+4. **Verify at least one precondition for resuming the orchestration
+   holds.** If none hold, write a "no preconditions met" plan-doc
+   update and STOP. Don't fan out.
+
+### Operational notes
+
+- For the bundle-grep schema-rev pattern (sessions 9, 11, 12, 16
+  precedents):
+
+  ```bash
+  cd tools/test-harness && node -e "
+    const {extractFile} = require('@electron/asar');
+    const buf = extractFile(
+      '/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar',
+      '.vite/build/index.js'
+    );
+    const s = buf.toString('utf8');
+    const idx = s.indexOf('<rejection-literal>');
+    console.log(s.slice(Math.max(0, idx - 1500), idx + 500));
+  "
+  ```
+
+- For seedFromHost specs: host MUST have a signed-in Claude.
+  `seedFromHost`'s host-claude-kill semantics will tear down any
+  running Claude process — flag clearly in the report before
+  invoking when the user's real Claude is running.
+
+- For AX-tree polling: `lib/ax.ts`'s `waitForAxNode` /
+  `waitForAxNodes` for predicate-based polling.
+
+- The eipc-registry probe (`tools/test-harness/eipc-registry-probe.ts`)
+  is the dedicated tool for inspecting per-wc IPC handler state.
+
+Begin with Phase 0. Don't fan out until at least one of the
+preconditions for resuming the orchestration is verified to hold.
--- a/docs/testing/runner-implementation-plan.md
+++ b/docs/testing/runner-implementation-plan.md
--- a/docs/testing/ui-inventory-reconciliation.md
+++ b/docs/testing/ui-inventory-reconciliation.md
@@ -0,0 +1,597 @@
+# claude.ai UI Inventory Reconciliation
+
+*Generated against [`ui-inventory.json`](./ui-inventory.json) v6 (captured 2026-05-03, app version 1.5354.0, 383 entries).*
+*Reconciled 2026-05-02.*
+
+This file diffs the human-written claims in [`ui/`](./ui/) against the
+machine-captured ground-truth in [`ui-inventory.json`](./ui-inventory.json).
+
+It is one-shot output meant to drive human cleanup of `ui/*.md` — re-run
+the reconciliation script (TODO: not yet built) after major walker passes.
+
+## Reading this document
+
+Three categories of finding per surface:
+
+- **In docs but not in renderer** — the doc names an element that has no
+  corresponding inventory entry. Possible causes (don't read this as "doc
+  is wrong"; the walker covers a subset of reality):
+  - **OS / window-manager element** — title bar, close/min/max buttons,
+    drop shadow, resize edges. These are drawn by the compositor, not by
+    claude.ai's renderer; the walker can't see them.
+  - **Out of renderer scope** — tray menu, libnotify notifications, IME
+    composition popups, Quick Entry popup window. These are main-process
+    or DE-level surfaces that don't exist in the claude.ai DOM.
+  - **Walker coverage gap** — Settings overlay, dialogs, deep Code-tab
+    panes (terminal, file pane, diff). The walker drilled some surfaces
+    but not others; absence here is "not yet observed" not "not present."
+  - **Account-state-dependent** — features that don't appear on this
+    user's plan (e.g. SSH connections panel, managed-settings rows,
+    specific Code-tab pane types).
+  - **Speculative** — doc was written from upstream behavior, not from a
+    Linux build. May not actually render.
+- **In renderer but not in docs** — inventory captured an element that no
+  doc row mentions. Either the doc is incomplete for that surface, or the
+  element is tangential (search-results recency rows, instance-suffix
+  duplicates with `#2`/`+5` markers).
+- **Fingerprint potentially drifted** — doc and inventory agree on the
+  element but the doc's selector hint disagrees with the inventory's
+  `fingerprint.selector`. Most `ui/*.md` rows use prose ("Top-left of
+  topbar") rather than CSS selectors, so this category is small.
+
+Human triage is what closes any of these. Don't auto-edit `ui/*.md`.
+
+## Summary
+
+| Metric | Count |
+|--------|-------|
+| Inventory entries (total) | 383 |
+| Inventory entries by kind | persistent 65 / structural 276 / menu 33 / instance 9 |
+| Inventory entries marked `denylisted: true` | 9 (Send×4, Install×4, Remove×1) |
+| `ui/*.md` files reconciled | 11 (10 surface files + README) |
+| `ui/*.md` rows reconciled (rough — multi-element rows complicate the count) | ~210 element rows across all 10 surface files |
+| Rows with confirmed inventory match | ~70 (~33%) |
+| Rows flagged "in docs but not in renderer" | ~140 (~67%) — heavily skewed by OS-frame, tray, notifications, deep Code panes, Settings, Quick Entry being out-of-renderer or under-walked |
+| Inventory entries with no `ui/*.md` mention | ~190 (~50%) — heavily skewed by per-conversation/per-skill/per-prompt-card structural rows that the docs treat as categories rather than enumerating |
+| Doc rows with explicit selectors that drift from inventory | 0 verified — `ui/*.md` rows almost never carry CSS selectors |
+
+Match counts are approximate. `ui/*.md` rows often describe categories
+("Recent conversations," "Per-history-entry hover") that map to many
+inventory entries; the inventory in turn enumerates structural elements
+the docs intentionally don't list (every project skill button, every
+search result option). The reconciliation is a triage signal, not a
+metric.
+
+## Per-surface breakdown
+
+### `ui/window-chrome-and-tabs.md`
+
+**Inventory surfaces likely covered:** none directly — OS window frame is
+drawn by the compositor; the in-app topbar elements live under `root` as
+`root.button.menu`, `root.button.collapse-sidebar`, `root.button.search`,
+`root.button.back`, `root.button.forward`. The "tab strip" maps to
+`root.button.chat`, `root.button.cowork`, `root.button.code`.
+
+**Doc rows reconciled:** ~22
+
+#### In docs but not in renderer
+
+| Doc element | Reason class |
+|-------------|--------------|
+| Title bar | OS / window-manager |
+| Close button (X) | OS / window-manager |
+| Minimize button | OS / window-manager |
+| Maximize / restore button | OS / window-manager |
+| Resize edges | OS / window-manager |
+| Window menu (right-click titlebar) | OS / window-manager |
+| Cowork ghost icon | Walker captures `root.button.cowork` (the tab) but not the ghost-icon visual within the topbar shim |
+| Drag region (gaps between buttons) | Renders as empty space — not an actionable element |
+| Active tab indicator | Visual styling, not an actionable element |
+| Tab badges (unread / Dispatch) | None observed; user state at capture had no badges |
+| About dialog | Walker did not surface a dialog; About is reachable only from app/tray menu, both out of renderer scope |
+| App menu (macOS-style) | Doc itself notes this is N/A on Linux |
+| Update prompt | Conditional, not present at capture |
+| Crash report dialog | Conditional, not present at capture |
+
+#### In renderer but not in docs
+
+| Inventory entry | Notes |
+|-----------------|-------|
+| `root.button.menu` ("Menu", `aria-label="Menu"`) | This is the doc's "Hamburger menu" — renamed |
+| `root.button.collapse-sidebar` ("Collapse sidebar") | Doc has "Sidebar toggle"; arguably the same |
+| `root.button.search` ("Search") | Doc's "Search icon"; same |
+| `root.button.back` / `root.button.forward` | Doc's back/forward arrows; same |
+| `root.a.skip-to-content` ("Skip to content") | A11y skip link; not in doc |
+| `root.button.new-chat-n` ("New chat⌘N") | Topbar new-chat button; not in doc |
+| `root.button.pinned`, `root.button.recents`, `root.button.projects`, `root.button.artifacts`, `root.button.customize` | Sidebar nav buttons; doc covers some of these in `sidebar.md` not here |
+| `root.button.awaaddrick-max` ("AWAaddrick·Max") | User/plan badge in topbar; not in doc |
+| `root.button.get-apps-and-extensions` | Topbar shortcut to apps page; not in doc |
+| `root.tab.write` / `root.tab.learn` / `root.tab.code` / `root.tab.from-calendar` / `root.tab.from-gmail` | Quick-prompt-template tabs in the prompt area; doc covers Write/Learn/Code as Chat/Cowork/Code tabs but the inventory's `root.tab.code` is distinct from `root.button.code` |
+
+#### Fingerprint potentially drifted
+
+None — doc rows for this surface use Location prose only.
+
+#### Notable cross-cut
+
+The doc's "Chat / Cowork / Code" tab strip maps cleanly to
+`root.button.chat`, `root.button.cowork`, `root.button.code`. But the
+inventory also has `root.tab.code` (a `[role="tab"]`, not a button) which
+is a separate element — the prompt-area template strip — that the doc
+conflates with the main Chat/Cowork/Code switcher. Worth a human note.
+
+---
+
+### `ui/tray.md`
+
+**Inventory surfaces covered:** none — the tray is a main-process Electron
+`Tray` object on the system SNI bus, not part of claude.ai's DOM.
+
+**Doc rows reconciled:** ~17
+
+#### In docs but not in renderer
+
+Every row, by design. Categories:
+
+- Tray icon (light / dark theme) — main-process `Tray.setImage()`
+- Right-click menu items (Show/Hide, Quick Entry, Open at Login,
+  Settings, About, Quit) — main-process `Menu.buildFromTemplate()`
+- Left-click / double-click / middle-click behaviors — main-process
+  event handlers
+- Tooltip on hover, position, icon resolution, theme switch — SNI
+  daemon and DE behavior
+
+This entire file is correctly out of renderer scope; the walker is doing
+the right thing by not capturing any of it.
+
+#### In renderer but not in docs
+
+N/A — surface mismatch.
+
+---
+
+### `ui/sidebar.md`
+
+**Inventory surfaces likely covered:** `root` (sidebar lives in the root
+chrome on claude.ai). Note: the doc opens "Code Tab Sidebar" but the
+sidebar in the captured renderer is the global claude.ai sidebar, not a
+Code-tab-specific one. The Code-tab-specific session list is captured
+separately under `root.button.code.button.new-session-n` (60 entries).
+
+**Doc rows reconciled:** ~18
+
+#### In docs but not in renderer
+
+| Doc element | Reason class |
+|-------------|--------------|
+| Filter: status / project / environment | Walker did not drill the filter dropdown |
+| Group-by control | Same — within Code-tab session list |
+| Session status indicator (idle/running/...) | Visual decoration on row, not an actionable element |
+| Project / branch label | Same |
+| Diff stats badge `+12 -1` | Conditional — no session at capture had pending diffs |
+| Dispatch badge | Conditional — no Dispatch-spawned session at capture |
+| Scheduled badge | Conditional — same |
+| Hover archive icon | Hover-revealed; walker captures static state |
+| Right-click context menu (Rename / Archive / etc.) | Walker does not synthesise right-clicks |
+| Sidebar resize handle | Visual / draggable, not an aria-labeled element |
+| Sidebar collapse toggle | Inventory has `root.button.collapse-sidebar` but doc treats it as a Code-tab element rather than chrome |
+| Scrollbar | OS / theme-rendered |
+| `Ctrl+Tab` / `Ctrl+Shift+Tab` cycling | Keyboard shortcut, not a UI element |
+
+#### In renderer but not in docs
+
+| Inventory entry | Notes |
+|-----------------|-------|
+| `root.button.fine-tuning-diffusion-models-with-reinforcement-learning` | A pinned recent conversation — sidebar content |
+| `root.button.more-options-for-fine-tuning-diffusion-models-with-reinforce` | Per-row menu trigger — doc mentions "right-click context menu" but inventory shows it's a discoverable button |
+| `root.button.how-to-use-claude` + `root.button.more-options-for-how-to-use-claude` | Same pattern |
+| `root.button.code.button.routines` | "Routines" link in Code-tab nav — doc's "Routines link" is here |
+| `root.button.code.button.more-navigation-items` | Likely the doc's "Customize / Routines" expander — not enumerated |
+| `root.button.code.button.filter` | The doc's "Filter: status" probably maps here |
+| `root.button.code.button.appearance` | Not in doc |
+| `root.button.code.button.show-5-more` | Pagination; not in doc |
+| `root.button.code.button.open-session-*` (5 entries) | Each is a single session row in the Code-tab list — the doc's "Per-session row" category |
+
+#### Fingerprint potentially drifted
+
+None — doc rows for this surface use Location prose only.
+
+---
+
+### `ui/prompt-area.md`
+
+**Inventory surfaces likely covered:** `root` (top-level prompt area
+buttons), `root.button.add-files-connectors-and-more` (the `+` menu),
+`root.button.model-opus-4-7-adaptive` (model picker), and several deep
+sub-surfaces.
+
+**Doc rows reconciled:** ~28
+
+#### In docs but not in renderer
+
+| Doc element | Reason class |
+|-------------|--------------|
+| Input field | The contenteditable / textarea itself isn't captured (no aria-label) |
+| Placeholder text | Not an interactive element |
+| Cursor caret / multi-line autosize / word wrap | Behavior, not element |
+| Paste plain text / paste image | Behavior |
+| `Enter` to send / `Shift+Enter` / `Esc` | Keyboard behavior |
+| IME composition | Not a renderer element |
+| Attachment button (left of input) | Not surfaced — possibly bundled into `root.button.add-files-connectors-and-more` |
+| File-attached chip | Conditional — no attachment at capture |
+| Multiple attachments / image preview / PDF preview | Conditional |
+| Drag-drop overlay | Conditional, only renders during drag |
+| `@filename` autocomplete | Conditional, only renders when typing `@` |
+| `+` button | Likely IS the `root.button.add-files-connectors-and-more` button — see below |
+| Slash menu (all rows: Built-in / Project skills / User skills / Plugin skills / filter / selection / `Esc`) | Walker did not type `/` to trigger the slash menu; no inventory entries |
+| Effort picker (`Cmd+Shift+E`) | Possibly inside `root.button.code.button.opus-4-7-1m-extra-high` — uncertain |
+| Stop button (replaces Send while responding) | Conditional — no in-flight response at capture |
+| Usage ring | Possibly `root.button.code.button.usage-plan-11` ("Usage: plan 11%") |
+
+#### In renderer but not in docs
+
+| Inventory entry | Notes |
+|-----------------|-------|
+| `root.button.press-and-hold-to-record` ("Press and hold to record") | Voice / dictation button in prompt area — doc has no voice input row |
+| `root.button.code.button.dictation-settings` | Dictation settings button |
+| `root.button.code.button.transcript-view-mode` | Transcript view toggle in prompt area |
+| `root.button.code.button.scroll-to-bottom` | Scroll-to-bottom affordance |
+| `root.button.code.button.accept-edits` | Permission-mode-related quick action |
+| `root.button.code.button.add` ("Add") | Likely the doc's `+` button, with a different label |
+| `root.button.code.button.usage-plan-11` ("Usage: plan 11%") | Probably the doc's "Usage ring" |
+| `root.button.code.button.opus-4-7-1m-extra-high` ("Opus 4.7 1M· Extra high") | Probably the doc's "Effort picker" |
+| All `root.button.add-files-connectors-and-more.menuitem.*` entries (Add files or photos / Add to project / Skills / Connectors / Plugins / Research / Web search / Use style) | The `+` menu contents — doc has Slash commands / Skills / Connectors / Plugins / Add plugin; inventory surfaces additional items the doc misses (Add files or photos, Add to project, Web search, Use style) |
+| `root.button.add-files-connectors-and-more.menuitem.use-style.*` (8 entries: Normal / Learning / Concise / Explanatory / Formal / Create & edit styles / Research mode) | Style picker is a whole sub-surface the doc doesn't mention |
+| `root.button.model-opus-4-7-adaptive.menuitemradio.*` (Opus / Sonnet / Haiku / Adaptive thinking / More models) | Doc says "Sonnet, Opus, Haiku" — inventory adds Adaptive thinking + More models |
+
+#### Fingerprint potentially drifted
+
+| Doc claim | Inventory says |
+|-----------|----------------|
+| `+` button → opens menu of "Slash commands / Skills / Connectors / Plugins / Add plugin" | The corresponding inventory button is labeled "Add files, connectors, and more" with `aria-label="Add files, connectors, and more"`. Menu contents don't include "Slash commands" or "Add plugin" sub-entry — doc menu structure is partly speculative |
+
+---
+
+### `ui/code-tab-panes.md`
+
+**Inventory surfaces likely covered:** `root.button.code` (23 entries),
+`root.button.code.button.new-session-n` (60 entries) — but no per-pane
+sub-surfaces (no diff pane, no terminal pane, no preview pane, no file
+pane).
+
+**Doc rows reconciled:** ~50
+
+#### In docs but not in renderer
+
+Almost every Code-tab pane row is missing from the inventory. The walker
+landed in the Code-tab "New session" shell but did not open or drill any
+of the panes. Categories:
+
+| Pane | Doc rows missing | Reason |
+|------|------------------|--------|
+| Pane chrome (header, drag/resize handles, close button, Views menu) | 5 rows | Walker coverage gap — no pane was open |
+| Diff pane | 9 rows (file list, diff content, line click, Cmd+Enter, Accept/Reject, Review code) | Walker coverage gap |
+| Preview pane | 11 rows | Walker coverage gap |
+| Terminal pane | 7 rows | Walker coverage gap (also: only renders for Local sessions) |
+| File pane | 7 rows | Walker coverage gap |
+| Tasks / subagent pane | 5 rows | Walker coverage gap |
+| Side chat overlay | 3 rows (trigger / content / close) | `root.button.code.button.close-side-chat` IS captured — the close button — but content isn't drilled |
+| CI status bar | 5 rows | Conditional — no PR open at capture |
+| View modes (Normal/Verbose/Summary) | 3 rows | Possibly behind `root.button.code.button.transcript-view-mode` — single inventory entry vs. 3 doc rows |
+
+#### In renderer but not in docs
+
+| Inventory entry | Notes |
+|-----------------|-------|
+| `root.button.code.button.local` ("Local") | Environment switcher chip — not in doc |
+| `root.button.code.button.select-folder` ("Select folder…") | Folder-picker entry — doc references this only via T17 cross-reference |
+| `root.button.code.button.send` (and `#2`, both denylisted) | Send button — doc has it under prompt-area, not panes |
+| `root.button.code.button.transcript-view-mode` | The doc's "Transcript view dropdown" — single inventory entry |
+| `root.button.code.button.opus-4-7-1m-extra-high` | Model selector inside Code-tab session shell |
+| `root.button.code.button.usage-plan-11` | Usage ring inside Code-tab session shell |
+| `root.button.code.button.accept-edits` ("Accept edits") | Permission-mode quick action — not in doc |
+| All 60 `root.button.code.button.new-session-n.button.open-session-*` and per-session entries | Doc covers the session list in `sidebar.md`, not here, so this isn't really a gap for `code-tab-panes.md` |
+
+#### Fingerprint potentially drifted
+
+None — doc is prose-only.
+
+---
+
+### `ui/settings.md`
+
+**Inventory surfaces likely covered:** `root.button.settings` (only 1
+entry — "Settings" button itself), `root.button.awaaddrick-max.menuitem.settingsctrl`
+(the menu-item route to Settings, label "SettingsCtrl,").
+
+**Doc rows reconciled:** ~28
+
+#### In docs but not in renderer
+
+The Settings page itself is essentially un-walked. Settings opens as an
+overlay/modal which the walker treated as a single button rather than
+drilling into. Every row in the doc beyond "Settings window opens" lacks
+a matching inventory entry:
+
+| Doc section | Rows missing | Reason |
+|-------------|--------------|--------|
+| Settings root (close button, sidebar nav) | 3 rows | Walker coverage gap |
+| Desktop app → General (Computer use, Keep computer awake, Denied apps, Unhide apps, Theme picker) | 5 rows | Walker coverage gap; some rows account-state-dependent |
+| Desktop app → Account (name/email, plan badge, Sign out) | 3 rows | Walker coverage gap |
+| Claude Code (Worktree location, Branch prefix, Auto-archive toggle, Persist preview, Preview toggle, Bypass-permissions toggle, Auto mode availability) | 7 rows | Walker coverage gap |
+| Connectors page (list, per-connector entry, Manage, Disconnect, Add connector) | 5 rows | Walker coverage gap; partially covered by the in-session connectors menu |
+| SSH connections (list, Add SSH connection button, per-connection entry) | 3 rows | Walker coverage gap; account-state-dependent |
+| Keyboard shortcuts (list, value, Reset, Quick Entry shortcut) | 4 rows | Walker coverage gap |
+| Local environment editor (open, Add variable, Remove variable, Apply to dev servers) | 4 rows | Walker coverage gap; account-state-dependent |
+
+#### In renderer but not in docs
+
+| Inventory entry | Notes |
+|-----------------|-------|
+| `root.button.settings` ("Settings", `aria-label="Settings"`) | The button that opens Settings — confirmed in chrome |
+| `root.button.awaaddrick-max.menuitem.settingsctrl` ("SettingsCtrl,") | Settings menu item under the user/plan menu — alternate path |
+
+#### Fingerprint potentially drifted
+
+None.
+
+#### Walker coverage note
+
+Settings is a known walker coverage gap (see preamble). This doc is
+substantively un-reconciled until a Settings drill pass lands.
+
+---
+
+### `ui/routines-page.md`
+
+**Inventory surfaces likely covered:** none directly. Routines are
+reachable via `root.button.code.button.routines`, but the page itself
+isn't drilled.
+
+**Doc rows reconciled:** ~26
+
+#### In docs but not in renderer
+
+Every doc row except the "Routines page link" itself is unmatched — the
+walker captured the entry point but did not open the Routines page.
+
+| Doc section | Rows missing | Reason |
+|-------------|--------------|--------|
+| Routines list (header, New routine button, list, per-routine row, Run-now icon, Pause/resume, click row) | 7 rows | Walker coverage gap |
+| New routine form Local (Name, Description, Instructions, permission-mode picker, model picker, Working folder, Worktree toggle, Schedule preset, Time picker, Day picker, Save, Cancel, Folder-trust prompt) | 13 rows | Walker coverage gap |
+| New routine form Remote (Trigger type, Connectors picker, Network access controls) | 3 rows | Walker coverage gap; doc itself is partly speculative ("Per upstream docs") |
+| Routine detail (Run now, Active/Paused toggle, Edit, Delete, Review history, hover tooltip, Show more, Always allowed, Revoke approval) | 9 rows | Walker coverage gap |
+
+#### In renderer but not in docs
+
+| Inventory entry | Notes |
+|-----------------|-------|
+| `root.button.code.button.routines` ("Routines") | The entry-point link — doc's "Routines page link" |
+
+#### Fingerprint potentially drifted
+
+None.
+
+---
+
+### `ui/connectors-and-plugins.md`
+
+**Inventory surfaces likely covered:** `root.button.add-files-connectors-and-more.menuitem.connectors`
+(the in-session connector picker, 5 entries), plus the deeper per-connector
+sub-surfaces under `.connectors.menuitemcheckbox.gmail.*` (15 entries).
+Plugin browser surfaces (`root.button.back.*`) cover Skills, Connectors,
+Add plugin, Typescript lsp, Php lsp, Playwright, Connectors, etc.
+
+**Doc rows reconciled:** ~24
+
+#### In docs but not in renderer
+
+| Doc element | Reason class |
+|-------------|--------------|
+| Connectors menu — "Per-connector row" with status indicator | Inventory has Gmail and Google Calendar but not status decorations |
+| Empty state | Conditional — user has connectors configured |
+| Connector catalog (modal body, per-connector tile with logo/description) | Walker coverage gap — the Add-connector flow opens a modal that wasn't drilled |
+| OAuth in-app overlay | Conditional, not present at capture |
+| Permission consent screen | External (provider's UI) |
+| Callback completion | Behavior, not an element |
+| Custom connector entry point | Walker coverage gap |
+| Plugin browser modal (browser modal, marketplace selector, per-plugin tile, scope selector, install progress, success state, error state) | Walker captured plugin surfaces under `root.button.back.*` (Add plugin, Typescript lsp, Php lsp, Playwright) but not the modal anatomy |
+| Manage plugins (installed list, per-plugin row, Enable toggle, Plugin skills sub-list) | Walker coverage gap — no Manage-plugins surface drilled |
+
+#### In renderer but not in docs
+
+| Inventory entry | Notes |
+|-----------------|-------|
+| `root.button.add-files-connectors-and-more.menuitem.connectors` ("Connectors", in-session menu) | Doc covers this — the in-session Connectors menu |
+| `root.button.add-files-connectors-and-more.menuitem.connectors.menuitemcheckbox.gmail` ("Gmail") | Per-connector row — doc "Per-connector row" category |
+| `root.button.add-files-connectors-and-more.menuitem.connectors.menuitemcheckbox.google-calendar` ("Google Calendar") | Per-connector row — same |
+| `root.button.add-files-connectors-and-more.menuitem.connectors.menuitem.manage-connectors` ("Manage connectors") | Doc's "Manage connectors entry" |
+| `root.button.add-files-connectors-and-more.menuitem.connectors.menuitem.add-connector` ("Add connector") | Doc has "Add connector button" in Settings; inventory shows it also exists in the in-session menu |
+| `root.button.add-files-connectors-and-more.menuitem.connectors.menuitem.tool-accessload-tools-when-needed` ("Tool accessLoad tools when needed") | Per-connector tool-access setting — not in doc |
+| `root.button.back.a.skills` ("Skills") | Plugin browser — Skills tab |
+| `root.button.back.a.connectors` / `root.button.back.a.connectors#2` (both "Connectors") | Plugin browser — Connectors tab (instance suffix `#2` indicates duplicate detection) |
+| `root.button.back.button.add-plugin` ("Add plugin") | Plugin browser — Add plugin button |
+| `root.button.back.a.typescript-lsp` / `root.button.back.a.php-lsp` / `root.button.back.a.playwright` | Installed plugins — doc treats this as "Manage plugins → Per-plugin row," walker captures the actual plugin names |
+| `root.button.back.button.connect-your-appslet-claude-read-and-write-to-the-tools-you-` ("Connect your appsLet Claude read...") | Plugin browser landing pane CTA — not in doc |
+| `root.button.back.a.create-new-skillsteach-claude-your-processes-team-norms-and-` ("Create new skillsTeach Claude your processes, team norms, and expertise.") | Skills-creation CTA — not in doc |
+| `root.button.back.button.browse-pluginsadd-pre-built-knowledge-for-your-field` ("Browse pluginsAdd pre-built knowledge for your field.") | Browse-plugins CTA — not in doc |
+| `root.button.add-files-connectors-and-more.menuitem.connectors.menuitemcheckbox.gmail.button.develop-storytelling-frameworks` and 9 similar `.option`/`.button` pairs | Connector-suggested prompt cards. Walker captured these as a side-effect of drilling Gmail — they aren't a doc-targeted UI element |
+
+#### Fingerprint potentially drifted
+
+| Doc claim | Inventory says |
+|-----------|----------------|
+| `+` → **Connectors** opens "Connectors menu" | Inventory: button is "Add files, connectors, and more" not "+"; menu item is "Connectors". Functionally the same surface |
+
+---
+
+### `ui/quick-entry.md`
+
+**Inventory surfaces covered:** none — Quick Entry is a separate
+`BrowserWindow` constructed in the main process (`index.js:515375`), not
+part of claude.ai's renderer. The walker started at `https://claude.ai/new`
+which never reaches it.
+
+**Doc rows reconciled:** ~17
+
+#### In docs but not in renderer
+
+Every row, by design. Categories:
+
+- Window appearance (frame, background, rounded corners, drop shadow,
+  position, always-on-top, lifecycle, persistence after main destroy) —
+  main-process BrowserWindow construction
+- Input area (text input, placeholder, multi-line, Enter/Shift+Enter,
+  Esc, click-outside, paste, IME) — popup renderer (separate from
+  claude.ai)
+- Submit feedback (transition, loading, error) — popup renderer + IPC
+  bridge
+
+This entire file is correctly out of renderer scope. Doc rows are
+already heavily annotated with `index.js:515xxx` references to upstream
+main-process source — that's the right substrate.
+
+#### In renderer but not in docs
+
+N/A — surface mismatch.
+
+---
+
+### `ui/notifications.md`
+
+**Inventory surfaces covered:** none — notifications fire via libnotify
+on the `org.freedesktop.Notifications` DBus path; they are not DOM
+elements.
+
+**Doc rows reconciled:** ~17
+
+#### In docs but not in renderer
+
+Every row, by design. Categories:
+
+- Notification sources (Scheduled fires, Catch-up, CI status, PR merged,
+  Dispatch handoff, Permission prompt) — main-process emitters
+- Per-notification anatomy (App identity, icon, title, body, actions,
+  click target) — DBus payload
+- Per-DE rendering (KDE/GNOME/Mako/Dunst/swaync/Niri) — daemon behavior
+- Notification persistence (history, DND) — daemon behavior
+
+This entire file is correctly out of renderer scope.
+
+#### In renderer but not in docs
+
+N/A — surface mismatch.
+
+---
+
+## Top-level findings
+
+### Coverage by source-of-truth axis
+
+- **OS-level / window-manager elements** (window-chrome rows for
+  title bar, close/min/max, resize edges, drop shadow) — never going to
+  appear in the renderer inventory. ~10 doc rows.
+- **Main-process Electron windows** (Quick Entry popup, About dialog,
+  crash dialog, file pickers) — never going to appear in the renderer
+  inventory. ~25 doc rows.
+- **Tray menu** (Show/Hide, Quick Entry, Settings, About, Quit, Open
+  at Login) — main-process `Menu.buildFromTemplate()`. ~12 doc rows.
+- **libnotify notifications** — DBus, not DOM. ~17 doc rows.
+- **Walker coverage gaps** (Settings overlay, Routines page, plugin
+  browser modal, all Code-tab panes, dialogs, slash menu, drag-drop
+  overlay) — would appear if the walker drilled them. ~70 doc rows.
+- **Account-state-dependent surfaces** (CI bar, Dispatch badges, file
+  attachments, SSH connections panel) — would appear in some sessions
+  but didn't at capture. ~15 doc rows.
+- **Conditional / hover / behavior** (right-click context menus, hover
+  archive icons, drag-drop overlays, tooltips) — wouldn't appear in a
+  static walker pass even if the surface was visited. ~10 doc rows.
+
+The combined explanation: roughly half of the "in docs but not in
+renderer" mismatches are unfixable (different source of truth), and
+roughly half are walker coverage gaps that future passes can close.
+
+### Top 3 surfaces with the most "in docs but not in renderer" mismatches
+
+These are likely candidates for speculative claims OR for un-walked
+surfaces. Treat as triage queue:
+
+1. **`ui/code-tab-panes.md`** — ~50 unmatched rows. Almost entirely
+   walker-coverage gap (the walker landed in the Code-tab shell but
+   opened no panes). Until the walker drills diff/preview/terminal/file/
+   tasks panes, this doc is un-reconcilable.
+2. **`ui/settings.md`** — ~28 unmatched rows. Settings opens as an
+   overlay; walker captured only the Settings entry-point button. Needs
+   targeted drill.
+3. **`ui/routines-page.md`** — ~26 unmatched rows. Same shape as
+   Settings — entry-point captured, page contents unwalked.
+
+### Top 3 surfaces with the most "in renderer but not in docs" surplus
+
+These docs are most-incomplete relative to ground truth:
+
+1. **`ui/sidebar.md`** — Inventory has 60+ Code-tab session-list entries
+   under `root.button.code.button.new-session-n`. Doc treats sessions as
+   a single category row. This is intentional doc behavior, but it means
+   the doc doesn't help when reasoning about the actual structural
+   buttons (Filter, Appearance, Routines, More navigation items, Show 5
+   more, etc.) that the walker found.
+2. **`ui/prompt-area.md`** — Inventory has the entire Use-style picker
+   sub-tree (Normal / Learning / Concise / Explanatory / Formal / Create
+   & edit styles + 5 preset cards), the Press-and-hold-to-record voice
+   button, dictation settings, transcript view mode, scroll-to-bottom,
+   and the model picker's "Adaptive thinking" / "More models" entries —
+   none of which the doc enumerates.
+3. **`ui/connectors-and-plugins.md`** — Inventory has the entire plugin
+   browser sub-tree (`root.button.back.*` — 12 entries: Skills, Add
+   plugin, Typescript lsp, Php lsp, Playwright, Browse plugins, Create
+   new skills, Connect your apps, Connectors×2, Back to Claude, Select
+   a folder), and connector-suggested prompt cards (10 entries under
+   `.gmail.button.*`). Doc treats these surfaces at a higher level of
+   abstraction.
+
+## Acknowledged gaps in inventory itself
+
+Not all inventory absences are doc errors. Known walker gaps as of v6:
+
+- **Settings page deep content** — only the entry-point button
+  (`root.button.settings`) and the menu shortcut
+  (`...menuitem.settingsctrl`) captured. Settings opens as an overlay
+  the walker did not drill.
+- **Dialogs** — 0 captured. claude.ai may not use `[role=dialog]` for
+  most modals, or the walker's drill paths didn't reach them.
+- **Code tab panes** — only the Code-tab session shell was drilled;
+  diff, preview, terminal, file, tasks, subagent, plan, side chat, CI
+  bar are uncaptured.
+- **Routines page** — only the entry-point link was captured.
+- **Plugin browser modal anatomy** — surrounding list captured, the
+  per-plugin install modal wasn't.
+- **Slash menu** — walker did not type `/` to trigger.
+- **Hover/right-click/drag-only affordances** — static walker; no
+  context menus or drag-drop overlays.
+- **Quick Entry / Tray / Notifications** — out of renderer scope.
+
+These are walker tickets, not bugs against the v6 capture.
+
+## Triage suggestions for `ui/*.md` cleanup
+
+Aimed at humans editing the docs. Ordered by impact:
+
+1. **Mark out-of-renderer surfaces explicitly.** `ui/tray.md`,
+   `ui/quick-entry.md`, `ui/notifications.md`, and the OS-frame section
+   of `ui/window-chrome-and-tabs.md` already reference main-process
+   source and DE behavior — add a header note that this surface
+   intentionally doesn't appear in `ui-inventory.json`.
+2. **Annotate walker-coverage-gap surfaces.** `ui/code-tab-panes.md`,
+   `ui/settings.md`, `ui/routines-page.md` — header note that the
+   inventory does not yet drill these surfaces; rows reflect upstream
+   behavior and are unverified in the renderer.
+3. **Add missing topbar/prompt-area elements** to `ui/window-chrome-and-tabs.md`
+   and `ui/prompt-area.md` from the "In renderer but not in docs" lists.
+4. **Decide the doc/inventory boundary for sidebar session lists.** Doc
+   treats sessions as a category; inventory enumerates each. Pick one
+   shape and document it.
+5. **Flag speculative Linux-conditional rows** — `ui/settings.md` SSH
+   connections, "Denied apps" / "Unhide apps when Claude finishes" for
+   Computer Use — mark as "may not render on Linux; verify before
+   assuming."
--- a/docs/testing/ui-inventory.json
+++ b/docs/testing/ui-inventory.json
--- a/docs/testing/ui-inventory.meta.json
+++ b/docs/testing/ui-inventory.meta.json
@@ -0,0 +1,12 @@
+{
+  "capturedAt": "2026-05-03T07:13:20.024Z",
+  "appVersion": "1.5354.0",
+  "walkerVersion": "7",
+  "startUrl": "https://claude.ai/epitaxy",
+  "totalElements": 90,
+  "deniedActions": 6,
+  "partial": false,
+  "isolation": "launchClaude (test-harness path)",
+  "seededFromHost": true,
+  "allowlistEntries": []
+}
--- a/docs/testing/ui-snapshots/.gitkeep
+++ b/docs/testing/ui-snapshots/.gitkeep
--- a/docs/testing/ui-snapshots/README.md
+++ b/docs/testing/ui-snapshots/README.md
@@ -0,0 +1,76 @@
+# UI snapshots
+
+Captured renderer state for the `claude.ai` web view, taken via the
+`explore` CLI in [`tools/test-harness/explore/`](../../../tools/test-harness/explore/).
+Use these to detect upstream UI drift before it breaks the harness.
+
+The snapshot JSON files themselves are gitignored
+(`docs/testing/ui-snapshots/*.json`) — they're noisy diffs and
+specific to the moment of capture. This directory is checked in so the
+path exists; the README + `.gitkeep` are the only tracked files.
+
+## Capture
+
+Requires a running `claude-desktop` build with the main-process
+debugger attached on port 9229 (Developer menu → Enable Main Process
+Debugger). Then, from `tools/test-harness/`:
+
+```sh
+npx tsx explore/explore.ts snapshot baseline-code-tab
+# → wrote /…/docs/testing/ui-snapshots/baseline-code-tab.json
+```
+
+Snapshot names are restricted to `[a-zA-Z0-9._-]`.
+
+## Compare
+
+```sh
+npx tsx explore/explore.ts diff baseline-code-tab after-feature-x
+```
+
+Add `--json` for machine-readable output. Add `--exit-on-diff` to fail
+the process (exit code 3) when there are any entries — useful inside a
+CI guard.
+
+`diff` arguments accept either a bare name (looked up in this dir,
+`.json` appended) or an explicit path.
+
+### What counts as a diff
+
+| Kind      | Meaning                                                 |
+|-----------|---------------------------------------------------------|
+| `removed` | Element keyed in A absent from B (drift signal).        |
+| `changed` | Same key, different visible text or structural detail.  |
+| `added`   | New key in B (informational only — surface gained).     |
+
+## Snapshot shape
+
+```jsonc
+{
+  "capturedAt": "2026-05-02T17:30:00Z",
+  "claudeAiUrl": "https://claude.ai/…",
+  "appVersion": "1.1.7714",        // from app.getVersion(), null on failure
+  "pageState":         { "url", "title", "readyState" },
+  "dfPills":           [ /* Chat / Cowork / Code top-level tabs */ ],
+  "compactPills":      [ /* env pill, Select-folder pill, … */ ],
+  "ariaLabeledButtons":[ /* every <button[aria-label]>, capped at 200 */ ],
+  "openMenu":          { "ariaLabelledBy", "ariaLabel", "items": [...] },
+  "modals":            [ /* role=dialog with heading + buttons */ ]
+}
+```
+
+Discovery is by **structural shape**, never by minified Tailwind class
+names. See the why-block at the top of
+[`tools/test-harness/explore/snapshot.ts`](../../../tools/test-harness/explore/snapshot.ts)
+for the rationale.
+
+## Other subcommands
+
+```sh
+npx tsx explore/explore.ts            # full snapshot to stdout
+npx tsx explore/explore.ts pills      # df-pills + compact-pills + state
+npx tsx explore/explore.ts menu       # currently-open menu (or null)
+npx tsx explore/explore.ts find <re>  # regex search over text + aria-label
+```
+
+`find` regex is case-insensitive by default.
--- a/docs/testing/ui-vocabulary.json
+++ b/docs/testing/ui-vocabulary.json
@@ -0,0 +1,360 @@
+{
+  "derivedAt": "2026-05-03T02:51:23.409Z",
+  "sourceInventory": {
+    "capturedAt": "2026-05-03T00:21:38.299Z",
+    "appVersion": "1.5354.0",
+    "walkerVersion": "6",
+    "totalElements": 383
+  },
+  "stable": [
+    "Accept edits",
+    "Add",
+    "Add connector",
+    "Add files",
+    "Add files or photosCtrl+U",
+    "Add files, connectors, and more",
+    "Add from GitHub",
+    "Add to project",
+    "All projects",
+    "Appearance",
+    "Ask",
+    "Back",
+    "Back to Claude",
+    "Chat",
+    "Clear active",
+    "Close",
+    "Close side chat",
+    "Close suggestions",
+    "Code",
+    "Completed: See Claude workTry a quick task — Claude does it, you watch",
+    "ConcisePreset",
+    "Connectors",
+    "Conversation ID reference",
+    "Copy invite",
+    "Cowork",
+    "Create custom style",
+    "Create engaging headlines",
+    "Create presentation scripts",
+    "Develop content templates",
+    "Develop storytelling frameworks",
+    "Dictation settings",
+    "Dismiss checklist",
+    "Dismiss guest pass",
+    "Draft PR visibility on GitHub",
+    "ELKO HRN-33 and HRN-31 manuals",
+    "Edit Instructions",
+    "Electron apps Linux users desperately want but can't have\nDespite Electron's cross-platform promise, several high-profil",
+    "Expand sidebar",
+    "ExplanatoryPreset",
+    "Feedback submission",
+    "Filter",
+    "Fine-tuning diffusion models with reinforcement learning",
+    "FormalPreset",
+    "Forward",
+    "From Calendar",
+    "From Gmail",
+    "Get apps and extensions",
+    "Gmail",
+    "Google Calendar",
+    "How to use ClaudeAaddrick Williams",
+    "Install",
+    "Invalid session description",
+    "Lamination plate position offsetsAaddrick Williams",
+    "Learn",
+    "Learn about styles",
+    "Learn how to use Cowork safely",
+    "Learn more about styles",
+    "Learning",
+    "LearningPreset",
+    "Local",
+    "Manage connectors",
+    "Menu",
+    "Model: Legacy Model",
+    "Model: Opus 4.7 Adaptive",
+    "Model: Sonnet 4.6 Adaptive",
+    "More navigation items",
+    "More options",
+    "More options for Fine-tuning diffusion models with reinforcement learning",
+    "More options for How to use Claude",
+    "New artifact",
+    "New project",
+    "Open session Audit for elementary-data supply chain vulnerability",
+    "Open session Find contact method for Claude Desktop issue",
+    "Open session Plan automated testing strategy for desktop app",
+    "Open session Test DNS query for Claude desktop package",
+    "Open session for PR #552",
+    "Pair your phoneSend tasks from your phone for Claude to run here",
+    "Pin project",
+    "Pinned",
+    "Plugins",
+    "Press and hold to record",
+    "Recents",
+    "Research",
+    "Research mode",
+    "Schedule a recurring taskGreat for reminders, reports, or regular check-ins",
+    "Scroll to bottom",
+    "Search",
+    "Search projects",
+    "Select folder…",
+    "Send",
+    "Settings",
+    "Show 5 more",
+    "Show more",
+    "Skills",
+    "Skip to content",
+    "Sort by",
+    "Start a task in Cowork",
+    "Style: Formal",
+    "Terms apply",
+    "Test",
+    "Testing and Quality Assurance",
+    "Tool accessLoad tools when needed",
+    "Transcript view mode",
+    "Untitled",
+    "Use style",
+    "View all",
+    "Web search",
+    "West Central Schools provincial takeover investigation",
+    "Work in a project",
+    "Write",
+    "Write something in the voice of my favorite historical figure",
+    "Your artifactsYour artifacts",
+    "about_tab.py, py, 60 lines",
+    "New chat⌘N",
+    "New session⌘N",
+    "New task⌘N",
+    "Artifacts",
+    "Live artifacts",
+    "Scheduled",
+    "DispatchBeta",
+    "Routines",
+    "How to use Claude",
+    "Projects",
+    "Customize"
+  ],
+  "instanceShapes": [
+    {
+      "id": "plan-badge",
+      "regex": "^.+·(Free|Pro|Max|Team|Enterprise)[-\\s]*$",
+      "flags": "u",
+      "pattern": "\\w+·(Free|Pro|Max|Team|Enterprise)",
+      "matchedNames": [
+        "AWAaddrick·Max"
+      ]
+    },
+    {
+      "id": "opus-version",
+      "regex": "^Opus \\d",
+      "flags": "",
+      "pattern": "^Opus \\d",
+      "matchedNames": [
+        "Opus 4.7 1M· Extra high",
+        "Opus 4.7Most capable for ambitious work"
+      ]
+    },
+    {
+      "id": "sonnet-version",
+      "regex": "^Sonnet \\d",
+      "flags": "",
+      "pattern": "^Sonnet \\d",
+      "matchedNames": [
+        "Sonnet 4.6Most efficient for everyday tasks"
+      ]
+    },
+    {
+      "id": "haiku-version",
+      "regex": "^Haiku \\d",
+      "flags": "",
+      "pattern": "^Haiku \\d",
+      "matchedNames": [
+        "Haiku 4.5Fastest for quick answers"
+      ]
+    },
+    {
+      "id": "percentage",
+      "regex": "\\d{1,3}%$",
+      "flags": "",
+      "pattern": "\\d{1,3}%",
+      "matchedNames": [
+        "Usage: plan 11%"
+      ]
+    },
+    {
+      "id": "relative-date",
+      "regex": "(Today|Yesterday|\\d+\\s(day|hour|minute|second|week|month|year)s?\\sago)",
+      "flags": "",
+      "pattern": "(Today|Yesterday|\\d+\\s(day|hour|minute|second|week|month|year)s?\\sago)(\\+\\d+)?",
+      "matchedNames": [
+        "Claude Desktop Debian1 year ago",
+        "Draft PR visibility on GitHubYesterday",
+        "ELKO HRN-33 and HRN-31 manualsYesterday",
+        "Feedback submissionYesterday",
+        "Find contact method for Claude Desktop issuePR #552 · Yesterday",
+        "Review PR 555 for issue 558 fixToday",
+        "Review and analyze issue 545Yesterday"
+      ]
+    },
+    {
+      "id": "size-with-unit",
+      "regex": "^\\d+\\.\\d+\\s\\w+",
+      "flags": "",
+      "pattern": "^\\d+\\.\\d+\\s\\w+",
+      "matchedNames": []
+    },
+    {
+      "id": "user-handle",
+      "regex": "@\\w+",
+      "flags": "",
+      "pattern": "@\\w+",
+      "matchedNames": []
+    },
+    {
+      "id": "long-title",
+      "regex": "^[A-Z][a-z]+ [A-Z][a-z]+ [a-z]",
+      "flags": "",
+      "pattern": null,
+      "matchedNames": [
+        "Evaluate Terraform for infrastructure setup",
+        "Host Obsidian library in second database"
+      ]
+    }
+  ],
+  "suspect": [
+    "Adaptive thinkingThinks for more complex tasks",
+    "Add build instructions and patch toggle option",
+    "Add build instructions and quick menu patch toggle",
+    "Add plugin",
+    "Audit for elementary-data supply chain vulnerability",
+    "Automate",
+    "Browse pluginsAdd pre-built knowledge for your field.",
+    "Build adversarial resume review platform MVP",
+    "Change fonts to Lexend",
+    "Check Quad9 DNS resolution for package domain",
+    "Check flight map tile caching history",
+    "Check for Trivy supply chain vulnerability",
+    "Claude Desktop DebianAaddrick Williams",
+    "Claude Desktop DebianEnter",
+    "Claude is AI and can make mistakes. Please double-check responses.",
+    "Claude prompting guide.md, md, 413 lines",
+    "Clawdmartclawdmart.comClaudeCreate a shopping list, go on Chrome, and make an order",
+    "Collapse sidebar",
+    "Compare GPU options for gaming performance",
+    "Concise",
+    "Connect your appsLet Claude read and write to the tools you already use.",
+    "Copy",
+    "Create & edit styles",
+    "Create new skillsTeach Claude your processes, team norms, and expertise.",
+    "Create user documentation",
+    "Customer Email",
+    "Data",
+    "Develop editorial guidelines",
+    "Dispatch background conversation",
+    "Download",
+    "Draw",
+    "Edit",
+    "Educational Content",
+    "Evaluate productization viability of methodology",
+    "Explanatory",
+    "Find contact method for Claude Desktop issue",
+    "Fix Claude Desktop installation on Debian",
+    "Formal",
+    "Formulas",
+    "Give negative feedback",
+    "Give positive feedback",
+    "Help me develop a unique voice for an audience",
+    "Home",
+    "How to use ClaudeAn example project that also doubles as a how-to guide for using Claude. Chat with it to learn more abo",
+    "Identify tools for session start hook",
+    "Insert",
+    "Investigate GitHub Actions workflow failure",
+    "Investigate GitHub issue 394 comment",
+    "Investigate leaked crates.io API key",
+    "Investigate leaked crates.io token in repository",
+    "Lamination plate position offsetsAdjust existing code to just populate a table with original positions, new positions, a",
+    "Marketing Blog Post",
+    "More models",
+    "More options for Claude Desktop Debian",
+    "More options for Lamination plate position offsets",
+    "My downloads folder is a mess! Can you clean it up?",
+    "Normal",
+    "Open",
+    "Options",
+    "Page Layout",
+    "Php lsp",
+    "Plan automated testing strategy for desktop app",
+    "Playwright",
+    "Product Review",
+    "Read health data",
+    "Retry",
+    "Review",
+    "Review PR 555 for issue 558 fix",
+    "Review and address issue 88",
+    "Review and analyze issue 545",
+    "Review and close stale issues",
+    "Review and investigate GitHub issue 445",
+    "Review issue 156",
+    "Review issue 172 and document related history",
+    "Review issue 373",
+    "Review last three repository commits",
+    "Review path resolution issues and pull requests",
+    "Review project issues and pull requests",
+    "Review recent comments, issues, and pull requests",
+    "Select a folder",
+    "Share chat",
+    "Short Story",
+    "Start a new project",
+    "Start return",
+    "Style: Concise",
+    "Style: Explanatory",
+    "Style: Learning",
+    "Test DNS lookup with Quad9 resolver",
+    "Test DNS query for Claude desktop package",
+    "Test path resolution",
+    "Test startsession hook functionality",
+    "Troubleshoot modem downstream connection issue",
+    "Turn these receipts into an expense report",
+    "Typescript lsp",
+    "Unpin project",
+    "Untitled, rename chat",
+    "View",
+    "Write case studies",
+    "Write speech drafts",
+    "analyze_project.py, py, 220 lines",
+    "base_half_sheet.py, py, 32 lines",
+    "changelog_viewer_component.py, py, 113 lines",
+    "colors.py, py, 103 lines",
+    "compensation.py, py, 50 lines",
+    "components.py, py, 118 lines",
+    "components.py, py, 119 lines",
+    "config_reader.py, py, 120 lines",
+    "contraction_tab.py, py, 105 lines",
+    "contraction_tab.py, py, 82 lines",
+    "conversions.py, py, 28 lines",
+    "data_parser.py, py, 87 lines",
+    "dialogs.py, py, 34 lines",
+    "file_operations.py, py, 43 lines",
+    "log.py, py, 140 lines",
+    "log.py, py, 236 lines",
+    "machines.ini, ini, 2 lines",
+    "main.py, py, 203 lines",
+    "main.py, py, 264 lines",
+    "output_tab.py, py, 191 lines",
+    "output_tab.py, py, 246 lines",
+    "process_request.py, py, 632 lines",
+    "processing_format.ini, ini, 2 lines",
+    "setup_tab.py, py, 120 lines",
+    "setup_tab.py, py, 177 lines",
+    "sheet_dimensions.ini, ini, 3 lines",
+    "version 0.1.0.md, md, 42 lines",
+    "version 0.1.1.md, md, 31 lines",
+    "version 0.1.2.md, md, 18 lines",
+    "View all plans",
+    "Get apps and extensions",
+    "Gift Claude",
+    "Language",
+    "Get help",
+    "Learn more",
+    "Log out",
+    "SettingsCtrl,"
+  ]
+}
--- a/docs/testing/ui/README.md
+++ b/docs/testing/ui/README.md
@@ -0,0 +1,78 @@
+# UI Element Inventory
+
+This directory holds per-surface UI checklists. Where [`../cases/`](../cases/) tests verify *behavior end-to-end*, files here verify *every UI element renders and responds* on Linux.
+
+## Why a separate directory
+
+A functional test like [T17 — Folder picker opens](../cases/code-tab-foundations.md#t17--folder-picker-opens) verifies the folder picker works. A UI checklist asks the smaller, more granular questions:
+
+- Is the **Select folder** button visually present?
+- Does its hover state render?
+- Is the icon next to it the correct shape on a HiDPI screen?
+- Does it tab-focus correctly?
+- Does it have an accessible name (a11y)?
+
+Functional tests catch "the feature broke." UI checklists catch "the feature works but looks wrong." Both matter on Linux because Electron under different DEs / display servers / GTK theme combinations produces visual artifacts that aren't behavioral failures.
+
+## Layout
+
+| File | Surface | Notes |
+|------|---------|-------|
+| [`window-chrome-and-tabs.md`](./window-chrome-and-tabs.md) | OS window frame + hybrid in-app topbar + Chat/Cowork/Code tabs | Crosses with [T04](../cases/tray-and-window-chrome.md#t04--window-decorations-draw), [T07](../cases/tray-and-window-chrome.md#t07--in-app-topbar-renders--clickable) |
+| [`tray.md`](./tray.md) | System tray icon + menu + theme variants | Crosses with [T03](../cases/tray-and-window-chrome.md#t03--tray-icon-present), [S08](../cases/tray-and-window-chrome.md#s08--tray-icon-doesnt-duplicate-after-nativetheme-update) |
+| [`sidebar.md`](./sidebar.md) | Session sidebar in Code tab | Crosses with [T29](../cases/code-tab-workflow.md#t29--worktree-isolation), [T30](../cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge), [S24](../cases/platform-integration.md#s24--dispatch-spawned-code-session-appears-with-badge-and-notification) |
+| [`prompt-area.md`](./prompt-area.md) | Code-tab prompt input area | Crosses with [T18](../cases/code-tab-foundations.md#t18--drag-and-drop-files-into-prompt), [T32](../cases/code-tab-workflow.md#t32--slash-command-menu) |
+| [`code-tab-panes.md`](./code-tab-panes.md) | Diff, preview, terminal, file, tasks, subagent, plan, side-chat | Crosses with [T19](../cases/code-tab-foundations.md#t19--integrated-terminal), [T20](../cases/code-tab-foundations.md#t20--file-pane-opens-and-saves), [T21](../cases/code-tab-workflow.md#t21--dev-server-preview-pane), [T22](../cases/code-tab-workflow.md#t22--pr-monitoring-via-gh), [T31](../cases/code-tab-workflow.md#t31--side-chat-opens) |
+| [`settings.md`](./settings.md) | All Settings pages | Crosses with [S20](../cases/routines.md#s20--keep-computer-awake-inhibits-idle-suspend), [S22](../cases/platform-integration.md#s22--computer-use-toggle-is-absent-or-visibly-disabled-on-linux), [T30](../cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge) |
+| [`routines-page.md`](./routines-page.md) | Routines list + new-routine form + detail page | Crosses with [T26](../cases/routines.md#t26--routines-page-renders), [T27](../cases/routines.md#t27--scheduled-task-fires-and-notifies) |
+| [`connectors-and-plugins.md`](./connectors-and-plugins.md) | Connector picker, connector list, plugin browser, plugin manager | Crosses with [T11](../cases/extensibility.md#t11--plugin-install-anthropic--partners), [T33](../cases/extensibility.md#t33--plugin-browser), [T34](../cases/code-tab-handoff.md#t34--connector-oauth-round-trip) |
+| [`quick-entry.md`](./quick-entry.md) | Quick Entry popup window | Crosses with [T06](../cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused), [S10](../cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame) |
+| [`notifications.md`](./notifications.md) | libnotify rendering for all notification sources | Crosses with [T23](../cases/code-tab-handoff.md#t23--desktop-notifications-fire), [T27](../cases/routines.md#t27--scheduled-task-fires-and-notifies), [S24](../cases/platform-integration.md#s24--dispatch-spawned-code-session-appears-with-badge-and-notification) |
+
+## Standard checklist row
+
+Each UI file uses tables of the form:
+
+| Element | Selector / location | Expected | Notes |
+|---------|---------------------|----------|-------|
+| Close button | Top-right of titlebar | Renders, hover state visible, click hides to tray (see T08) | KDE-W: ✓ |
+
+Columns:
+
+- **Element** — human-readable name.
+- **Selector / location** — DOM selector if known, otherwise plain-language pointer ("right-click menu, second item from top"). The selector column is what becomes a Playwright/CDP assertion when automation lands.
+- **Expected** — what the user should see / what should happen on click. Concise.
+- **Notes** — known issues, environment caveats, screenshot links.
+
+## Sweep workflow
+
+A UI sweep on a row:
+
+1. Take a baseline screenshot of each surface (`scrot`, `gnome-screenshot`, `grim`, `flameshot`).
+2. Walk each table top-to-bottom. For each row, look at the element, click/hover/tab to it, compare against Expected.
+3. Mark anomalies in the **Notes** column or file an issue if the deviation is environment-specific.
+4. Save screenshots of any failure to a dated folder; reference them inline.
+
+UI rows don't have stable IDs (`T##` / `S##`) — they're append-only checkpoints. When something becomes a regression candidate worth tracking long-term, promote it to a functional test in [`../cases/`](../cases/).
+
+## Automation roadmap
+
+Each UI checklist row is a candidate Playwright (via [Electron driver](https://playwright.dev/docs/api/class-electron)) or `xdotool` assertion:
+
+```typescript
+// Playwright shape
+await page.locator('[data-testid="close-button"]').click()
+await expect(window).toBeHidden()
+```
+
+Or for pure visual diffing:
+
+```bash
+# scrot + perceptualdiff
+scrot -u baseline.png
+# ... interaction ...
+scrot -u current.png
+perceptualdiff baseline.png current.png
+```
+
+The structure here is intentionally diff-friendly: rows are stable, tables are append-only, selectors live in their own column.
--- a/docs/testing/ui/code-tab-panes.md
+++ b/docs/testing/ui/code-tab-panes.md
@@ -0,0 +1,114 @@
+# UI — Code Tab Panes
+
+Drag-and-drop panes inside a Code-tab session: diff, preview, terminal, file editor, tasks, subagent, plan, side chat. Related functional tests: [T19](../cases/code-tab-foundations.md#t19--integrated-terminal), [T20](../cases/code-tab-foundations.md#t20--file-pane-opens-and-saves), [T21](../cases/code-tab-workflow.md#t21--dev-server-preview-pane), [T22](../cases/code-tab-workflow.md#t22--pr-monitoring-via-gh), [T31](../cases/code-tab-workflow.md#t31--side-chat-opens).
+
+## Pane chrome (common)
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Pane header | Top of pane | Shows pane title, drag handle, close button | — |
+| Drag handle | Pane header | Drag repositions the pane in the layout | — |
+| Resize handle | Edge between panes | Drag resizes; double-click resets | — |
+| Close pane button | Pane header right | `Cmd+\` or Ctrl+\\ shortcut equivalent | — |
+| Views menu | Session toolbar | Lists all openable panes; click to add | — |
+
+## Diff pane
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Diff stats indicator | Chat / sidebar (entry point) | Shows `+12 -1` style. Click opens diff pane | — |
+| File list | Left side of pane | Lists changed files, click to navigate | — |
+| Diff content | Right side | Side-by-side or unified diff renders cleanly | Theme-aware (dark/light) |
+| Line click → comment box | Click any line | Opens inline comment input | — |
+| Comment submit (`Cmd+Enter` / `Ctrl+Enter`) | Press the shortcut after writing | Submits all comments at once | — |
+| Accept button | Per-file or per-hunk | Applies the change to disk | — |
+| Reject button | Per-file or per-hunk | Discards the change | — |
+| **Review code** button | Top-right of pane | Triggers Claude self-review of diff | — |
+
+## Preview pane
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Preview dropdown | Session toolbar | Lists configured servers from `.claude/launch.json` | — |
+| **Start** action | Per-server entry | Launches the dev server | — |
+| **Stop** action | Per-server entry | Stops the dev server | — |
+| **Stop all servers** | Dropdown bottom | Stops every running server | — |
+| **Edit configuration** | Dropdown bottom | Opens `.claude/launch.json` in the file pane | — |
+| **Persist sessions** toggle | Dropdown | Persists cookies / localStorage across server restarts | — |
+| Embedded browser frame | Pane content | Renders the running app | Uses Electron `<webview>` or `BrowserView` |
+| URL bar / address | Top of pane | Shows current URL; editable | — |
+| Reload button | Top of pane | Reloads the embedded URL | — |
+| DevTools toggle | Top of pane (right) | Opens Electron DevTools for the embedded view | — |
+| Auto-verify screenshots | When Claude verifies a change | Brief overlay shows screenshot being captured | — |
+
+## Terminal pane
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Terminal pane | Opened via `Ctrl+`` or Views menu | Bash/zsh/fish session in the working directory ([T19](../cases/code-tab-foundations.md#t19--integrated-terminal)) | Local sessions only |
+| Cursor | Inside terminal | Blinks; cursor shape per shell | — |
+| Resize | Drag pane edges | Terminal cols/rows update; `tput cols` reflects new width | SIGWINCH should fire |
+| Scrollback | Type many lines | Scrollable history; mouse scroll wheel works | — |
+| Color rendering | Run `ls --color=auto`, `tput colors` | 256-color or truecolor support; theme-aware | — |
+| Copy / paste | Select + `Ctrl+Shift+C` / `Ctrl+Shift+V` | Standard terminal-emulator shortcuts | — |
+| Working directory inheritance | Open pane in a session | Opens at the session's project folder | Confirm with `pwd` |
+
+## File pane
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| File pane | Opened by clicking a file path | Shows file content, syntax-highlighted | — |
+| Save button | Pane toolbar | Writes current content to disk | — |
+| Path label | Pane header | Click copies absolute path | — |
+| On-disk-changed warning | If file changed externally after open | Banner with Override / Discard options ([T20](../cases/code-tab-foundations.md#t20--file-pane-opens-and-saves)) | — |
+| Discard button | When edits unsaved | Reverts to disk content | — |
+| Cursor / selection | Inside content | Renders correctly; multi-cursor not supported | — |
+| Find / replace | `Ctrl+F` | Opens find-in-file overlay | Verify scoped to current pane only |
+
+## Tasks pane / subagent pane
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Tasks pane | Opened via Views menu | Lists subagents, background shell commands, workflows | — |
+| Task entry click | Click any task | Opens the subagent pane with output | — |
+| Stop task button | Per-task | Sends interrupt signal | — |
+| Task status indicator | Per-task | Running / Completed / Failed | — |
+| Output stream | Inside subagent pane | Live-updating stdout/stderr | — |
+
+## Side chat overlay
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Side chat trigger | `Ctrl+;` or `/btw` in main prompt | Opens overlay attached to current session ([T31](../cases/code-tab-workflow.md#t31--side-chat-opens)) | — |
+| Side chat content | Overlay body | Reads main thread context; replies stay in side chat | — |
+| Close button | Overlay top-right | Closes side chat, returns focus to main session | — |
+
+## CI status bar
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| CI status row | Below prompt area when PR open | Shows current check states | Crosses with [T22](../cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) |
+| **Auto-fix** toggle | Top of CI bar | Toggles automatic check-failure fixes | — |
+| **Auto-merge** toggle | Top of CI bar | Toggles auto-merge on green | Requires GitHub repo setting |
+| Per-check entries | Each CI check | Shows pass / fail / pending state | Click to see logs |
+| CI completion notification | When all checks resolve | Desktop notification posted ([T23](../cases/code-tab-handoff.md#t23--desktop-notifications-fire)) | — |
+
+## View modes
+
+| Mode | Trigger | Expected | Notes |
+|------|---------|----------|-------|
+| Normal | Default; cycle via `Ctrl+O` | Tool calls collapsed into summaries, full text responses | — |
+| Verbose | Cycle via `Ctrl+O` | Every tool call, file read, intermediate step | Use for debugging |
+| Summary | Cycle via `Ctrl+O` | Only Claude's final responses + changes | Use when scanning many sessions |
+| Transcript view dropdown | Next to send button | Same as `Ctrl+O` | — |
+
+## Failure modes to watch for
+
+| Symptom | Likely cause | Notes |
+|---------|--------------|-------|
+| Pane drag doesn't snap to layout zones | Layout engine state corruption; restart session | — |
+| Terminal cursor doesn't blink | `xterm-256color` not propagated; `TERM` env wrong | `echo $TERM` inside the pane |
+| File pane "Save" silently no-ops | Read-only filesystem ([S28](../cases/extensibility.md#s28--worktree-creation-surfaces-clear-error-on-read-only-mounts)); permissions wrong | `stat <file>` for ownership |
+| Preview pane embedded browser blank | Dev server didn't bind expected port; `autoPort` config | Check launcher log; `lsof -i :<port>` |
+| Auto-verify screenshots fail | Headless screenshot in embedded view broken on Wayland | Test on X11 row; report to upstream |
+| CI bar shows stale state | `gh` polling interval; rate-limited | `gh api rate_limit`; manual `gh pr checks <num>` |
--- a/docs/testing/ui/connectors-and-plugins.md
+++ b/docs/testing/ui/connectors-and-plugins.md
@@ -0,0 +1,70 @@
+# UI — Connectors & Plugins
+
+Connector picker, connectors list, plugin browser, plugin manager. Related functional tests: [T11](../cases/extensibility.md#t11--plugin-install-anthropic--partners), [T33](../cases/extensibility.md#t33--plugin-browser), [T34](../cases/code-tab-handoff.md#t34--connector-oauth-round-trip), [S27](../cases/extensibility.md#s27--plugins-install-per-user-not-into-system-paths).
+
+## Connector picker (in-session)
+
+Triggered by `+` → **Connectors** in the prompt area.
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Connectors menu | Opened from `+` button | Lists configured connectors + "Manage connectors" entry | — |
+| Per-connector row | Menu item | Name, status indicator (connected / not configured), action button | — |
+| **Manage connectors** entry | Bottom of menu | Opens Settings → Connectors | Crosses with [`settings.md`](./settings.md#connectors) |
+| Empty state | When no connectors configured | Helpful prompt with "Add connector" call to action | — |
+
+## Connectors list (Settings → Connectors)
+
+See [`settings.md`](./settings.md#connectors) for the surface.
+
+## Add-connector flow
+
+Triggered from the connector picker or Settings.
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Connector catalog | Modal body | Searchable list (Slack, GitHub, Linear, Notion, Google Calendar, etc.) | — |
+| Per-connector tile | Catalog entry | Logo, name, short description | — |
+| **Connect** button | Per tile | Initiates OAuth flow ([T34](../cases/code-tab-handoff.md#t34--connector-oauth-round-trip)) | Click → `xdg-open` to provider |
+| OAuth in-app overlay (if used) | Replaces system browser handoff in some flows | Embedded login pane | — |
+| Permission consent screen | OAuth provider side | Provider's UI; not under our control | — |
+| Callback completion | After OAuth completes | Returns to Claude Desktop, connector now in list | If the URL scheme handler is broken, user is stranded in browser |
+| Custom connector entry point | Catalog bottom | "Add custom connector via remote MCP" link | — |
+
+## Plugin browser
+
+Triggered by `+` → **Plugins** → **Add plugin**, or from sidebar **Customize** → **Plugins**.
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Plugin browser modal | Opened from menu | Searchable marketplace catalog | — |
+| Marketplace selector | Top of modal | Default: Anthropic official; user-configured marketplaces also visible | — |
+| Per-plugin tile | Catalog body | Name, author, description, install count | — |
+| **Install** button | Per tile | Click installs to `~/.claude/plugins/` ([T11](../cases/extensibility.md#t11--plugin-install-anthropic--partners), [S27](../cases/extensibility.md#s27--plugins-install-per-user-not-into-system-paths)) | — |
+| Plugin scope selector | Per install | User / Project / Local-only | — |
+| Install progress indicator | During install | Spinner + "Installing X..." text | — |
+| Install success state | After install | Confirmation; plugin now in **Manage plugins** | — |
+| Install error state | On failure | Error message identifying the cause (network, signature, conflict) | — |
+
+## Manage plugins
+
+Triggered by `+` → **Plugins** → **Manage plugins**.
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Installed plugins list | Modal body | One row per installed plugin | — |
+| Per-plugin row | List item | Name, version, scope (User / Project / Local), enable toggle, uninstall button | — |
+| Enable toggle | Per row | Toggles plugin on/off without uninstall | — |
+| **Uninstall** button | Per row | Removes plugin files from `~/.claude/plugins/` | Confirmation expected |
+| Plugin skills sub-list | Expand row | Lists skills, agents, hooks, MCP servers, LSP configs the plugin contributes | — |
+
+## Failure modes to watch for
+
+| Symptom | Likely cause | Notes |
+|---------|--------------|-------|
+| Connect OAuth doesn't return to app | Custom URI scheme not registered ([T34](../cases/code-tab-handoff.md#t34--connector-oauth-round-trip)) | `xdg-mime query default x-scheme-handler/claude` |
+| Plugin browser empty | Marketplace fetch failed; offline | DevTools network panel |
+| Install progress stalls | Network / signature verification | Launcher log; check `~/.claude/plugins/.partial/` for incomplete downloads |
+| Plugin installed but skills don't appear | Slash menu cache stale; restart session | — |
+| Uninstall leaves files | Filesystem permissions; some plugin files owned by root | `find ~/.claude/plugins/ -not -user $USER` |
+| Connector "Connected" but tools fail | Token expired; backend refuses; needs reconnect | Disconnect → reconnect |
--- a/docs/testing/ui/notifications.md
+++ b/docs/testing/ui/notifications.md
@@ -0,0 +1,59 @@
+# UI — Desktop Notifications
+
+Notification rendering across DEs. The app dispatches notifications via `org.freedesktop.Notifications` (libnotify spec); each DE renders them differently. Related functional tests: [T23](../cases/code-tab-handoff.md#t23--desktop-notifications-fire), [T27](../cases/routines.md#t27--scheduled-task-fires-and-notifies), [S24](../cases/platform-integration.md#s24--dispatch-spawned-code-session-appears-with-badge-and-notification).
+
+## Notification sources
+
+The app posts notifications for the following events. Each should fire reliably on every supported DE.
+
+| Source | Trigger | Expected text | Click action | Notes |
+|--------|---------|---------------|--------------|-------|
+| Scheduled task fires | When a routine starts a run | "Scheduled task `<name>` started" or similar | Focus the new session in sidebar | Crosses with [T27](../cases/routines.md#t27--scheduled-task-fires-and-notifies) |
+| Catch-up run | When a missed run starts after wake | "Catching up on `<name>`" + missed-time hint | Focus the catch-up session | Crosses with [T28](../cases/routines.md#t28--scheduled-task-catch-up-after-suspend) |
+| CI status change | When PR's CI state resolves | "CI passed for `<branch>`" or "CI failed: `<check>`" | Focus the session with CI bar | Crosses with [T22](../cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) |
+| PR merged (auto-archive trigger) | When watched PR merges | "PR `<title>` merged. Session archived" | — | Crosses with [T30](../cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge) |
+| Dispatch handoff | When a Dispatch task creates a Code session | "Dispatch session ready: `<task>`" | Focus the new Dispatch-badged session | Crosses with [S24](../cases/platform-integration.md#s24--dispatch-spawned-code-session-appears-with-badge-and-notification) |
+| Permission prompt awaiting approval | When a session in Ask mode needs user approval | "Claude needs your approval" | Focus the awaiting session | Sessions in Ask mode stall until answered |
+
+## Per-notification anatomy
+
+Each notification should include:
+
+| Element | Expected | Notes |
+|---------|----------|-------|
+| App identity | "Claude" or "Claude Desktop" as the source | DE-specific (Plasma shows the app name and icon prominently) |
+| Notification icon | App icon (theme-aware) | Should match the same icon set as the tray |
+| Title | Short event headline | One line, no truncation issues for typical lengths |
+| Body | One or two short lines of context | Wrap correctly for the DE's notification width |
+| Actions (if any) | Inline buttons (e.g. "Open", "Dismiss") | Some DEs show actions, some require expand |
+| Click target | Activates the relevant session/window | — |
+
+## Per-DE rendering
+
+| DE / daemon | Expected render | Caveats |
+|-------------|-----------------|---------|
+| KDE Plasma | KDE notification daemon (KNotifications); appears top-right by default; inline action buttons supported | — |
+| GNOME Shell | gnome-shell built-in; appears top-center; limited action support | — |
+| Mako (wlroots) | Stacked notifications top-right by default; supports actions if config allows | — |
+| Dunst | Lightweight; respects `~/.config/dunst/dunstrc`; actions via keybinds | — |
+| swaync (Sway) | Notification center + popups | — |
+| Niri | Compositor-provided; usually a portable daemon (mako, dunst) | — |
+
+## Notification persistence
+
+| Element | Expected | Notes |
+|---------|----------|-------|
+| Notification history | DE-dependent (KDE has notification panel; GNOME has Calendar drawer; mako/dunst can be configured) | Don't rely on persistence — assume fire-and-forget |
+| Do-not-disturb mode | Respect DE's DND state | If user has DND on, notifications shouldn't fire — verify the daemon honors this |
+
+## Failure modes to watch for
+
+| Symptom | Likely cause | Diagnose with |
+|---------|--------------|---------------|
+| No notifications appear | No daemon running; service not registered | `gdbus call --session --dest=org.freedesktop.Notifications --object-path=/org/freedesktop/Notifications --method=org.freedesktop.DBus.Introspectable.Introspect`; `notify-send "test"` from terminal |
+| Notification fires but no icon | Icon path resolution failed; theme strip | Inspect the dbus call body for `app_icon` value |
+| Click does nothing | Action handler IPC missed; window already focused | Click while main window is hidden — does it appear? |
+| Title/body cut off | DE truncation policy | Test with shorter strings to confirm content vs. layout |
+| Notifications fire even in DND | Daemon ignoring DND, or our app sets `urgency=critical` inappropriately | Check `urgency` hint in the dbus call |
+| Notification persists indefinitely | `expire_timeout=-1` (never) used inappropriately | Confirm timeout passed in the dbus call |
+| Per-source duplicates | Multiple subscribers to the same event | Diagnose by isolating one source at a time |
--- a/docs/testing/ui/prompt-area.md
+++ b/docs/testing/ui/prompt-area.md
@@ -0,0 +1,76 @@
+# UI — Code Tab Prompt Area
+
+The prompt input area is where users type messages, attach files, pick model and permission mode, and trigger send/stop. Related functional tests: [T18](../cases/code-tab-foundations.md#t18--drag-and-drop-files-into-prompt), [T32](../cases/code-tab-workflow.md#t32--slash-command-menu).
+
+## Text input
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Input field | Bottom center of session pane | Single-line on focus, expands to multi-line as user types | — |
+| Placeholder text | Empty state | Helpful hint ("Type to message Claude...") | — |
+| Cursor caret | Inside input | Blinks; visible against any background | — |
+| Multi-line autosize | Type a long message | Input grows up to a max height, then scrolls | — |
+| Word wrap | Long text | Wraps at field width without horizontal scroll | — |
+| Paste plain text | `Ctrl+V` after copying text | Inserts at cursor | — |
+| Paste image | `Ctrl+V` after copying an image | Attaches as file (see attachments below) | — |
+| `Enter` to send | Press Enter | Submits prompt | — |
+| `Shift+Enter` for newline | Press Shift+Enter | Inserts newline, doesn't submit | — |
+| `Esc` | Press Esc when prompt has content | DE-dependent; typically does nothing in input | — |
+| IME composition | Compose a CJK character | Composition UI renders correctly above the input | Fcitx5/IBus integration |
+
+## Attachments
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Attachment button | Left of input (paperclip icon) | Click opens native file chooser | Wayland: portal-backed |
+| File-attached chip | Above or inside input | Shows filename + remove (X) button | — |
+| Multiple attachments | Attach 3+ files | Each shows as a separate chip; stacked if needed | — |
+| Image preview thumbnail | Image attachments | Shows small thumbnail | — |
+| PDF preview | PDF attachments | Shows generic PDF icon + filename | — |
+| Drag-drop overlay | Drag a file from file manager into the prompt | Overlay highlight indicates drop zone; release attaches ([T18](../cases/code-tab-foundations.md#t18--drag-and-drop-files-into-prompt)) | — |
+| `@filename` autocomplete | Type `@` in prompt | Dropdown shows matching project files | Local and SSH only |
+
+## `+` menu (skills, plugins, connectors)
+
+| Element | Position in menu | Expected | Notes |
+|---------|------------------|----------|-------|
+| `+` button | Adjacent to attachment button | Click opens menu | — |
+| **Slash commands** entry | Top of menu | Opens slash command picker (same as typing `/`) | Crosses with [T32](../cases/code-tab-workflow.md#t32--slash-command-menu) |
+| **Skills** entry | Mid-menu | Opens skill browser | — |
+| **Connectors** entry | Mid-menu | Opens connector picker / status | Crosses with [T34](../cases/code-tab-handoff.md#t34--connector-oauth-round-trip) |
+| **Plugins** entry | Mid-menu | Opens installed plugin list | Crosses with [T11](../cases/extensibility.md#t11--plugin-install-anthropic--partners), [T33](../cases/extensibility.md#t33--plugin-browser) |
+| **Add plugin** subentry | Under Plugins | Opens plugin browser | — |
+
+## Slash menu (triggered by typing `/`)
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Menu container | Above prompt input | Modal-like overlay, scrollable | — |
+| Built-in commands section | Top of list | Lists `/btw`, `/compact`, etc. | — |
+| Project skills section | Mid-list | Lists skills from `.claude/skills/` | — |
+| User skills section | Mid-list | Lists skills from `~/.claude/skills/` | — |
+| Plugin skills section | Bottom-list | Lists skills from installed plugins | — |
+| Filter by typing | Type after `/` | Narrows the list | — |
+| Selected item insertion | `Enter` or click | Inserts highlighted token in prompt | — |
+| `Esc` to dismiss | Press Esc | Closes menu, keeps `/` typed | — |
+
+## Pickers next to send button
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Model picker | Right of input | Dropdown of Sonnet, Opus, Haiku (per current plan availability) | `Cmd+Shift+I` opens |
+| Permission mode picker | Right of input | Dropdown of Ask, Auto accept, Plan, Auto, Bypass | `Cmd+Shift+M` opens |
+| Effort picker (when applicable) | Right of input | Dropdown of effort levels for adaptive-reasoning models | `Cmd+Shift+E` opens |
+| Send button | Far right | Click submits prompt | — |
+| Stop button | Replaces Send while Claude responding | Click interrupts current response | `Esc` shortcut equivalent |
+| Usage ring | Adjacent to model picker | Shows context window usage + plan usage | Click for details |
+
+## Failure modes to watch for
+
+| Symptom | Likely cause | Notes |
+|---------|--------------|-------|
+| Drag-drop overlay doesn't appear | Electron drag-drop event not firing on Wayland | Try X11 fallback to isolate |
+| `@filename` autocomplete returns empty | Project-folder access not granted; folder picker [T17](../cases/code-tab-foundations.md#t17--folder-picker-opens) failed silently | Verify env pill shows the right folder |
+| Slash menu shows wrong skills | Settings shared between desktop and CLI ([T36](../cases/extensibility.md#t36--hooks-fire), [T37](../cases/extensibility.md#t37--claudemd-memory-loads)) | Check `~/.claude/skills/` content vs what's listed |
+| Send button greyed out unexpectedly | Permission mode or model not loaded | Refresh; check model dropdown |
+| IME composition broken | Electron IME pipeline regression | Test with simpler Electron app |
--- a/docs/testing/ui/quick-entry.md
+++ b/docs/testing/ui/quick-entry.md
@@ -0,0 +1,49 @@
+# UI — Quick Entry Popup
+
+The Quick Entry popup is the global-shortcut-triggered prompt overlay. Related functional tests: [T06](../cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused), [S09](../cases/shortcuts-and-input.md#s09--quick-window-patch-runs-only-on-kde-post-406-gate), [S10](../cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame), [S29](../cases/shortcuts-and-input.md#s29--quick-entry-popup-is-created-lazily-on-first-shortcut-press-closed-to-tray-sanity), [S33](../cases/shortcuts-and-input.md#s33--quick-entry-transparent-rendering-tracked-against-bundled-electron-version), [S35](../cases/shortcuts-and-input.md#s35--quick-entry-popup-position-is-persisted-across-invocations-and-across-app-restarts), [S36](../cases/shortcuts-and-input.md#s36--quick-entry-popup-falls-back-to-primary-display-when-saved-monitor-is-gone), [S37](../cases/shortcuts-and-input.md#s37--quick-entry-popup-remains-functional-after-main-window-destroy).
+
+## Window appearance
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Window frame | None (frameless popup) | No OS-titlebar; no close/min/max buttons | Upstream sets `frame: false` on the BrowserWindow (`index.js:515381`) |
+| Background | Behind prompt UI | Transparent (no opaque square frame visible) on KDE Plasma Wayland ([S10](../cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame)) | Upstream already sets both `transparent: true` and `backgroundColor: "#00000000"` (`index.js:515380, 515383`). #370 regression is below the option-passing layer (Electron 41.0.4 CSD rework). KDE-W: pending; bug if opaque |
+| Rounded corners | Outer edge of UI | Visible | Compositor must support corner rounding via shaders / clip mask |
+| Drop shadow | Around popup | macOS-only at the Electron level; on Linux/Windows depends entirely on compositor | Upstream sets `hasShadow: Zr` where `Zr === process.platform === "darwin"` (`index.js:515384`). Linux is expected to render via compositor shadow support; wlroots without server-side decorations will not show one |
+| Position | Last-saved position, keyed on monitor; falls back to primary display if monitor is gone | Popup remembers its position across invocations and across app restarts ([S35](../cases/shortcuts-and-input.md#s35--quick-entry-popup-position-is-persisted-across-invocations-and-across-app-restarts), [S36](../cases/shortcuts-and-input.md#s36--quick-entry-popup-falls-back-to-primary-display-when-saved-monitor-is-gone)) | Upstream uses `an.get("quickWindowPosition")` (`index.js:515491-515526`) keyed on monitor label + resolution. Falls back to `cHn()` (`:515502`) when the saved monitor is gone. **Upstream does NOT place on cursor display or focused-window display** — it's last-position or primary, nothing else |
+| Always-on-top | Window manager hint | Stays above other windows | Upstream sets `alwaysOnTop: true` with level `"pop-up-menu"` (`index.js:515399`). On macOS this is per-app; on Linux compositors the level hint is interpreted variably |
+| Lifecycle | Lazy-created on first shortcut press | First shortcut press constructs the BrowserWindow; subsequent presses reuse it ([S29](../cases/shortcuts-and-input.md#s29--quick-entry-popup-is-created-lazily-on-first-shortcut-press-closed-to-tray-sanity)) | Upstream `if (!Ko \|\| ...) Ko = new BrowserWindow(...)` near `index.js:515375`. Means popup works in tray-only state with no main window mapped |
+| Persistence after main window destroy | Popup survives `mainWindow.destroy()` | Popup remains functional; submit guards skip show/focus when `ut` is destroyed ([S37](../cases/shortcuts-and-input.md#s37--quick-entry-popup-remains-functional-after-main-window-destroy)) | Upstream `!ut \|\| ut.isDestroyed()` guard at `index.js:515595`. Likely unreachable on this project due to hide-to-tray override of X button |
+
+## Input area
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Text input field | Center of popup | Receives focus immediately on open; cursor blinks | — |
+| Placeholder text | Empty input state | Shows guidance like "Ask Claude anything..." | — |
+| Multi-line autosize | Type a long prompt | Input grows downward as text wraps; popup grows with it | — |
+| `Enter` to submit | Press Enter | Sends prompt, closes popup. Prompt must be > 2 chars trimmed (`index.js:515530, 515533`); 1-2 char prompts are silently dropped | Renderer-side keymap; reaches main process via IPC `requestDismissWithPayload()` (`:515409`) |
+| `Shift+Enter` for newline | Press Shift+Enter | Inserts newline, doesn't submit | Renderer-side |
+| `Esc` to dismiss | Press Esc | Closes popup without submitting | Renderer-side; reaches main process via IPC `requestDismiss()` (`:515409`) |
+| Click outside | Click outside the popup window | Closes popup without submitting | Wired in **main process** via the popup's `blur` handler (`Ko.on("blur", () => g3A(null))` at `index.js:515465`) |
+| Paste behavior | Paste rich text | Text-only paste; no HTML residue | — |
+| IME / dead-key composition | Type composed characters | Composition UI renders correctly above the input | Fcitx5/IBus integration is fragile under Electron |
+
+## Submit feedback
+
+| Element | Trigger | Expected | Notes |
+|---------|---------|----------|-------|
+| Submit transition | Press Enter | Popup closes; main window navigates to a **new** chat session ([S31](../cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state)). Quick Entry never appends to existing chats — `ynt(e)` at `index.js:515546` always creates new | Upstream calls `mainWin.show()` + `mainWin.focus()` only — no `restore()`, no workspace migration. Behavior on minimized / hidden / cross-workspace main is compositor-dependent |
+| Loading indicator | While prompt is in flight | Brief spinner or fade-out — popup should not appear frozen | — |
+| Error state | Submit when offline / API error | Inline error message; popup stays open so user can retry | — |
+
+## Failure modes to watch for
+
+| Symptom | Likely cause | Diagnose with |
+|---------|--------------|---------------|
+| Popup doesn't appear when shortcut pressed | Global shortcut not registered ([T06](../cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused), [S11](../cases/shortcuts-and-input.md#s11--quick-entry-shortcut-fires-from-any-focus-on-wayland-mutter-xwayland-key-grab), [S14](../cases/shortcuts-and-input.md#s14--global-shortcuts-via-xdg-portal-work-on-niri)) | Launcher log; portal `BindShortcuts` outcome |
+| Opaque square frame visible behind UI | Transparent background not respected ([S10](../cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame)) | KDE compositor settings; BrowserWindow `transparent: true` arg |
+| Popup appears but input doesn't auto-focus | Focus stealing prevention by compositor; race in BrowserWindow `show()` + `focus()` | Wayland focus-request semantics; mutter is most strict |
+| IME composition cursor renders in wrong place | Electron IME integration bug | Try with simpler GTK app to isolate; report upstream Electron issue if reproducible |
+| Popup persists after submit | Close-on-submit IPC missed | Launcher log; DevTools console (if reachable on the popup window) |
+| Popup appears on wrong monitor / wrong workspace | Compositor places frameless windows differently | Test with `xdotool getactivewindow` (X11) before/after |
--- a/docs/testing/ui/routines-page.md
+++ b/docs/testing/ui/routines-page.md
@@ -0,0 +1,72 @@
+# UI — Routines Page
+
+The Routines page hosts the list of scheduled tasks (local and remote), the new-routine form, and per-routine detail views. Related functional tests: [T26](../cases/routines.md#t26--routines-page-renders), [T27](../cases/routines.md#t27--scheduled-task-fires-and-notifies), [T28](../cases/routines.md#t28--scheduled-task-catch-up-after-suspend).
+
+## Routines list
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Routines page link | Code-tab sidebar | Click opens the page ([T26](../cases/routines.md#t26--routines-page-renders)) | — |
+| Page header | Top of page | Title "Routines" + description | — |
+| **New routine** button | Top-right of page | Click shows Local / Remote selector | — |
+| Routines list | Page body | Lists all configured routines | — |
+| Per-routine row | List item | Name, schedule summary, last-run timestamp, status indicator | — |
+| Run-now icon | Per row, hover-revealed | Click triggers immediate run ([T27](../cases/routines.md#t27--scheduled-task-fires-and-notifies)) | — |
+| Pause / resume toggle | Per row | Pauses or resumes scheduled runs without deleting | — |
+| Click row | Per row | Opens routine detail page | — |
+
+## New routine form (Local)
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Routine type selector | Top of form | Local / Remote tabs or radio | — |
+| **Name** field | Top of form | Required; converted to lowercase kebab-case for filesystem | — |
+| **Description** field | Below name | Optional one-liner shown in list | — |
+| **Instructions** textarea | Mid-form | Rich textarea for the prompt | — |
+| Permission mode picker | Within Instructions area | Same options as session: Ask, Auto accept, Plan, Auto, Bypass | — |
+| Model picker | Within Instructions area | Sonnet, Opus, Haiku per plan | — |
+| **Working folder** picker | Below Instructions | Required; opens native file chooser | If folder not yet trusted, app prompts to trust |
+| **Worktree** toggle | Below folder | When ON, each run gets its own isolated worktree | — |
+| **Schedule** preset | Bottom of form | Manual / Hourly / Daily / Weekdays / Weekly | — |
+| Time picker | Visible for Daily, Weekdays, Weekly | Defaults to 9:00 AM local | — |
+| Day picker | Visible for Weekly only | Day-of-week selector | — |
+| **Save** button | Bottom-right | Disabled until required fields filled | — |
+| **Cancel** button | Bottom-left | Discards form, returns to list | — |
+| Folder-trust prompt | Triggered when folder not trusted | Modal asking to trust the selected folder | Required before save |
+
+## New routine form (Remote)
+
+Per upstream docs, remote routines run on Anthropic-managed cloud infrastructure. The form has additional fields for connectors and trigger types (cron, API, GitHub event). On Linux, the Remote tab should function identically to other platforms.
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Trigger type selector | Top of form | Schedule / API call / GitHub event | — |
+| Connectors picker | Per-routine basis (remote) | Configures connectors at routine creation | — |
+| Network access controls | If applicable | Tied to cloud environment config | — |
+
+## Routine detail page
+
+Per upstream docs.
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| **Run now** button | Top of page | Starts the task immediately | — |
+| Status toggle (Active / Paused) | Top of page | Pauses or resumes without deleting | — |
+| **Edit** button | Top of page | Opens the same form populated with current values | — |
+| **Delete** button | Top of page (or footer) | Removes routine; archives all sessions it created | Confirmation dialog expected |
+| **Review history** section | Page body | Lists every past run with timestamp and status | — |
+| Per-history-entry hover | Hover skipped runs | Tooltip explains why skipped (asleep, prior run still running, other concurrent task) | — |
+| **Show more** button | Bottom of history | Loads older entries | — |
+| **Always allowed** panel | Page body | Lists tools auto-approved for this routine | — |
+| Revoke approval | Per-tool entry | Removes the auto-approval | — |
+
+## Failure modes to watch for
+
+| Symptom | Likely cause | Notes |
+|---------|--------------|-------|
+| Folder-trust modal doesn't appear | Trust state cached incorrectly | Clear `~/.claude/trusted-folders` (or equivalent) and retry |
+| Save button never enables | Required fields validation regression | DevTools console |
+| Time picker truncates / clips | Modal sizing on small viewports | Resize Settings window to reproduce |
+| History tooltips don't render | Tooltip component regression | — |
+| Run-now does nothing | Task runner thread not started | Launcher log; `pgrep -af claude` for runner subprocess |
+| Routines page blank | Code-tab failure ([T16](../cases/code-tab-foundations.md#t16--code-tab-loads)) cascading | Confirm Code tab itself loads first |
--- a/docs/testing/ui/settings.md
+++ b/docs/testing/ui/settings.md
@@ -0,0 +1,87 @@
+# UI — Settings
+
+The Settings window holds Desktop app preferences, Claude Code settings, connector management, and account controls. Related functional tests: [S20](../cases/routines.md#s20--keep-computer-awake-inhibits-idle-suspend), [S22](../cases/platform-integration.md#s22--computer-use-toggle-is-absent-or-visibly-disabled-on-linux), [T30](../cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge).
+
+## Settings root
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Settings window | Opened via app menu, tray menu, or in-app shortcut | Window opens with sidebar nav and content area | — |
+| Window close button | Top-right (or top-left on GNOME) | Closes settings; main app continues running | — |
+| Sidebar nav | Left of window | Lists every settings page | — |
+
+## Desktop app → General
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| **Computer use** toggle | Top of page | Either absent on Linux, or rendered disabled with a "not supported on Linux" hint ([S22](../cases/platform-integration.md#s22--computer-use-toggle-is-absent-or-visibly-disabled-on-linux)) | Critical: must not appear functional |
+| **Keep computer awake** toggle | Mid-page | Toggles `systemd-inhibit --what=idle:sleep` lock ([S20](../cases/routines.md#s20--keep-computer-awake-inhibits-idle-suspend)) | Verify with `systemd-inhibit --list` |
+| **Denied apps** list | Computer-use related | Likely absent on Linux (computer use unsupported) | — |
+| **Unhide apps when Claude finishes** toggle | Computer-use related | Likely absent on Linux | — |
+| Theme picker (if exposed) | Mid-page | System / Light / Dark | Tray icon should respond ([S08](../cases/tray-and-window-chrome.md#s08--tray-icon-doesnt-duplicate-after-nativetheme-update)) |
+
+## Desktop app → Account
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Account name / email | Top of page | Reflects signed-in identity | — |
+| Plan badge | Below name | Shows Pro / Max / Team / Enterprise | — |
+| Sign out button | Bottom of page | Signs out cleanly; subsequent launches show sign-in screen | — |
+
+## Claude Code
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| **Worktree location** | Top of page | Default: `<project-root>/.claude/worktrees/`. Editable to a custom directory | Crosses with [T29](../cases/code-tab-workflow.md#t29--worktree-isolation) |
+| **Branch prefix** | Mid-page | Optional prefix prepended to every worktree branch | — |
+| **Auto-archive after PR merge or close** toggle | Mid-page | When ON, sessions archive on PR resolution ([T30](../cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge)) | — |
+| **Persist preview sessions** toggle | Mid-page | Toggles cookies/localStorage persistence in Preview pane | Crosses with [T21](../cases/code-tab-workflow.md#t21--dev-server-preview-pane) |
+| **Preview** toggle | Mid-page | When OFF, preview pane and auto-verify are disabled | — |
+| **Allow bypass permissions mode** toggle | Mid-page | When ON, exposes Bypass mode in mode picker | Enterprise admins can disable |
+| **Auto** mode availability | Mid-page | Research preview; not on Pro plans | Per upstream docs |
+
+## Connectors
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Connectors list | Page content | Lists connected services with status | Crosses with [T34](../cases/code-tab-handoff.md#t34--connector-oauth-round-trip) |
+| Per-connector entry | List row | Name, last-connected timestamp, manage / disconnect buttons | — |
+| **Manage** button | Per row | Opens connector-specific settings | — |
+| **Disconnect** button | Per row | Revokes access; connector becomes unusable in subsequent sessions | — |
+| **Add connector** button | Top of page | Opens the connector picker (same surface as `+ → Connectors`) | — |
+
+## SSH connections
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| SSH connections list | Page content | Lists user-added + managed (read-only) connections | — |
+| **Add SSH connection** button | Top of page | Opens dialog with Name / SSH Host / SSH Port / Identity File fields | — |
+| Per-connection entry | List row | Edit / delete (user-added) or "Managed" badge (admin-distributed) | — |
+
+## Keyboard shortcuts
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Shortcut list | Page content | Tabular list of all configurable shortcuts | — |
+| Shortcut value | Per row | Click to rebind; shows current binding | — |
+| Reset to default | Per row | Reverts to upstream default | — |
+| Quick Entry shortcut | Specifically called out | Default `Ctrl+Alt+Space`; rebind here | Crosses with [T06](../cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused) |
+
+## Local environment editor
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Env editor open | Environment dropdown → Local → gear icon | Opens encrypted env-var editor | Crosses with [S18](../cases/platform-integration.md#s18--local-environment-editor-persists-across-reboot) |
+| Add variable | In editor | Name + value fields; save | — |
+| Remove variable | Per row | Deletes the variable | — |
+| **Apply to dev servers** indicator | Near save | Confirms vars also reach preview servers | — |
+
+## Failure modes to watch for
+
+| Symptom | Likely cause | Notes |
+|---------|--------------|-------|
+| Computer-use toggle visible and toggleable on Linux | [S22](../cases/platform-integration.md#s22--computer-use-toggle-is-absent-or-visibly-disabled-on-linux) regression | File a bug; users will be misled |
+| Keep-computer-awake toggle has no effect | `systemd-inhibit` integration not wired ([S20](../cases/routines.md#s20--keep-computer-awake-inhibits-idle-suspend)) | Verify lock list before/after |
+| Worktree location field rejects valid paths | Path validation too strict; absolute vs `~`-prefixed | Check both forms |
+| SSH connection list missing managed entries | Managed-settings file not loaded; admin distribution failed | Confirm file exists at expected path |
+| Env editor not encrypting | Linux secret-store not wired ([S18](../cases/platform-integration.md#s18--local-environment-editor-persists-across-reboot)) | `secret-tool search`; `kwallet5-query` |
--- a/docs/testing/ui/sidebar.md
+++ b/docs/testing/ui/sidebar.md
@@ -0,0 +1,55 @@
+# UI — Code Tab Sidebar
+
+The sidebar lists Code-tab sessions, lets you filter, group, archive, and rename. Related functional tests: [T29](../cases/code-tab-workflow.md#t29--worktree-isolation), [T30](../cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge), [S24](../cases/platform-integration.md#s24--dispatch-spawned-code-session-appears-with-badge-and-notification).
+
+## Top controls
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| **+ New session** button | Top of sidebar | Click opens a new session against the currently selected env. `Ctrl+N` shortcut equivalent | — |
+| **Routines** link | Top of sidebar | Click opens the Routines page ([T26](../cases/routines.md#t26--routines-page-renders)) | — |
+| **Customize** link | Top of sidebar | Click opens connectors / skills / plugins manager | — |
+| Filter: status | Top of session list | Dropdown / tabs filter by Active / Archived / All | — |
+| Filter: project | Top of session list | Dropdown filters by project (multi-select) | — |
+| Filter: environment | Top of session list | Dropdown filters by Local / Remote / SSH / All | — |
+| Group-by control | Top of session list | Toggle between flat list and grouped-by-project | — |
+
+## Session row
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Session title | Row content | Shows session name (auto-generated or user-renamed) | Click row → switches to that session |
+| Session status indicator | Left of title or as colored dot | Reflects state: idle, running, awaiting-approval, errored, archived | — |
+| Project / branch label | Below title | Shows project folder name + branch | — |
+| Diff stats badge (e.g. `+12 -1`) | Right of title | Visible when session has uncommitted changes | Click → opens diff view |
+| **Dispatch** badge | Top-right of row | Visible on Dispatch-spawned sessions ([S24](../cases/platform-integration.md#s24--dispatch-spawned-code-session-appears-with-badge-and-notification)) | — |
+| **Scheduled** badge | Top-right of row | Visible on scheduled-task-spawned sessions ([T27](../cases/routines.md#t27--scheduled-task-fires-and-notifies)) | Sessions group under "Scheduled" header |
+| Hover archive icon | Right side, on row hover | Click archives the session and removes its worktree | — |
+| Right-click context menu | Right-click on row | Standard menu: Rename, Archive, Open in Files, Copy path | — |
+| Active session highlight | Selected row | Visually distinct from inactive rows | — |
+
+## Sidebar layout
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Sidebar resize handle | Right edge of sidebar | Drag to resize; double-click to reset width | — |
+| Sidebar collapse toggle | Top of sidebar (hamburger or arrow) | Collapse to icons-only or hide entirely | Crosses with topbar hamburger |
+| Scrollbar | Right edge when content exceeds height | Renders, drags work | Theme-aware |
+
+## Cycling shortcuts
+
+| Shortcut | Expected | Notes |
+|----------|----------|-------|
+| `Ctrl+Tab` | Cycle to next session | Per upstream docs |
+| `Ctrl+Shift+Tab` | Cycle to previous session | Per upstream docs |
+| `Cmd+Shift+]` / `Cmd+Shift+[` | Same as above on macOS | N/A on Linux unless rebound |
+
+## Failure modes to watch for
+
+| Symptom | Likely cause | Notes |
+|---------|--------------|-------|
+| Sidebar doesn't render | Code tab failed to load ([T16](../cases/code-tab-foundations.md#t16--code-tab-loads)) | Check DevTools console |
+| Sessions appear but clicking does nothing | IPC between sidebar and session pane broken | Launcher log, DevTools console |
+| Hover archive icon never appears | CSS hover state mis-applied; touch device might be assumed | Inspect element; check pointer events |
+| Dispatch / Scheduled badges missing | Feature flag or state not reaching the renderer | Check session metadata in launcher log |
+| Auto-archive doesn't fire | Session-archive logic bug ([T30](../cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge)) | Confirm setting enabled; check PR state via `gh pr view` |
--- a/docs/testing/ui/tray.md
+++ b/docs/testing/ui/tray.md
@@ -0,0 +1,44 @@
+# UI — System Tray
+
+Tray icon, menu, and theme variants. See [`../cases/tray-and-window-chrome.md`](../cases/tray-and-window-chrome.md) for related functional tests ([T03](../cases/tray-and-window-chrome.md#t03--tray-icon-present), [S08](../cases/tray-and-window-chrome.md#s08--tray-icon-doesnt-duplicate-after-nativetheme-update)).
+
+## Tray icon
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Tray icon (light theme) | System tray / status area | Black icon (the "Template" variant) renders cleanly on a light tray | — |
+| Tray icon (dark theme) | System tray / status area | White icon (the "Template-Dark" variant) renders cleanly on a dark tray | — |
+| Theme switch | Trigger system theme change | Icon updates in place — no duplicate icons spawned ([S08](../cases/tray-and-window-chrome.md#s08--tray-icon-doesnt-duplicate-after-nativetheme-update)) | KDE-W ✓ via in-place fast-path |
+| Icon resolution / sharpness | Inspect at native scale | Icon is crisp, not pixelated. Check on HiDPI screens | — |
+| Position | Tray area | Appears among other SNI/tray icons | KDE Plasma sorts alphabetically by ID; adjusting position requires user config |
+| Tooltip on hover | Hover over icon | Shows "Claude" or app name | — |
+
+## Right-click menu
+
+| Element | Position in menu | Expected | Notes |
+|---------|------------------|----------|-------|
+| Show / Hide window | Top item | Toggles main window visibility | Label may change between "Show" and "Hide" based on state |
+| Quick Entry | Mid-menu | Opens Quick Entry popup ([T06](../cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused)) | — |
+| Open at Login (toggle) | Mid-menu | Reflects current XDG autostart state ([T09](../cases/platform-integration.md#t09--autostart-via-xdg)) | Toggle should write `~/.config/autostart/*.desktop` |
+| Settings | Mid-menu | Opens Settings window | — |
+| About | Bottom area | Opens About dialog | — |
+| Quit | Bottom item | Fully exits the app (no hide-to-tray) | — |
+| Menu separators | Between item groups | Render cleanly | — |
+
+## Left-click behavior
+
+| Element | Trigger | Expected | Notes |
+|---------|---------|----------|-------|
+| Single left-click | Click tray icon once | Toggles main window visibility | KDE-W ✓ |
+| Double left-click | Click twice quickly | DE-dependent; should not spawn duplicate windows | — |
+| Middle-click | Middle mouse button on tray icon | DE-dependent (no documented behavior); should not crash | — |
+
+## Failure modes to watch for
+
+| Symptom | Likely cause | Diagnose with |
+|---------|--------------|---------------|
+| Tray icon never appears | No SNI watcher (e.g. GNOME without AppIndicator extension); Electron fallback to legacy XEmbed not registered | `gdbus call ... org.kde.StatusNotifierWatcher` — see [runbook](../runbook.md#tray--dbus-state-kde) |
+| Two tray icons after theme switch | Tray rebuild race ([S08](../cases/tray-and-window-chrome.md#s08--tray-icon-doesnt-duplicate-after-nativetheme-update)) | SNI watcher state before/after; [`docs/learnings/tray-rebuild-race.md`](../../learnings/tray-rebuild-race.md) |
+| Icon renders as a generic placeholder | Icon path resolution failed; theme mismatch | Check Electron `Tray` constructor args; check `~/.cache/claude-desktop-debian/launcher.log` |
+| Menu items don't respond | IPC bridge to tray menu broken; main process busy | Click main window — does the rest of the app respond? `pgrep -af claude`; main process state |
+| Tray icon disappears after some time | Tray daemon restarted; Claude didn't re-register | KDE Plasma: restart `plasmashell`; observe whether icon comes back without restarting Claude |
--- a/docs/testing/ui/window-chrome-and-tabs.md
+++ b/docs/testing/ui/window-chrome-and-tabs.md
@@ -0,0 +1,58 @@
+# UI — Window Chrome & Tabs
+
+OS-level window frame plus the in-app tab strip and (PR #538) hybrid in-app topbar. See [`../cases/tray-and-window-chrome.md`](../cases/tray-and-window-chrome.md) for related functional tests.
+
+## OS window frame
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Title bar | Top of window | Drawn by DE/compositor; shows app title; right-click opens window menu | KDE-W ✓; Hypr-N ✓ |
+| Close button (X) | Top-right (or top-left on GNOME) | Renders, hover state visible, click hides-to-tray ([T08](../cases/tray-and-window-chrome.md#t08--hide-to-tray-on-close)) | — |
+| Minimize button | Adjacent to close | Renders, hover state visible, click minimizes | — |
+| Maximize / restore button | Adjacent to minimize | Renders, hover state visible, click toggles maximize | — |
+| Resize edges (left, right, top, bottom, corners) | Window perimeter | Cursor changes to resize affordance on hover; drag resizes | Wlroots compositors may not show cursor change |
+| Window menu (right-click titlebar) | Right-click anywhere on titlebar | Standard window menu (Move, Resize, Close, Always on Top, etc.) | DE-dependent |
+
+## Hybrid in-app topbar (PR #538 builds)
+
+Sits below the OS frame in hybrid mode. Crosses with [T07](../cases/tray-and-window-chrome.md#t07--in-app-topbar-renders--clickable) and [S13](../cases/tray-and-window-chrome.md#s13--hybrid-topbar-shim-survives-omarchys-ozone-wayland-env-exports).
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| Hamburger menu | Top-left of topbar | Renders, click opens sidebar | — |
+| Sidebar toggle | Adjacent to hamburger | Renders, click collapses/expands sidebar | — |
+| Search icon | Center-left | Renders, click opens search overlay | — |
+| Back arrow | Center | Renders, greyed out when no history; click navigates back | — |
+| Forward arrow | Adjacent to back | Same as back, but for forward history | — |
+| Cowork ghost icon | Right of nav arrows | Renders, click opens Cowork tab | The icon is the canonical "is the topbar shim alive" indicator |
+| Drag region (gaps between buttons) | Empty space between elements | Drag region behaves correctly — buttons remain clickable, no implicit drag region capturing button clicks | Critical: this is the regression mode in [T07](../cases/tray-and-window-chrome.md#t07--in-app-topbar-renders--clickable) |
+
+## Tab strip (Chat / Cowork / Code)
+
+Sits in the topbar (hybrid) or in the OS-frame area (legacy). Top center.
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| **Chat** tab | Left tab | Renders, click switches to Chat | — |
+| **Cowork** tab | Center tab | Renders, click switches to Cowork; ghost icon may indicate Dispatch state | — |
+| **Code** tab | Right tab | Renders, click switches to Code; on Linux, may show 403 / sign-in upsell ([T16](../cases/code-tab-foundations.md#t16--code-tab-loads)) | — |
+| Active tab indicator | Underline / fill on active tab | Visually distinct from inactive tabs | — |
+| Tab badges (e.g. unread count, Dispatch badge) | Top-right of each tab | Render when applicable, dismiss when state clears | — |
+
+## Other window-level UI
+
+| Element | Location | Expected | Notes |
+|---------|----------|----------|-------|
+| About dialog | App menu → About | Modal opens with app version, Electron version, license info; close button works | — |
+| App menu (macOS-style) | macOS only — N/A on Linux | Not present on Linux; menu items are in window menu instead | — |
+| Update prompt | Triggered by upstream update detection | On DEB/RPM, auto-update path is suppressed ([S26](../cases/distribution.md#s26--auto-update-is-disabled-when-installed-via-apt--dnf)). On AppImage, may surface a prompt | — |
+| Crash report dialog | Shown after a crash | Dialog explains what happened, offers to file an issue | Capture for Linux specifics — wording may reference macOS Console / Windows Event Viewer paths only |
+
+## Display-server cross-cuts
+
+| Concern | X11 | Wayland (mutter) | Wayland (KWin) | Wayland (wlroots) |
+|---------|-----|-------------------|----------------|---------------------|
+| HiDPI scaling | `--force-device-scale-factor=N` works | Auto via fractional scaling | Auto via fractional scaling | Auto where compositor supports it |
+| Drag-to-snap (Aero-style) | Works under most WMs | mutter snaps | KWin snaps | Compositor-dependent |
+| Always-on-top | Window menu | Window menu | Window menu | Compositor-dependent |
+| Cursor theme | Inherits from `gtk-cursor-theme-name` | Same | Same | Same |
--- a/tools/test-harness/.gitignore
+++ b/tools/test-harness/.gitignore
@@ -0,0 +1,5 @@
+node_modules/
+results/
+*.log
+.DS_Store
+package-lock.json
--- a/tools/test-harness/README.md
+++ b/tools/test-harness/README.md
@@ -0,0 +1,480 @@
+# Linux Compatibility Test Harness
+
+In-VM (or on-host) Playwright + DBus runner for the test cases under
+[`docs/testing/cases/`](../../docs/testing/cases/). See
+[`docs/testing/automation.md`](../../docs/testing/automation.md) for the
+architecture, decisions, and rationale.
+
+## Status
+
+Seventy-four specs wired (36 cross-env T-tests, 33 env-specific S-tests,
+5 H-prefix harness self-tests). See
+[`docs/testing/runner-implementation-plan.md`](../../docs/testing/runner-implementation-plan.md)
+for the tiered triage of remaining tests and the per-spec rationale
+behind tier classification.
+
+| Test | What it checks | Layer |
+|------|----------------|-------|
+| [T01](../../docs/testing/cases/launch.md#t01--app-launch) | X11 window with our pid appears within 15s; title matches `/claude/i` | L2 (xprop) |
+| [T02](../../docs/testing/cases/launch.md#t02--doctor-health-check) | `claude-desktop --doctor` exits 0 | spawn probe |
+| [T03](../../docs/testing/cases/tray-and-window-chrome.md#t03--tray-icon-present) | A `StatusNotifierItem` is registered by the claude-desktop pid AND exactly one (no rebuild-race duplicates) | L2 (DBus) |
+| [T04](../../docs/testing/cases/tray-and-window-chrome.md#t04--window-decorations-draw) | Window has `_NET_FRAME_EXTENTS` (sum > 0) and a "Claude" title | L2 (xprop) |
+| [T05](../../docs/testing/cases/shortcuts-and-input.md#t05--claude-url-handler) | `xdg-open 'claude://...'` delivers via `app.on('second-instance')` to the running app | spawn + L1 hook |
+| [T06](../../docs/testing/cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused) | `globalShortcut.isRegistered('Ctrl+Alt+Space')` returns true after `mainVisible` | L1 |
+| [T07](../../docs/testing/cases/tray-and-window-chrome.md#t07--in-app-topbar) | Five topbar buttons render with non-zero rects (uses `seedFromHost` for hermetic auth) | L1 + DOM |
+| [T08](../../docs/testing/cases/tray-and-window-chrome.md#t08--close-x-hides-to-tray) | `win.close()` fires the wrapper interceptor; window hidden, proc alive | L1 |
+| [T09](../../docs/testing/cases/platform-integration.md#t09--autostart-via-xdg) | `setLoginItemSettings({ openAtLogin })` writes/removes `$XDG_CONFIG_HOME/autostart/claude-desktop.desktop` | L1 + filesystem |
+| [T10](../../docs/testing/cases/platform-integration.md#t10--cowork-integration) | After H04-style spawn detection, `kill -9` the daemon and confirm a *different* pid respawns within ~20s (Patch 6 cooldown + retry) | pgrep delta + spawn delta |
+| [T11](../../docs/testing/cases/extensibility.md#t11--plugin-install) | Plugin-install code path fingerprints present in bundled `index.js` | file probe |
+| [T11_runtime](../../docs/testing/cases/extensibility.md#t11--plugin-install) | After `seedFromHost` + `userLoaded`, the install-flow eipc surface (`installPlugin`, `uninstallPlugin`, `updatePlugin`, `listInstalledPlugins`, `LocalPlugins/getPlugins` — five-suffix presence probe) is registered on the claude.ai webContents AND BOTH read-side handlers across the two impl objects are callable through the renderer-side wrapper: `CustomPlugins/listInstalledPlugins([])` returns array shape (drives Manage plugins panel), `LocalPlugins/getPlugins()` returns array shape (reads `~/.claude/plugins/installed_plugins.json` per case-doc :465822) — Tier 2 reframe of T11 (case-doc anchor :507181) | L1 (eipc registry + invoke) |
+| [T12](../../docs/testing/cases/platform-integration.md#t12--webgl-warn-only) | `app.getGPUFeatureStatus()` returns a populated object; renderer reached visible | L1 |
+| [T13](../../docs/testing/cases/launch.md#t13--doctor-reports-correct-package-format) | `--doctor` does not false-flag rpm/deb installs as missing-dpkg AppImage | spawn + stdout grep |
+| [T14a](../../docs/testing/cases/launch.md#t14--multi-instance-behavior) | `requestSingleInstanceLock` + `'second-instance'` strings in bundled `index.js` (file probe) | file probe |
+| [T14b](../../docs/testing/cases/launch.md#t14--multi-instance-behavior) | Second invocation under same isolation exits cleanly; primary pid stays alive (runtime probe) | spawn delta + pgrep |
+| [T16](../../docs/testing/cases/code-tab-foundations.md#t16--code-tab-loads) | After `seedFromHost` + `userLoaded`, `CodeTab.activate()` resolves and ≥1 compact pill renders (env pill = Code-body mounted) | L1 + AX-tree |
+| [T17](../../docs/testing/cases/code-tab-foundations.md#t17--folder-picker-opens) | After `seedFromHost` + `userLoaded`, Code df-pill → env pill → Local → Select folder → Open folder triggers `dialog.showOpenDialog` (mock installed via `installOpenDialogMock`); skips cleanly when host has no signed-in Claude config | L1 + AX-tree |
+| [T18](../../docs/testing/cases/code-tab-foundations.md#t18--drag-and-drop-files-into-prompt) | Bundled `mainView.js` preload contains the path-resolution bridge fingerprints: `getPathForFile` (2× — property key + the `webUtils.getPathForFile(` call, both at case-doc :9267), `webUtils`, `filePickers`, and the `claudeAppSettings` `contextBridge.exposeInMainWorld` namespace (case-doc :9552) — pins the load-bearing wiring without faking OS-level XDND drag (xdotool can't put file URIs on the X11 selection; Wayland needs per-compositor IPC + libei) | file probe |
+| [T19](../../docs/testing/cases/code-tab-foundations.md#t19--integrated-terminal) | After `seedFromHost` + `userLoaded`, the integrated-terminal eipc surface (`startShellPty`, `writeShellPty`, `stopShellPty`, `resizeShellPty`, `getShellPtyBuffer` — five-suffix presence probe) is registered on the claude.ai webContents AND the foundational `LocalSessions/getAll` returns array shape (Tier 2 reframe of the case-doc T19 case; case-doc anchors are write-side `startShellPty` etc. so reframe asserts the FULL terminal IPC surface registers + a stateless read-side surrogate is invocable) | L1 (eipc registry + invoke) |
+| [T20](../../docs/testing/cases/code-tab-foundations.md#t20--file-pane-opens-and-saves) | After `seedFromHost` + `userLoaded`, the file-pane eipc surface (`readSessionFile`, `writeSessionFile`, `pickSessionFile` — three-suffix presence probe) is registered on the claude.ai webContents AND the foundational `LocalSessions/getAll` returns array shape (Tier 2 reframe of the case-doc T20 case; the case-doc's `readSessionFile` anchor is read-side but needs (sessionId, path) args not constructible from a fresh isolation, so the registration probe + foundational `getAll` invocation is the strongest non-destructive Tier 2 layer) | L1 (eipc registry + invoke) |
+| [T21](../../docs/testing/cases/code-tab-workflow.md#t21--dev-server-preview-pane) | After `seedFromHost` + `userLoaded`, the preview-pane eipc surface (`getConfiguredServices`, `startFromConfig`, `stopServer`, `getAutoVerify`, `capturePreviewScreenshot` — five-suffix presence probe) is registered on the claude.ai webContents AND BOTH case-doc-anchored read-side handlers are callable through the renderer-side wrapper: `getConfiguredServices(cwd)` returns array shape, `getAutoVerify(cwd)` returns boolean shape (Tier 2 reframe of the case-doc T21 case; cwd validator is `typeof cwd === 'string'` only, smoke-tested session 11) | L1 (eipc registry + invoke) |
+| [T22](../../docs/testing/cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) | Bundled `index.js` contains `LocalSessions_$_getPrChecks` eipc channel name *and* `gh CLI not found in PATH` Linux-fallthrough throw site (Tier 1 fingerprint) | file probe |
+| [T22b](../../docs/testing/cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) | After `seedFromHost` + `userLoaded`, the `LocalSessions_$_getPrChecks` eipc handler is registered on the claude.ai webContents (`webContents.ipc._invokeHandlers` — Tier 2 runtime probe sibling of T22, strictly stronger than the bundle-string fingerprint) | L1 (eipc registry) |
+| [T23](../../docs/testing/cases/code-tab-handoff.md#t23--desktop-notifications-fire) | Firing `new Notification({title})` from main reaches the session bus's `org.freedesktop.Notifications.Notify` (observed via `dbus-monitor`) | L1 + DBus subprocess |
+| [T24](../../docs/testing/cases/code-tab-handoff.md#t24--open-in-external-editor) | After `installOpenExternalMock` mirroring T25's pattern, `evalInMain` calls `shell.openExternal('vscode://file/...')`; mock records the URL verbatim, no real editor launch | L1 (mocked egress) |
+| [T25](../../docs/testing/cases/code-tab-handoff.md#t25--show-in-files--file-manager) | After `installShowItemInFolderMock` mirroring T17's dialog-mock pattern, `evalInMain` calls `shell.showItemInFolder(<synthetic path>)`; mock records the call verbatim, no throw — no host side effect | L1 (mocked egress) |
+| [T26](../../docs/testing/cases/routines.md#t26--routines-page-renders) | After `seedFromHost` + `userLoaded`, click "Routines" sidebar AX button; assert "New routine" / "All" / "Calendar" anchor renders | L1 + AX-tree |
+| [T27](../../docs/testing/cases/routines.md#t27--scheduled-task-fires-and-notifies) | After `seedFromHost` + `userLoaded`, both Cowork and CCD `getAllScheduledTasks` eipc handlers are registered AND callable through the renderer-side wrapper, returning array shape — Tier 2 reframe of the case-doc T27 case | L1 (eipc invoke) |
+| [T30](../../docs/testing/cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge) | Bundled `index.js` colocates the auto-archive sweep cadence (`300*1e3` ≤ `3600*1e3` ≤ `AutoArchiveEngine`) with the `ccAutoArchiveOnPrClose` gate key (single-regex multi-string fingerprint) | file probe |
+| [T31](../../docs/testing/cases/code-tab-workflow.md#t31--side-chat-opens) | Bundled `index.js` contains all three side-chat eipc channel names (`startSideChat`, `sendSideChatMessage`, `stopSideChat`) — load-bearing trio | file probe |
+| [T31b](../../docs/testing/cases/code-tab-workflow.md#t31--side-chat-opens) | After `seedFromHost` + `userLoaded`, all three side-chat eipc handlers (`startSideChat`, `sendSideChatMessage`, `stopSideChat`) are registered on the claude.ai webContents — load-bearing trio (Tier 2 runtime sibling of T31) | L1 (eipc registry) |
+| [T32](../../docs/testing/cases/code-tab-workflow.md#t32--slash-command-menu) | Bundled `index.js` contains `LocalSessions_$_getSupportedCommands` eipc channel + `slashCommands` schema field | file probe |
+| [T33](../../docs/testing/cases/extensibility.md#t33--plugin-browser) | Bundled `index.js` contains `CustomPlugins_$_listMarketplaces` and `CustomPlugins_$_listAvailablePlugins` eipc channel names (browser populate flow) | file probe |
+| [T33b](../../docs/testing/cases/extensibility.md#t33--plugin-browser) | After `seedFromHost` + `userLoaded`, both plugin-browser eipc handlers (`listMarketplaces`, `listAvailablePlugins`) are registered on the claude.ai webContents — load-bearing pair (Tier 2 runtime sibling of T33) | L1 (eipc registry) |
+| [T33c](../../docs/testing/cases/extensibility.md#t33--plugin-browser) | After `seedFromHost` + `userLoaded`, both plugin-browser eipc handlers (`listMarketplaces`, `listAvailablePlugins`) are callable through the renderer-side wrapper with `args = [[]]` (empty `egressAllowedDomains`), each returning array shape — Tier 2 invocation upgrade of T33b, strictly stronger than registration alone | L1 (eipc invoke) |
+| [T35](../../docs/testing/cases/extensibility.md#t35--mcp-server-config-picked-up) | Bundled `index.js` contains the four-needle MCP-config separation fingerprint: `claude_desktop_config.json` (chat-tab path), `.claude.json` + `.mcp.json` (Code-tab loaders), `"user","project","local"` (settingSources triple Code-session passes to the agent SDK) — pins per-tab separation without launch | file probe |
+| [T35b](../../docs/testing/cases/extensibility.md#t35--mcp-server-config-picked-up) | After `seedFromHost` + `userLoaded`, the `claude.settings/MCP/getMcpServersConfig` eipc handler is registered AND callable through the renderer-side wrapper, returning a non-array object (Tier 2 runtime sibling of T35, strictly stronger than the bundle-string fingerprint) | L1 (eipc invoke) |
+| [T36](../../docs/testing/cases/extensibility.md#t36--hooks-fire) | Bundled `index.js` contains the hooks runtime fingerprint: `hook_started` / `hook_progress` / `hook_response` (single-occurrence Verbose-transcript runtime emits) plus `PreToolUse` / `UserPromptSubmit` registry tokens — pins the runtime hook-fire path the case-doc Verbose-transcript claim hangs on | file probe |
+| [T37](../../docs/testing/cases/extensibility.md#t37--claudemd-memory-loads) | Bundled `index.js` contains `[GlobalMemory] Copied CLAUDE.md` log line + `CLAUDE.md` filename literal + `CLAUDE_CONFIG_DIR` env-var token (memory-loading wiring) | file probe |
+| [T37b](../../docs/testing/cases/extensibility.md#t37--claudemd-memory-loads) | After `seedFromHost` + `userLoaded`, the `claude.web/CoworkMemory/readGlobalMemory` eipc handler is registered AND callable through the renderer-side wrapper, returning the documented `string \| null` shape (Tier 2 runtime sibling of T37) | L1 (eipc invoke) |
+| [T38](../../docs/testing/cases/code-tab-handoff.md#t38--continue-in-ide) | Bundled `index.js` contains `LocalSessions_$_openInEditor` eipc channel name (Tier 1 fingerprint) | file probe |
+| [T38b](../../docs/testing/cases/code-tab-handoff.md#t38--continue-in-ide) | After `seedFromHost` + `userLoaded`, the `LocalSessions_$_openInEditor` eipc handler is registered on the claude.ai webContents (Tier 2 runtime sibling of T38) | L1 (eipc registry) |
+| H01 | CDP auth gate exits with code 1 when spawned with `--remote-debugging-port` and no `CLAUDE_CDP_AUTH` token | spawn probe |
+| H02 | `frame-fix-wrapper.js` + `frame-fix-entry.js` injected into `app.asar` (Proxy + main-field reference) | file probe |
+| H03 | Build-pipeline patch fingerprints all present in `app.asar` (KDE gate, frame-fix inject, tray, cowork, claude-code) | file probe |
+| H04 | cowork daemon spawns under app and exits with app — soft-skips on rows where it isn't gated to spawn | pgrep delta |
+| H05 | UI-drift canary against the AX-tree fingerprint walker (requires `CLAUDE_TEST_USE_HOST_CONFIG=1`) | L1 (AX) |
+| [S01](../../docs/testing/cases/distribution.md#s01--appimage-launches-without-manual-libfuse2t64) | AppImage launches without `libfuse.so.2` complaint (skips on non-AppImage rows) | spawn + stderr grep |
+| [S02](../../docs/testing/cases/distribution.md#s02--xdg_current_desktopubuntugnome-prefix-form-doesnt-break-de-detection) | No strict `==` equality against `XDG_CURRENT_DESKTOP` in launcher / patches (regression detector) | source-tree probe |
+| [S03](../../docs/testing/cases/distribution.md#s03--deb-install-pulls-runtime-deps) | `dpkg-query Depends:` field non-empty (currently fails as upstream-contract regression detector) | dpkg-query |
+| [S04](../../docs/testing/cases/distribution.md#s04--rpm-install-pulls-runtime-deps) | `rpm -qR` has at least one non-`rpmlib(...)` requirement (currently fails per #autoreqprov off) | rpm -qR |
+| [S05](../../docs/testing/cases/distribution.md#s05--doctor-recognises-dnf-installed-package-doesnt-false-flag-as-appimage) | Doctor does not false-flag rpm-installed package (skips when `rpm -qf` doesn't claim the binary) | spawn + stdout grep |
+| [S07](../../docs/testing/cases/shortcuts-and-input.md#s07--claude_use_waylandvar) | Under `CLAUDE_HARNESS_USE_WAYLAND=1`, spawned Electron has `--ozone-platform=wayland` on argv | argv probe |
+| [S08](../../docs/testing/cases/tray-and-window-chrome.md#s08--tray-icon-doesnt-duplicate-after-nativetheme-update) | `setImage`-based in-place fast-path injected by `tray.sh` (KDE-only, file probe) | file probe |
+| [S09](../../docs/testing/cases/shortcuts-and-input.md#s09--quick-window-patch-runs-only-on-kde-post-406-gate) | KDE-gate string present in bundled `index.js` (patch ran at build) | file probe |
+| [S10](../../docs/testing/cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame) | KDE-W only — popup runtime `getBackgroundColor() === '#00000000'` after Quick Entry opens (regression-detector against electron#50213 if bundled Electron in 41.0.4-bisect-window) | L1 + ydotool |
+| [S11](../../docs/testing/cases/shortcuts-and-input.md#s11--quick-entry-shortcut-fires-from-any-focus-on-wayland-mutter-xwayland-key-grab) | GNOME-X / Ubu-X only (X11-side regression detector) — spawn xterm marker, `xdotool windowfocus` to it, verify `_NET_ACTIVE_WINDOW` shifted, fire `Ctrl+Alt+Space` via ydotool, assert popup visible. Wayland-side mutter regression (#404) is a primitive gap — needs Wayland-native focus injection (libei) | L1 + xdotool focus + ydotool shortcut |
+| S12 | `--enable-features=GlobalShortcutsPortal` in Electron argv (GNOME-W only — currently a known-failing regression detector) | argv probe |
+| [S14](../../docs/testing/cases/shortcuts-and-input.md#s14--global-shortcuts-via-xdg-portal-work-on-niri) | Niri only — spawn `foot` marker, `niri msg action focus-window` to it, verify `niri msg --json focused-window` shifted, fire `Ctrl+Alt+Space` via ydotool, assert popup visible. Currently known-failing detector for the Niri portal `BindShortcuts` path (parallels S12's GNOME-W detector) | L1 + niri msg focus + ydotool shortcut |
+| [S15](../../docs/testing/cases/distribution.md#s15--appimage-extraction---appimage-extract-works-as-documented-fallback) | `--appimage-extract` exits 0; `squashfs-root/AppRun --version` runs without FUSE error | spawn + filesystem |
+| [S16](../../docs/testing/cases/distribution.md#s16--appimage-mount-cleans-up-on-app-exit) | `mount(8)` shows new `.mount_claude` while app is up; gone within 10s of close | mount delta |
+| [S17](../../docs/testing/cases/platform-integration.md#s17--app-launched-from-desktop-inherits-shell-path) | Shell-path-worker overlays user's login-shell PATH onto a deliberately-scrubbed env | L1 + utilityProcess |
+| [S19](../../docs/testing/cases/routines.md#s19--claude_config_dir-redirects-scheduled-task-storage) | `extraEnv: { CLAUDE_CONFIG_DIR }` reaches main-process `process.env`; `cE()`-equivalent resolves under the override path | L1 + extraEnv |
+| [S21](../../docs/testing/cases/routines.md#s21--lid-close-still-suspends-per-os-policy) | No `handle-lid-switch` / `HandleLidSwitch` strings in bundle (lid policy deferred to OS) | asar absence probe |
+| [S22](../../docs/testing/cases/platform-integration.md#s22--computer-use-toggle-absent-or-visibly-disabled-on-linux) | `new Set(["darwin","win32"])` platform gate present; no 2-element Set pairing linux (file-probe form) | asar regex |
+| [S25](../../docs/testing/cases/platform-integration.md#s25--mobile-pairing-survives-linux-session-restart) | `safeStorage.encryptString → file → app restart → file → safeStorage.decryptString` round-trips the same plaintext (skips when `isEncryptionAvailable === false`) | L1 + shared isolation handle |
+| [S26](../../docs/testing/cases/distribution.md#s26--auto-update-is-disabled-when-installed-via-aptdnf) | `setFeedURL` present + project suppression marker present (currently fails — gated on #567) | asar fingerprint |
+| [S27](../../docs/testing/cases/extensibility.md#s27--plugins-install-per-user) | `installed_plugins.json` + homedir resolver present; no `*/plugins` system paths in bundle | asar fingerprint |
+| [S28](../../docs/testing/cases/extensibility.md#s28--worktree-creation-surfaces-clear-error-on-read-only-mounts) | Bundled `index.js` contains the worktree permission classifier expression (`"Permission denied" \|\| "Access is denied" \|\| "could not lock config file" → "permission-denied"`) plus the `Failed to create git worktree:` log line | asar fingerprint |
+| [S29](../../docs/testing/cases/shortcuts-and-input.md#s29--quick-entry-popup-is-created-lazily-on-first-shortcut-press-closed-to-tray-sanity) | Popup opens when main is hidden-to-tray (lazy-create sanity) | L1 |
+| [S30](../../docs/testing/cases/shortcuts-and-input.md#s30--quick-entry-shortcut-becomes-a-no-op-after-full-app-exit) | No new claude-desktop pid spawns after post-exit shortcut press | pgrep delta + ydotool |
+| [S31](../../docs/testing/cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) | Submit reaches new chat from visible / minimized / hidden-to-tray (QE-7/8/9) | L1 + ydotool |
+| S32 | GNOME mutter stale-`isFocused()` regression (GNOME-W/Ubu-W only — known-failing today) | L1 + ydotool |
+| [S33](../../docs/testing/cases/shortcuts-and-input.md#s33--quick-entry-transparent-rendering-tracked-against-bundled-electron-version) | Captures bundled Electron version against the #370 / electron#50213 bisect threshold | file read |
+| [S34](../../docs/testing/cases/shortcuts-and-input.md#s34--quick-entry-shortcut-focuses-fullscreen-main-window-instead-of-showing-popup) | Popup does **not** appear when main is fullscreen (upstream contract) | L1 + ydotool |
+| [S35](../../docs/testing/cases/shortcuts-and-input.md#s35--quick-entry-popup-position-is-persisted-across-invocations-and-across-app-restarts) | Popup position persists across invocations *and* across app restart (two-launch test) | L1 + shared isolation handle + ydotool |
+| S36 | Multi-monitor fallback — skip-on-single-monitor with documented `fixme` for the disconnect orchestration | display probe |
+| S37 | Main-window destroy unreachable on Linux per close-to-tray override — documented skip | — |
+
+These specs exercise the substrate primitives in `lib/`: `xprop`
+shell-outs (T01, T04), `dbus-next` (T03), `dbus-monitor` subprocess
+eavesdrop (T23), Node-inspector runtime-attach
+(T07/T16/T17/T26/S10/S29-S35/T05-T14b L1 specs), `app.asar` content reads
+(S08/S09/S21/S22/S26/S27/S28/T11/T14a/T18/T22/T30/T31/T32/T33/T35/T36/T37/T38/H02/H03/S33 — mostly `index.js`; T18 reads `mainView.js`),
+`/proc/$pid/cmdline` reads (S07/S12), pgrep-based pid deltas
+(T10/T14b/H04/S16/S30), `mount(8)` parsing (S16), source-tree probes
+against `scripts/launcher-common.sh` (S02), `dpkg-query` / `rpm -qR` /
+`rpm -qf` calls (S03/S04/S05/T13), `safeStorage.encryptString`
+round-trip across two launches (S25), `extraEnv` precedence over
+isolation env (S19), the `lib/electron-mocks.ts` mock-then-call
+helpers — `installOpenDialogMock` (T17), `installShowItemInFolderMock`
+(T25), `installOpenExternalMock` (T24) — the `lib/input.ts`
+focus-shifter (`focusOtherWindow` + `spawnMarkerWindow` for S11; X11
+only — `WaylandFocusUnavailable` thrown on native Wayland) and its
+Niri-native sibling `lib/input-niri.ts` (`niri msg --json` for the
+focus-injection + readback chain, `foot --title` for the marker
+window; `NiriIpcUnavailable` thrown off-Niri; consumed by S14), the
+`lib/eipc.ts` registry walker (`getEipcChannels` /
+`waitForEipcChannel` / `waitForEipcChannels` against
+`webContents.ipc._invokeHandlers`; opaque on the UUID, suffix-matched
+against case-doc anchors; consumed by T19 / T20 / T22b / T31b / T33b /
+T38b) plus its session 8 invoke surface (`invokeEipcChannel` — calls
+a registered handler through the renderer-side wrapper at
+`window['claude.<scope>'].<Iface>.<method>`; consumed by T19 / T20 /
+T27 / T33c / T35b / T37b), the `lib/ax.ts` AX-tree substrate
+(`snapshotAx` for one-shot reads + `waitForAxNode` / `waitForAxNodes`
+for predicate-based polling, plus re-exports of `RawElement` /
+`AxNode` / `axTreeToSnapshot` / `waitForAxTreeStable` from
+`explore/walker.ts` so consumers stay inside `lib/`; threshold-
+driven extraction in session 13 once T26 had to duplicate the
+formerly-private `snapshotAx` from `claudeai.ts`; consumed by
+`claudeai.ts` page-objects + T26; session 14 migrated `activateTab`
+from a one-shot snapshot to `waitForAxNode` polling — fixes the
+T16 `no AX-tree button with accessibleName="Code" found` failure
+mode where the Code button hadn't rendered yet at click time —
+and converted `CodeTab.activate`'s post-click `findCompactPills`
+retry loop to `waitForAxNodes`) — and the
+`createIsolation({ seedFromHost: true })` primitive that lets login-
+required tests run hermetically against a copy of the host's signed-
+in auth state (T07, T11_runtime, T16, T17, T19, T20, T21, T22b, T26,
+T27, T31b, T33b, T33c, T35b, T37b, T38b — session 15 migrated T17
+from the legacy `CLAUDE_TEST_USE_HOST_CONFIG=1` / `isolation: null`
+shape to `seedFromHost`, fixing a pre-existing 60s spec-timeout
+flake where the unauth'd default isolation polled `userLoaded` past
+Playwright's spec budget; session 16 verified the migration end-to-
+end — `seedFromHost` clones the host's signed-in config,
+`waitForReady('userLoaded')` resolves to a post-login URL, and the
+session-14 `CodeTab.activate({ timeout: 15_000 })` succeeds; T17
+now reaches a NEW failure mode at the next chain step
+(`openFolderPicker` after `selectLocal`, `Select folder…` pill
+doesn't render on `/epitaxy` workspace route — likely needs `/new`
+context, deferred for a future session).
+
+Note on eipc channels: the `LocalSessions_$_*` and `CustomPlugins_$_*`
+channel names referenced in the case-doc Code anchors don't register
+through Electron's *global* `ipcMain.handle()` registry (which only
+carries 3 chat-tab MCP-bridge handlers). They DO register through
+Electron's stdlib `IpcMainImpl` — just on the per-`webContents` IPC
+scope (`webContents.ipc._invokeHandlers`, Electron 17+) rather than
+the global one. The framing is
+`$eipc_message$_<UUID>_$_<scope>_$_<iface>_$_<method>` (UUID stable
+across builds at `c0eed8c9-…`); 117 `LocalSessions_*` + 16
+`CustomPlugins_*` + 50+ other interfaces register on the claude.ai
+webContents. T22 / T31 / T33 / T38 ship as Tier 1 fingerprints
+against the bundled channel-name strings; T22b / T31b / T33b / T38b
+are the runtime registry-presence siblings (strictly stronger,
+require `seedFromHost`). T27 / T33c / T35b / T37b go one step
+further — they invoke the resolved handlers through the renderer-
+side wrapper at `window['claude.<scope>'].<Iface>.<method>`. T19 /
+T20 are first-runtime-probe siblings of case-doc tests whose anchors
+are write-side handlers (`startShellPty` / `writeSessionFile`); they
+ship a five-suffix / three-suffix registration probe over the
+case-doc-anchored write-side surface plus a single foundational
+read-side `LocalSessions/getAll` invocation as the read-side
+surrogate (case-doc connection: integrated terminal and file pane
+both bind to LocalSessions; `getAll` proves the LocalSessions impl
+object is reachable through the renderer wrapper). T21 and
+T11_runtime extend the dual-invocation pattern: when a case-doc has
+read-side anchors with resolvable arg shapes, invoke the case-doc-
+anchored handlers directly rather than through a foundational
+surrogate (T21: `getConfiguredServices` array + `getAutoVerify`
+boolean on a single Launch impl object; T11_runtime: cross-impl-
+object dual invocation — `CustomPlugins/listInstalledPlugins` array
+ `LocalPlugins/getPlugins` array — proves the install plumbing
+crosses both interfaces intact, strictly stronger than single-
+interface coverage). All wrapper
+invocations use the wrapper exposed by `mainView.js` via
+`contextBridge.exposeInMainWorld` after a top-frame + origin gate
+(`Qc()`: claude.ai / claude.com / preview.* / localhost). Calling
+through the wrapper carries an honest `senderFrame` for the inlined
+`le()` / `Vi()` per-handler origin gate, so the test surface matches
+real attack surface. T33c also
+demonstrates the schema-rev path: when invocation rejects with
+`Argument "<name>" at position N ... failed to pass validation`,
+the verbatim rejection string is the cheapest grep target back to
+the inline hand-rolled validator block (bundle bytes 5013601 /
+5018821 for the two CustomPlugins methods). See `lib/eipc.ts` for
+both surfaces, and
+[`runner-implementation-plan.md`](../../docs/testing/runner-implementation-plan.md)
+session 7 / 8 / 9 / 10 status sections for the findings.
+
+Per-row pass/skip counts depend on which sweep runs against the row;
+see `runner-implementation-plan.md` for tier classification and
+matrix-regen for the most-recent per-row outcomes. The Quick Entry
+runners (S29-S35) all share the same primitive set (`installInterceptor()`
+ `openAndWaitReady()` + scenario-specific state setup).
+
+## Prerequisites
+
+On the host or VM running the sweep:
+
+- Node.js ≥ 20
+- `claude-desktop` installed (deb / rpm / AppImage), reachable via `claude-desktop` on `PATH` or `CLAUDE_DESKTOP_LAUNCHER` env var
+- `xprop` (for L2 window queries — `dnf install xorg-x11-utils` on Fedora; `apt install x11-utils` on Debian/Ubuntu)
+- `zstd` (optional — used to bundle results)
+
+### Quick Entry runners (S29–S37, future QE-*)
+
+Quick Entry tests inject the OS-level shortcut via `ydotool` /
+`/dev/uinput`. One-time setup per host or VM:
+
+```sh
+# Install the binary + daemon
+sudo dnf install -y ydotool   # or: sudo apt install ydotool
+
+# Make ydotoold's socket world-writable so the test runner reaches it
+sudo mkdir -p /etc/systemd/system/ydotool.service.d
+sudo tee /etc/systemd/system/ydotool.service.d/override.conf <<'EOF'
+[Service]
+ExecStart=
+ExecStart=/usr/bin/ydotoold --socket-perm=0666
+EOF
+sudo systemctl daemon-reload
+sudo systemctl enable --now ydotool.service
+```
+
+After this, `ydotool key 29:1 29:0` (Ctrl tap) should exit 0. The
+runner sets `YDOTOOL_SOCKET=/tmp/.ydotool_socket` automatically;
+override the env var if your daemon binds elsewhere.
+
+ydotool **cannot** drive portal-grabbed shortcuts (kernel uinput
+events vs compositor portal grabs) — those tests stay manual until
+libei adoption broadens. See [`docs/testing/automation.md`](../../docs/testing/automation.md#input-injection--ydotool-now-libei-next).
+
+## Install
+
+```sh
+cd tools/test-harness
+npm install
+```
+
+`package-lock.json` is gitignored for now; commit it once the dep set is settled.
+
+## Run
+
+```sh
+# All four tests against the locally installed claude-desktop
+ROW=KDE-W ./orchestrator/sweep.sh
+
+# Single test
+npx playwright test src/runners/T01_app_launch.spec.ts
+
+# Headed (watch the app launch in front of you)
+npx playwright test --headed
+
+# Run the full suite under native Wayland instead of X11/XWayland
+CLAUDE_HARNESS_USE_WAYLAND=1 npm test
+
+# Grounding probe — dump runtime state for the case-doc grounding sweep
+npm run grounding-probe -- --launch --include-synthetic \
+  --out ../../docs/testing/cases-grounding-runtime.json
+```
+
+Results land at `results/results-${ROW}-${DATE}/`:
+
+```
+results/results-KDE-W-20260430T143000Z/
+├── junit.xml             # JUnit summary (matrix-regen input)
+├── html/                 # Playwright HTML report
+└── test-output/          # Per-test attachments (screenshots, logs, etc.)
+```
+
+A bundled `results-${ROW}-${DATE}.tar.zst` sits next to the dir if `zstd`
+is installed.
+
+## Environment variables
+
+| Var | Default | Purpose |
+|-----|---------|---------|
+| `ROW` | `KDE-W` | Matrix row label, propagated into the bundle name and per-test annotations. Drives `skipUnlessRow()` in spec files |
+| `CLAUDE_DESKTOP_LAUNCHER` | `claude-desktop` (PATH lookup) | Path to the launcher / Electron binary Playwright spawns |
+| `CLAUDE_DESKTOP_ELECTRON` | probed | Override the resolved Electron binary path (skips deb/rpm install probing) |
+| `CLAUDE_DESKTOP_APP_ASAR` | probed | Override the resolved `app.asar` path |
+| `CLAUDE_TEST_USE_HOST_CONFIG` | unset | When `1`, opt out of per-test isolation and use the host's real `~/.config/Claude`. Required for tests that need a signed-in claude.ai (S31, future submit-side QE runners). **Side effect:** these tests write to your real account — chats / settings persist |
+| `CLAUDE_HARNESS_USE_WAYLAND` | unset | When `1`, every runner spawns Electron with the native-Wayland backend (`--ozone-platform=wayland` + sibling flags from `launcher-common.sh`) instead of the default X11-via-XWayland. `CLAUDE_USE_WAYLAND=1` is also exported into the spawn env for in-app paths that read it. Per-launch overrides via `launchClaude({ extraEnv })` still win |
+| `YDOTOOL_SOCKET` | `/tmp/.ydotool_socket` | Path to the `ydotoold` socket. Override only if the daemon binds elsewhere |
+| `OUTPUT_DIR` | `./results` | Where bundles land |
+| `RESULTS_DIR` | per-run derived | Single-run output dir (set by `sweep.sh`; usually you don't set this manually) |
+
+### Per-test isolation default
+
+`launchClaude()` creates a fresh `XDG_CONFIG_HOME` / `CLAUDE_CONFIG_DIR`
+under `$TMPDIR/claude-test-*` for every launch and removes it on
+`close()`. This is the default to prevent state leaks between tests
+(SingletonLock collisions, persisted Quick Entry positions, etc. —
+see Decision 1 in [`docs/testing/automation.md`](../../docs/testing/automation.md)).
+Three escape hatches:
+
+- **`launchClaude()`** — default, fresh per-launch isolation.
+- **`launchClaude({ isolation })`** — pass a shared `Isolation` handle
+  to launch the same app twice with persistent state (e.g. S35
+  position-memory across restart).
+- **`launchClaude({ isolation: null })`** — opt out entirely; share
+  the host's `~/.config/Claude`. Used by tests gated on
+  `CLAUDE_TEST_USE_HOST_CONFIG` for signed-in claude.ai access.
+
+## Layout
+
+```
+tools/test-harness/
+├── package.json
+├── tsconfig.json
+├── playwright.config.ts
+├── src/
+│   ├── lib/                       # shared helpers
+│   │   ├── electron.ts            # spawn + isolation + inspector attach
+│   │   ├── inspector.ts           # Node-inspector RPC client (SIGUSR1 path)
+│   │   ├── dbus.ts                # dbus-next session-bus + helpers
+│   │   ├── sni.ts                 # StatusNotifierWatcher / Item
+│   │   ├── wm.ts                  # xprop wrappers (X11 + XWayland)
+│   │   ├── env.ts                 # XDG_CURRENT_DESKTOP / SESSION_TYPE branching
+│   │   ├── row.ts                 # skipUnlessRow / skipOnRow primitives
+│   │   ├── isolation.ts           # per-test XDG_CONFIG_HOME sandbox
+│   │   ├── argv.ts                # /proc/$pid/cmdline reader + flag check
+│   │   ├── asar.ts                # in-place app.asar reads (no temp extract)
+│   │   ├── quickentry.ts          # Quick Entry domain wrapper (popup, MainWindow, ydotool)
+│   │   ├── claudeai.ts            # claude.ai renderer UI domain (CodeTab, dialog mock, atoms)
+│   │   ├── electron-mocks.ts      # mock-then-call helpers (dialog/showItemInFolder/openExternal)
+│   │   ├── input.ts               # focus-shifter primitive (X11 only — xdotool + xprop verify; spawnMarkerWindow xterm)
+│   │   ├── input-niri.ts          # focus-shifter primitive (Niri only — niri msg --json verify; spawnMarkerWindow foot)
+│   │   ├── eipc.ts                # eipc-channel registry walker (per-webContents IPC scope; suffix-matched, UUID-opaque)
+│   │   ├── retry.ts               # poll-until-true with timeout
+│   │   └── diagnostics.ts         # launcher log, --doctor, session env
+│   └── runners/                   # one .spec.ts per test ID
+│       ├── T01_app_launch.spec.ts
+│       ├── T03_tray_icon_present.spec.ts
+│       ├── T04_window_decorations.spec.ts
+│       ├── T17_folder_picker.spec.ts
+│       ├── S09_quick_window_patch_only_kde.spec.ts
+│       ├── S12_global_shortcuts_portal_flag.spec.ts
+│       ├── S29_quick_entry_lazy_create_closed_to_tray.spec.ts
+│       ├── S30_quick_entry_noop_after_app_exit.spec.ts
+│       ├── S31_quick_entry_submit_reaches_new_chat.spec.ts
+│       ├── S32_quick_entry_submit_gnome_stale_isfocused.spec.ts
+│       ├── S33_electron_version_capture.spec.ts
+│       ├── S34_shortcut_focuses_fullscreen_main.spec.ts
+│       ├── S35_quick_entry_position_persisted_across_restarts.spec.ts
+│       ├── S36_quick_entry_fallback_to_primary_display.spec.ts
+│       ├── S37_quick_entry_popup_after_main_destroy.spec.ts
+│       ├── H01_cdp_gate_canary.spec.ts
+│       ├── H02_frame_fix_wrapper_present.spec.ts
+│       ├── H03_patch_fingerprints.spec.ts
+│       └── H04_cowork_daemon_lifecycle.spec.ts
+├── probe.ts                       # one-off renderer-DOM probe (debugger on :9229)
+├── grounding-probe.ts             # case-grounding runtime capture (see "Grounding probe" below)
+└── orchestrator/
+    └── sweep.sh                   # row-aware harness invocation
+```
+
+H-prefix specs are harness self-tests — they validate the harness's
+preconditions and the build pipeline's invariants (CDP gate alive,
+patches landed, daemon lifecycle clean). Cheap, run in <1s each
+except H04 which launches the app.
+
+## How L1 testing works (the SIGUSR1 path)
+
+The shipped Electron has a CDP auth gate that exits the app whenever
+`--remote-debugging-port` or `--remote-debugging-pipe` is on argv and a
+valid `CLAUDE_CDP_AUTH` token isn't in env. Both Playwright's
+`_electron.launch()` and `chromium.connectOverCDP()` inject the gated
+flag, so both are blocked.
+
+The gate doesn't check `--inspect` or runtime `SIGUSR1`, which is the
+same code path as the in-app `Developer → Enable Main Process Debugger`
+menu item. So:
+
+1. `launchClaude()` spawns Electron with no debug-port flags (gate
+   asleep) and waits for the X11 window.
+2. `app.attachInspector()` sends `SIGUSR1` to the pid; Node's inspector
+   opens on port 9229.
+3. `lib/inspector.ts` connects via WebSocket and exposes
+   `evalInMain(body)` and `evalInRenderer(urlFilter, js)` for tests.
+
+From the inspector you can:
+- Drive the renderer via `webContents.executeJavaScript()`
+- Install main-process mocks (e.g. `dialog.showOpenDialog` for T17)
+- Inspect any Electron API state
+
+Two gotchas worth knowing:
+
+- `BrowserWindow.getAllWindows()` returns 0 because frame-fix-wrapper
+  substitutes the BrowserWindow class. Use `webContents.getAllWebContents()`
+  instead — works correctly and includes both the shell window and the
+  embedded claude.ai BrowserView.
+- `Runtime.evaluate` with `awaitPromise: true` returns empty objects for
+  awaited Promise resolutions. `inspector.evalInMain<T>()` returns
+  `JSON.stringify(value)` from the IIFE and parses on the caller side
+  to dodge this.
+
+Full writeup with rationale and tradeoffs:
+[`docs/testing/automation.md` "The CDP auth gate"](../../docs/testing/automation.md#the-cdp-auth-gate-and-the-runtime-attach-workaround-that-beats-it).
+
+## Grounding probe
+
+`grounding-probe.ts` is a separate entry-point — not a Playwright spec —
+that connects to a live Claude Desktop and dumps the runtime state
+backing the load-bearing claims in
+[`docs/testing/cases/`](../../docs/testing/cases/). It exists because
+static grep against the 546k-line beautified bundle has known blind
+spots (lazy `import()`s, dynamic handler tables, conditional wiring),
+and some claims (S26 autoUpdater gate, S20 powerSaveBlocker path) can
+only be verified at runtime.
+
+```sh
+# Self-contained: launchClaude() + capture + tear down
+npm run grounding-probe -- --launch
+
+# Plus the one synthetic probe (powerSaveBlocker start+stop)
+npm run grounding-probe -- --launch --include-synthetic
+
+# Attach to an already-running app (manual --inspect=9229 setup)
+npm run grounding-probe -- --port 9229 --out /tmp/probe.json
+```
+
+Output is keyed by test ID — see the file's header comment for the
+full table. Diff captures across upstream version bumps to spot
+behavior drift the static sweep would miss. Surfaces inside modals
+or popups (T22 PR toolbar, T26 preset list, T31 side chat, T32 slash
+menu) need the surface open at probe time — the AX-tree fingerprint
+is a snapshot of what's currently on screen.
+
+## Known limitations
+- **T04** uses `xprop` (no `xdotool` dependency — walks `_NET_CLIENT_LIST` + `_NET_WM_PID`). Works on X11 native and KDE Wayland (XWayland), **not** on native-Wayland sessions where the app is running through Ozone-Wayland directly. Per Decision 6, project default is X11; native-Wayland window-state queries are deferred until those tests get added.
+- **T17** is shallow — it intercepts `dialog.showOpenDialog` at the Electron main process level. The integration question "does Claude make the right *portal* call?" is a v2 concern; portal-level mocking via `dbus-next` is sketched in [`docs/testing/automation.md`](../../docs/testing/automation.md) but requires displacing the running portal service or running under `dbus-run-session`.
+- **`render-matrix.sh`** isn't here yet. `sweep.sh` prints a summary; the `matrix.md` regen step from JUnit is the next addition.
+- **No CI wrapper.** Decision 4: the harness is invokable from CI but sweeps run from the dev box for the first ~20 tests.
+
+## Adding a test
+
+1. Pick the `T##` / `S##` from [`docs/testing/cases/`](../../docs/testing/cases/).
+2. Drop `src/runners/T##_short_name.spec.ts`. Use the existing five as templates — match the layer (L1 / L2) to the test's assertion shape.
+3. First line of the test body: `skipUnlessRow(testInfo, ['KDE-W', ...])`. JUnit `<skipped>` → matrix `-`, never `✗` for a row that doesn't apply.
+4. Tag the test with `severity` and `surface` annotations so the JUnit output carries them.
+5. Capture diagnostics via `testInfo.attach()` — these become Decision 7 "always-on" captures regardless of pass/fail. For tests that need richer state on failure, wrap your scenarios in a results-collector and attach a single JSON dump (S31's pattern).
+6. No fixed `sleep`s. Use `retryUntil` or Playwright's auto-wait.
+
+### Hooking Electron — read this before reaching for `BrowserWindow`
+
+`scripts/frame-fix-wrapper.js` returns the `electron` module wrapped
+in a `Proxy` whose `get` trap returns a closure-captured
+`PatchedBrowserWindow`. **Constructor-level wraps don't work** — your
+`electron.BrowserWindow = WrappedCtor` write lands on the underlying
+module but the Proxy keeps returning `PatchedBrowserWindow` on
+read, so the wrap is bypassed. The reliable hook is at the
+**prototype-method level**:
+
+```ts
+// in inspector.evalInMain(...)
+const proto = electron.BrowserWindow.prototype;
+const orig = proto.loadFile;
+proto.loadFile = function(filePath, ...rest) {
+  // record `this` + filePath; identify popups by filePath suffix
+  return orig.call(this, filePath, ...rest);
+};
+```
+
+This captures every instance regardless of subclass identity.
+Construction-time options (`transparent: true`, `frame: false`,
+etc.) aren't observable through this hook — use runtime
+equivalents instead (`getBackgroundColor()`, `getContentBounds()
+vs getBounds()`, `isAlwaysOnTop()`). `lib/quickentry.ts` is the
+worked example.
--- a/tools/test-harness/eipc-registry-probe.ts
+++ b/tools/test-harness/eipc-registry-probe.ts
@@ -0,0 +1,309 @@
+// Probe to verify whether the eipc channel registry (LocalSessions_$_*,
+// CustomPlugins_$_*) is reachable from main via webContents.ipc._invokeHandlers
+// instead of the empty-on-this-build globalThis.ipcMain._invokeHandlers.
+//
+// Run from tools/test-harness against a running claude-desktop with the
+// main-process debugger enabled (Developer → Enable Main Process Debugger
+// in the app menu, or `claude-desktop` was launched with --inspect):
+//   npx tsx eipc-registry-probe.ts
+//
+// Useful states to probe (re-run to compare):
+//   * fresh launch — whichever tab opens by default
+//   * /epitaxy with a Code session open
+//   * /chats with a chat thread open
+//   * cowork tab loaded
+// The per-interface breakdown surfaces which interfaces register lazily
+// vs eagerly — useful for designing the lib/eipc.ts primitive's wait
+// semantics.
+//
+// Non-destructive — read-only enumeration of handler keys. Doesn't invoke
+// anything, doesn't register anything, doesn't mutate state.
+
+import { InspectorClient } from './src/lib/inspector.js';
+import { writeFileSync } from 'node:fs';
+
+interface InterfaceCount {
+	scope: string;
+	iface: string;
+	count: number;
+	sampleMethods: string[];
+}
+
+interface PerWcReport {
+	id: number;
+	url: string;
+	type: string;
+	hasIpc: boolean;
+	hasInvokeHandlers: boolean;
+	totalHandlers: number;
+	framedCount: number;
+	unframedCount: number;
+	scopes: string[];
+	byInterface: InterfaceCount[];
+	unframedSample: string[];
+}
+
+async function main() {
+	const client = await InspectorClient.connect(9229);
+
+	// Confirm globalThis.ipcMain._invokeHandlers is empty (or near-empty)
+	// — that's session 3's finding and we want it on the record alongside
+	// the per-wc reading for contrast.
+	const ipcMainReport = await client.evalInMain<{
+		hasIpcMain: boolean;
+		ipcMainKeys: string[];
+		ipcMainCount: number;
+	}>(`
+		const electron = process.mainModule.require('electron');
+		const ipcMain = electron.ipcMain;
+		const map = ipcMain && ipcMain._invokeHandlers;
+		if (!map) {
+			return { hasIpcMain: !!ipcMain, ipcMainKeys: [], ipcMainCount: 0 };
+		}
+		const keys = (typeof map.keys === 'function')
+			? Array.from(map.keys())
+			: Object.keys(map);
+		return {
+			hasIpcMain: true,
+			ipcMainKeys: keys,
+			ipcMainCount: keys.length,
+		};
+	`);
+
+	// Per-webContents enumeration with full framing parse:
+	//   $eipc_message$_<UUID>_$_<scope>_$_<interface>_$_<method>
+	// Scope examples: claude.settings, claude.web, claude.app_internal.
+	// Interface examples: GlobalShortcut, LocalSessions, CustomPlugins.
+	// We group by scope.iface to show which feature areas are populated
+	// on each webContents — what registers eagerly vs on-tab-load.
+	const perWcReports = await client.evalInMain<PerWcReport[]>(`
+		const { webContents } = process.mainModule.require('electron');
+		const re = /^\\$eipc_message\\$_[0-9a-f-]+_\\$_([^_]+(?:\\.[^_]+)*)_\\$_([^_]+)_\\$_(.+)$/;
+		const all = webContents.getAllWebContents();
+		const out = [];
+		for (const w of all) {
+			const ipc = w.ipc;
+			const invokeMap = ipc && ipc._invokeHandlers;
+			let keys = [];
+			let hasInvokeHandlers = false;
+			if (invokeMap) {
+				hasInvokeHandlers = true;
+				if (typeof invokeMap.keys === 'function') {
+					keys = Array.from(invokeMap.keys());
+				} else {
+					keys = Object.keys(invokeMap);
+				}
+			}
+			const groups = new Map();
+			const scopes = new Set();
+			let framedCount = 0;
+			let unframedCount = 0;
+			const unframedSample = [];
+			for (const k of keys) {
+				const m = re.exec(k);
+				if (!m) {
+					unframedCount++;
+					if (unframedSample.length < 8) unframedSample.push(k);
+					continue;
+				}
+				framedCount++;
+				const scope = m[1];
+				const iface = m[2];
+				const method = m[3];
+				scopes.add(scope);
+				const groupKey = scope + '/' + iface;
+				let g = groups.get(groupKey);
+				if (!g) {
+					g = { scope, iface, count: 0, sampleMethods: [] };
+					groups.set(groupKey, g);
+				}
+				g.count++;
+				if (g.sampleMethods.length < 4) g.sampleMethods.push(method);
+			}
+			const byInterface = Array.from(groups.values())
+				.sort((a, b) => b.count - a.count);
+			out.push({
+				id: w.id,
+				url: w.getURL(),
+				type: w.getType ? w.getType() : 'unknown',
+				hasIpc: !!ipc,
+				hasInvokeHandlers,
+				totalHandlers: keys.length,
+				framedCount,
+				unframedCount,
+				scopes: Array.from(scopes).sort(),
+				byInterface,
+				unframedSample,
+			});
+		}
+		return out;
+	`);
+
+	// For each case-doc anchored channel, find which webContents (if any)
+	// hosts it. The framing prefix `$eipc_message$_<UUID>_$_claude.web_$_`
+	// is build-stable per session 2's T38 finding, so we match by suffix.
+	const expected = [
+		// T22 — gh PR check monitoring
+		'LocalSessions_$_getPrChecks',
+		// T31 — side chat trio
+		'LocalSessions_$_startSideChat',
+		'LocalSessions_$_sendSideChatMessage',
+		'LocalSessions_$_stopSideChat',
+		// T33 — plugin browser
+		'CustomPlugins_$_listMarketplaces',
+		'CustomPlugins_$_listAvailablePlugins',
+		// T38 — Continue in IDE
+		'LocalSessions_$_openInEditor',
+	];
+
+	const expectedReport = await client.evalInMain<
+		Array<{ suffix: string; foundOn: number[]; matchedKeys: string[] }>
+	>(`
+		const { webContents } = process.mainModule.require('electron');
+		const expected = ${JSON.stringify(expected)};
+		const all = webContents.getAllWebContents();
+		const out = [];
+		for (const suffix of expected) {
+			const foundOn = [];
+			const matchedKeys = [];
+			for (const w of all) {
+				const ipc = w.ipc;
+				const invokeMap = ipc && ipc._invokeHandlers;
+				if (!invokeMap) continue;
+				const keys = (typeof invokeMap.keys === 'function')
+					? Array.from(invokeMap.keys())
+					: Object.keys(invokeMap);
+				for (const k of keys) {
+					if (k.endsWith(suffix)) {
+						if (!foundOn.includes(w.id)) foundOn.push(w.id);
+						if (!matchedKeys.includes(k)) matchedKeys.push(k);
+					}
+				}
+			}
+			out.push({ suffix, foundOn, matchedKeys });
+		}
+		return out;
+	`);
+
+	// Snapshot the framing UUID(s) — useful to confirm build-stability
+	// across the per-wc registries (session 2 noted it as build-stable
+	// `c0eed8c9-...`).
+	const framingReport = await client.evalInMain<{
+		uuidsSeen: string[];
+		samplesPerUuid: Record<string, string[]>;
+	}>(`
+		const { webContents } = process.mainModule.require('electron');
+		const re = /^\\$eipc_message\\$_([0-9a-f-]+)_\\$_/;
+		const uuidsSeen = new Set();
+		const samples = {};
+		for (const w of webContents.getAllWebContents()) {
+			const ipc = w.ipc;
+			const invokeMap = ipc && ipc._invokeHandlers;
+			if (!invokeMap) continue;
+			const keys = (typeof invokeMap.keys === 'function')
+				? Array.from(invokeMap.keys())
+				: Object.keys(invokeMap);
+			for (const k of keys) {
+				const m = re.exec(k);
+				if (!m) continue;
+				const uuid = m[1];
+				uuidsSeen.add(uuid);
+				if (!samples[uuid]) samples[uuid] = [];
+				if (samples[uuid].length < 3) samples[uuid].push(k);
+			}
+		}
+		return {
+			uuidsSeen: Array.from(uuidsSeen),
+			samplesPerUuid: samples,
+		};
+	`);
+
+	console.log('=== globalThis.ipcMain._invokeHandlers (session 3 baseline) ===');
+	console.log(JSON.stringify(ipcMainReport, null, 2));
+
+	console.log('\n=== Per-webContents IPC registries ===');
+	console.log(JSON.stringify(perWcReports, null, 2));
+
+	console.log('\n=== Expected case-doc-anchored channel resolution ===');
+	console.log(JSON.stringify(expectedReport, null, 2));
+
+	console.log('\n=== Framing UUID(s) observed ===');
+	console.log(JSON.stringify(framingReport, null, 2));
+
+	// Cross-webContents per-interface deltas — useful when comparing
+	// "fresh launch" vs "after navigating to /epitaxy" vs "after opening
+	// cowork tab". Lists every (scope, iface) seen anywhere with the
+	// per-wc breakdown of which has it.
+	const interfaceAcrossWcs = (() => {
+		const matrix = new Map<string, Map<number, number>>();
+		for (const wc of perWcReports) {
+			for (const g of wc.byInterface) {
+				const key = `${g.scope}/${g.iface}`;
+				let row = matrix.get(key);
+				if (!row) {
+					row = new Map();
+					matrix.set(key, row);
+				}
+				row.set(wc.id, g.count);
+			}
+		}
+		const out: Array<{
+			interfaceKey: string;
+			perWc: Record<string, number>;
+			total: number;
+		}> = [];
+		for (const [key, row] of matrix) {
+			const perWc: Record<string, number> = {};
+			let total = 0;
+			for (const [wcId, count] of row) {
+				perWc[`wc${wcId}`] = count;
+				total += count;
+			}
+			out.push({ interfaceKey: key, perWc, total });
+		}
+		out.sort((a, b) => b.total - a.total);
+		return out;
+	})();
+
+	console.log('\n=== Interface presence across webContents ===');
+	console.log(JSON.stringify(interfaceAcrossWcs, null, 2));
+
+	const totalAll = perWcReports.reduce((a, r) => a + r.totalHandlers, 0);
+	const totalFramed = perWcReports.reduce((a, r) => a + r.framedCount, 0);
+	const totalUnframed = perWcReports.reduce((a, r) => a + r.unframedCount, 0);
+	const expectedFound = expectedReport.filter((e) => e.foundOn.length > 0).length;
+	const totalDistinctInterfaces = new Set(
+		perWcReports.flatMap((r) => r.byInterface.map((g) => `${g.scope}/${g.iface}`)),
+	).size;
+
+	console.log('\n=== Summary ===');
+	console.log(JSON.stringify({
+		webContentsCount: perWcReports.length,
+		webContentsUrls: perWcReports.map((r) => `wc${r.id}: ${r.url}`),
+		ipcMainHandlerCount: ipcMainReport.ipcMainCount,
+		perWcTotalHandlerCount: totalAll,
+		perWcFramedCount: totalFramed,
+		perWcUnframedCount: totalUnframed,
+		distinctInterfacesAcrossAllWcs: totalDistinctInterfaces,
+		expectedSuffixesFound: `${expectedFound} / ${expected.length}`,
+		framingUuidsObserved: framingReport.uuidsSeen.length,
+	}, null, 2));
+
+	const out = {
+		ipcMainReport,
+		perWcReports,
+		expectedReport,
+		framingReport,
+		interfaceAcrossWcs,
+	};
+	writeFileSync('/tmp/eipc-registry-probe.json', JSON.stringify(out, null, 2));
+	console.log('\nFull dump → /tmp/eipc-registry-probe.json');
+
+	client.close();
+	process.exit(0);
+}
+
+main().catch((err) => {
+	console.error('probe failed:', err);
+	process.exit(1);
+});
--- a/tools/test-harness/explore/derive-vocabulary.ts
+++ b/tools/test-harness/explore/derive-vocabulary.ts
@@ -0,0 +1,280 @@
+// Derives the stable-UI vocabulary corpus from an existing inventory.
+// Output is committed at docs/testing/ui-vocabulary.json and consumed
+// by the v7 walker (Phase 2) when classifying captured accessible-
+// names. Re-run on each major upstream release.
+//
+// Rules (adapted from the v7 plan to the v6-collapsed inventory shape):
+//   - Persistent entries collapse to one inventory entry with a
+//     `surfaces[]` array recording every surface the element was
+//     observed on. Any persistent label whose surfaces[] has length
+//     >= 2 is stable by definition.
+//   - Structural / menu entries: stable if the label is shared by 3+
+//     entries OR appears on 2+ distinct surfaces. Either signal is
+//     enough — the plan's strict 3-and-2 conjunction over-rejects
+//     against a v6-collapsed inventory where most chrome already
+//     deduped to one entry.
+//   - Names matching any INSTANCE_SHAPES regex go to instanceShapes
+//     and are excluded from stable / suspect even if they would have
+//     qualified — the instance-shape pattern is the canonical
+//     representation for those at resolve time.
+//   - kind: instance entries are excluded from the stable corpus
+//     entirely — those labels by definition vary per session. (A
+//     label that appears in BOTH instance and structural entries
+//     follows the structural / menu rule.)
+//   - Everything else falls through to `suspect`, queued for human
+//     reconciliation.
+
+import {
+	existsSync,
+	readFileSync,
+	renameSync,
+	writeFileSync,
+} from 'node:fs';
+import { dirname, resolve } from 'node:path';
+import { fileURLToPath } from 'node:url';
+
+import { INSTANCE_SHAPES } from '../src/lib/name-classifier.js';
+import type { Inventory, InventoryEntry } from './walker.js';
+
+const HERE = dirname(fileURLToPath(import.meta.url));
+const TESTING_DIR = resolve(HERE, '..', '..', '..', 'docs', 'testing');
+const DEFAULT_INVENTORY = resolve(TESTING_DIR, 'ui-inventory.json');
+const DEFAULT_OUTPUT = resolve(TESTING_DIR, 'ui-vocabulary.json');
+
+interface CliOpts {
+	inventory: string;
+	output: string;
+	help: boolean;
+}
+
+interface InstanceShapeOutput {
+	id: string;
+	regex: string;
+	flags: string;
+	pattern: string | null;
+	matchedNames: string[];
+}
+
+interface VocabularyOutput {
+	derivedAt: string;
+	sourceInventory: {
+		capturedAt: string;
+		appVersion: string;
+		walkerVersion: string;
+		totalElements: number;
+	};
+	stable: string[];
+	instanceShapes: InstanceShapeOutput[];
+	suspect: string[];
+}
+
+function parseCli(argv: string[]): CliOpts {
+	const opts: CliOpts = {
+		inventory: DEFAULT_INVENTORY,
+		output: DEFAULT_OUTPUT,
+		help: false,
+	};
+	for (let i = 0; i < argv.length; i += 1) {
+		const a = argv[i]!;
+		switch (a) {
+			case '-h':
+			case '--help':
+				opts.help = true;
+				break;
+			case '--inventory': {
+				const v = argv[++i];
+				if (!v) {
+					process.stderr.write('--inventory requires a path\n');
+					process.exit(1);
+				}
+				opts.inventory = resolve(v);
+				break;
+			}
+			case '--output': {
+				const v = argv[++i];
+				if (!v) {
+					process.stderr.write('--output requires a path\n');
+					process.exit(1);
+				}
+				opts.output = resolve(v);
+				break;
+			}
+			default:
+				process.stderr.write(
+					`derive-vocabulary: unknown argument: ${a}\n`,
+				);
+				printUsage();
+				process.exit(1);
+		}
+	}
+	return opts;
+}
+
+function printUsage(): void {
+	process.stdout.write(
+		'Usage: tsx explore/derive-vocabulary.ts [options]\n' +
+			'\n' +
+			'Derives docs/testing/ui-vocabulary.json from an existing\n' +
+			'inventory walk. Output records the stable-UI corpus, the\n' +
+			'instance-shape registry hits, and any names flagged for\n' +
+			'human triage.\n' +
+			'\n' +
+			'Options:\n' +
+			'  --inventory <path>  Override default inventory path\n' +
+			'                      (default: docs/testing/ui-inventory.json)\n' +
+			'  --output <path>     Override default vocabulary output path\n' +
+			'                      (default: docs/testing/ui-vocabulary.json)\n' +
+			'  -h, --help          Print this help and exit\n',
+	);
+}
+
+function loadInventory(path: string): Inventory {
+	if (!existsSync(path)) {
+		process.stderr.write(
+			`derive-vocabulary: inventory not found: ${path}\n`,
+		);
+		process.exit(1);
+	}
+	try {
+		return JSON.parse(readFileSync(path, 'utf8')) as Inventory;
+	} catch (err) {
+		const msg = err instanceof Error ? err.message : String(err);
+		process.stderr.write(
+			`derive-vocabulary: failed to parse inventory: ${msg}\n`,
+		);
+		process.exit(1);
+	}
+}
+
+interface LabelStats {
+	kinds: Set<InventoryEntry['kind']>;
+	surfaces: Set<string>;
+	entryCount: number;
+	maxPersistentSpan: number;
+}
+
+function aggregate(inv: Inventory): Map<string, LabelStats> {
+	const stats = new Map<string, LabelStats>();
+	for (const e of inv.entries) {
+		const lbl = e.label;
+		if (!lbl) continue;
+		let s = stats.get(lbl);
+		if (!s) {
+			s = {
+				kinds: new Set(),
+				surfaces: new Set(),
+				entryCount: 0,
+				maxPersistentSpan: 0,
+			};
+			stats.set(lbl, s);
+		}
+		s.kinds.add(e.kind);
+		s.surfaces.add(e.surface);
+		s.entryCount += 1;
+		if (e.kind === 'persistent' && e.surfaces) {
+			s.maxPersistentSpan = Math.max(
+				s.maxPersistentSpan,
+				e.surfaces.length,
+			);
+		}
+	}
+	return stats;
+}
+
+function classify(inv: Inventory): VocabularyOutput {
+	const stats = aggregate(inv);
+	const stable = new Set<string>();
+	const suspect = new Set<string>();
+	const instanceHits = new Map<string, Set<string>>();
+	for (const shape of INSTANCE_SHAPES) {
+		instanceHits.set(shape.id, new Set());
+	}
+
+	for (const [lbl, s] of stats) {
+		// Pure-instance label — exclude entirely.
+		if (s.kinds.size === 1 && s.kinds.has('instance')) {
+			continue;
+		}
+
+		// Instance-shape regex match — record + skip stable/suspect.
+		let shapeMatched = false;
+		for (const shape of INSTANCE_SHAPES) {
+			if (shape.regex.test(lbl)) {
+				instanceHits.get(shape.id)!.add(lbl);
+				shapeMatched = true;
+				break;
+			}
+		}
+		if (shapeMatched) continue;
+
+		// Persistent: surfaces[] >= 2 carries the proof that the chrome
+		// element actually spans surfaces.
+		if (s.maxPersistentSpan >= 2) {
+			stable.add(lbl);
+			continue;
+		}
+
+		// Structural / menu: 3+ entries OR 2+ distinct surfaces.
+		if (s.entryCount >= 3 || s.surfaces.size >= 2) {
+			stable.add(lbl);
+			continue;
+		}
+
+		suspect.add(lbl);
+	}
+
+	const instanceShapesOut: InstanceShapeOutput[] = INSTANCE_SHAPES.map(
+		(shape) => ({
+			id: shape.id,
+			regex: shape.regex.source,
+			flags: shape.regex.flags,
+			pattern: shape.pattern,
+			matchedNames: [...instanceHits.get(shape.id)!].sort(),
+		}),
+	);
+
+	return {
+		derivedAt: new Date().toISOString(),
+		sourceInventory: {
+			capturedAt: inv.capturedAt,
+			appVersion: inv.appVersion,
+			walkerVersion: inv.walkerVersion,
+			totalElements: inv.totalElements,
+		},
+		stable: [...stable].sort(),
+		instanceShapes: instanceShapesOut,
+		suspect: [...suspect].sort(),
+	};
+}
+
+function atomicWrite(path: string, body: string): void {
+	const tmp = `${path}.tmp`;
+	writeFileSync(tmp, body, 'utf8');
+	renameSync(tmp, path);
+}
+
+function main(): void {
+	const opts = parseCli(process.argv.slice(2));
+	if (opts.help) {
+		printUsage();
+		return;
+	}
+	const inv = loadInventory(opts.inventory);
+	const out = classify(inv);
+	const body = `${JSON.stringify(out, null, 2)}\n`;
+	atomicWrite(opts.output, body);
+
+	const shapeHitTotal = out.instanceShapes.reduce(
+		(n, s) => n + s.matchedNames.length,
+		0,
+	);
+	process.stdout.write(
+		`derive-vocabulary: wrote ${opts.output}\n` +
+			`  source: ${opts.inventory} (${inv.totalElements} entries)\n` +
+			`  stable: ${out.stable.length}, ` +
+			`instance-shaped: ${shapeHitTotal} (${out.instanceShapes.filter((s) => s.matchedNames.length > 0).length} shapes hit), ` +
+			`suspect: ${out.suspect.length}\n`,
+	);
+}
+
+main();
--- a/tools/test-harness/explore/diff.ts
+++ b/tools/test-harness/explore/diff.ts
@@ -0,0 +1,313 @@
+// Snapshot comparator.
+//
+// Diff semantics, in priority order:
+//   - removed:  an element keyed in A is absent from B  → drift signal.
+//   - changed:  same key, different visible text or aria-label  → drift.
+//   - added:    new key in B  → informational only (UI gained surface).
+//
+// Keys are stable identity tokens chosen per element class:
+//   - df-pill:        aria-label  (Chat / Cowork / Code)
+//   - compactPill:    inner text  (env value, "Select folder…", …)
+//   - ariaButton:     aria-label  (sidebar "more" buttons share labels;
+//                     we de-dup by counting; see compareCounts below)
+//   - modal:          headingText ?? aria-label ?? aria-labelledby
+//   - openMenu:       items diffed by `${role}::${text}`
+//
+// Pure module — no I/O, no process.exit. The dispatcher reads files
+// and prints; this file just produces a Diff value.
+
+import type {
+	AriaButton,
+	CompactPillSnap,
+	DfPill,
+	MenuItem,
+	ModalSnap,
+	OpenMenu,
+	Snapshot,
+} from './snapshot.js';
+
+export interface DiffEntry {
+	kind: 'removed' | 'changed' | 'added';
+	category: string;
+	key: string;
+	before?: string;
+	after?: string;
+}
+
+export interface DiffResult {
+	a: { capturedAt: string; url: string; appVersion: string | null };
+	b: { capturedAt: string; url: string; appVersion: string | null };
+	entries: DiffEntry[];
+	summary: { removed: number; changed: number; added: number };
+}
+
+export function diff(a: Snapshot, b: Snapshot): DiffResult {
+	const entries: DiffEntry[] = [];
+	entries.push(...diffDfPills(a.dfPills, b.dfPills));
+	entries.push(...diffCompactPills(a.compactPills, b.compactPills));
+	entries.push(...diffAriaButtons(a.ariaLabeledButtons, b.ariaLabeledButtons));
+	entries.push(...diffModals(a.modals, b.modals));
+	entries.push(...diffOpenMenu(a.openMenu, b.openMenu));
+	const summary = entries.reduce(
+		(acc, e) => {
+			acc[e.kind] += 1;
+			return acc;
+		},
+		{ removed: 0, changed: 0, added: 0 },
+	);
+	return {
+		a: {
+			capturedAt: a.capturedAt,
+			url: a.claudeAiUrl,
+			appVersion: a.appVersion,
+		},
+		b: {
+			capturedAt: b.capturedAt,
+			url: b.claudeAiUrl,
+			appVersion: b.appVersion,
+		},
+		entries,
+		summary,
+	};
+}
+
+// Human-readable formatter. Removed/changed first (they're failures
+// in spirit), added last (informational). Empty diff prints a single
+// line so CI logs stay tidy.
+export function formatDiff(d: DiffResult): string {
+	const lines: string[] = [];
+	lines.push(`A: ${d.a.capturedAt}  (${d.a.url})  app=${d.a.appVersion}`);
+	lines.push(`B: ${d.b.capturedAt}  (${d.b.url})  app=${d.b.appVersion}`);
+	lines.push('');
+	if (d.entries.length === 0) {
+		lines.push('No differences.');
+		return lines.join('\n');
+	}
+	const order: DiffEntry['kind'][] = ['removed', 'changed', 'added'];
+	for (const kind of order) {
+		const group = d.entries.filter((e) => e.kind === kind);
+		if (group.length === 0) continue;
+		lines.push(`# ${kind.toUpperCase()} (${group.length})`);
+		for (const e of group) {
+			if (e.kind === 'changed') {
+				lines.push(
+					`  [${e.category}] ${e.key}: ${e.before ?? ''} → ${e.after ?? ''}`,
+				);
+			} else if (e.kind === 'removed') {
+				lines.push(`  [${e.category}] ${e.key}: ${e.before ?? ''}`);
+			} else {
+				lines.push(`  [${e.category}] ${e.key}: ${e.after ?? ''}`);
+			}
+		}
+		lines.push('');
+	}
+	lines.push(
+		`Summary: ${d.summary.removed} removed, ` +
+			`${d.summary.changed} changed, ${d.summary.added} added`,
+	);
+	return lines.join('\n');
+}
+
+function diffDfPills(a: DfPill[], b: DfPill[]): DiffEntry[] {
+	const aMap = byKey(a, (p) => p.ariaLabel ?? p.text);
+	const bMap = byKey(b, (p) => p.ariaLabel ?? p.text);
+	return compareMaps(aMap, bMap, 'dfPill', (p) => p.text);
+}
+
+function diffCompactPills(
+	a: CompactPillSnap[],
+	b: CompactPillSnap[],
+): DiffEntry[] {
+	// Compact pills can repeat by text in pathological cases, so we
+	// disambiguate by appending an ordinal when needed. The ordinal is
+	// stable as long as DOM order is — same approach `findCompactPills`
+	// callers rely on.
+	const aMap = byKeyOrdinal(a, (p) => p.text);
+	const bMap = byKeyOrdinal(b, (p) => p.text);
+	return compareMaps(aMap, bMap, 'compactPill', (p) => `maxW=${p.maxW}`);
+}
+
+// Aria-labeled buttons frequently repeat (sidebar's ~80 conversation-row
+// "more" buttons all share a label). We compare by *count per label*
+// instead of per-instance: a delta in count surfaces as a single
+// changed entry, which is far more readable than 80 added/removed
+// rows. Per-label text is omitted since duplicate labels mean text is
+// not a stable identity.
+function diffAriaButtons(a: AriaButton[], b: AriaButton[]): DiffEntry[] {
+	return compareCounts(
+		countBy(a, (x) => x.ariaLabel),
+		countBy(b, (x) => x.ariaLabel),
+		'ariaButton',
+	);
+}
+
+function diffModals(a: ModalSnap[], b: ModalSnap[]): DiffEntry[] {
+	const key = (m: ModalSnap) =>
+		m.headingText ?? m.ariaLabel ?? m.ariaLabelledBy ?? '<unlabeled-modal>';
+	const aMap = byKeyOrdinal(a, key);
+	const bMap = byKeyOrdinal(b, key);
+	return compareMaps(aMap, bMap, 'modal', (m) =>
+		`buttons=${m.buttonLabels.join('|')}`,
+	);
+}
+
+// Menu diff is special: the "key" is the menu identity, but a menu
+// diff is really an item-set diff. We compare item lists, scoped under
+// the menu's labelledBy/ariaLabel for context.
+function diffOpenMenu(
+	a: OpenMenu | null,
+	b: OpenMenu | null,
+): DiffEntry[] {
+	if (!a && !b) return [];
+	const scope =
+		(a?.ariaLabel ?? b?.ariaLabel) ||
+		(a?.ariaLabelledBy ?? b?.ariaLabelledBy) ||
+		'<menu>';
+	if (a && !b) {
+		return [
+			{
+				kind: 'removed',
+				category: 'openMenu',
+				key: scope,
+				before: a.items.map(itemKey).join(' | '),
+			},
+		];
+	}
+	if (!a && b) {
+		return [
+			{
+				kind: 'added',
+				category: 'openMenu',
+				key: scope,
+				after: b.items.map(itemKey).join(' | '),
+			},
+		];
+	}
+	if (!a || !b) return [];
+	const aMap = byKeyOrdinal(a.items, itemKey);
+	const bMap = byKeyOrdinal(b.items, itemKey);
+	return compareMaps(
+		aMap,
+		bMap,
+		`openMenu[${scope}]`,
+		(it) =>
+			`disabled=${it.disabled}` +
+			(it.ariaChecked !== null ? ` checked=${it.ariaChecked}` : ''),
+	);
+}
+
+function itemKey(it: MenuItem): string {
+	return `${it.role}::${it.text}`;
+}
+
+function byKey<T>(arr: T[], k: (t: T) => string): Map<string, T> {
+	const m = new Map<string, T>();
+	for (const it of arr) m.set(k(it), it);
+	return m;
+}
+
+// When keys collide, append `#2`, `#3`, … so the comparator can still
+// detect "we used to have 3, now we have 2" (one #N drops out as
+// removed). Ordinals are local to this snapshot — they don't cross
+// snapshot boundaries.
+function byKeyOrdinal<T>(arr: T[], k: (t: T) => string): Map<string, T> {
+	const m = new Map<string, T>();
+	const counts = new Map<string, number>();
+	for (const it of arr) {
+		const base = k(it);
+		const n = (counts.get(base) ?? 0) + 1;
+		counts.set(base, n);
+		m.set(n === 1 ? base : `${base}#${n}`, it);
+	}
+	return m;
+}
+
+function countBy<T>(arr: T[], k: (t: T) => string): Map<string, number> {
+	const m = new Map<string, number>();
+	for (const it of arr) {
+		const key = k(it);
+		m.set(key, (m.get(key) ?? 0) + 1);
+	}
+	return m;
+}
+
+function compareMaps<T>(
+	a: Map<string, T>,
+	b: Map<string, T>,
+	category: string,
+	describe: (t: T) => string,
+): DiffEntry[] {
+	const out: DiffEntry[] = [];
+	for (const [k, v] of a) {
+		const bv = b.get(k);
+		if (bv === undefined) {
+			out.push({
+				kind: 'removed',
+				category,
+				key: k,
+				before: describe(v),
+			});
+			continue;
+		}
+		const before = describe(v);
+		const after = describe(bv);
+		if (before !== after) {
+			out.push({
+				kind: 'changed',
+				category,
+				key: k,
+				before,
+				after,
+			});
+		}
+	}
+	for (const [k, v] of b) {
+		if (!a.has(k)) {
+			out.push({
+				kind: 'added',
+				category,
+				key: k,
+				after: describe(v),
+			});
+		}
+	}
+	return out;
+}
+
+function compareCounts(
+	a: Map<string, number>,
+	b: Map<string, number>,
+	category: string,
+): DiffEntry[] {
+	const out: DiffEntry[] = [];
+	for (const [k, n] of a) {
+		const m = b.get(k);
+		if (m === undefined) {
+			out.push({
+				kind: 'removed',
+				category,
+				key: k,
+				before: `count=${n}`,
+			});
+		} else if (m !== n) {
+			out.push({
+				kind: 'changed',
+				category,
+				key: k,
+				before: `count=${n}`,
+				after: `count=${m}`,
+			});
+		}
+	}
+	for (const [k, m] of b) {
+		if (!a.has(k)) {
+			out.push({
+				kind: 'added',
+				category,
+				key: k,
+				after: `count=${m}`,
+			});
+		}
+	}
+	return out;
+}
--- a/tools/test-harness/explore/explore.ts
+++ b/tools/test-harness/explore/explore.ts
@@ -0,0 +1,640 @@
+// Entry point for the explore CLI.
+//
+// Subcommand surface (matches docs/testing/claudeai-ui-mapping-plan.md
+// Phase 1):
+//
+//   explore                    full snapshot to stdout
+//   explore pills              df-pills + compact-pills + state
+//   explore menu               currently-open menu structure
+//   explore snapshot <name>    write to docs/testing/ui-snapshots/<name>.json
+//   explore diff <a> <b>       diff two snapshots
+//   explore find <regex>       search renderer for matching text/aria-label
+//
+// Why a hand-rolled dispatcher: the surface is six cases. A flag parser
+// adds a dependency and obscures which command takes which positional.
+// Keep the routing visible.
+//
+// Exit codes:
+//   0  success (including a clean diff)
+//   1  caller error (bad args, missing file)
+//   2  runtime error (no debugger, no claude.ai webContents)
+//   3  diff non-empty AND `--exit-on-diff` was set — opt-in, off by
+//      default so `explore diff` from a script can read entries
+//      without conflating "drift" with "tool blew up".
+
+import {
+	existsSync,
+	mkdirSync,
+	readFileSync,
+	renameSync,
+	writeFileSync,
+} from 'node:fs';
+import { dirname, resolve } from 'node:path';
+import { fileURLToPath } from 'node:url';
+
+import { InspectorClient } from '../src/lib/inspector.js';
+import { capture, capturePills, captureOpenMenu } from './snapshot.js';
+import type { Snapshot } from './snapshot.js';
+import { diff, formatDiff } from './diff.js';
+import { findInRenderer, formatHits } from './find.js';
+import {
+	collapsePersistentEntries,
+	walkRenderer,
+	WALKER_VERSION,
+} from './walker.js';
+import type { Inventory } from './walker.js';
+
+const INSPECTOR_PORT = 9229;
+// Resolve relative to this source file so the CLI works regardless of
+// cwd (npm script vs. ad-hoc tsx invocation from elsewhere).
+const TESTING_DIR = resolve(
+	dirname(fileURLToPath(import.meta.url)),
+	'..',
+	'..',
+	'..',
+	'docs',
+	'testing',
+);
+const SNAPSHOT_DIR = resolve(TESTING_DIR, 'ui-snapshots');
+const INVENTORY_PATH = resolve(TESTING_DIR, 'ui-inventory.json');
+const INVENTORY_META_PATH = resolve(TESTING_DIR, 'ui-inventory.meta.json');
+
+async function main(): Promise<void> {
+	const argv = process.argv.slice(2);
+	const cmd = argv[0];
+	const rest = argv.slice(1);
+	try {
+		switch (cmd) {
+			case undefined:
+				await runFullSnapshot();
+				return;
+			case 'pills':
+				await runPills();
+				return;
+			case 'menu':
+				await runMenu();
+				return;
+			case 'snapshot':
+				await runSnapshot(rest);
+				return;
+			case 'diff':
+				await runDiff(rest);
+				return;
+			case 'find':
+				await runFind(rest);
+				return;
+			case 'walk':
+				await runWalk(rest);
+				return;
+			case 'collapse':
+				await runCollapse(rest);
+				return;
+			case '-h':
+			case '--help':
+			case 'help':
+				printUsage();
+				return;
+			default:
+				console.error(`unknown subcommand: ${cmd}`);
+				printUsage();
+				process.exit(1);
+		}
+	} catch (err) {
+		const msg = err instanceof Error ? err.message : String(err);
+		console.error(`explore: ${msg}`);
+		process.exit(2);
+	}
+}
+
+async function runFullSnapshot(): Promise<void> {
+	const client = await connect();
+	try {
+		const snap = await capture(client);
+		console.log(JSON.stringify(snap, null, 2));
+	} finally {
+		client.close();
+	}
+}
+
+async function runPills(): Promise<void> {
+	const client = await connect();
+	try {
+		const pills = await capturePills(client);
+		console.log(JSON.stringify(pills, null, 2));
+	} finally {
+		client.close();
+	}
+}
+
+async function runMenu(): Promise<void> {
+	const client = await connect();
+	try {
+		const menu = await captureOpenMenu(client);
+		if (!menu) {
+			console.log('null');
+			return;
+		}
+		console.log(JSON.stringify(menu, null, 2));
+	} finally {
+		client.close();
+	}
+}
+
+async function runSnapshot(args: string[]): Promise<void> {
+	const name = args[0];
+	if (!name) {
+		console.error('snapshot: missing <name> argument');
+		console.error('usage: explore snapshot <name>');
+		process.exit(1);
+	}
+	if (!/^[a-zA-Z0-9._-]+$/.test(name)) {
+		console.error(
+			`snapshot: name ${JSON.stringify(name)} contains characters ` +
+				`outside [a-zA-Z0-9._-] — choose a slug-safe name`,
+		);
+		process.exit(1);
+	}
+	const client = await connect();
+	let snap: Snapshot;
+	try {
+		snap = await capture(client);
+	} finally {
+		client.close();
+	}
+	if (!existsSync(SNAPSHOT_DIR)) {
+		mkdirSync(SNAPSHOT_DIR, { recursive: true });
+	}
+	const outPath = resolve(SNAPSHOT_DIR, `${name}.json`);
+	writeFileSync(outPath, JSON.stringify(snap, null, 2) + '\n', 'utf8');
+	console.log(`wrote ${outPath}`);
+}
+
+async function runDiff(args: string[]): Promise<void> {
+	const opts = { json: false, exitOnDiff: false };
+	const positional: string[] = [];
+	for (const a of args) {
+		if (a === '--json') opts.json = true;
+		else if (a === '--exit-on-diff') opts.exitOnDiff = true;
+		else positional.push(a);
+	}
+	if (positional.length !== 2) {
+		console.error('diff: expected exactly two snapshot names or paths');
+		console.error('usage: explore diff <a> <b> [--json] [--exit-on-diff]');
+		process.exit(1);
+	}
+	const a = readSnapshot(positional[0]!);
+	const b = readSnapshot(positional[1]!);
+	const result = diff(a, b);
+	if (opts.json) {
+		console.log(JSON.stringify(result, null, 2));
+	} else {
+		console.log(formatDiff(result));
+	}
+	if (opts.exitOnDiff && result.entries.length > 0) {
+		process.exit(3);
+	}
+}
+
+// `walk` parses its own flags; --max-elements 0 prints usage and exits
+// (a cheap dry-run for "is the CLI loadable" without touching CDP).
+async function runWalk(args: string[]): Promise<void> {
+	const opts: {
+		maxElements: number;
+		maxDrillsPerSurface: number;
+		checkpointEvery: number;
+		allowlist: string | null;
+		output: string;
+		verbose: boolean;
+		help: boolean;
+	} = {
+		maxElements: 1000,
+		maxDrillsPerSurface: 50,
+		checkpointEvery: 100,
+		allowlist: null,
+		output: INVENTORY_PATH,
+		verbose: false,
+		help: false,
+	};
+	for (let i = 0; i < args.length; i += 1) {
+		const a = args[i]!;
+		if (a === '-h' || a === '--help') {
+			opts.help = true;
+		} else if (a === '--max-elements') {
+			const n = Number(args[i + 1]);
+			if (!Number.isFinite(n) || n < 0) {
+				console.error('walk: --max-elements requires a non-negative integer');
+				process.exit(1);
+			}
+			opts.maxElements = n;
+			i += 1;
+		} else if (a === '--checkpoint-every') {
+			const n = Number(args[i + 1]);
+			if (!Number.isFinite(n) || n < 0 || !Number.isInteger(n)) {
+				console.error(
+					'walk: --checkpoint-every requires a non-negative integer (0 disables)',
+				);
+				process.exit(1);
+			}
+			opts.checkpointEvery = n;
+			i += 1;
+		} else if (
+			a === '--max-drills-per-surface' ||
+			a === '--max-elements-per-surface'
+		) {
+			// v4 renamed the flag from --max-elements-per-surface (which
+			// truncated emissions) to --max-drills-per-surface (which only
+			// caps queue pushes; all entries are still emitted). Keep the
+			// old name as a deprecated alias.
+			if (a === '--max-elements-per-surface') {
+				process.stderr.write(
+					'walk: --max-elements-per-surface is deprecated; ' +
+						'use --max-drills-per-surface (semantics changed: now ' +
+						'caps drilling fan-out, not emission count)\n',
+				);
+			}
+			const n = Number(args[i + 1]);
+			if (!Number.isFinite(n) || n < 0) {
+				console.error(`walk: ${a} requires a non-negative integer`);
+				process.exit(1);
+			}
+			opts.maxDrillsPerSurface = n;
+			i += 1;
+		} else if (a === '--allowlist') {
+			const p = args[i + 1];
+			if (!p) {
+				console.error('walk: --allowlist requires a path');
+				process.exit(1);
+			}
+			opts.allowlist = p;
+			i += 1;
+		} else if (a === '--output') {
+			const p = args[i + 1];
+			if (!p) {
+				console.error('walk: --output requires a path');
+				process.exit(1);
+			}
+			opts.output = resolve(p);
+			i += 1;
+		} else if (a === '--verbose') {
+			opts.verbose = true;
+		} else {
+			console.error(`walk: unknown argument: ${a}`);
+			printWalkUsage();
+			process.exit(1);
+		}
+	}
+	if (opts.help || opts.maxElements === 0) {
+		printWalkUsage();
+		return;
+	}
+	let allowlist: string[] = [];
+	if (opts.allowlist) {
+		const raw = readFileSync(opts.allowlist, 'utf8');
+		try {
+			const parsed = JSON.parse(raw) as { exemptions?: string[] };
+			allowlist = parsed.exemptions ?? [];
+		} catch (err) {
+			const msg = err instanceof Error ? err.message : String(err);
+			console.error(`walk: allowlist ${opts.allowlist}: invalid JSON — ${msg}`);
+			process.exit(1);
+		}
+	}
+	const outDir = dirname(opts.output);
+	if (!existsSync(outDir)) mkdirSync(outDir, { recursive: true });
+	const metaPath =
+		opts.output === INVENTORY_PATH
+			? INVENTORY_META_PATH
+			: opts.output.replace(/\.json$/, '.meta.json');
+
+	// Atomic writer: write to <path>.tmp, then rename. Survives a kill
+	// between writes — readers always see either the prior complete file
+	// or the new one, never a half-written buffer. Used for both the
+	// in-flight checkpoint writes and the final write. `partial` is
+	// recorded in meta.json (true on intermediate writes, false on the
+	// final write) so downstream readers can tell whether the inventory
+	// is complete; the inventory file itself stays shape-compatible.
+	const writeCheckpoint = (
+		inventory: Inventory,
+		isPartial: boolean,
+	): void => {
+		const invTmp = `${opts.output}${INVENTORY_TMP_SUFFIX}`;
+		writeFileSync(
+			invTmp,
+			JSON.stringify(inventory, null, 2) + '\n',
+			'utf8',
+		);
+		renameSync(invTmp, opts.output);
+		const meta = {
+			capturedAt: inventory.capturedAt,
+			appVersion: inventory.appVersion,
+			walkerVersion: WALKER_VERSION,
+			startUrl: inventory.startUrl,
+			totalElements: inventory.totalElements,
+			deniedActions: inventory.deniedActions,
+			partial: isPartial,
+			denylistDescription:
+				'Default destructive-action labels (see DEFAULT_DENYLIST in walker.ts) ' +
+				'plus optional allowlist exemptions.',
+			allowlistEntries: allowlist,
+		};
+		const metaTmp = `${metaPath}${INVENTORY_TMP_SUFFIX}`;
+		writeFileSync(metaTmp, JSON.stringify(meta, null, 2) + '\n', 'utf8');
+		renameSync(metaTmp, metaPath);
+	};
+
+	const client = await connect();
+	let inventory: Inventory;
+	try {
+		inventory = await walkRenderer(client, {
+			maxElements: opts.maxElements,
+			maxDrillsPerSurface: opts.maxDrillsPerSurface,
+			allowlist,
+			verbose: opts.verbose,
+			checkpointEvery: opts.checkpointEvery,
+			checkpointWriter:
+				opts.checkpointEvery > 0
+					? (inv) => writeCheckpoint(inv, true)
+					: undefined,
+		});
+	} finally {
+		client.close();
+	}
+	writeCheckpoint(inventory, false);
+	console.log(
+		`wrote ${opts.output} (${inventory.totalElements} entries, ` +
+			`${inventory.deniedActions} denylisted)`,
+	);
+	console.log(`wrote ${metaPath}`);
+}
+
+// Suffix used by the atomic-write helper. Kept module-level so any
+// future readers know which dotfile to ignore in tooling/gitignore.
+const INVENTORY_TMP_SUFFIX = '.tmp';
+
+// `collapse [<path>]` re-runs the post-walk persistent-element
+// collapse against an existing inventory file. Use case: a partial
+// checkpoint (walker aborted mid-walk) skipped the in-loop collapse
+// and so has 0 persistent entries — this command salvages it without
+// re-running the walker. Also useful if collapse heuristics change
+// and we want to refresh an existing inventory.
+async function runCollapse(args: string[]): Promise<void> {
+	let path = INVENTORY_PATH;
+	let help = false;
+	for (let i = 0; i < args.length; i += 1) {
+		const a = args[i]!;
+		if (a === '-h' || a === '--help') help = true;
+		else if (!a.startsWith('-')) path = resolve(a);
+		else {
+			console.error(`collapse: unknown argument: ${a}`);
+			printCollapseUsage();
+			process.exit(1);
+		}
+	}
+	if (help) {
+		printCollapseUsage();
+		return;
+	}
+	if (!existsSync(path)) {
+		console.error(`collapse: inventory not found: ${path}`);
+		process.exit(1);
+	}
+	let inventory: Inventory;
+	try {
+		inventory = JSON.parse(readFileSync(path, 'utf8')) as Inventory;
+	} catch (err) {
+		const msg = err instanceof Error ? err.message : String(err);
+		console.error(`collapse: invalid JSON in ${path} — ${msg}`);
+		process.exit(1);
+	}
+	// v7-only gate. The v6 → v7 fingerprint cutover invalidated all
+	// older inventory shapes; re-running the persistent collapse on a
+	// v6 inventory would mint v7-key collisions against v6 selectors
+	// and drop unrelated entries. Re-walk first.
+	const wv = inventory.walkerVersion;
+	if (wv !== '7') {
+		console.error(
+			`collapse: walkerVersion ${wv} is not supported (need v7; ` +
+				`re-walk after the v6 → v7 fingerprint cutover)`,
+		);
+		process.exit(1);
+	}
+	const before = inventory.entries.length;
+	const result = collapsePersistentEntries(inventory.entries);
+	const after = result.entries.length;
+	const dropped = before - after;
+	const collapsedAt = new Date().toISOString();
+	const updated: Inventory = {
+		...inventory,
+		walkerVersion: WALKER_VERSION,
+		totalElements: after,
+		entries: result.entries,
+		capturedAt: inventory.capturedAt,
+	};
+
+	// Atomic write inventory + meta. Mirror the walk subcommand: write
+	// to .tmp, rename. Meta gets `partial: false` (collapse closes out
+	// a partial checkpoint) and `collapsedAt`; everything else carries
+	// through from the existing meta where present.
+	const invTmp = `${path}${INVENTORY_TMP_SUFFIX}`;
+	writeFileSync(invTmp, JSON.stringify(updated, null, 2) + '\n', 'utf8');
+	renameSync(invTmp, path);
+
+	const metaPath =
+		path === INVENTORY_PATH
+			? INVENTORY_META_PATH
+			: path.replace(/\.json$/, '.meta.json');
+	let existingMeta: Record<string, unknown> = {};
+	if (existsSync(metaPath)) {
+		try {
+			existingMeta = JSON.parse(readFileSync(metaPath, 'utf8')) as Record<
+				string,
+				unknown
+			>;
+		} catch {
+			// Carry the inventory through even if meta is malformed; meta
+			// is recoverable, the entries are not.
+		}
+	}
+	const meta = {
+		...existingMeta,
+		capturedAt: updated.capturedAt,
+		appVersion: updated.appVersion,
+		walkerVersion: WALKER_VERSION,
+		startUrl: updated.startUrl,
+		totalElements: updated.totalElements,
+		deniedActions: updated.deniedActions,
+		partial: false,
+		collapsedAt,
+	};
+	const metaTmp = `${metaPath}${INVENTORY_TMP_SUFFIX}`;
+	writeFileSync(metaTmp, JSON.stringify(meta, null, 2) + '\n', 'utf8');
+	renameSync(metaTmp, metaPath);
+
+	console.log(
+		`collapse: read ${before} entries → wrote ${after} entries ` +
+			`(${dropped} dropped via persistent collapse, ` +
+			`${result.persistentSurvivors} shells emitted)`,
+	);
+	console.log(`wrote ${path}`);
+	console.log(`wrote ${metaPath}`);
+}
+
+function printCollapseUsage(): void {
+	console.log(
+		[
+			'usage: explore collapse [<path>]',
+			'',
+			'Re-run the post-walk persistent-element collapse against an',
+			'existing inventory file. Useful for salvaging a partial',
+			'checkpoint that aborted before the in-loop collapse step.',
+			'',
+			'  <path>   inventory file to collapse in place (default:',
+			'           docs/testing/ui-inventory.json). Must be v5+.',
+			'  -h, --help  print this help',
+			'',
+			'Writes the collapsed inventory and updated meta.json',
+			'atomically (.tmp + rename). Meta gains `collapsedAt` and',
+			'clears `partial` to false.',
+		].join('\n'),
+	);
+}
+
+function printWalkUsage(): void {
+	console.log(
+		[
+			'usage: explore walk [options]',
+			'',
+			'options:',
+			'  --max-elements N              safety cap on total entries',
+			'                                (default 1000; 0 prints this help',
+			'                                and exits)',
+			'  --max-drills-per-surface N    max number of children to drill into',
+			'                                from one surface (default 50). All',
+			'                                children are still emitted to the',
+			'                                inventory; this only bounds the BFS',
+			'                                queue fan-out per surface.',
+			'                                (Alias: --max-elements-per-surface,',
+			'                                 deprecated — v3 truncated emissions,',
+			'                                 v4 only caps drilling.)',
+			'  --checkpoint-every N          atomically write the inventory every N',
+			'                                newly-emitted entries (default 100;',
+			'                                0 disables). Intermediate writes set',
+			'                                meta.json `partial: true`; the final',
+			'                                write clears it to false.',
+			'  --allowlist PATH              JSON file:',
+			'                                {"exemptions": ["entry.id", ...]} to',
+			'                                remove from the default denylist',
+			'  --output PATH                 write inventory to PATH (default',
+			'                                docs/testing/ui-inventory.json)',
+			'  --verbose                     log every click + surface to stderr',
+			'  -h, --help                    print this help',
+		].join('\n'),
+	);
+}
+
+async function runFind(args: string[]): Promise<void> {
+	const opts = { json: false, limit: 100 };
+	const positional: string[] = [];
+	for (let i = 0; i < args.length; i += 1) {
+		const a = args[i]!;
+		if (a === '--json') opts.json = true;
+		else if (a === '--limit') {
+			const n = Number(args[i + 1]);
+			if (!Number.isFinite(n) || n <= 0) {
+				console.error('find: --limit requires a positive integer');
+				process.exit(1);
+			}
+			opts.limit = n;
+			i += 1;
+		} else positional.push(a);
+	}
+	const pat = positional[0];
+	if (!pat) {
+		console.error('find: missing <regex> argument');
+		console.error('usage: explore find <regex> [--json] [--limit N]');
+		process.exit(1);
+	}
+	let re: RegExp;
+	try {
+		re = new RegExp(pat, 'i');
+	} catch (err) {
+		const msg = err instanceof Error ? err.message : String(err);
+		console.error(`find: invalid regex: ${msg}`);
+		process.exit(1);
+	}
+	const client = await connect();
+	try {
+		const hits = await findInRenderer(client, re, { limit: opts.limit });
+		if (opts.json) {
+			console.log(JSON.stringify(hits, null, 2));
+		} else {
+			console.log(formatHits(hits));
+		}
+	} finally {
+		client.close();
+	}
+}
+
+// Snapshot resolver: accept either a bare name (looked up in the
+// snapshot dir, .json appended) or an explicit path. Bare names are
+// the common case from CI / the README; explicit paths help when
+// diffing a snapshot against an out-of-tree fixture.
+function readSnapshot(nameOrPath: string): Snapshot {
+	const candidates = [
+		nameOrPath,
+		resolve(SNAPSHOT_DIR, nameOrPath),
+		resolve(SNAPSHOT_DIR, `${nameOrPath}.json`),
+	];
+	const found = candidates.find((p) => existsSync(p));
+	if (!found) {
+		console.error(`snapshot not found: tried ${candidates.join(', ')}`);
+		process.exit(1);
+	}
+	const raw = readFileSync(found, 'utf8');
+	try {
+		return JSON.parse(raw) as Snapshot;
+	} catch (err) {
+		const msg = err instanceof Error ? err.message : String(err);
+		console.error(`snapshot ${found}: invalid JSON — ${msg}`);
+		process.exit(1);
+	}
+}
+
+async function connect(): Promise<InspectorClient> {
+	try {
+		return await InspectorClient.connect(INSPECTOR_PORT);
+	} catch (err) {
+		const msg = err instanceof Error ? err.message : String(err);
+		throw new Error(
+			`could not attach to debugger on :${INSPECTOR_PORT} — ${msg}. ` +
+				`Enable the main-process debugger via the in-app menu first.`,
+		);
+	}
+}
+
+function printUsage(): void {
+	console.log(
+		[
+			'usage:',
+			'  explore                    full snapshot to stdout',
+			'  explore pills              df-pills + compact-pills + state',
+			'  explore menu               currently-open menu structure',
+			'  explore snapshot <name>    write snapshot to ui-snapshots/<name>.json',
+			'  explore diff <a> <b> [--json] [--exit-on-diff]',
+			'                             compare two snapshots',
+			'  explore find <regex> [--json] [--limit N]',
+			'                             search renderer text + aria-label',
+			'  explore walk [options]     BFS walker → docs/testing/ui-inventory.json',
+			'                             (see `explore walk --help` for options)',
+			'  explore collapse [<path>]  re-run persistent-element collapse against',
+			'                             an existing inventory (salvages partial',
+			'                             checkpoints; see `explore collapse --help`)',
+		].join('\n'),
+	);
+}
+
+main().catch((err) => {
+	const msg = err instanceof Error ? err.message : String(err);
+	console.error(`explore: ${msg}`);
+	process.exit(2);
+});
--- a/tools/test-harness/explore/find.ts
+++ b/tools/test-harness/explore/find.ts
@@ -0,0 +1,86 @@
+// Renderer search by regex over text content + aria-label.
+//
+// Why text+aria together: a "Send" button might have aria-label="Send"
+// but textContent="" (icon child); a heading might be the inverse.
+// Searching both lets the human ask "where does the word X appear?"
+// without first guessing which surface labels it.
+//
+// We restrict the candidate set to interactive + landmark elements
+// (button, [role], a, h1-h6, [aria-label]) rather than walking the
+// entire document — claude.ai's chat history dumps thousands of
+// <span>/<p> nodes that swamp signal. If a future need wants the
+// broader sweep, add a `--all` flag here rather than expanding the
+// default.
+
+import type { InspectorClient } from '../src/lib/inspector.js';
+
+export interface FindHit {
+	tag: string;
+	role: string | null;
+	ariaLabel: string | null;
+	text: string;
+	matchedField: 'text' | 'ariaLabel' | 'both';
+	visible: boolean;
+}
+
+// Regex source + flags travel as JSON strings into the renderer eval —
+// same encoding pattern as openPill / clickMenuItem in lib/claudeai.ts.
+export async function findInRenderer(
+	client: InspectorClient,
+	pattern: RegExp,
+	opts: { limit?: number } = {},
+): Promise<FindHit[]> {
+	const limit = opts.limit ?? 100;
+	const reSrc = JSON.stringify(pattern.source);
+	const reFlags = JSON.stringify(pattern.flags);
+	return await client.evalInRenderer<FindHit[]>(
+		'claude.ai',
+		`(() => {
+			const re = new RegExp(${reSrc}, ${reFlags});
+			const sel = 'button, a, h1, h2, h3, h4, h5, h6, ' +
+				'[role], [aria-label]';
+			const nodes = Array.from(document.querySelectorAll(sel));
+			const hits = [];
+			for (const el of nodes) {
+				const text = (el.textContent || '').trim().slice(0, 200);
+				const aria = el.getAttribute('aria-label');
+				const textHit = text.length > 0 && re.test(text);
+				const ariaHit = aria !== null && re.test(aria);
+				if (!textHit && !ariaHit) continue;
+				hits.push({
+					tag: el.tagName.toLowerCase(),
+					role: el.getAttribute('role'),
+					ariaLabel: aria,
+					text,
+					matchedField: textHit && ariaHit
+						? 'both'
+						: (textHit ? 'text' : 'ariaLabel'),
+					visible: !!el.getClientRects().length,
+				});
+				if (hits.length >= ${limit}) break;
+			}
+			return hits;
+		})()`,
+	);
+}
+
+export function formatHits(hits: FindHit[]): string {
+	if (hits.length === 0) return 'No matches.';
+	const lines: string[] = [];
+	for (const h of hits) {
+		const vis = h.visible ? '' : ' [hidden]';
+		const role = h.role ? ` role=${h.role}` : '';
+		const aria = h.ariaLabel !== null ? ` aria-label=${q(h.ariaLabel)}` : '';
+		lines.push(
+			`${h.tag}${role}${aria} (${h.matchedField})${vis}` +
+				(h.text ? `\n    text: ${h.text}` : ''),
+		);
+	}
+	lines.push('');
+	lines.push(`${hits.length} match(es).`);
+	return lines.join('\n');
+}
+
+function q(s: string): string {
+	return JSON.stringify(s);
+}
--- a/tools/test-harness/explore/gen-render-specs.ts
+++ b/tools/test-harness/explore/gen-render-specs.ts
@@ -0,0 +1,523 @@
+// Generate the U01 UI-visibility Playwright spec from the captured
+// inventory at docs/testing/ui-inventory.json. Reads the inventory +
+// its meta sidecar offline (no live app needed), groups entries by
+// canonical surface, and emits a single .spec.ts file with one
+// `test()` per inventory entry under one `test.describe()` per
+// surface.
+//
+// The generated spec asserts each entry's recorded fingerprint still
+// resolves to a visible element on the live signed-in renderer. It's
+// the inventory's "do these things still render" sibling — H05
+// detects shape drift across snapshots, U01 detects per-entry render
+// failures across the whole inventory.
+//
+// Pure file in/out: no network, no inspector. The spec it emits is
+// where the live app gets touched. Run via `npm run gen:render-specs`.
+//
+// Refuses to operate on a stale walker version or a partial inventory
+// — generating a passing spec from a half-walked DOM would silently
+// shrink the assertion surface to whatever the walker happened to
+// reach before crashing.
+
+import {
+	existsSync,
+	readFileSync,
+	renameSync,
+	writeFileSync,
+} from 'node:fs';
+import { dirname, resolve } from 'node:path';
+import { fileURLToPath } from 'node:url';
+
+import { WALKER_VERSION } from './walker.js';
+import type { Inventory, InventoryEntry, NavStep } from './walker.js';
+
+const HERE = dirname(fileURLToPath(import.meta.url));
+const TESTING_DIR = resolve(HERE, '..', '..', '..', 'docs', 'testing');
+const DEFAULT_INVENTORY = resolve(TESTING_DIR, 'ui-inventory.json');
+const DEFAULT_META = resolve(TESTING_DIR, 'ui-inventory.meta.json');
+const DEFAULT_OUTPUT = resolve(
+	HERE,
+	'..',
+	'src',
+	'runners',
+	'U01_ui_visibility.spec.ts',
+);
+
+interface MetaSidecar {
+	walkerVersion: string;
+	partial: boolean;
+	capturedAt: string;
+	appVersion: string;
+}
+
+interface CliOpts {
+	inventory: string;
+	output: string;
+	help: boolean;
+}
+
+function parseCli(argv: string[]): CliOpts {
+	const opts: CliOpts = {
+		inventory: DEFAULT_INVENTORY,
+		output: DEFAULT_OUTPUT,
+		help: false,
+	};
+	for (let i = 0; i < argv.length; i += 1) {
+		const a = argv[i]!;
+		switch (a) {
+			case '-h':
+			case '--help':
+				opts.help = true;
+				break;
+			case '--inventory': {
+				const v = argv[++i];
+				if (!v) {
+					process.stderr.write('--inventory requires a path\n');
+					process.exit(1);
+				}
+				opts.inventory = resolve(v);
+				break;
+			}
+			case '--output': {
+				const v = argv[++i];
+				if (!v) {
+					process.stderr.write('--output requires a path\n');
+					process.exit(1);
+				}
+				opts.output = resolve(v);
+				break;
+			}
+			default:
+				process.stderr.write(`gen-render-specs: unknown argument: ${a}\n`);
+				printUsage();
+				process.exit(1);
+		}
+	}
+	return opts;
+}
+
+function printUsage(): void {
+	process.stdout.write(
+		'Usage: tsx explore/gen-render-specs.ts [options]\n' +
+			'\n' +
+			'Generates src/runners/U01_ui_visibility.spec.ts from\n' +
+			'docs/testing/ui-inventory.json. Refuses to run if the inventory\n' +
+			'is partial or was produced by a walker older than v' +
+			WALKER_VERSION +
+			'.\n' +
+			'\n' +
+			'Options:\n' +
+			'  --inventory <path>  Override default inventory path\n' +
+			'                      (default: docs/testing/ui-inventory.json)\n' +
+			'  --output <path>     Override default spec output path\n' +
+			'                      (default: src/runners/U01_ui_visibility.spec.ts)\n' +
+			'  -h, --help          Print this help and exit\n',
+	);
+}
+
+function loadInventory(path: string): Inventory {
+	if (!existsSync(path)) {
+		process.stderr.write(`gen-render-specs: inventory not found: ${path}\n`);
+		process.exit(1);
+	}
+	try {
+		return JSON.parse(readFileSync(path, 'utf8')) as Inventory;
+	} catch (err) {
+		const msg = err instanceof Error ? err.message : String(err);
+		process.stderr.write(`gen-render-specs: failed to parse inventory: ${msg}\n`);
+		process.exit(1);
+	}
+}
+
+function loadMeta(invPath: string): MetaSidecar {
+	const metaPath = invPath.replace(/\.json$/, '.meta.json');
+	const fallbackPath =
+		invPath === DEFAULT_INVENTORY ? DEFAULT_META : metaPath;
+	const path = existsSync(metaPath) ? metaPath : fallbackPath;
+	if (!existsSync(path)) {
+		process.stderr.write(
+			`gen-render-specs: meta sidecar not found at ${metaPath} ` +
+				'(needed for partial/walkerVersion gating)\n',
+		);
+		process.exit(1);
+	}
+	try {
+		return JSON.parse(readFileSync(path, 'utf8')) as MetaSidecar;
+	} catch (err) {
+		const msg = err instanceof Error ? err.message : String(err);
+		process.stderr.write(`gen-render-specs: failed to parse meta: ${msg}\n`);
+		process.exit(1);
+	}
+}
+
+// Refuse on stale walker versions or partial inventories. The point of
+// this generator is to emit a spec that asserts the FULL inventory
+// renders; gating on these two flags is what stops a half-walked
+// checkpoint from quietly shrinking the assertion set.
+function validate(inv: Inventory, meta: MetaSidecar): void {
+	const seen = Number.parseInt(inv.walkerVersion, 10);
+	const required = Number.parseInt(WALKER_VERSION, 10);
+	if (Number.isNaN(seen) || seen < required) {
+		process.stderr.write(
+			`gen-render-specs: walkerVersion ${inv.walkerVersion} < ${WALKER_VERSION}; ` +
+				'inventory shape may be incompatible. Re-walk with the current ' +
+				'explore CLI before regenerating the spec.\n',
+		);
+		process.exit(1);
+	}
+	if (meta.partial === true) {
+		process.stderr.write(
+			'gen-render-specs: inventory meta reports partial=true (walk did ' +
+				'not finish). Refusing to generate a spec from a half-walked DOM ' +
+				'— complete the walk first or pass --inventory to a known-good file.\n',
+		);
+		process.exit(1);
+	}
+}
+
+// Deterministic surface→entries grouping. Sort surfaces alphabetically
+// and entries within each surface by id, so a re-run produces an
+// identical spec file when the inventory hasn't changed (the file is
+// checked in; no-op regeneration shouldn't mint diffs).
+function groupBySurface(
+	entries: InventoryEntry[],
+): { surface: string; entries: InventoryEntry[] }[] {
+	const buckets = new Map<string, InventoryEntry[]>();
+	for (const e of entries) {
+		const list = buckets.get(e.surface) ?? [];
+		list.push(e);
+		buckets.set(e.surface, list);
+	}
+	const surfaces = [...buckets.keys()].sort();
+	return surfaces.map((surface) => {
+		const list = buckets.get(surface)!.slice();
+		list.sort((a, b) => (a.id < b.id ? -1 : a.id > b.id ? 1 : 0));
+		return { surface, entries: list };
+	});
+}
+
+// Strip any navigationPath step that would CLICK the entry under
+// test, when that entry is denylisted. Per the spec brief: never click
+// denylisted controls, just assert they exist. In practice the
+// recorded path's last click is the surface-opener (entry's own id is
+// `surface.role.label`, distinct from any path step), so this filter
+// usually no-ops — but it's the safety net the brief calls for.
+function safeNavigationPath(entry: InventoryEntry): NavStep[] {
+	if (!entry.denylisted) return entry.navigationPath;
+	return entry.navigationPath.filter(
+		(s) => !(s.action === 'click' && s.id === entry.id),
+	);
+}
+
+// JS string literal for embedding in generated source. Use JSON.stringify
+// — handles all the escapes (backslash, quotes, newlines, unicode) that
+// hand-rolling would miss on entries with weird labels.
+function js(value: unknown): string {
+	return JSON.stringify(value);
+}
+
+// Sanitize a surface name into a `test.describe()` block label that
+// reads cleanly. Surfaces are dot-separated paths like
+// `root.button.search.option.x`; the raw form is fine for grouping
+// but we annotate the count so the report shows scope at a glance.
+function describeLabel(surface: string, count: number): string {
+	return `surface: ${surface} (${count} ${count === 1 ? 'entry' : 'entries'})`;
+}
+
+function testTitle(entry: InventoryEntry): string {
+	const tags: string[] = [entry.kind];
+	if (entry.denylisted) tags.push('denylist');
+	const tagStr = tags.length ? ` [${tags.join(',')}]` : '';
+	return `${entry.id}${tagStr} — ${entry.role}: ${entry.label}`;
+}
+
+function generateSpec(
+	inv: Inventory,
+	meta: MetaSidecar,
+	groups: { surface: string; entries: InventoryEntry[] }[],
+): string {
+	const out: string[] = [];
+	out.push(
+		'// AUTO-GENERATED FROM docs/testing/ui-inventory.json',
+		'// DO NOT EDIT — regenerate with `npm run gen:render-specs`',
+		`// Source inventory: walker v${inv.walkerVersion} (account-portable ariaPath ` +
+			`fingerprints), captured ${inv.capturedAt}, app ${inv.appVersion}`,
+		`// Entries: ${inv.totalElements} ` +
+			`(${inv.deniedActions} denylisted), ` +
+			`${groups.length} surfaces`,
+		`// Meta: partial=${meta.partial}`,
+		'',
+		"import { test, expect } from '@playwright/test';",
+		'',
+		"import { launchClaude } from '../lib/electron.js';",
+		"import type { ClaudeApp } from '../lib/electron.js';",
+		"import { createIsolation } from '../lib/isolation.js';",
+		"import { InspectorClient } from '../lib/inspector.js';",
+		"import { captureSessionEnv } from '../lib/diagnostics.js';",
+		'import {',
+		'\tcurrentUrl,',
+		'\tfindByFingerprint,',
+		'\tredrivePath,',
+		'\twaitForStable,',
+		"} from '../../explore/walker.js';",
+		"import type { InventoryEntry } from '../../explore/walker.js';",
+		'',
+		'// U01 — UI visibility sweep.',
+		'//',
+		'// One Playwright test per inventory entry. Each test re-drives the',
+		"// entry's recorded navigationPath against the live signed-in",
+		"// renderer, then asserts the entry's fingerprint resolves to a",
+		'// visible element. The full inventory acts as a render contract:',
+		'// any entry that no longer renders (selector drift, route change,',
+		'// permission change) shows up as exactly one failed test, with the',
+		'// triage payload (entry JSON + observed DOM neighbourhood)',
+		'// attached to that test only.',
+		'//',
+		'// Skip semantics mirror H05: the suite skips cleanly if the host',
+		"// isn't signed in (claude.ai webContents never reaches the",
+		"// userLoaded level). Default path: kill any running host Claude,",
+		"// copy the auth-relevant subset of ~/.config/Claude into a",
+		"// hermetic tmpdir, and launch against that copy. Host config is",
+		"// left untouched after the kill+seed. CLAUDE_TEST_USE_HOST_CONFIG=1",
+		"// opts out and shares the host's actual config directory (no",
+		"// kill+seed) — use only when you've manually closed the host first.",
+		'//',
+		"// Denylisted entries: we still assert they render, but the",
+		"// generator strips any navigationPath step that would CLICK the",
+		'// denylisted entry itself. Per the spec brief: never trigger',
+		'// destructive controls from a render check.',
+		'//',
+		'// Persistent entries: each persistent entry is asserted on its',
+		'// canonical surface only (the `surface` field). The cross-surface',
+		'// `surfaces[]` list is intentionally unused here — a strict',
+		'// "renders on every surface it was observed" mode is a future',
+		'// follow-up.',
+		'//',
+		'// Instance entries: assert that AT LEAST ONE element matching the',
+		"// fingerprint exists. We don't assert the recorded instanceCount",
+		'// — list lengths legitimately fluctuate across sessions.',
+		'',
+		"// Per-test budget covers a path redrive (~1 nav + ~N clicks * 1.5s)",
+		'// plus a fingerprint resolve. Generous to ride out a slow first',
+		'// route load; later tests in the same suite reuse the warmed app.',
+		'test.setTimeout(120_000);',
+		'',
+		'const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === \'1\';',
+		'',
+		"// Single shared launch + inspector across the whole suite. N",
+		'// tests at one launch each would burn 30+ minutes on cold-start',
+		'// alone. We pay for setup once, then each test re-drives from the',
+		'// recorded startUrl so prior-test side effects (open menus, route',
+		'// changes) get reset before the next assertion runs.',
+		'let app: ClaudeApp | null = null;',
+		'let sharedInspector: InspectorClient | null = null;',
+		'let sharedStartUrl: string | null = null;',
+		'let suiteSkipReason: string | null = null;',
+		'',
+		"test.describe('U01 — UI visibility sweep (auto-generated)', () => {",
+		'\ttest.beforeAll(async () => {',
+		'\t\t// Default path: kill any host Claude, copy auth-relevant',
+		"\t\t// subset of ~/.config/Claude into a hermetic tmpdir, launch",
+		"\t\t// against that copy. Host config is left untouched after the",
+		"\t\t// kill+seed. CLAUDE_TEST_USE_HOST_CONFIG=1 opts out — shares",
+		"\t\t// the host's actual config directory (no kill+seed); use only",
+		"\t\t// when you've manually closed the host first.",
+		'\t\tif (useHostConfig) {',
+		'\t\t\tapp = await launchClaude({ isolation: null });',
+		'\t\t} else {',
+		'\t\t\tconst seeded = await createIsolation({ seedFromHost: true });',
+		'\t\t\tapp = await launchClaude({ isolation: seeded });',
+		'\t\t}',
+		"\t\tconst ready = await app.waitForReady('userLoaded');",
+		'\t\tif (!ready.postLoginUrl) {',
+		"\t\t\tsuiteSkipReason = 'claude.ai never reached a post-login URL — host ' +",
+		"\t\t\t\t'profile is not signed in. Sign in via the host app first.';",
+		'\t\t\treturn;',
+		'\t\t}',
+		'\t\tsharedInspector = ready.inspector;',
+		'\t\tsharedStartUrl = await currentUrl(sharedInspector);',
+		'\t\tawait waitForStable(sharedInspector);',
+		'\t});',
+		'',
+		'\ttest.afterAll(async () => {',
+		'\t\tif (sharedInspector) {',
+		'\t\t\ttry {',
+		'\t\t\t\tsharedInspector.close();',
+		'\t\t\t} catch {',
+		'\t\t\t\t// inspector may already be closed by app.close()',
+		'\t\t\t}',
+		'\t\t\tsharedInspector = null;',
+		'\t\t}',
+		'\t\tif (app) {',
+		'\t\t\tawait app.close();',
+		'\t\t\tapp = null;',
+		'\t\t}',
+		'\t});',
+		'',
+		'\t// why: shared per-test runner. Each generated `test()` packs the',
+		'\t// entry as a literal and calls this — keeps the file scannable',
+		'\t// (one block per entry) without duplicating the assertion logic',
+		"\t// 383 times. Throws on its own when the suite was skipped so",
+		"\t// each test's status reflects the actual render check, not a",
+		'\t// mis-attributed setup failure.',
+		'\tasync function runEntry(',
+		'\t\tentry: InventoryEntry,',
+		"\t\ttestInfo: import('@playwright/test').TestInfo,",
+		'\t): Promise<void> {',
+		'\t\tif (suiteSkipReason) {',
+		'\t\t\ttestInfo.skip(true, suiteSkipReason);',
+		'\t\t\treturn;',
+		'\t\t}',
+		'\t\tif (!sharedInspector || !sharedStartUrl) {',
+		'\t\t\tthrow new Error(',
+		"\t\t\t\t'U01: beforeAll did not initialize the inspector — check the ' +",
+		"\t\t\t\t\t'session-env attachment for the launch failure.',",
+		'\t\t\t);',
+		'\t\t}',
+		"\t\ttestInfo.annotations.push({ type: 'severity', description: 'Should' });",
+		'\t\ttestInfo.annotations.push({',
+		"\t\t\ttype: 'surface',",
+		'\t\t\tdescription: entry.surface,',
+		'\t\t});',
+		'\t\ttestInfo.annotations.push({',
+		"\t\t\ttype: 'kind',",
+		'\t\t\tdescription: entry.kind,',
+		'\t\t});',
+		'',
+		'\t\ttry {',
+		'\t\t\tawait redrivePath(sharedInspector, sharedStartUrl, entry.navigationPath);',
+		'\t\t} catch (err) {',
+		'\t\t\tconst msg = err instanceof Error ? err.message : String(err);',
+		"\t\t\tawait testInfo.attach('redrive-failure', {",
+		'\t\t\t\tbody: JSON.stringify(',
+		'\t\t\t\t\t{',
+		'\t\t\t\t\t\tentry,',
+		'\t\t\t\t\t\terror: msg,',
+		'\t\t\t\t\t\tnote:',
+		"\t\t\t\t\t\t\t'redrivePath threw before we could assert visibility — ' +",
+		"\t\t\t\t\t\t\t'usually a stale fingerprint along the path. Re-walk the ' +",
+		"\t\t\t\t\t\t\t'inventory and regenerate.',",
+		'\t\t\t\t\t},',
+		'\t\t\t\t\tnull,',
+		'\t\t\t\t\t2,',
+		'\t\t\t\t),',
+		"\t\t\t\tcontentType: 'application/json',",
+		'\t\t\t});',
+		'\t\t\tthrow err;',
+		'\t\t}',
+		'\t\tawait waitForStable(sharedInspector);',
+		'',
+		'\t\tconst result = await findByFingerprint(',
+		'\t\t\tsharedInspector,',
+		'\t\t\tentry.fingerprint,',
+		'\t\t\tentry.kind,',
+		'\t\t);',
+		'\t\tif (!result.found) {',
+		"\t\t\tawait testInfo.attach('fingerprint-miss', {",
+		'\t\t\t\tbody: JSON.stringify(',
+		'\t\t\t\t\t{',
+		'\t\t\t\t\t\tentry,',
+		'\t\t\t\t\t\treason: result.reason,',
+		'\t\t\t\t\t\tobservedOuterHTML: result.outerHTMLSnippet,',
+		'\t\t\t\t\t},',
+		'\t\t\t\t\tnull,',
+		'\t\t\t\t\t2,',
+		'\t\t\t\t),',
+		"\t\t\t\tcontentType: 'application/json',",
+		'\t\t\t});',
+		'\t\t}',
+		"\t\t// Soft drift: primary aria-tree match failed but a relaxed-",
+		"\t\t// scope fallback recovered. Test still passes — but a",
+		"\t\t// drift-warning attachment surfaces it so the sweep summary",
+		"\t\t// can flag re-walk before drift compounds.",
+		'\t\tif (result.found && result.drift) {',
+		"\t\t\tawait testInfo.attach('drift-warning', {",
+		'\t\t\t\tbody: JSON.stringify(',
+		'\t\t\t\t\t{',
+		'\t\t\t\t\t\tentryId: entry.id,',
+		'\t\t\t\t\t\texpected: entry.fingerprint.ariaPath,',
+		'\t\t\t\t\t\tmatchedVia: result.strategy,',
+		'\t\t\t\t\t\tdrift: result.drift,',
+		"\t\t\t\t\t\tnote:",
+		"\t\t\t\t\t\t\t'primary aria-tree match failed; recovered via fallback. ' +",
+		"\t\t\t\t\t\t\t'Re-walk inventory before drift compounds.',",
+		'\t\t\t\t\t},',
+		'\t\t\t\t\tnull,',
+		'\t\t\t\t\t2,',
+		'\t\t\t\t),',
+		"\t\t\t\tcontentType: 'application/json',",
+		'\t\t\t});',
+		"\t\t\ttestInfo.annotations.push({",
+		"\t\t\t\ttype: 'drift',",
+		'\t\t\t\tdescription: result.strategy ?? \'unknown\',',
+		'\t\t\t});',
+		'\t\t}',
+		'\t\texpect(',
+		'\t\t\tresult.found,',
+		'\t\t\t`fingerprint did not resolve: ${result.reason ?? \'unknown\'}`,',
+		'\t\t).toBe(true);',
+		'\t}',
+		'',
+		'\ttest.beforeAll(async ({}, testInfo) => {',
+		"\t\tawait testInfo.attach('session-env', {",
+		'\t\t\tbody: JSON.stringify(captureSessionEnv(), null, 2),',
+		"\t\t\tcontentType: 'application/json',",
+		'\t\t});',
+		'\t});',
+		'',
+	);
+
+	// One describe per surface, one test per entry. Strings are
+	// JSON-encoded so labels with quotes/backticks/unicode survive.
+	for (const group of groups) {
+		out.push(
+			`\ttest.describe(${js(describeLabel(group.surface, group.entries.length))}, () => {`,
+		);
+		for (const entry of group.entries) {
+			const safe: InventoryEntry = {
+				...entry,
+				navigationPath: safeNavigationPath(entry),
+			};
+			out.push(
+				`\t\ttest(${js(testTitle(entry))}, async ({}, testInfo) => {`,
+				`\t\t\tconst entry: InventoryEntry = ${js(safe)};`,
+				'\t\t\tawait runEntry(entry, testInfo);',
+				'\t\t});',
+			);
+		}
+		out.push('\t});', '');
+	}
+
+	out.push('});', '');
+	return out.join('\n');
+}
+
+function atomicWrite(path: string, body: string): void {
+	const tmp = `${path}.tmp`;
+	writeFileSync(tmp, body, 'utf8');
+	renameSync(tmp, path);
+}
+
+function main(): void {
+	const opts = parseCli(process.argv.slice(2));
+	if (opts.help) {
+		printUsage();
+		return;
+	}
+	const inv = loadInventory(opts.inventory);
+	const meta = loadMeta(opts.inventory);
+	validate(inv, meta);
+
+	const groups = groupBySurface(inv.entries);
+	const body = generateSpec(inv, meta, groups);
+	atomicWrite(opts.output, body);
+
+	const testCount = inv.entries.length;
+	process.stdout.write(
+		`gen-render-specs: wrote ${opts.output}\n` +
+			`  ${testCount} test() across ${groups.length} test.describe() ` +
+			`(${inv.deniedActions} denylisted)\n`,
+	);
+}
+
+main();
--- a/tools/test-harness/explore/probe-claudeai-ax.ts
+++ b/tools/test-harness/explore/probe-claudeai-ax.ts
@@ -0,0 +1,202 @@
+// Live AX-tree probe for the claudeai.ts migration. Connects to the
+// host's main-process Node inspector on :9229 (must be enabled via
+// "Developer → Enable Main Process Debugger"), pulls the claude.ai
+// AX tree, and reports what the page-object discrimination shapes
+// will actually see.
+//
+// Read-only — no clicks, no state mutation.
+//
+// Run: cd tools/test-harness && npx tsx explore/probe-claudeai-ax.ts
+
+import { InspectorClient } from '../src/lib/inspector.js';
+import { axTreeToSnapshot, type RawElement } from './walker.js';
+
+const INSPECTOR_PORT = 9229;
+const ROW_MORE_OPTIONS_RE = /^More options for /;
+const MENU_ITEM_ROLES = new Set([
+	'menuitem',
+	'menuitemradio',
+	'menuitemcheckbox',
+]);
+
+function landmarkTrail(el: RawElement): string {
+	const trail = el.ancestors
+		.filter((a) => a.role !== null)
+		.map((a) => (a.name ? `${a.role}[${a.name}]` : (a.role as string)));
+	return trail.join(' › ') || '<no ancestors>';
+}
+
+function fmtElement(el: RawElement): string {
+	const name = el.accessibleName ?? '<no-name>';
+	const popup = el.hasPopup ?? '-';
+	return (
+		`  • role=${el.computedRole} hasPopup=${popup} ` +
+		`name=${JSON.stringify(name).slice(0, 90)}\n` +
+		`    landmarks: ${landmarkTrail(el)}`
+	);
+}
+
+async function main(): Promise<void> {
+	const inspector = await InspectorClient.connect(INSPECTOR_PORT);
+	try {
+		// What URL is the renderer on right now?
+		const url = await inspector.evalInRenderer<string>(
+			'claude.ai',
+			'(() => location.href)()',
+		);
+		process.stdout.write(`renderer URL: ${url}\n\n`);
+
+		const nodes = await inspector.getAccessibleTree('claude.ai');
+		process.stdout.write(`raw AX nodes: ${nodes.length}\n`);
+		const elements = axTreeToSnapshot(nodes);
+		process.stdout.write(
+			`interactive elements (post-filter): ${elements.length}\n\n`,
+		);
+
+		// Bucket by role for a quick overall shape.
+		const byRole = new Map<string, number>();
+		for (const el of elements) {
+			byRole.set(el.computedRole, (byRole.get(el.computedRole) ?? 0) + 1);
+		}
+		process.stdout.write('role histogram:\n');
+		for (const [role, n] of [...byRole.entries()].sort()) {
+			process.stdout.write(`  ${role}: ${n}\n`);
+		}
+		process.stdout.write('\n');
+
+		// THE KEY QUESTION: do any buttons report hasPopup === 'menu'?
+		// If yes, the migration's discrimination shape is sound. If no,
+		// claude.ai exposes the popover trigger via a different AX
+		// signal and we need a different filter.
+		const buttonsWithPopup = elements.filter(
+			(el) => el.computedRole === 'button' && el.hasPopup !== null,
+		);
+		process.stdout.write(
+			`buttons with hasPopup set (any value): ${buttonsWithPopup.length}\n`,
+		);
+		const popupValues = new Map<string, number>();
+		for (const b of buttonsWithPopup) {
+			const v = b.hasPopup ?? '<null>';
+			popupValues.set(v, (popupValues.get(v) ?? 0) + 1);
+		}
+		for (const [v, n] of [...popupValues.entries()].sort()) {
+			process.stdout.write(`  hasPopup="${v}": ${n}\n`);
+		}
+		process.stdout.write('\n');
+
+		// What findCompactPills() would return.
+		const compactPills = elements.filter(
+			(el) =>
+				el.computedRole === 'button' &&
+				el.hasPopup === 'menu' &&
+				el.accessibleName !== null &&
+				el.accessibleName.length > 0 &&
+				!ROW_MORE_OPTIONS_RE.test(el.accessibleName),
+		);
+		process.stdout.write(
+			`findCompactPills() would return ${compactPills.length} candidate(s):\n`,
+		);
+		for (const el of compactPills) process.stdout.write(`${fmtElement(el)}\n`);
+		process.stdout.write('\n');
+
+		// What the row-more-options filter is dropping.
+		const rowMore = elements.filter(
+			(el) =>
+				el.computedRole === 'button' &&
+				el.hasPopup === 'menu' &&
+				el.accessibleName !== null &&
+				ROW_MORE_OPTIONS_RE.test(el.accessibleName),
+		);
+		process.stdout.write(
+			`row-more-options filter dropped ${rowMore.length} button(s) ` +
+				`(showing first 5):\n`,
+		);
+		for (const el of rowMore.slice(0, 5)) {
+			process.stdout.write(`${fmtElement(el)}\n`);
+		}
+		process.stdout.write('\n');
+
+		// Top-level tabs: activateTab() looks for `role: 'button'` with
+		// accessibleName === 'Chat' | 'Cowork' | 'Code'. Probe each one.
+		process.stdout.write('top-level tab probe:\n');
+		for (const name of ['Chat', 'Cowork', 'Code']) {
+			const matches = elements.filter(
+				(el) =>
+					el.computedRole === 'button' && el.accessibleName === name,
+			);
+			process.stdout.write(`  "${name}": ${matches.length} match(es)\n`);
+			for (const el of matches) {
+				process.stdout.write(
+					`    landmarks: ${landmarkTrail(el)} hasPopup=${el.hasPopup ?? '-'}\n`,
+				);
+			}
+		}
+		process.stdout.write('\n');
+
+		// Open menu? Anything in MENU_ITEM_ROLES right now would mean a
+		// menu happens to be open at probe time — useful context for
+		// callers reading the output.
+		const items = elements.filter((el) =>
+			MENU_ITEM_ROLES.has(el.computedRole),
+		);
+		process.stdout.write(
+			`menuitem* elements currently in tree: ${items.length}` +
+				(items.length > 0 ? ' (a menu is open — surprise context)' : '') +
+				'\n\n',
+		);
+
+		// Diagnostic: is `properties[]` even being returned? Dump the
+		// raw shape of the first button node and any node that has a
+		// non-empty properties array, so we can tell whether
+		// (a) Chromium isn't surfacing aria-haspopup, or
+		// (b) properties[] is just absent from the response.
+		const firstButton = nodes.find((n) => n.role?.value === 'button');
+		if (firstButton) {
+			process.stdout.write('first raw button AxNode (full JSON):\n');
+			process.stdout.write(`${JSON.stringify(firstButton, null, 2)}\n\n`);
+		}
+
+		const nodesWithProps = nodes.filter(
+			(n) => Array.isArray(n.properties) && n.properties.length > 0,
+		);
+		process.stdout.write(
+			`raw nodes with non-empty properties[]: ${nodesWithProps.length}\n`,
+		);
+		// Histogram of property names actually present.
+		const propNames = new Map<string, number>();
+		for (const n of nodesWithProps) {
+			const props = n.properties as { name?: string }[];
+			for (const p of props) {
+				if (typeof p.name === 'string') {
+					propNames.set(p.name, (propNames.get(p.name) ?? 0) + 1);
+				}
+			}
+		}
+		for (const [name, n] of [...propNames.entries()].sort()) {
+			process.stdout.write(`  property "${name}": ${n}\n`);
+		}
+		process.stdout.write('\n');
+
+		// Spot-check the model picker if visible — it should be the
+		// canonical "menu trigger" on every surface.
+		const modelLikely = elements.filter(
+			(el) =>
+				el.accessibleName !== null &&
+				/^(Opus|Sonnet|Haiku|Claude)\b/i.test(el.accessibleName),
+		);
+		process.stdout.write(
+			`model-picker-like elements (name starts with Opus/Sonnet/Haiku/Claude): ` +
+				`${modelLikely.length}\n`,
+		);
+		for (const el of modelLikely.slice(0, 5)) {
+			process.stdout.write(`${fmtElement(el)}\n`);
+		}
+	} finally {
+		inspector.close();
+	}
+}
+
+main().catch((err) => {
+	process.stderr.write(`probe failed: ${err}\n`);
+	process.exit(1);
+});
--- a/tools/test-harness/explore/snapshot.ts
+++ b/tools/test-harness/explore/snapshot.ts
@@ -0,0 +1,276 @@
+// Renderer-state capture for the explore CLI.
+//
+// Why a separate module: the snapshot shape is the contract diff.ts
+// reads against. Keeping the capture here (rather than inline in the
+// dispatcher) means a future format bump only touches two files and
+// the schema lives next to its sole producer.
+//
+// All discovery is by structural shape — never by minified Tailwind
+// class names. We anchor on:
+//   - df-pills: button.df-pill[aria-label] (3 expected: Chat/Cowork/Code)
+//   - compact pills: button[aria-haspopup="menu"] containing
+//     span.truncate.max-w-[Npx] (env pill, Select-folder pill, …)
+//   - aria-labeled buttons: any <button[aria-label]> for general drift
+//     visibility (sidebar "more" buttons, header actions, modals).
+//   - open menu: the role=menu currently in the DOM, plus its items.
+//   - modals: role=dialog elements with aria-label/aria-labelledby.
+//
+// All renderer evals run in a single round-trip to keep snapshots
+// deterministic — async work between probes can shift the DOM.
+
+import type { InspectorClient } from '../src/lib/inspector.js';
+
+export interface DfPill {
+	ariaLabel: string | null;
+	text: string;
+	visible: boolean;
+}
+
+export interface CompactPillSnap {
+	ariaLabel: string | null;
+	text: string;
+	maxW: string;
+	expanded: boolean;
+}
+
+export interface AriaButton {
+	ariaLabel: string;
+	text: string;
+	expanded: boolean | null;
+	hasPopup: string | null;
+	visible: boolean;
+}
+
+export interface MenuItem {
+	role: string;
+	text: string;
+	ariaChecked: string | null;
+	disabled: boolean;
+}
+
+export interface OpenMenu {
+	ariaLabelledBy: string | null;
+	ariaLabel: string | null;
+	items: MenuItem[];
+}
+
+export interface ModalSnap {
+	ariaLabel: string | null;
+	ariaLabelledBy: string | null;
+	headingText: string | null;
+	buttonLabels: string[];
+}
+
+export interface PageState {
+	url: string;
+	title: string;
+	readyState: string;
+}
+
+export interface Snapshot {
+	capturedAt: string;
+	claudeAiUrl: string;
+	appVersion: string | null;
+	pageState: PageState;
+	dfPills: DfPill[];
+	compactPills: CompactPillSnap[];
+	ariaLabeledButtons: AriaButton[];
+	openMenu: OpenMenu | null;
+	modals: ModalSnap[];
+}
+
+// Capture the renderer DOM into the canonical snapshot shape.
+// `claudeAiUrl` is recorded separately from pageState.url because the
+// pageState reflects the moment of capture and is useful for diff
+// triage; the top-level url anchors which webContents we hit.
+export async function capture(client: InspectorClient): Promise<Snapshot> {
+	const target = await pickClaudeAiWebContents(client);
+	const appVersion = await readAppVersion(client);
+	const dom = await client.evalInRenderer<{
+		pageState: PageState;
+		dfPills: DfPill[];
+		compactPills: CompactPillSnap[];
+		ariaLabeledButtons: AriaButton[];
+		openMenu: OpenMenu | null;
+		modals: ModalSnap[];
+	}>('claude.ai', RENDERER_CAPTURE_BODY);
+	return {
+		capturedAt: new Date().toISOString(),
+		claudeAiUrl: target,
+		appVersion,
+		pageState: dom.pageState,
+		dfPills: dom.dfPills,
+		compactPills: dom.compactPills,
+		ariaLabeledButtons: dom.ariaLabeledButtons,
+		openMenu: dom.openMenu,
+		modals: dom.modals,
+	};
+}
+
+// Just the pills slice — used by `explore pills`. Reuses the same eval
+// body to avoid drift between subcommands.
+export async function capturePills(
+	client: InspectorClient,
+): Promise<{
+	dfPills: DfPill[];
+	compactPills: CompactPillSnap[];
+	pageState: PageState;
+}> {
+	const dom = await client.evalInRenderer<{
+		pageState: PageState;
+		dfPills: DfPill[];
+		compactPills: CompactPillSnap[];
+		ariaLabeledButtons: AriaButton[];
+		openMenu: OpenMenu | null;
+		modals: ModalSnap[];
+	}>('claude.ai', RENDERER_CAPTURE_BODY);
+	return {
+		dfPills: dom.dfPills,
+		compactPills: dom.compactPills,
+		pageState: dom.pageState,
+	};
+}
+
+// Just the open menu — used by `explore menu`.
+export async function captureOpenMenu(
+	client: InspectorClient,
+): Promise<OpenMenu | null> {
+	const dom = await client.evalInRenderer<{ openMenu: OpenMenu | null }>(
+		'claude.ai',
+		`(() => { ${OPEN_MENU_FN} return { openMenu: openMenu() }; })()`,
+	);
+	return dom.openMenu;
+}
+
+async function pickClaudeAiWebContents(
+	client: InspectorClient,
+): Promise<string> {
+	const list = await client.evalInMain<Array<{ url: string }>>(`
+		const { webContents } = process.mainModule.require('electron');
+		return webContents.getAllWebContents().map(w => ({ url: w.getURL() }));
+	`);
+	const target = list.find((w) => w.url.includes('claude.ai'));
+	if (!target) {
+		throw new Error(
+			'snapshot: no claude.ai webContents — open the app to a ' +
+				'logged-in state first',
+		);
+	}
+	return target.url;
+}
+
+// app.getVersion() is the cleanest source of truth — same value the
+// app.asar serves at runtime. Returns null if the call shape ever
+// changes upstream rather than failing the whole snapshot.
+async function readAppVersion(
+	client: InspectorClient,
+): Promise<string | null> {
+	try {
+		return await client.evalInMain<string>(`
+			const { app } = process.mainModule.require('electron');
+			return app.getVersion();
+		`);
+	} catch {
+		return null;
+	}
+}
+
+// Single shared renderer-eval body. Definitions are inlined as IIFEs so
+// the whole capture is one round-trip. Truncation limits (text 200,
+// list 200) are wide enough for current claude.ai but bounded so a
+// future infinite-scroll regression doesn't blow up the JSON file.
+const OPEN_MENU_FN = `
+	function openMenu() {
+		const menu = document.querySelector('[role=menu][data-open]')
+			|| document.querySelector('[role=menu]');
+		if (!menu) return null;
+		const items = Array.from(menu.querySelectorAll(
+			'[role=menuitem], [role=menuitemradio], [role=menuitemcheckbox]'
+		)).slice(0, 200).map(el => ({
+			role: el.getAttribute('role') || '',
+			text: (el.textContent || '').trim().slice(0, 200),
+			ariaChecked: el.getAttribute('aria-checked'),
+			disabled: el.hasAttribute('data-disabled')
+				|| el.getAttribute('aria-disabled') === 'true',
+		}));
+		return {
+			ariaLabelledBy: menu.getAttribute('aria-labelledby'),
+			ariaLabel: menu.getAttribute('aria-label'),
+			items,
+		};
+	}
+`;
+
+const RENDERER_CAPTURE_BODY = `
+	(() => {
+		${OPEN_MENU_FN}
+		const buttons = Array.from(document.querySelectorAll('button'));
+		const dfPills = buttons
+			.filter(b => /\\bdf-pill\\b/.test(b.className))
+			.map(b => ({
+				ariaLabel: b.getAttribute('aria-label'),
+				text: (b.textContent || '').trim().slice(0, 200),
+				visible: !!b.getClientRects().length,
+			}));
+		const compactPills = buttons.flatMap(b => {
+			if (b.getAttribute('aria-haspopup') !== 'menu') return [];
+			const span = b.querySelector('span.truncate');
+			if (!span) return [];
+			const m = span.className.match(/max-w-\\[[^\\]]+\\]/);
+			if (!m) return [];
+			return [{
+				ariaLabel: b.getAttribute('aria-label'),
+				text: (span.textContent || '').trim().slice(0, 200),
+				maxW: m[0],
+				expanded: b.getAttribute('aria-expanded') === 'true',
+			}];
+		});
+		const ariaLabeledButtons = buttons
+			.filter(b => b.hasAttribute('aria-label'))
+			.slice(0, 200)
+			.map(b => ({
+				ariaLabel: b.getAttribute('aria-label') || '',
+				text: (b.textContent || '').trim().slice(0, 200),
+				expanded: b.hasAttribute('aria-expanded')
+					? b.getAttribute('aria-expanded') === 'true'
+					: null,
+				hasPopup: b.getAttribute('aria-haspopup'),
+				visible: !!b.getClientRects().length,
+			}));
+		const modals = Array.from(
+			document.querySelectorAll('[role=dialog]')
+		).slice(0, 20).map(d => {
+			const heading = d.querySelector(
+				'h1, h2, h3, [role=heading]'
+			);
+			const btnLabels = Array.from(d.querySelectorAll('button'))
+				.slice(0, 50)
+				.map(b => {
+					const al = b.getAttribute('aria-label');
+					if (al) return al;
+					return (b.textContent || '').trim().slice(0, 80);
+				})
+				.filter(s => s.length > 0);
+			return {
+				ariaLabel: d.getAttribute('aria-label'),
+				ariaLabelledBy: d.getAttribute('aria-labelledby'),
+				headingText: heading
+					? (heading.textContent || '').trim().slice(0, 200)
+					: null,
+				buttonLabels: btnLabels,
+			};
+		});
+		return {
+			pageState: {
+				url: location.href,
+				title: document.title,
+				readyState: document.readyState,
+			},
+			dfPills,
+			compactPills,
+			ariaLabeledButtons,
+			openMenu: openMenu(),
+			modals,
+		};
+	})()
+`;
--- a/tools/test-harness/explore/walk-isolated.ts
+++ b/tools/test-harness/explore/walk-isolated.ts
@@ -0,0 +1,240 @@
+// Drive a v7 walk inside the test harness's launch-with-isolation
+// path so the run lives in a per-launch tmpdir (auth seeded from the
+// host config) rather than the running host app's own profile.
+//
+// Why a separate driver instead of `explore walk`: the standalone CLI
+// connects to whatever Node inspector is already on :9229 — i.e. the
+// running host Claude Desktop. That path mutates the host profile
+// (visited surfaces, navigation history, route changes) and races
+// with the human at the keyboard. The launchClaude path here mirrors
+// what H05 / U01 do: kill any running host instance, copy auth into
+// a tmpdir, spawn a fresh Electron with isolated XDG_CONFIG_HOME,
+// attach the inspector via SIGUSR1, and tear everything down on
+// exit.
+//
+// Usage (matches `explore walk` flag set):
+//   npx tsx explore/walk-isolated.ts --verbose --max-elements 2000
+//
+// Flags:
+//   --max-elements N            global cap (default 1000)
+//   --max-drills-per-surface N  per-surface drilling fan-out cap (default 50)
+//   --checkpoint-every N        write inventory every N entries (default 100)
+//   --output PATH               inventory output (default docs/testing/
+//                                                       ui-inventory.json)
+//   --allowlist PATH            JSON file with `exemptions: string[]`
+//   --no-seed                   don't copy host auth — fresh sign-in
+//                               required (rare; default seeds from host)
+//   --verbose                   walker chatter to stderr
+
+import {
+	existsSync,
+	mkdirSync,
+	readFileSync,
+	renameSync,
+	writeFileSync,
+} from 'node:fs';
+import { dirname, resolve } from 'node:path';
+import { fileURLToPath } from 'node:url';
+
+import { launchClaude } from '../src/lib/electron.js';
+import { createIsolation } from '../src/lib/isolation.js';
+import { walkRenderer, WALKER_VERSION } from './walker.js';
+import type { Inventory } from './walker.js';
+
+const TESTING_DIR = resolve(
+	dirname(fileURLToPath(import.meta.url)),
+	'..',
+	'..',
+	'..',
+	'docs',
+	'testing',
+);
+const INVENTORY_PATH = resolve(TESTING_DIR, 'ui-inventory.json');
+const INVENTORY_META_PATH = resolve(TESTING_DIR, 'ui-inventory.meta.json');
+const INVENTORY_TMP_SUFFIX = '.tmp';
+
+interface Options {
+	maxElements: number;
+	maxDrillsPerSurface: number;
+	checkpointEvery: number;
+	allowlist: string | null;
+	output: string;
+	verbose: boolean;
+	seed: boolean;
+	help: boolean;
+}
+
+function parseArgs(args: string[]): Options {
+	const opts: Options = {
+		maxElements: 1000,
+		maxDrillsPerSurface: 50,
+		checkpointEvery: 100,
+		allowlist: null,
+		output: INVENTORY_PATH,
+		verbose: false,
+		seed: true,
+		help: false,
+	};
+	for (let i = 0; i < args.length; i += 1) {
+		const a = args[i]!;
+		if (a === '-h' || a === '--help') opts.help = true;
+		else if (a === '--verbose') opts.verbose = true;
+		else if (a === '--no-seed') opts.seed = false;
+		else if (a === '--max-elements') {
+			const n = Number(args[++i]);
+			if (!Number.isFinite(n) || n < 0) die('--max-elements N (N≥0)');
+			opts.maxElements = n;
+		} else if (a === '--max-drills-per-surface') {
+			const n = Number(args[++i]);
+			if (!Number.isFinite(n) || n < 0) die('--max-drills-per-surface N');
+			opts.maxDrillsPerSurface = n;
+		} else if (a === '--checkpoint-every') {
+			const n = Number(args[++i]);
+			if (!Number.isInteger(n) || n < 0) die('--checkpoint-every N');
+			opts.checkpointEvery = n;
+		} else if (a === '--allowlist') {
+			const p = args[++i];
+			if (!p) die('--allowlist PATH');
+			opts.allowlist = p;
+		} else if (a === '--output') {
+			const p = args[++i];
+			if (!p) die('--output PATH');
+			opts.output = resolve(p);
+		} else {
+			die(`unknown flag: ${a}`);
+		}
+	}
+	return opts;
+}
+
+function die(msg: string): never {
+	process.stderr.write(`walk-isolated: ${msg}\n`);
+	process.exit(1);
+}
+
+function printUsage(): void {
+	process.stdout.write(
+		[
+			'usage: npx tsx explore/walk-isolated.ts [flags]',
+			'',
+			'flags:',
+			'  --max-elements N             global cap (default 1000)',
+			'  --max-drills-per-surface N   drilling fan-out cap (default 50)',
+			'  --checkpoint-every N         partial-write cadence (default 100; 0 disables)',
+			'  --output PATH                inventory output path',
+			'  --allowlist PATH             JSON { exemptions: string[] }',
+			'  --no-seed                    skip host-config auth seeding',
+			'  --verbose                    walker chatter on stderr',
+			'',
+		].join('\n'),
+	);
+}
+
+async function main(): Promise<void> {
+	const opts = parseArgs(process.argv.slice(2));
+	if (opts.help) {
+		printUsage();
+		return;
+	}
+
+	let allowlist: string[] = [];
+	if (opts.allowlist) {
+		const raw = readFileSync(opts.allowlist, 'utf8');
+		const parsed = JSON.parse(raw) as { exemptions?: string[] };
+		allowlist = parsed.exemptions ?? [];
+	}
+
+	const outDir = dirname(opts.output);
+	if (!existsSync(outDir)) mkdirSync(outDir, { recursive: true });
+	const metaPath =
+		opts.output === INVENTORY_PATH
+			? INVENTORY_META_PATH
+			: opts.output.replace(/\.json$/, '.meta.json');
+
+	const writeCheckpoint = (inventory: Inventory, isPartial: boolean): void => {
+		const invTmp = `${opts.output}${INVENTORY_TMP_SUFFIX}`;
+		writeFileSync(invTmp, JSON.stringify(inventory, null, 2) + '\n', 'utf8');
+		renameSync(invTmp, opts.output);
+		const meta = {
+			capturedAt: inventory.capturedAt,
+			appVersion: inventory.appVersion,
+			walkerVersion: WALKER_VERSION,
+			startUrl: inventory.startUrl,
+			totalElements: inventory.totalElements,
+			deniedActions: inventory.deniedActions,
+			partial: isPartial,
+			isolation: 'launchClaude (test-harness path)',
+			seededFromHost: opts.seed,
+			allowlistEntries: allowlist,
+		};
+		const metaTmp = `${metaPath}${INVENTORY_TMP_SUFFIX}`;
+		writeFileSync(metaTmp, JSON.stringify(meta, null, 2) + '\n', 'utf8');
+		renameSync(metaTmp, metaPath);
+	};
+
+	process.stderr.write(
+		`walk-isolated: creating isolation (seedFromHost=${opts.seed})\n`,
+	);
+	const isolation = await createIsolation({ seedFromHost: opts.seed });
+	let app: Awaited<ReturnType<typeof launchClaude>> | null = null;
+	try {
+		process.stderr.write('walk-isolated: spawning Claude Desktop\n');
+		app = await launchClaude({ isolation });
+		process.stderr.write(
+			'walk-isolated: waiting for claude.ai webContents (90s budget)\n',
+		);
+		const { inspector, claudeAiUrl } = await app.waitForReady('claudeAi');
+		if (!claudeAiUrl) {
+			throw new Error(
+				'claude.ai webContents never loaded — host likely not signed in. ' +
+					'Open Claude Desktop, sign in, fully close, and re-run.',
+			);
+		}
+		process.stderr.write(`walk-isolated: at ${claudeAiUrl}\n`);
+
+		const inventory = await walkRenderer(inspector, {
+			maxElements: opts.maxElements,
+			maxDrillsPerSurface: opts.maxDrillsPerSurface,
+			allowlist,
+			verbose: opts.verbose,
+			checkpointEvery: opts.checkpointEvery,
+			checkpointWriter:
+				opts.checkpointEvery > 0
+					? (inv) => writeCheckpoint(inv, true)
+					: undefined,
+		});
+		writeCheckpoint(inventory, false);
+		process.stdout.write(
+			`wrote ${opts.output} (${inventory.totalElements} entries, ` +
+				`${inventory.deniedActions} denylisted)\n`,
+		);
+		process.stdout.write(`wrote ${metaPath}\n`);
+	} finally {
+		if (app) {
+			try {
+				await app.close();
+			} catch (err) {
+				process.stderr.write(
+					`walk-isolated: app.close() failed: ${
+						err instanceof Error ? err.message : String(err)
+					}\n`,
+				);
+			}
+		}
+		try {
+			await isolation.cleanup();
+		} catch (err) {
+			process.stderr.write(
+				`walk-isolated: isolation.cleanup() failed: ${
+					err instanceof Error ? err.message : String(err)
+				}\n`,
+			);
+		}
+	}
+}
+
+main().catch((err) => {
+	const msg = err instanceof Error ? err.message : String(err);
+	process.stderr.write(`walk-isolated: ${msg}\n`);
+	process.exit(2);
+});
--- a/tools/test-harness/explore/walker.ts
+++ b/tools/test-harness/explore/walker.ts
--- a/tools/test-harness/grounding-probe.ts
+++ b/tools/test-harness/grounding-probe.ts
@@ -0,0 +1,468 @@
+// Grounding probe — dumps Claude Desktop runtime state that backs the
+// load-bearing claims in docs/testing/cases/. Output is keyed by
+// test-ID so the next grounding sweep can diff captures across
+// upstream versions.
+//
+// Two modes:
+//   - attach (default): connect to an already-running app on port 9229
+//     (manual `--inspect=9229` run, or a launchClaude() instance that
+//     called attachInspector()).
+//   - --launch: spin up a fresh isolated instance via launchClaude(),
+//     capture, tear down. Self-contained — usable in CI.
+//
+// Mostly read-only; --include-synthetic enables short-lived state
+// changes (powerSaveBlocker start+stop) to close API-only gaps.
+//
+// Captures, keyed by test ID:
+//   T01  app metadata, webContents count
+//   T03  SNI / tray registration via DBus (KDE StatusNotifierWatcher)
+//   T06  globalShortcut.isRegistered() for known accelerators
+//   T09  app.getLoginItemSettings()
+//   T22  AX fingerprint (PR toolbar — open the surface before probing)
+//   T23  Notification.isSupported()
+//   T24  IPC channels matching /external|editor|openIn/i
+//   T26  AX fingerprint (Routines page — open before probing)
+//   T31  AX fingerprint (side chat — open before probing)
+//   T32  AX fingerprint (slash menu — type "/" before probing)
+//   T38  IPC channels matching /external|editor|openIn/i (editor handoff)
+//   S18  safeStorage.isEncryptionAvailable() + backend
+//   S20  powerSaveBlocker (gated by --include-synthetic)
+//   S22  process.platform (Computer Use gate)
+//   S25  safeStorage (cowork trusted-device token)
+//   S26  autoUpdater.getFeedURL() — empirical answer to the structural-
+//        open claim that static analysis couldn't resolve
+//
+// Usage:
+//   cd tools/test-harness
+//   npx tsx grounding-probe.ts                                          # attach :9229
+//   npx tsx grounding-probe.ts --launch                                 # self-contained
+//   npx tsx grounding-probe.ts --launch --include-synthetic
+//   npx tsx grounding-probe.ts --out ../../docs/testing/cases-grounding-runtime.json
+//   npx tsx grounding-probe.ts --port 9229 --out path/to/file.json
+//
+// Extending: add a section in capture() with a `client.evalInMain`
+// dump targeting whatever runtime state your new test cares about,
+// then map the result into `tests[<id>]`.
+
+import { writeFileSync } from 'node:fs';
+import { InspectorClient } from './src/lib/inspector.js';
+import { launchClaude } from './src/lib/electron.js';
+// dbus-next is loaded lazily inside captureSni() — importing here would
+// pull in a session-bus connection on environments without one (CI
+// containers, sshfs, etc.) and break the probe before it ever runs.
+
+// Accelerators we expect to be registered on Linux. T06 = Quick Entry
+// default. S31/S32 — fullscreen + cmd-K dispatch. Extend per case docs.
+const KNOWN_ACCELERATORS = [
+	'Alt+Space',
+	'Ctrl+Alt+Space',
+	'CommandOrControl+Shift+L',
+];
+
+interface AxFingerprintNode {
+	role: string;
+	name: string;
+	hasPopup: boolean;
+}
+
+interface GroundingCapture {
+	capturedAt: string;
+	appVersion: string;
+	appPath: string;
+	isPackaged: boolean;
+	platform: string;
+	// Cross-test corpus — useful as a denormalized source the per-test
+	// entries reference by index/key. Keep these flat so jq queries
+	// don't need to walk a nested tree.
+	ipcInvokeChannels: string[];
+	ipcOnChannels: string[];
+	webContents: Array<{ id: number; url: string; type: string }>;
+	// Reduced AX tree of the current claude.ai webContents, shared by
+	// every test entry that names a renderer-side surface. Stored once
+	// at the top level rather than copied per-test — diff stability
+	// matters more than per-test isolation here.
+	axFingerprint: AxFingerprintNode[];
+	// Per-test bag — extend as new probes land. Each entry is the
+	// runtime state the test's load-bearing claim depends on, in a
+	// shape that's easy to diff across captures. Renderer-side tests
+	// reference $.axFingerprint via { axFingerprintRef: true }.
+	tests: Record<string, unknown>;
+	// Probe-level diagnostics — what we tried and couldn't capture.
+	// Surfaced so the grounding sweep can flag uncovered surfaces.
+	gaps: string[];
+}
+
+interface CaptureOptions {
+	includeSynthetic: boolean;
+}
+
+async function capture(
+	client: InspectorClient,
+	opts: CaptureOptions,
+): Promise<GroundingCapture> {
+	const gaps: string[] = [];
+
+	// App metadata — every test references at least one of these.
+	const appMeta = await client.evalInMain<{
+		appVersion: string;
+		appPath: string;
+		isPackaged: boolean;
+		appReady: boolean;
+		platform: string;
+	}>(`
+		const { app } = process.mainModule.require('electron');
+		return {
+			appVersion: app.getVersion(),
+			appPath: app.getAppPath(),
+			isPackaged: app.isPackaged,
+			appReady: app.isReady(),
+			platform: process.platform,
+		};
+	`);
+
+	// IPC handler registry. Every claude.web_* channel registers via
+	// ipcMain.handle() (invoke side) or ipcMain.on() (fire-and-forget).
+	// Private API — surfaces shift across Electron versions; tolerate
+	// both shapes.
+	const ipc = await client.evalInMain<{ invoke: string[]; on: string[] }>(`
+		const { ipcMain } = process.mainModule.require('electron');
+		const invoke = ipcMain._invokeHandlers
+			? Array.from(ipcMain._invokeHandlers.keys())
+			: [];
+		const on = ipcMain.eventNames ? ipcMain.eventNames().map(String) : [];
+		return { invoke, on };
+	`);
+
+	// WebContents inventory — proves which BrowserViews / BrowserWindows
+	// exist at probe time. Note: BrowserWindow.getAllWindows() returns
+	// 0 because frame-fix-wrapper substitutes the class (see
+	// inspector.ts header comment) — webContents registry stays intact.
+	const webContents = await client.evalInMain<
+		Array<{ id: number; url: string; type: string }>
+	>(`
+		const { webContents } = process.mainModule.require('electron');
+		return webContents.getAllWebContents().map(w => ({
+			id: w.id,
+			url: w.getURL(),
+			type: w.getType ? w.getType() : 'unknown',
+		}));
+	`);
+
+	// Global shortcuts — T06, S31/S32 reference these. isRegistered()
+	// is the canonical runtime probe; matches the case-doc claim about
+	// what's bound at startup.
+	const accelerators = await client.evalInMain<
+		Array<{ accelerator: string; registered: boolean }>
+	>(`
+		const { globalShortcut } = process.mainModule.require('electron');
+		const list = ${JSON.stringify(KNOWN_ACCELERATORS)};
+		return list.map(a => ({
+			accelerator: a,
+			registered: globalShortcut.isRegistered(a),
+		}));
+	`);
+
+	// Autostart resolution — T09. On Linux Electron's openAtLogin is a
+	// documented no-op; our wrapper installs an XDG Autostart shim
+	// (frame-fix-wrapper.js:376). The empirical check confirms which
+	// path is active.
+	const loginItems = await client.evalInMain<{
+		openAtLogin: boolean;
+		wasOpenedAtLogin?: boolean;
+		executableWillLaunchAtLogin?: boolean;
+	}>(`
+		const { app } = process.mainModule.require('electron');
+		return app.getLoginItemSettings();
+	`);
+
+	// safeStorage — S18 (env-config encryption) + S25 (cowork trusted-
+	// device token). Linux backend is libsecret; availability gates
+	// whether tokens persist or stall.
+	const safeStorage = await client.evalInMain<{
+		available: boolean;
+		backend: string;
+	}>(`
+		const { safeStorage } = process.mainModule.require('electron');
+		let backend = 'unknown';
+		try {
+			if (safeStorage.getSelectedStorageBackend) {
+				backend = safeStorage.getSelectedStorageBackend();
+			}
+		} catch (_) { /* older Electron — backend not exposed */ }
+		return {
+			available: safeStorage.isEncryptionAvailable(),
+			backend,
+		};
+	`);
+
+	// autoUpdater feedURL — S26. The case doc claims the gate is open
+	// by construction (lii() returns true on Linux when packaged).
+	// Accidental coverage from Electron's Linux autoUpdater being
+	// unimplemented saves us from real download attempts. This probe
+	// puts that on the record empirically.
+	const autoUpdater = await client.evalInMain<{
+		feedURL: string | null;
+		feedURLError: string | null;
+	}>(`
+		const { autoUpdater } = process.mainModule.require('electron');
+		let feedURL = null, feedURLError = null;
+		try {
+			feedURL = autoUpdater.getFeedURL ? autoUpdater.getFeedURL() : null;
+		} catch (e) {
+			feedURLError = String(e && e.message);
+		}
+		return { feedURL, feedURLError };
+	`);
+
+	// Tray — T03. We can't enumerate Tray instances via public API,
+	// but we can confirm Notification support is alive (T23 prerequisite).
+	const notifications = await client.evalInMain<{ supported: boolean }>(`
+		const { Notification } = process.mainModule.require('electron');
+		return { supported: Notification.isSupported() };
+	`);
+
+	// Powermonitor / suspend inhibit — S20. powerSaveBlocker has no
+	// public enumeration API. Synthetic probe (gated behind
+	// --include-synthetic) starts a blocker, reads isStarted, stops
+	// immediately. Brief inhibit (~ms) is harmless; what we get back
+	// is empirical proof the API path is alive on this host. Doesn't
+	// verify the case-doc claim that `keepAwakeEnabled` setting toggles
+	// trigger this — that requires correlating settings IO with the
+	// `PhA` Set at index.js:241897, which depends on minified-name
+	// stability and is left to the next sweep.
+	let powerSaveBlocker: {
+		apiAvailable: boolean;
+		startWorks: boolean;
+		idType: string;
+		probeError: string | null;
+	} | null = null;
+	if (opts.includeSynthetic) {
+		powerSaveBlocker = await client.evalInMain(`
+			const { powerSaveBlocker } = process.mainModule.require('electron');
+			let id = null, started = false, probeError = null;
+			try {
+				id = powerSaveBlocker.start('prevent-app-suspension');
+				started = powerSaveBlocker.isStarted(id);
+			} catch (e) {
+				probeError = String(e && e.message);
+			} finally {
+				if (id !== null) {
+					try { powerSaveBlocker.stop(id); } catch (_) {}
+				}
+			}
+			return {
+				apiAvailable: true,
+				startWorks: started,
+				idType: typeof id,
+				probeError,
+			};
+		`);
+	} else {
+		gaps.push(
+			'S20: powerSaveBlocker not probed (skip-synthetic). ' +
+				'Re-run with --include-synthetic to confirm API path.',
+		);
+	}
+
+	// Editor handoff scheme registry — T24/T38. Static case anchor
+	// (`Mtt` at index.js:463902) names the registry; variable is
+	// minified, so we identify by IPC handler name pattern instead.
+	// The case doc claims schemes vscode/cursor/zed/windsurf are wired
+	// up on Linux (xcode is darwin-only). The IPC channel that calls
+	// `shell.openExternal('<scheme>://file/<encoded-path>:<line>')`
+	// will be one of these matches.
+	const editorIpcChannels = [
+		...ipc.invoke.filter((c) => /external|editor|openIn/i.test(c)),
+		...ipc.on.filter((c) => /external|editor|openIn/i.test(c)),
+	];
+
+	// Renderer AX fingerprint — T22/T26/T31/T32. `getAccessibleTree`
+	// snapshots whatever's *currently on screen*. To anchor surfaces
+	// inside modals/popups (preset list, slash menu, side chat, PR
+	// toolbar), open the surface in the running app before probe time.
+	// Reduced form (role+name+hasPopup) keeps the output grep-able and
+	// avoids re-shipping ui-inventory.json's full schema.
+	const claudeAi = webContents.find((w) => w.url.includes('claude.ai'));
+	let axFingerprint: AxFingerprintNode[] = [];
+	if (claudeAi) {
+		try {
+			const tree = await client.getAccessibleTree('claude.ai');
+			axFingerprint = tree
+				.filter((n) => !n.ignored && n.role && n.name)
+				.map((n) => ({
+					role: n.role!.value,
+					name: n.name!.value,
+					hasPopup: !!n.properties?.find((p) => p.name === 'haspopup'),
+				}))
+				.filter((n) => n.name.length > 0);
+		} catch (e) {
+			gaps.push(
+				`renderer-ax: getAccessibleTree threw: ${e instanceof Error ? e.message : String(e)}`,
+			);
+		}
+	} else {
+		gaps.push(
+			'renderer-ax: no claude.ai webContents at probe time. ' +
+				'Sign in to the app before re-running to capture renderer state.',
+		);
+	}
+
+	// Tray / SNI registration — T03. Linux tray icons register against
+	// org.kde.StatusNotifierWatcher (KDE protocol used by GNOME's
+	// AppIndicator extension too). We can attribute an SNI item to the
+	// app's pid via `findItemByPid`. Lazily imported because dbus-next
+	// connects on first call to getSessionBus(), and we want
+	// non-DBus environments to still get a partial probe rather than
+	// hard-fail.
+	const ourPid = await client.evalInMain<number>('return process.pid;');
+	let sni: {
+		ourPid: number;
+		registeredItem: { service: string; objectPath: string } | null;
+		probeError: string | null;
+	} = { ourPid, registeredItem: null, probeError: null };
+	try {
+		const sniLib = await import('./src/lib/sni.js');
+		const dbusLib = await import('./src/lib/dbus.js');
+		try {
+			sni.registeredItem = await sniLib.findItemByPid(ourPid);
+		} finally {
+			await dbusLib.disconnectBus();
+		}
+	} catch (e) {
+		sni.probeError = e instanceof Error ? e.message : String(e);
+	}
+
+	// T22 PR toolbar / T31 side chat / T32 slash menu — these surfaces
+	// are now captured if the user has the relevant view open at probe
+	// time (see `axFingerprint` above). Empty fingerprint at idle is
+	// expected; flag here only if the renderer was reachable but the
+	// captured tree was empty (which would suggest the AX walker hit
+	// a permission gate or was disabled).
+	if (claudeAi && axFingerprint.length === 0) {
+		gaps.push(
+			'renderer-ax: claude.ai webContents present but AX tree empty. ' +
+				'Either Accessibility was not enabled or the page is mid-load.',
+		);
+	}
+	gaps.push(
+		'T39 /desktop: lives in the upstream `claude` CLI binary, not the ' +
+			'Electron asar — not reachable from this probe.',
+	);
+
+	return {
+		capturedAt: new Date().toISOString(),
+		appVersion: appMeta.appVersion,
+		appPath: appMeta.appPath,
+		isPackaged: appMeta.isPackaged,
+		platform: appMeta.platform,
+		ipcInvokeChannels: ipc.invoke,
+		ipcOnChannels: ipc.on,
+		webContents,
+		axFingerprint,
+		tests: {
+			T01: { appReady: appMeta.appReady, webContentsCount: webContents.length },
+			T03: sni,
+			T06: { accelerators },
+			T09: loginItems,
+			T22: { axFingerprintRef: true, count: axFingerprint.length },
+			T23: notifications,
+			T24: { editorIpcChannels },
+			T26: { axFingerprintRef: true, count: axFingerprint.length },
+			T31: { axFingerprintRef: true, count: axFingerprint.length },
+			T32: { axFingerprintRef: true, count: axFingerprint.length },
+			T38: { editorIpcChannels },
+			S18: safeStorage,
+			S20: powerSaveBlocker,
+			S22: {
+				platform: appMeta.platform,
+				expectedDisabledOnLinux: appMeta.platform === 'linux',
+			},
+			S25: safeStorage,
+			S26: {
+				...autoUpdater,
+				isPackaged: appMeta.isPackaged,
+				platform: appMeta.platform,
+				note: 'Gate is structurally open; saved by Electron autoUpdater being unimplemented on Linux.',
+			},
+		},
+		gaps,
+	};
+}
+
+interface ParsedArgs {
+	port: number;
+	out: string;
+	launch: boolean;
+	includeSynthetic: boolean;
+}
+
+function parseArgs(argv: string[]): ParsedArgs {
+	const flags = new Set<string>();
+	const args = new Map<string, string>();
+	for (let i = 2; i < argv.length; i++) {
+		const tok = argv[i];
+		if (!tok || !tok.startsWith('--')) continue;
+		const key = tok.replace(/^--/, '');
+		const next = argv[i + 1];
+		if (next && !next.startsWith('--')) {
+			args.set(key, next);
+			i++;
+		} else {
+			flags.add(key);
+		}
+	}
+	return {
+		port: Number(args.get('port') ?? 9229),
+		out: args.get('out') ?? '/tmp/grounding-probe.json',
+		launch: flags.has('launch'),
+		includeSynthetic: flags.has('include-synthetic'),
+	};
+}
+
+async function main() {
+	const parsed = parseArgs(process.argv);
+	const { out, launch, includeSynthetic } = parsed;
+
+	let client: InspectorClient;
+	let cleanup: () => Promise<void>;
+
+	if (launch) {
+		// Self-contained: fresh isolation per run, tear down on exit.
+		// 'mainVisible' is the lowest level that gives us the inspector
+		// without waiting on claude.ai network load. Sufficient for
+		// every probe in capture() — none touch renderer DOM.
+		const app = await launchClaude();
+		const ready = await app.waitForReady('mainVisible');
+		client = ready.inspector;
+		cleanup = async () => {
+			client.close();
+			await app.close();
+		};
+	} else {
+		client = await InspectorClient.connect(parsed.port);
+		cleanup = async () => {
+			client.close();
+		};
+	}
+
+	try {
+		const result = await capture(client, { includeSynthetic });
+		writeFileSync(out, JSON.stringify(result, null, 2));
+		console.log(
+			`grounding-probe: wrote ${out} ` +
+				`(${result.ipcInvokeChannels.length} invoke channels, ` +
+				`${result.webContents.length} webContents, ` +
+				`${result.axFingerprint.length} ax nodes, ` +
+				`${result.gaps.length} gaps` +
+				`${launch ? ', --launch' : ''}` +
+				`${includeSynthetic ? ', synthetic' : ''})`,
+		);
+	} finally {
+		await cleanup();
+	}
+}
+
+main().catch((err) => {
+	console.error('grounding-probe failed:', err);
+	process.exit(1);
+});
--- a/tools/test-harness/orchestrator/sweep.sh
+++ b/tools/test-harness/orchestrator/sweep.sh
@@ -0,0 +1,108 @@
+#!/usr/bin/env bash
+# sweep.sh — run a test sweep for a row.
+#
+# Usage:
+#   ROW=KDE-W ./orchestrator/sweep.sh
+#   CLAUDE_DESKTOP_LAUNCHER=/usr/bin/claude-desktop ROW=KDE-W ./orchestrator/sweep.sh
+#
+# Output bundle layout:
+#   results/results-${ROW}-${DATE}/
+#     ├── junit.xml
+#     ├── html/                   (Playwright HTML report)
+#     └── test-output/            (per-test attachments)
+
+set -uo pipefail
+
+script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+readonly script_dir
+harness_dir="$(dirname "$script_dir")"
+readonly harness_dir
+
+readonly row="${ROW:-KDE-W}"
+date_str="$(date -u +%Y%m%dT%H%M%SZ)"
+readonly date_str
+readonly bundle_id="results-${row}-${date_str}"
+readonly results_root="${OUTPUT_DIR:-${harness_dir}/results}"
+readonly bundle_dir="${results_root}/${bundle_id}"
+
+mkdir -p "$bundle_dir"
+
+cd "$harness_dir" || exit 1
+
+# Backend banner. CLAUDE_HARNESS_USE_WAYLAND=1 flips every runner from
+# the default X11/XWayland backend to native Wayland — see the
+# "Environment variables" table in tools/test-harness/README.md.
+if [[ "${CLAUDE_HARNESS_USE_WAYLAND:-}" == '1' ]]; then
+	printf 'sweep: native Wayland backend (CLAUDE_HARNESS_USE_WAYLAND=1)\n' >&2
+fi
+
+# Fast-fail prereq checks — only matter when the sweep includes
+# Quick Entry runners (S31, future S29/S30/S32/S34/S35/S37 +
+# T06 / QE-* additions). Skip with QE_PREREQ_CHECK=0 if running
+# a sweep that excludes those.
+if [[ "${QE_PREREQ_CHECK:-1}" == "1" ]]; then
+	if ! command -v ydotool >/dev/null 2>&1; then
+		printf 'sweep: ydotool not on PATH — Quick Entry runners will skip.\n' >&2
+		printf '  install: dnf install ydotool / apt install ydotool\n' >&2
+		printf '  to suppress this check: QE_PREREQ_CHECK=0\n' >&2
+	fi
+	socket="${YDOTOOL_SOCKET:-/tmp/.ydotool_socket}"
+	if [[ ! -S "$socket" ]]; then
+		printf 'sweep: ydotoold socket missing at %s — daemon not running.\n' \
+			"$socket" >&2
+		printf '  start: sudo systemctl start ydotool.service\n' >&2
+		printf '  see tools/test-harness/README.md "Quick Entry runners" for one-time setup\n' >&2
+	fi
+fi
+
+ROW="$row" \
+RESULTS_DIR="$bundle_dir" \
+	npx playwright test
+rc=$?
+
+# Bundle into tar.zst for orchestrator pickup. Best-effort — keep the
+# uncompressed dir even if zstd is unavailable.
+if command -v zstd >/dev/null 2>&1; then
+	tar --zstd -cf "${results_root}/${bundle_id}.tar.zst" \
+		-C "$results_root" "$bundle_id" 2>/dev/null \
+		&& printf 'bundle: %s/%s.tar.zst\n' "$results_root" "$bundle_id"
+fi
+
+printf 'row=%s exit=%d dir=%s\n' "$row" "$rc" "$bundle_dir"
+
+# Quick summary if junit.xml landed. Prefer Node so we sum across all
+# <testsuite> elements (grep+head only saw the first suite, undercounting
+# multi-suite reports). Fall back to the legacy grep path when node isn't
+# on PATH so the harness stays usable on minimal images.
+if [[ -f "${bundle_dir}/junit.xml" ]]; then
+	if command -v node >/dev/null 2>&1; then
+		read -r tests failures errors skipped \
+			< <(node -e "$(cat <<'EOF'
+const fs = require('fs');
+const xml = fs.readFileSync(process.argv[1], 'utf8');
+const sumAttr = (a) => Array.from(
+	xml.matchAll(new RegExp(`<testsuite[^>]*\\b${a}="(\\d+)"`, 'g'))
+).reduce((s, m) => s + parseInt(m[1], 10), 0);
+console.log([
+	sumAttr('tests'), sumAttr('failures'),
+	sumAttr('errors'), sumAttr('skipped'),
+].join(' '));
+EOF
+)" "${bundle_dir}/junit.xml")
+		printf 'summary: tests=%s failures=%s errors=%s skipped=%s\n' \
+			"$tests" "$failures" "$errors" "$skipped"
+	elif command -v grep >/dev/null 2>&1; then
+		tests="$(grep -oP 'tests="\K\d+' "${bundle_dir}/junit.xml" \
+			| head -1 || printf '?')"
+		failures="$(grep -oP 'failures="\K\d+' "${bundle_dir}/junit.xml" \
+			| head -1 || printf '?')"
+		errors="$(grep -oP 'errors="\K\d+' "${bundle_dir}/junit.xml" \
+			| head -1 || printf '?')"
+		skipped="$(grep -oP 'skipped="\K\d+' "${bundle_dir}/junit.xml" \
+			| head -1 || printf '?')"
+		printf 'summary: tests=%s failures=%s errors=%s skipped=%s\n' \
+			"$tests" "$failures" "$errors" "$skipped"
+	fi
+fi
+
+exit "$rc"
--- a/tools/test-harness/package.json
+++ b/tools/test-harness/package.json
@@ -0,0 +1,32 @@
+{
+	"name": "claude-desktop-debian-test-harness",
+	"version": "0.0.1",
+	"private": true,
+	"description": "Linux compatibility test harness for claude-desktop-debian",
+	"type": "module",
+	"engines": {
+		"node": ">=20"
+	},
+	"scripts": {
+		"test": "playwright test",
+		"sweep": "bash orchestrator/sweep.sh",
+		"typecheck": "tsc --noEmit",
+		"explore": "npx tsx explore/explore.ts",
+		"grounding-probe": "npx tsx grounding-probe.ts",
+		"explore:snapshot": "npx tsx explore/explore.ts snapshot",
+		"explore:diff": "npx tsx explore/explore.ts diff",
+		"explore:walk": "npx tsx explore/explore.ts walk",
+		"derive:vocabulary": "npx tsx explore/derive-vocabulary.ts",
+		"gen:render-specs": "npx tsx explore/gen-render-specs.ts"
+	},
+	"devDependencies": {
+		"@playwright/test": "^1.48.0",
+		"@types/node": "^20.16.0",
+		"playwright": "^1.48.0",
+		"typescript": "^5.6.0"
+	},
+	"dependencies": {
+		"@electron/asar": "^3.2.10",
+		"dbus-next": "^0.10.2"
+	}
+}
--- a/tools/test-harness/playwright.config.ts
+++ b/tools/test-harness/playwright.config.ts
@@ -0,0 +1,25 @@
+/// <reference types="node" />
+import { defineConfig } from '@playwright/test';
+
+const resultsDir = process.env.RESULTS_DIR ?? './results/local';
+
+export default defineConfig({
+	testDir: './src/runners',
+	testMatch: /.*\.spec\.ts$/,
+	fullyParallel: false,
+	workers: 1,
+	retries: process.env.CI ? 1 : 0,
+	forbidOnly: !!process.env.CI,
+	timeout: 60_000,
+	expect: { timeout: 10_000 },
+	outputDir: `${resultsDir}/test-output`,
+	reporter: [
+		['list'],
+		['junit', { outputFile: `${resultsDir}/junit.xml` }],
+		['html', { outputFolder: `${resultsDir}/html`, open: 'never' }],
+	],
+	use: {
+		trace: 'retain-on-failure',
+		screenshot: 'only-on-failure',
+	},
+});
--- a/tools/test-harness/probe.ts
+++ b/tools/test-harness/probe.ts
@@ -0,0 +1,163 @@
+// Standalone probe that connects to a running claude-desktop with the
+// main process debugger enabled (port 9229) and dumps renderer-DOM
+// shapes useful for designing reusable abstractions in lib/claudeai.ts.
+//
+// Run from tools/test-harness:
+//   npx tsx probe.ts
+//
+// Non-destructive — observes only, doesn't click anything.
+
+import { InspectorClient } from './src/lib/inspector.js';
+import { writeFileSync } from 'node:fs';
+
+async function main() {
+	const client = await InspectorClient.connect(9229);
+
+	const webContentsList = await client.evalInMain<
+		Array<{ id: number; url: string; type: string }>
+	>(`
+		const { webContents } = process.mainModule.require('electron');
+		return webContents.getAllWebContents().map(w => ({
+			id: w.id,
+			url: w.getURL(),
+			type: w.getType ? w.getType() : 'unknown',
+		}));
+	`);
+
+	const target = webContentsList.find((w) => w.url.includes('claude.ai'));
+	if (!target) {
+		console.error('No claude.ai webContents — open the app to a logged-in state first.');
+		console.error('webContents observed:', webContentsList);
+		process.exit(1);
+	}
+
+	console.log('=== webContents ===');
+	console.log(JSON.stringify(webContentsList, null, 2));
+	console.log('Targeting:', target.url, `(id=${target.id})`);
+
+	// All "pill"-shape buttons on the page.
+	const pills = await client.evalInRenderer<{
+		dfPills: Array<{ ariaLabel: string | null; text: string; visible: boolean; classSig: string }>;
+		menuButtons: Array<{
+			ariaLabel: string | null;
+			text: string;
+			expanded: boolean;
+			truncateMaxW: string | null;
+			classSig: string;
+		}>;
+		summary: { totalButtons: number; ariaHaspopupMenu: number; dfPills: number };
+	}>(
+		'claude.ai',
+		`
+		(() => {
+			const buttons = Array.from(document.querySelectorAll('button'));
+			const dfPills = buttons
+				.filter(b => /\\bdf-pill\\b/.test(b.className))
+				.map(b => ({
+					ariaLabel: b.getAttribute('aria-label'),
+					text: (b.textContent || '').trim().slice(0, 80),
+					visible: !!b.getClientRects().length,
+					classSig: b.className.slice(0, 120),
+				}));
+			const menuButtons = buttons
+				.filter(b => b.getAttribute('aria-haspopup') === 'menu')
+				.map(b => {
+					const truncSpan = b.querySelector('span.truncate');
+					const maxW = truncSpan
+						? (truncSpan.className.match(/max-w-\\[[^\\]]+\\]/) || [null])[0]
+						: null;
+					return {
+						ariaLabel: b.getAttribute('aria-label'),
+						text: (b.textContent || '').trim().slice(0, 80),
+						expanded: b.getAttribute('aria-expanded') === 'true',
+						truncateMaxW: maxW,
+						classSig: b.className.slice(0, 120),
+					};
+				});
+			return {
+				dfPills,
+				menuButtons,
+				summary: {
+					totalButtons: buttons.length,
+					ariaHaspopupMenu: menuButtons.length,
+					dfPills: dfPills.length,
+				},
+			};
+		})()
+	`,
+	);
+
+	console.log('\n=== Pills summary ===');
+	console.log(JSON.stringify(pills.summary, null, 2));
+
+	console.log('\n=== df-pill buttons ===');
+	console.log(JSON.stringify(pills.dfPills, null, 2));
+
+	console.log('\n=== aria-haspopup=menu buttons (sample) ===');
+	console.log(JSON.stringify(pills.menuButtons.slice(0, 10), null, 2));
+
+	// Currently open menu (if any) — items, structure.
+	const openMenu = await client.evalInRenderer<{
+		menuPresent: boolean;
+		ariaLabelledBy: string | null;
+		items: Array<{ role: string; text: string; ariaChecked: string | null; disabled: boolean }>;
+	} | null>(
+		'claude.ai',
+		`
+		(() => {
+			const menu = document.querySelector('[role=menu][data-open]') || document.querySelector('[role=menu]');
+			if (!menu) return null;
+			const items = Array.from(menu.querySelectorAll('[role=menuitem], [role=menuitemradio], [role=menuitemcheckbox]'))
+				.map(el => ({
+					role: el.getAttribute('role') || '',
+					text: (el.textContent || '').trim().slice(0, 80),
+					ariaChecked: el.getAttribute('aria-checked'),
+					disabled: el.hasAttribute('data-disabled') || el.getAttribute('aria-disabled') === 'true',
+				}));
+			return {
+				menuPresent: true,
+				ariaLabelledBy: menu.getAttribute('aria-labelledby'),
+				items,
+			};
+		})()
+	`,
+	);
+
+	console.log('\n=== Currently open menu ===');
+	console.log(openMenu ? JSON.stringify(openMenu, null, 2) : 'no menu open');
+
+	// URL and basic page state.
+	const pageState = await client.evalInRenderer<{
+		url: string;
+		title: string;
+		readyState: string;
+		hasComposer: boolean;
+		hasSidebar: boolean;
+	}>(
+		'claude.ai',
+		`
+		(() => ({
+			url: location.href,
+			title: document.title,
+			readyState: document.readyState,
+			hasComposer: !!document.querySelector('[data-testid*=composer], textarea[placeholder*=Reply], textarea[placeholder*=Message]'),
+			hasSidebar: !!document.querySelector('nav, [role=navigation]'),
+		}))()
+	`,
+	);
+
+	console.log('\n=== Page state ===');
+	console.log(JSON.stringify(pageState, null, 2));
+
+	const out = { webContentsList, pills, openMenu, pageState };
+	writeFileSync('/tmp/claude-probe.json', JSON.stringify(out, null, 2));
+	console.log('\nFull dump → /tmp/claude-probe.json');
+
+	client.close();
+	process.exit(0);
+}
+
+main().catch((err) => {
+	console.error('probe failed:', err);
+	process.exit(1);
+});
--- a/tools/test-harness/src/lib/argv.ts
+++ b/tools/test-harness/src/lib/argv.ts
@@ -0,0 +1,44 @@
+// Read a process's argv from /proc/<pid>/cmdline.
+//
+// /proc/<pid>/cmdline is a single string of NUL-separated args (no
+// trailing NUL on most kernels; trim defensively). Used by QE-6 / S12
+// to verify the launcher appended the right Electron flags, and by
+// future flag-presence tests (Decision 6 Wayland-default Smoke, S07
+// CLAUDE_USE_WAYLAND, etc.).
+//
+// readPidArgv returns null if the process is gone — callers usually
+// want to retry until the pid stabilizes.
+
+import { readFile } from 'node:fs/promises';
+
+export async function readPidArgv(pid: number): Promise<string[] | null> {
+	try {
+		const raw = await readFile(`/proc/${pid}/cmdline`, 'utf8');
+		// Strip trailing NUL if present, then split. Empty argv is
+		// theoretically possible (kernel threads); preserve it.
+		const trimmed = raw.endsWith('\0') ? raw.slice(0, -1) : raw;
+		return trimmed.length === 0 ? [] : trimmed.split('\0');
+	} catch {
+		return null;
+	}
+}
+
+export function argvHasFlag(argv: string[], flag: string): boolean {
+	// Matches `--enable-features=GlobalShortcutsPortal` (full equality)
+	// and `--enable-features` (bare flag, value in next argv slot).
+	// Substring match handles `--enable-features=Foo,Bar` correctly when
+	// flag is `--enable-features=Foo`.
+	for (const arg of argv) {
+		if (arg === flag) return true;
+		if (arg.startsWith(`${flag}=`)) return true;
+		// Comma-separated --enable-features value: match any subkey.
+		if (flag.includes('=')) {
+			const [key, val] = flag.split('=', 2);
+			if (arg.startsWith(`${key}=`)) {
+				const values = arg.slice(key!.length + 1).split(',');
+				if (values.includes(val!)) return true;
+			}
+		}
+	}
+	return false;
+}
--- a/tools/test-harness/src/lib/asar.ts
+++ b/tools/test-harness/src/lib/asar.ts
@@ -0,0 +1,55 @@
+// Read files out of the installed app.asar without on-disk extraction.
+//
+// Used by QE-19 / S09 (verify the KDE-gate string is in the bundled
+// JS) and by future patch-sanity tests for tray.sh / cowork.sh /
+// claude-code.sh patches. Reading via @electron/asar avoids the
+// `npx asar extract /tmp/inspect-installed` dance — same outcome, no
+// temp tree, JSON-grepable from inside a TS spec.
+//
+// Path resolution mirrors lib/electron.ts:resolveInstall(): respect
+// CLAUDE_DESKTOP_APP_ASAR if set, otherwise probe the deb and rpm
+// install locations.
+
+import { extractFile, listPackage } from '@electron/asar';
+import { existsSync } from 'node:fs';
+
+const DEFAULT_ASAR_PATHS = [
+	'/usr/lib/claude-desktop/app.asar',
+	'/opt/Claude/resources/app.asar',
+	'/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar',
+	'/opt/Claude/node_modules/electron/dist/resources/app.asar',
+];
+
+export function resolveAsarPath(): string {
+	const env = process.env.CLAUDE_DESKTOP_APP_ASAR;
+	if (env) return env;
+	for (const candidate of DEFAULT_ASAR_PATHS) {
+		if (existsSync(candidate)) return candidate;
+	}
+	throw new Error(
+		'Could not locate app.asar. Set CLAUDE_DESKTOP_APP_ASAR or install ' +
+			'the deb/rpm package.',
+	);
+}
+
+export function readAsarFile(filename: string, asarPath?: string): string {
+	const archive = asarPath ?? resolveAsarPath();
+	const buf = extractFile(archive, filename);
+	return buf.toString('utf8');
+}
+
+export function asarContains(
+	filename: string,
+	needle: string | RegExp,
+	asarPath?: string,
+): boolean {
+	const contents = readAsarFile(filename, asarPath);
+	return typeof needle === 'string'
+		? contents.includes(needle)
+		: needle.test(contents);
+}
+
+export function listAsar(asarPath?: string): string[] {
+	const archive = asarPath ?? resolveAsarPath();
+	return listPackage(archive, { isPack: false });
+}
--- a/tools/test-harness/src/lib/ax.ts
+++ b/tools/test-harness/src/lib/ax.ts
@@ -0,0 +1,255 @@
+// AX-tree loading + traversal primitives — shared substrate for any
+// test that reads from Chromium's accessibility tree.
+//
+// Why this exists
+// ---------------
+// Sessions 1-12 grew two parallel AX consumers without consolidating
+// the loading shape:
+//
+//   1. `lib/claudeai.ts` page-objects (CodeTab.activate, openPill,
+//      clickMenuItem, findCompactPills) carry a private `snapshotAx`
+//      that gates on `waitForAxTreeStable` then calls
+//      `inspector.getAccessibleTree('claude.ai')` and converts via
+//      `axTreeToSnapshot`. Every page-object that polls for a node
+//      rolls its own retryUntil/while loop around that helper.
+//
+//   2. `src/runners/T26_routines_page_renders.spec.ts` re-implemented
+//      the same `snapshotAx` shape inline because the claudeai.ts
+//      version isn't exported. Its leading comment explicitly noted
+//      this was "premature abstraction" at 1 consumer; with 2 it is
+//      threshold-driven extraction.
+//
+// Plus the user reports recurring flake in tests that use the AX tree:
+// queries fire before the relevant subtree is mounted, and individual
+// specs each pick their own retryUntil budget. The proposed
+// `waitForAxNode` primitive collapses the snapshot+find+retry shape
+// into one helper with a single tunable budget per consumer, reducing
+// both the surface area for budget drift and the duplication.
+//
+// What this primitive does
+// ------------------------
+// - `snapshotAx(inspector, opts)` — single AX tree read with the
+//   stability gate. Replaces the duplicated implementations in
+//   `claudeai.ts` (private) and `T26_routines_page_renders.spec.ts`
+//   (inlined). `opts.fast` skips the stability gate for inside-poll
+//   callers (matches the existing claudeai.ts contract).
+// - `waitForAxNode(inspector, predicate, opts)` — repeatedly snapshot
+//   the AX tree and return the first element matching `predicate`,
+//   subject to a timeout. Built against the loops in `CodeTab.activate`
+//   (poll for compact pills), `openPill` (poll for menu items),
+//   `clickMenuItem` (poll for matching menuitem), and T26's pre/post-
+//   click anchor scans. The predicate carries the discrimination
+//   logic the caller already had inline; the primitive owns the
+//   stability-gate + retry loop.
+// - Re-exports `RawElement`, `axTreeToSnapshot`, `waitForAxTreeStable`
+//   from `explore/walker.ts` so consumers don't need to reach across
+//   the lib/explore boundary themselves. The walker stays the source
+//   of truth for the AX-snapshot shape; this file is the runner-
+//   facing surface.
+//
+// Scope boundaries
+// ----------------
+// This is NOT a "wait for surface rendered" registry. The plan-doc
+// proposal mentioned `waitForRenderedSurface(client, surfaceKey)`
+// with a registry of named surface anchors — that's still
+// speculative (no consumer asks for it). When a third consumer
+// emerges that already knows it wants a named surface anchor (e.g.
+// "the Code tab body has mounted"), promote the relevant claudeai.ts
+// page-object into a registry entry. Today, `waitForAxNode` with a
+// predicate covers every observed callsite.
+//
+// This is also NOT a CSS-querySelector primitive. T07 polls the DOM
+// via `document.querySelector('[data-testid=...]')` for the topbar;
+// that's a different abstraction (DOM, not AX) with no extraction
+// signal yet — leave it inline in T07 until a second consumer
+// surfaces.
+
+import type { AxNode, InspectorClient } from './inspector.js';
+import {
+	type RawElement,
+	axTreeToSnapshot,
+	waitForAxTreeStable,
+} from '../../explore/walker.js';
+import { retryUntil } from './retry.js';
+
+// Re-exports for consumer convenience. Anything that today imports
+// `RawElement` / `axTreeToSnapshot` / `waitForAxTreeStable` from
+// `../../explore/walker.js` can switch to this file as the import
+// path. Keeping the walker as the source of truth — these are the
+// runner-facing aliases.
+export type { AxNode } from './inspector.js';
+export {
+	type RawElement,
+	axTreeToSnapshot,
+	waitForAxTreeStable,
+} from '../../explore/walker.js';
+
+// Re-export the AxNode -> RawElement[] conversion as a single import
+// point. (Kept distinct from `axTreeToSnapshot`'s walker-side export
+// so future renames in `explore/walker.ts` don't churn the runner-
+// facing API.)
+export interface SnapshotAxOptions {
+	// Skip the upfront `waitForAxTreeStable` gate. Default false —
+	// i.e. callers gate by default. Pass true inside polling loops
+	// where the gate fights the loop: each iteration would block
+	// waiting for "no node-count change" even when the change we're
+	// polling for is exactly the AX tree updating.
+	//
+	// `waitForAxNode` itself uses fast=true on every iteration after
+	// gating once at the start; consumers calling `snapshotAx` from
+	// inside a hand-rolled loop should do the same.
+	fast?: boolean;
+	// AX-stability gate budget when `fast` is false. Default 10000ms
+	// — matches the existing claudeai.ts/T26 inline implementations.
+	// Increase for cold-cache cases on slow machines.
+	stabilityTimeoutMs?: number;
+	// Renderer URL filter for `inspector.getAccessibleTree`. Default
+	// 'claude.ai'. Tests against a different webContents (find_in_page,
+	// main_window) can override but the AX tree on those is much
+	// simpler — `claude.ai` is the only one current consumers care
+	// about.
+	urlFilter?: string;
+}
+
+// Single AX-tree read, returning the walker's flat RawElement[]
+// snapshot. Identical contract to the private `snapshotAx` formerly in
+// `claudeai.ts` and the inlined one formerly in T26 — extracted here
+// so both consumers share an implementation.
+//
+// Cost: ~800ms when the stability gate hits "stable" on the first
+// pair of reads (interior-loop fast=true callers skip this); a few
+// seconds on cold-cache. The AX tree itself is comparatively cheap
+// to fetch and convert (~50-100ms).
+export async function snapshotAx(
+	inspector: InspectorClient,
+	opts: SnapshotAxOptions = {},
+): Promise<RawElement[]> {
+	if (!opts.fast) {
+		await waitForAxTreeStable(inspector, {
+			minNodes: 1,
+			timeoutMs: opts.stabilityTimeoutMs ?? 10_000,
+		});
+	}
+	const url = opts.urlFilter ?? 'claude.ai';
+	const nodes: AxNode[] = await inspector.getAccessibleTree(url);
+	return axTreeToSnapshot(nodes);
+}
+
+export interface WaitForAxNodeOptions {
+	// Total budget for the polling loop. Default 5000ms — matches the
+	// claudeai.ts / T26 callsites that the primitive replaces. Override
+	// upward for cold-cache or post-click cases (T26 uses 10s post-
+	// click; CodeTab.activate uses 5s default but T16 passes 15s).
+	timeoutMs?: number;
+	// Per-iteration interval. Default 200ms — matches the existing
+	// inline retryUntil({ interval: 200 }) calls. The AX tree fetch
+	// itself dominates the loop cost; a shorter interval gives no
+	// throughput benefit and a longer one delays the resolution.
+	intervalMs?: number;
+	// Renderer URL filter passed through to `snapshotAx`. Default
+	// 'claude.ai'.
+	urlFilter?: string;
+	// Whether to gate on `waitForAxTreeStable` once before entering
+	// the poll loop. Default true. When the caller has just mutated
+	// the page (e.g. clicked a button and is waiting for the
+	// resulting menu to render) the upfront stability gate is what
+	// keeps the first iteration from racing the in-flight render.
+	// After the upfront gate, every iteration uses fast=true so the
+	// loop iterates without re-blocking on stability.
+	stabilityGate?: boolean;
+	// AX-stability gate budget for the upfront `waitForAxTreeStable`
+	// when `stabilityGate` is true. Default 5000ms. Independent from
+	// the outer poll budget — the gate is a hard precondition, not
+	// part of the find loop.
+	stabilityTimeoutMs?: number;
+}
+
+// Poll the AX tree until the predicate matches a node, or the budget
+// runs out. Returns the matched RawElement on success, null on
+// timeout.
+//
+// The predicate runs over RawElement (the walker-snapshot shape) so
+// callers can use the same `el.computedRole === 'button' &&
+// el.accessibleName === 'Code'` form they already have inline. The
+// helper does NOT click the matched node — callers receive the
+// RawElement and can pass `el.backendDOMNodeId` to
+// `inspector.clickByBackendNodeId` if a click follows. Keeping click
+// out of the find primitive lets composite consumers (e.g. "find then
+// click then poll for the menu") chain cleanly.
+//
+// On timeout, returns null. Callers that want a hard fail with a
+// diagnostic should pattern-match `if (!found) throw new Error(...)`
+// — the primitive doesn't throw because some specs surface
+// missing-node as a clean fail with a JSON snapshot attachment
+// rather than an uncaught timeout.
+//
+// The `name` param is purely for diagnostic message hygiene if a
+// consumer wraps a throw around the null return — it's appended to
+// the implicit "looking for a node matching <predicate>" so failure
+// logs read meaningfully. Optional; pass an empty string to suppress.
+export async function waitForAxNode(
+	inspector: InspectorClient,
+	predicate: (el: RawElement) => boolean,
+	opts: WaitForAxNodeOptions = {},
+): Promise<RawElement | null> {
+	const stabilityGate = opts.stabilityGate ?? true;
+	if (stabilityGate) {
+		await waitForAxTreeStable(inspector, {
+			minNodes: 1,
+			timeoutMs: opts.stabilityTimeoutMs ?? 5_000,
+		});
+	}
+	return retryUntil(
+		async () => {
+			const elements = await snapshotAx(inspector, {
+				fast: true,
+				urlFilter: opts.urlFilter,
+			});
+			return elements.find(predicate) ?? null;
+		},
+		{
+			timeout: opts.timeoutMs ?? 5_000,
+			interval: opts.intervalMs ?? 200,
+		},
+	);
+}
+
+// Same shape as `waitForAxNode` but returns every match rather than
+// the first. Useful for consumers that want to enumerate all menu
+// items or all compact pills after a stability point — the
+// findCompactPills caller in claudeai.ts is a one-shot snapshot
+// today, but if a consumer needs to wait for "at least one compact
+// pill" plus enumerate the resulting set, this avoids a second
+// round-trip.
+//
+// Returns the (possibly empty) array on success, null on timeout
+// when no element ever matched. A successful call with zero matches
+// is impossible by construction — the loop only resolves once the
+// post-filter array is non-empty.
+export async function waitForAxNodes(
+	inspector: InspectorClient,
+	predicate: (el: RawElement) => boolean,
+	opts: WaitForAxNodeOptions = {},
+): Promise<RawElement[] | null> {
+	const stabilityGate = opts.stabilityGate ?? true;
+	if (stabilityGate) {
+		await waitForAxTreeStable(inspector, {
+			minNodes: 1,
+			timeoutMs: opts.stabilityTimeoutMs ?? 5_000,
+		});
+	}
+	return retryUntil(
+		async () => {
+			const elements = await snapshotAx(inspector, {
+				fast: true,
+				urlFilter: opts.urlFilter,
+			});
+			const matches = elements.filter(predicate);
+			return matches.length > 0 ? matches : null;
+		},
+		{
+			timeout: opts.timeoutMs ?? 5_000,
+			interval: opts.intervalMs ?? 200,
+		},
+	);
+}
--- a/tools/test-harness/src/lib/claudeai.ts
+++ b/tools/test-harness/src/lib/claudeai.ts
@@ -0,0 +1,397 @@
+// claude.ai renderer-UI domain wrapper — single point of coupling to
+// upstream's accessibility tree for tests that drive the renderer.
+//
+// Why centralize: claude.ai's UI ships from a different release train
+// than the Electron shell, so any cross-spec drift would be an N-file
+// fix. Confining the discovery here means the rest of the harness can
+// speak in domain verbs (`activate('Code')`, `openEnvPill()`, …) and
+// we only retune one file when upstream drifts.
+//
+// Discovery substrate is Chromium's accessibility tree
+// (`Accessibility.getFullAXTree` over CDP), shared with the v7 walker.
+// Reading from AX rather than the DOM means the page-objects survive
+// tailwind class regeneration and React-tree restructuring as long as
+// the platform-computed role + accessible name + ancestor landmarks
+// stay stable. See docs/learnings/test-harness-ax-tree-walker.md for
+// the gotchas (AX-enable async lag, post-click stability gating, list
+// virtualization).
+//
+// Discrimination shapes used:
+//   - Top-level tabs: `role: 'button'` whose accessibleName matches
+//     the literal tab label ('Chat' | 'Cowork' | 'Code'). The
+//     `df-pill` tailwind anchor and `aria-label` selector are gone —
+//     the AX-computed name is the durable contract.
+//   - Compact pills (the env pill on Code, the "Select folder…" pill
+//     after Local is chosen): `role: 'button'` with
+//     `hasPopup === 'menu'`, scoped away from the cowork sidebar by
+//     filtering out per-row `^More options for ` triggers. The visible
+//     label is the button's accessibleName.
+//   - Menu items: any of `menuitem` / `menuitemradio` /
+//     `menuitemcheckbox` (collected as MENU_ITEM_ROLES below).
+
+import type { InspectorClient } from './inspector.js';
+import {
+	snapshotAx,
+	waitForAxNode,
+	waitForAxNodes,
+	waitForAxTreeStable,
+} from './ax.js';
+import { retryUntil, sleep } from './retry.js';
+
+// All three CDP-exposed menu-item variants. Caller code wants to treat
+// them uniformly — radios and checkboxes are still "items in an open
+// menu the user can pick".
+const MENU_ITEM_ROLES = new Set<string>([
+	'menuitem',
+	'menuitemradio',
+	'menuitemcheckbox',
+]);
+
+// AccessibleName patterns that indicate a per-row trigger button on
+// the cowork sidebar (~70+ of them on a busy account). They share the
+// same `hasPopup: 'menu'` signal as the compact pills we actually
+// want, so excluding them by name is the load-bearing discriminator.
+const ROW_MORE_OPTIONS_RE = /^More options for /;
+
+// `snapshotAx` and the stability gate are now in `lib/ax.ts` —
+// extracted there in session 13 once T26 had to redefine the same
+// helper inline (two consumers = threshold-driven extraction). Page-
+// objects below import via the lib aliases; consumers outside this
+// file should reach for `lib/ax.ts` directly rather than re-importing
+// through `lib/claudeai.ts`.
+
+// One of the three top-level pills. Click is fire-and-forget — the
+// router rerenders the tab body inline (no URL change on Code), so
+// callers must poll for whatever signal indicates *their* next step is
+// ready (e.g. CodeTab.activate polls for the env pill).
+//
+// AX-tree match: `role: 'button'` with the literal tab name as the
+// accessible name. The visible label and aria-label happen to coincide
+// today, and the AX-computed name follows the same cascade — pinning
+// to the name keeps the page-object durable across the tailwind
+// regenerations that motivated the migration.
+//
+// Pre-click polling budget. Up to session 13, this was a one-shot
+// snapshot — if the tab button hadn't rendered yet when activateTab
+// was called, the function returned `{ clicked: false }` immediately.
+// Session 13's `waitForAxNode` substrate makes "wait for the button to
+// appear" a one-line shape-only change. Default 5000ms matches the
+// `lib/ax.ts` defaults; callers that previously relied on the no-retry
+// shape pass `timeout: 0` (e.g. via `waitForAxNode`'s timeoutMs) to
+// keep the old behaviour, though no caller currently does so. T16
+// passes 15s through `CodeTab.activate({ timeout })` — that budget is
+// still spent on the post-click pill poll; the pre-click click budget
+// is independent.
+export async function activateTab(
+	inspector: InspectorClient,
+	name: 'Chat' | 'Cowork' | 'Code',
+	opts: { timeout?: number } = {},
+): Promise<{ clicked: boolean }> {
+	const target = await waitForAxNode(
+		inspector,
+		(el) =>
+			el.computedRole === 'button' && el.accessibleName === name,
+		{ timeoutMs: opts.timeout ?? 5_000 },
+	);
+	if (!target || target.backendDOMNodeId === null) {
+		return { clicked: false };
+	}
+	await inspector.clickByBackendNodeId('claude.ai', target.backendDOMNodeId);
+	return { clicked: true };
+}
+
+// A "compact pill" — the React component used by both the env pill and
+// the "Select folder…" pill. AX shape: `role: 'button'` with
+// `hasPopup === 'menu'`, scoped away from cowork sidebar row triggers
+// (`/^More options for /`). The tailwind `max-w-[Npx]` field used to
+// be carried as a diagnostic in v6; that signal isn't in the AX tree
+// (and it was tailwind-specific, exactly the kind of thing the
+// migration was meant to drop), so it's gone — callers only used it
+// in error messages.
+export interface CompactPill {
+	text: string;
+}
+
+export async function findCompactPills(
+	inspector: InspectorClient,
+): Promise<CompactPill[]> {
+	const elements = await snapshotAx(inspector);
+	return elements
+		.filter(
+			(el) =>
+				el.computedRole === 'button' &&
+				el.hasPopup === 'menu' &&
+				el.accessibleName !== null &&
+				el.accessibleName.length > 0 &&
+				!ROW_MORE_OPTIONS_RE.test(el.accessibleName),
+		)
+		.map((el) => ({ text: el.accessibleName as string }));
+}
+
+// Open a compact pill whose accessibleName matches `labelPattern`.
+// Discrimination: `role: 'button'` AND `hasPopup === 'menu'` AND the
+// AX-computed name passes the regex. The hasPopup gate is what stops
+// us trial-clicking action buttons that happen to share text with a
+// pill — the pill always carries an aria-haspopup contract (it opens
+// a popover) while a same-named action button does not.
+//
+// Polls the AX tree post-click for the menu to render (any role in
+// MENU_ITEM_ROLES). Returns the rendered menu item names so the caller
+// can validate without a second snapshot round-trip.
+export async function openPill(
+	inspector: InspectorClient,
+	labelPattern: RegExp,
+	opts: { timeout?: number } = {},
+): Promise<{ opened: boolean; items: string[] }> {
+	const timeout = opts.timeout ?? 5000;
+	const elements = await snapshotAx(inspector);
+	const target = elements.find(
+		(el) =>
+			el.computedRole === 'button' &&
+			el.hasPopup === 'menu' &&
+			el.accessibleName !== null &&
+			labelPattern.test(el.accessibleName),
+	);
+	if (!target || target.backendDOMNodeId === null) {
+		return { opened: false, items: [] };
+	}
+	await inspector.clickByBackendNodeId('claude.ai', target.backendDOMNodeId);
+	// Menu render is async and the AX tree lags DOM by hundreds of ms
+	// (see docs/learnings/test-harness-ax-tree-walker.md §1). Gate
+	// once on stability post-click, then poll fast — re-gating on every
+	// iteration would burn 800ms+ each cycle waiting for "no change"
+	// when what we want is "menuitems appear".
+	await waitForAxTreeStable(inspector, { minNodes: 1, timeoutMs: 5_000 });
+	const deadline = Date.now() + timeout;
+	while (Date.now() < deadline) {
+		const post = await snapshotAx(inspector, { fast: true });
+		const items = post.filter((el) => MENU_ITEM_ROLES.has(el.computedRole));
+		if (items.length > 0) {
+			return {
+				opened: true,
+				items: items.map((el) => (el.accessibleName ?? '').slice(0, 80)),
+			};
+		}
+		await sleep(100);
+	}
+	return { opened: false, items: [] };
+}
+
+// Click any menuitem (any of MENU_ITEM_ROLES) whose accessibleName
+// matches `textPattern`. Caller opens the menu first. Polls the AX
+// snapshot — menu render is async and the AX tree lags DOM by
+// hundreds of ms.
+//
+// Returns the matched item's text and the full item list at the time
+// of the match — the second is useful for diagnostics when `clicked`
+// is null.
+export async function clickMenuItem(
+	inspector: InspectorClient,
+	textPattern: RegExp,
+	opts: { timeout?: number } = {},
+): Promise<{ clicked: string | null; items: string[] }> {
+	const timeout = opts.timeout ?? 1500;
+	// Caller has just opened a menu — gate once on stability so the
+	// first iteration sees the populated tree, then poll fast for the
+	// match. Same shape as openPill's post-click handling.
+	await waitForAxTreeStable(inspector, { minNodes: 1, timeoutMs: 5_000 });
+	const deadline = Date.now() + timeout;
+	let lastItemNames: string[] = [];
+	while (Date.now() < deadline) {
+		const elements = await snapshotAx(inspector, { fast: true });
+		const items = elements.filter((el) =>
+			MENU_ITEM_ROLES.has(el.computedRole),
+		);
+		lastItemNames = items.map((el) => (el.accessibleName ?? '').slice(0, 80));
+		const match = items.find(
+			(el) =>
+				el.accessibleName !== null && textPattern.test(el.accessibleName),
+		);
+		if (match && match.backendDOMNodeId !== null) {
+			const text = (match.accessibleName ?? '').slice(0, 80);
+			await inspector.clickByBackendNodeId(
+				'claude.ai',
+				match.backendDOMNodeId,
+			);
+			return { clicked: text, items: lastItemNames };
+		}
+		await sleep(100);
+	}
+	return { clicked: null, items: lastItemNames };
+}
+
+// Dispatch an Escape keydown to the document. Used by openEnvPill's
+// trial-click loop to dismiss the menu when the wrong pill was hit.
+// We dispatch on document because the popover trigger may not have
+// retained focus.
+export async function pressEscape(inspector: InspectorClient): Promise<void> {
+	await inspector.evalInRenderer<null>(
+		'claude.ai',
+		`(() => {
+			document.dispatchEvent(new KeyboardEvent('keydown', {
+				key: 'Escape', code: 'Escape', keyCode: 27, which: 27,
+				bubbles: true, cancelable: true,
+			}));
+			return null;
+		})()`,
+	);
+}
+
+// Code tab domain operations. Instance-shaped (carries the inspector)
+// to match QuickEntry / MainWindow in quickentry.ts.
+//
+// Only valid after the renderer has loaded a logged-in claude.ai page;
+// callers should `app.waitForReady('userLoaded')` first. activate()
+// itself doesn't repeat that check — it would just fail to find the
+// Code button on /login, which surfaces as a clear error.
+export class CodeTab {
+	constructor(private readonly inspector: InspectorClient) {}
+
+	// Click the Code tab, then poll up to `timeout` for at least one
+	// compact pill to render. The env pill rendering is the cheapest
+	// signal that the Code-tab body has mounted and is interactive —
+	// the URL doesn't change (route stays `/new` etc.), so we can't
+	// anchor on navigation. Throws on miss with the candidate count for
+	// triage.
+	//
+	// Session 14 migration: the pre-click `activateTab` call now polls
+	// up to `opts.timeout` for the Code button itself to appear (was a
+	// one-shot snapshot prior — the T16 failure mode). Same budget
+	// covers both phases; in practice the click resolves in well under
+	// a second when the Code button is present, so the post-click pill
+	// poll inherits the bulk of the budget.
+	async activate(opts: { timeout?: number } = {}): Promise<void> {
+		const timeout = opts.timeout ?? 5000;
+		const result = await activateTab(this.inspector, 'Code', { timeout });
+		if (!result.clicked) {
+			throw new Error(
+				'CodeTab.activate: no AX-tree button with accessibleName="Code" found',
+			);
+		}
+		// Post-click: poll the AX tree for at least one compact pill.
+		// `waitForAxNodes` carries the snapshot+filter+sleep loop
+		// formerly hand-rolled here, with the same per-iteration cadence
+		// (200ms) and overall budget. Predicate matches `findCompactPills`
+		// — `role: 'button'` + `hasPopup: 'menu'` + non-empty
+		// accessibleName + not a per-row "More options for X" trigger.
+		const ready = await waitForAxNodes(
+			this.inspector,
+			(el) =>
+				el.computedRole === 'button' &&
+				el.hasPopup === 'menu' &&
+				el.accessibleName !== null &&
+				el.accessibleName.length > 0 &&
+				!ROW_MORE_OPTIONS_RE.test(el.accessibleName),
+			{ timeoutMs: timeout, intervalMs: 200 },
+		);
+		if (!ready) {
+			throw new Error(
+				`CodeTab.activate: no compact pill rendered within ${timeout}ms ` +
+					`after clicking Code — tab body may not have mounted`,
+			);
+		}
+	}
+
+	// Open the env pill (the compact pill whose menu contains a `^Local`
+	// menuitemradio). Trial-click strategy: for each compact pill, try
+	// opening it and check for the Local item. If absent, dismiss with
+	// Escape and try the next. Necessary because nothing in the DOM
+	// distinguishes the env pill from a future second compact pill at
+	// rest — only the menu contents disambiguate.
+	//
+	// Returns the matched pill's label text and the rendered menu
+	// items. Throws if no candidate yields a Local-bearing menu.
+	async openEnvPill(): Promise<{ pillText: string; items: string[] }> {
+		const pills = await findCompactPills(this.inspector);
+		if (pills.length === 0) {
+			throw new Error(
+				'CodeTab.openEnvPill: no compact pills on the page — ' +
+					'did you call activate() first?',
+			);
+		}
+		// Iterate by label rather than DOM index so we can use openPill
+		// with an exact-text anchor — avoids re-querying ordinals after
+		// each Escape (the DOM may shift).
+		for (const pill of pills) {
+			const labelRe = new RegExp(`^${escapeRegExp(pill.text)}$`);
+			const opened = await openPill(this.inspector, labelRe, { timeout: 1500 });
+			if (!opened.opened) continue;
+			const hasLocal = opened.items.some((t) => /^Local\b/.test(t));
+			if (hasLocal) {
+				return { pillText: pill.text, items: opened.items };
+			}
+			await pressEscape(this.inspector);
+			// Brief settle so the next openPill doesn't race the popover
+			// teardown. 150ms matches the original T17 implementation.
+			await sleep(150);
+		}
+		throw new Error(
+			`CodeTab.openEnvPill: probed ${pills.length} compact pill(s), ` +
+				`none yielded a menu containing /^Local\\b/`,
+		);
+	}
+
+	// Click the `^Local` menuitemradio inside the (already-open) env-pill
+	// menu. textContent reads "Local, environment settings, right arrow"
+	// because of the SR-only suffix; we anchor on /^Local\b/.
+	async selectLocal(): Promise<void> {
+		const result = await clickMenuItem(this.inspector, /^Local\b/);
+		if (!result.clicked) {
+			throw new Error(
+				`CodeTab.selectLocal: no /^Local\\b/ item in the open menu. ` +
+					`Items: ${JSON.stringify(result.items)}`,
+			);
+		}
+	}
+
+	// Full chain: open env pill → Local → wait for the "Select folder…"
+	// pill to render → open it → click "Open folder…". After this
+	// resolves, dialog.showOpenDialog has been invoked (the caller
+	// installs the mock first and polls getOpenDialogCalls to confirm).
+	//
+	// Each step throws on its own miss with enough metadata to tell
+	// which selector decayed; the caller can wrap the whole chain in
+	// try/catch for partial-state attachment.
+	async openFolderPicker(): Promise<void> {
+		await this.openEnvPill();
+		await this.selectLocal();
+		// The Select-folder pill renders after Local is chosen. Same
+		// CompactPill shape — anchor on the leading "Select folder"
+		// text. 4s budget matches the T17 wait that proved sufficient
+		// in practice on KDE-W.
+		const selectOpened = await retryUntil(
+			async () => {
+				const r = await openPill(this.inspector, /^Select folder/, {
+					timeout: 1000,
+				});
+				return r.opened ? r : null;
+			},
+			{ timeout: 4000, interval: 200 },
+		);
+		if (!selectOpened) {
+			throw new Error(
+				'CodeTab.openFolderPicker: "Select folder…" pill did not ' +
+					'open within 4s after Local was clicked',
+			);
+		}
+		// The Select-folder menu has a "Recent" group (radios — clicking
+		// reuses the past path silently, no dialog) followed by
+		// "Open folder…" (menuitem — fires the picker). Click the
+		// menuitem variant explicitly; clickMenuItem matches all
+		// menuitem* roles, so the leading-text anchor is what
+		// disambiguates here.
+		const openClicked = await clickMenuItem(this.inspector, /^Open folder/);
+		if (!openClicked.clicked) {
+			throw new Error(
+				`CodeTab.openFolderPicker: no /^Open folder/ menuitem in ` +
+					`the Select-folder menu. Items: ${JSON.stringify(openClicked.items)}`,
+			);
+		}
+	}
+}
+
+// Standard "escape regex special chars in a literal string" helper.
+// Used to build an exact-match RegExp from a captured pill label.
+function escapeRegExp(s: string): string {
+	return s.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
+}
--- a/tools/test-harness/src/lib/dbus.ts
+++ b/tools/test-harness/src/lib/dbus.ts
@@ -0,0 +1,40 @@
+import { sessionBus, type MessageBus, type ClientInterface } from 'dbus-next';
+
+let cached: MessageBus | null = null;
+
+export function getSessionBus(): MessageBus {
+	if (!cached) {
+		cached = sessionBus();
+	}
+	return cached;
+}
+
+export async function disconnectBus(): Promise<void> {
+	if (cached) {
+		cached.disconnect();
+		cached = null;
+	}
+}
+
+// dbus-next exposes interface methods as dynamic properties typed loosely. Cast
+// at the call site rather than re-typing every D-Bus interface we touch.
+type DynamicMethod = (...args: unknown[]) => Promise<unknown>;
+
+export function method(iface: ClientInterface, name: string): DynamicMethod {
+	const fn = (iface as unknown as Record<string, DynamicMethod | undefined>)[name];
+	if (typeof fn !== 'function') {
+		throw new Error(`D-Bus method ${name} not found on interface`);
+	}
+	return fn.bind(iface);
+}
+
+export async function getConnectionPid(connectionName: string): Promise<number> {
+	const bus = getSessionBus();
+	const proxy = await bus.getProxyObject(
+		'org.freedesktop.DBus',
+		'/org/freedesktop/DBus',
+	);
+	const iface = proxy.getInterface('org.freedesktop.DBus');
+	const result = await method(iface, 'GetConnectionUnixProcessID')(connectionName);
+	return result as number;
+}
--- a/tools/test-harness/src/lib/diagnostics.ts
+++ b/tools/test-harness/src/lib/diagnostics.ts
@@ -0,0 +1,65 @@
+import { readFile } from 'node:fs/promises';
+import { homedir } from 'node:os';
+import { join } from 'node:path';
+import { execFile } from 'node:child_process';
+import { promisify } from 'node:util';
+
+const exec = promisify(execFile);
+
+const LAUNCHER_LOG = join(
+	homedir(),
+	'.cache/claude-desktop-debian/launcher.log',
+);
+
+export async function readLauncherLog(): Promise<string | null> {
+	try {
+		return await readFile(LAUNCHER_LOG, 'utf8');
+	} catch {
+		return null;
+	}
+}
+
+export interface DoctorResult {
+	output: string;
+	exitCode: number | null;
+}
+
+export async function runDoctor(launcher?: string): Promise<DoctorResult> {
+	const bin = launcher ?? process.env.CLAUDE_DESKTOP_LAUNCHER ?? 'claude-desktop';
+	try {
+		const { stdout, stderr } = await exec(bin, ['--doctor'], { timeout: 15_000 });
+		return {
+			output: `${stdout}\n${stderr}`.trim(),
+			exitCode: 0,
+		};
+	} catch (err) {
+		// --doctor may exit non-zero if checks fail; still return the output
+		// and the actual exit code so T02/T13/S05 can assert against it.
+		const e = err as { stdout?: string; stderr?: string; code?: number };
+		const combined = `${e.stdout ?? ''}\n${e.stderr ?? ''}`.trim();
+		return {
+			output: combined,
+			exitCode: typeof e.code === 'number' ? e.code : null,
+		};
+	}
+}
+
+export function captureSessionEnv(): Record<string, string> {
+	const keys = [
+		'XDG_SESSION_TYPE',
+		'XDG_CURRENT_DESKTOP',
+		'WAYLAND_DISPLAY',
+		'DISPLAY',
+		'GDK_BACKEND',
+		'QT_QPA_PLATFORM',
+		'OZONE_PLATFORM',
+		'ELECTRON_OZONE_PLATFORM_HINT',
+		'CLAUDE_DESKTOP_LAUNCHER',
+	];
+	const out: Record<string, string> = {};
+	for (const k of keys) {
+		const v = process.env[k];
+		if (v !== undefined) out[k] = v;
+	}
+	return out;
+}
--- a/tools/test-harness/src/lib/eipc.ts
+++ b/tools/test-harness/src/lib/eipc.ts
@@ -0,0 +1,413 @@
+// "eipc" channel-registry primitive — runtime discovery of the custom
+// `$eipc_message$_<UUID>_$_<scope>_$_<iface>_$_<method>` handlers
+// registered on each per-webContents IPC scope.
+//
+// Why this exists
+// ---------------
+// Sessions 2-6 of the runner-implementation work treated the eipc
+// registry as unreachable from main: the standard Electron
+// `ipcMain._invokeHandlers` map only carries 3 chat-tab MCP-bridge
+// handlers (`list-mcp-servers`, `connect-to-mcp-server`,
+// `request-open-mcp-settings`); the 700+ `claude.web_$_*` /
+// `claude.settings_$_*` etc. channels were assumed to be closure-
+// local. Session 3's `globalThis` walk came up empty, which kept
+// T22/T31/T33/T38 stuck as Tier 1 asar fingerprints rather than
+// runtime registry probes.
+//
+// Session 7 found the missing piece: handlers DO go through
+// Electron's stdlib `IpcMainImpl` — just not the GLOBAL `ipcMain`
+// instance. Each `webContents` has its own `webContents.ipc` (per-
+// `WebContents` IPC scope, introduced in Electron 17+), and that's
+// where every `e.ipc.handle("$eipc_message$_..._$_<scope>_$_<iface>_$_<method>", fn)`
+// call lands. Verified empirically against a debugger-attached
+// running Claude:
+//   - find_in_page wc:    78 handlers (settings/find-in-page only)
+//   - main_window wc:     79 handlers (settings/title-bar only)
+//   - claude.ai wc:      490 handlers (full surface — including
+//                                       117 LocalSessions, 16 CustomPlugins)
+//   - global ipcMain:      3 handlers (the chat-tab MCP-bridge trio)
+//
+// All `claude.web_$_*` interfaces (LocalSessions, CustomPlugins,
+// CoworkSpaces, CoworkArtifacts, CoworkMemory, ClaudeCode, etc.)
+// register on the claude.ai webContents. They're sticky across route
+// changes — once registered (during webContents init), they don't
+// deregister when the user navigates between /chats and /epitaxy.
+// So the wait-for-channel poll just needs claude.ai to be alive +
+// finished initial handler registration, NOT a specific route.
+//
+// What this primitive does
+// ------------------------
+// Read-only enumeration via `getEipcChannels` / `findEipcChannel` /
+// `waitForEipcChannel(s)`. Handler PRESENCE checks (T22b / T31b / T33b
+// / T38b) — that's strictly stronger than the asar fingerprint (a
+// handler registered at runtime is a handler that actually wired up,
+// not just a string in the bundle).
+//
+// Plus `invokeEipcChannel` (session 8 addition) — calls a registered
+// handler through the renderer-side wrapper at `window['claude.<scope>']
+// .<Iface>.<method>(...args)`. The wrapper is exposed by `mainView.js`
+// preload via `contextBridge.exposeInMainWorld` after a frame + origin
+// gate (top-level frame, origin in `{claude.ai, claude.com,
+// preview.claude.ai, preview.claude.com, localhost}`). Because the
+// `inspector.evalInRenderer('claude.ai', ...)` path runs inside the
+// claude.ai renderer, the wrapper is present and the synthesized
+// `IpcMainInvokeEvent` carries an honest `senderFrame` — the alternative
+// of pulling the function out of `_invokeHandlers` and synthesizing a
+// fake event with `senderFrame.url = 'https://claude.ai/'` works (the
+// gates are duck-typed structural checks) but spoofs a security-relevant
+// claim. Going through the wrapper keeps the test surface aligned with
+// real attack surface.
+//
+// `invokeEipcChannel` is read-by-default but doesn't enforce a
+// read-only allowlist — the safety property is that consumers pass
+// case-doc-anchored suffixes verbatim, which limits the blast radius
+// to whatever the case doc said the test should poke. Don't pass
+// `start*` / `set*` / `write*` / `run*` / `openIn*` suffixes; those
+// mutate user state.
+//
+// Framing opacity
+// ---------------
+// The `$eipc_message$_<UUID>_$_<scope>_$_<iface>_$_<method>` framing
+// has been UUID-stable across builds (session 2 noted
+// `c0eed8c9-c94a-4931-8cc3-3a08694e9863`; session 7 confirmed it's
+// still that, single UUID across all 647 per-wc handlers). The
+// primitive does not pin the UUID — match by suffix so a future
+// build that rotates the UUID doesn't silently break every consuming
+// spec. Suffix matching is also what the case-doc anchors use
+// (`LocalSessions_$_getPrChecks` etc.), so consumers can pass the
+// case-doc string verbatim.
+
+import { retryUntil } from './retry.js';
+import type { InspectorClient } from './inspector.js';
+
+// One handler entry on a webContents. `suffix` is the part after the
+// UUID — `<scope>_$_<iface>_$_<method>` — useful for dedup / display.
+// `fullKey` is the full registry key including the framing prefix and
+// UUID, kept for diagnostic attachments where the raw form matters
+// (drift detection, regression triage). `webContentsId` lets a caller
+// disambiguate when a future scope registers the same suffix on
+// multiple webContents (today only `claude.settings/*` does this and
+// every wc gets the same set; non-issue for current consumers).
+export interface EipcChannel {
+	suffix: string;
+	fullKey: string;
+	webContentsId: number;
+	webContentsUrl: string;
+}
+
+export interface GetEipcChannelsOptions {
+	// Substring match on `webContents.getURL()`. Default: 'claude.ai'.
+	// Pass an empty string to enumerate every webContents.
+	urlFilter?: string;
+	// Optional scope filter — e.g. 'claude.web' to drop settings-
+	// scope handlers. Matched against the segment immediately after
+	// the UUID. Empty / undefined returns all scopes.
+	scope?: string;
+	// Optional interface filter — e.g. 'LocalSessions'. Matched
+	// against the segment after the scope. Empty / undefined returns
+	// all interfaces.
+	iface?: string;
+}
+
+// Internal: shape returned by the inspector eval below. Kept private
+// so the `EipcChannel` interface above is the public type contract.
+interface RawEntry {
+	wcId: number;
+	wcUrl: string;
+	fullKey: string;
+}
+
+// Enumerate every eipc-framed handler key registered on every matching
+// webContents. The UUID is opaque to the caller — only the suffix
+// (`<scope>_$_<iface>_$_<method>`) is exposed via the EipcChannel
+// type. Filtering by `scope` / `iface` happens after the inspector
+// eval (the eval keeps its filter set minimal so a single eval call
+// covers every consumer's needs).
+//
+// Returns an empty array when no matching webContents exists (e.g.
+// the spec called this before claude.ai loaded). Callers that need
+// a "wait until present" semantic should use `waitForEipcChannel`
+// instead.
+export async function getEipcChannels(
+	inspector: InspectorClient,
+	opts: GetEipcChannelsOptions = {},
+): Promise<EipcChannel[]> {
+	const urlFilter = opts.urlFilter ?? 'claude.ai';
+	const raw = await inspector.evalInMain<RawEntry[]>(`
+		const { webContents } = process.mainModule.require('electron');
+		const urlFilter = ${JSON.stringify(urlFilter)};
+		const out = [];
+		for (const wc of webContents.getAllWebContents()) {
+			const url = wc.getURL();
+			if (urlFilter && !url.includes(urlFilter)) continue;
+			const ipc = wc.ipc;
+			const map = ipc && ipc._invokeHandlers;
+			if (!map) continue;
+			const keys = (typeof map.keys === 'function')
+				? Array.from(map.keys())
+				: Object.keys(map);
+			for (const k of keys) {
+				out.push({ wcId: wc.id, wcUrl: url, fullKey: k });
+			}
+		}
+		return out;
+	`);
+
+	// Match the framing prefix and capture the suffix. Anything that
+	// doesn't match (e.g. a non-eipc handler that snuck onto a wc
+	// scope) gets filtered out — only eipc-framed entries are part of
+	// this primitive's contract.
+	const re = /^\$eipc_message\$_[0-9a-f-]+_\$_(.+)$/;
+	const out: EipcChannel[] = [];
+	for (const entry of raw) {
+		const m = re.exec(entry.fullKey);
+		if (!m) continue;
+		const suffix = m[1]!;
+		if (opts.scope) {
+			// Suffix shape: `<scope>_$_<iface>_$_<method>`. Anchor at
+			// the start so 'claude.web' matches but 'web' doesn't
+			// match `claude.settings` etc.
+			if (!suffix.startsWith(`${opts.scope}_$_`)) continue;
+		}
+		if (opts.iface) {
+			// Interface segment is after the scope — search for
+			// `_$_<iface>_$_` in the suffix. Anchored separators
+			// avoid accidentally matching a method name that happens
+			// to contain the iface string.
+			if (!suffix.includes(`_$_${opts.iface}_$_`)) continue;
+		}
+		out.push({
+			suffix,
+			fullKey: entry.fullKey,
+			webContentsId: entry.wcId,
+			webContentsUrl: entry.wcUrl,
+		});
+	}
+	return out;
+}
+
+export interface FindEipcChannelOptions {
+	// Substring match on `webContents.getURL()`. Default: 'claude.ai'.
+	urlFilter?: string;
+}
+
+// Locate the first registered handler whose suffix ends with
+// `caseDocSuffix`. Designed so callers can pass the case-doc-anchored
+// string verbatim — e.g. `LocalSessions_$_getPrChecks`. Returns null
+// when no match exists (caller decides whether to fail, skip, or
+// retry).
+//
+// This is a synchronous one-shot; for the populate-on-init wait, use
+// `waitForEipcChannel` — it wraps this in a retryUntil.
+export async function findEipcChannel(
+	inspector: InspectorClient,
+	caseDocSuffix: string,
+	opts: FindEipcChannelOptions = {},
+): Promise<EipcChannel | null> {
+	const channels = await getEipcChannels(inspector, {
+		urlFilter: opts.urlFilter,
+	});
+	for (const ch of channels) {
+		if (ch.suffix.endsWith(caseDocSuffix)) return ch;
+	}
+	return null;
+}
+
+export interface WaitForEipcChannelOptions {
+	urlFilter?: string;
+	// Total budget for the poll. Default 15s — the claude.ai
+	// webContents' initial handler registration completes within a
+	// second of `userLoaded` on the dev box, so 15s leaves wide
+	// margin for slow-cache cases.
+	timeoutMs?: number;
+	intervalMs?: number;
+}
+
+// Poll until the named channel is registered, or the budget runs out.
+// Use this when the spec just reached `waitForReady('userLoaded')` —
+// the claude.ai webContents may exist but its handlers might not have
+// finished registering yet. The poll is cheap (one inspector eval per
+// tick + a string scan) so the default interval can be aggressive.
+//
+// Returns the EipcChannel on success, null on timeout. Callers that
+// want a hard fail on timeout should `expect(channel, '...').not.toBeNull()`
+// — the primitive doesn't throw because some specs want to surface
+// missing-handler as a clean fail with diagnostics rather than an
+// uncaught timeout.
+export async function waitForEipcChannel(
+	inspector: InspectorClient,
+	caseDocSuffix: string,
+	opts: WaitForEipcChannelOptions = {},
+): Promise<EipcChannel | null> {
+	return retryUntil(
+		() => findEipcChannel(inspector, caseDocSuffix, opts),
+		{
+			timeout: opts.timeoutMs ?? 15_000,
+			interval: opts.intervalMs ?? 250,
+		},
+	);
+}
+
+// Convenience: resolve a list of case-doc suffixes in one round-trip.
+// Returns a Map keyed by the input suffix so callers can iterate the
+// expected list and report per-suffix presence. Missing suffixes have
+// `null` values.
+//
+// Single inspector call by design — the `getEipcChannels` cost is
+// dominated by the eval round-trip, not the in-process filtering, so
+// batching is strictly cheaper than N calls to `findEipcChannel`.
+export async function findEipcChannels(
+	inspector: InspectorClient,
+	caseDocSuffixes: readonly string[],
+	opts: FindEipcChannelOptions = {},
+): Promise<Map<string, EipcChannel | null>> {
+	const channels = await getEipcChannels(inspector, {
+		urlFilter: opts.urlFilter,
+	});
+	const out = new Map<string, EipcChannel | null>();
+	for (const suffix of caseDocSuffixes) {
+		const hit = channels.find((c) => c.suffix.endsWith(suffix));
+		out.set(suffix, hit ?? null);
+	}
+	return out;
+}
+
+// Wait until ALL of the listed suffixes are registered, or the budget
+// runs out. Useful for trios like T31's side-chat (start/send/stop) —
+// the trio is load-bearing as a unit; partial registration is a fail.
+//
+// Returns the resolved Map on full success. On timeout, returns the
+// last-observed Map (some entries may be null) so callers can surface
+// the partial state in their diagnostic attachment before failing.
+export async function waitForEipcChannels(
+	inspector: InspectorClient,
+	caseDocSuffixes: readonly string[],
+	opts: WaitForEipcChannelOptions = {},
+): Promise<Map<string, EipcChannel | null>> {
+	let lastSnapshot = new Map<string, EipcChannel | null>();
+	const result = await retryUntil(
+		async () => {
+			const snap = await findEipcChannels(
+				inspector,
+				caseDocSuffixes,
+				opts,
+			);
+			lastSnapshot = snap;
+			for (const v of snap.values()) if (v === null) return null;
+			return snap;
+		},
+		{
+			timeout: opts.timeoutMs ?? 15_000,
+			interval: opts.intervalMs ?? 250,
+		},
+	);
+	return result ?? lastSnapshot;
+}
+
+export interface InvokeEipcChannelOptions {
+	// Renderer URL filter. Default 'claude.ai' — the only webContents
+	// whose origin passes the wrapper-exposure gate (`Qc()` in
+	// `mainView.js`: `https://claude.ai`, `https://claude.com`,
+	// preview.*, localhost). The `find_in_page` and `main_window`
+	// webContents register `claude.settings/*` handlers in their
+	// per-wc IPC scope but their renderers run from `file://`, so
+	// `window['claude.settings']` is never exposed there and invocation
+	// through them would need a different (main-side, fake-event)
+	// approach not implemented in this primitive.
+	urlFilter?: string;
+	// Inspector eval timeout. Default = InspectorClient.defaultTimeoutMs
+	// (30s). Read-only handlers like `getMcpServersConfig` /
+	// `readGlobalMemory` / `getAllScheduledTasks` return well within
+	// 1s on a warm app; the 30s budget is for cold-cache cases.
+	timeoutMs?: number;
+}
+
+// Invoke an eipc handler through the renderer-side wrapper at
+// `window['claude.<scope>'].<Iface>.<method>(...args)`. The suffix is
+// resolved against the per-wc registry first (same matching rules as
+// `findEipcChannel` — accepts both fully-qualified
+// `claude.web_$_LocalSessions_$_getPrChecks` and the more concise
+// `LocalSessions_$_getPrChecks`) and the scope/iface/method triplet is
+// pulled from the resolved full suffix.
+//
+// Why through the renderer wrapper, not a direct main-side call:
+// handlers register via `e.ipc.handle(framedName, async (event, args)
+// => { if (!le(event)) throw ...; return A.<method>(args); })` — the
+// origin gate is inlined at registration time (variants `le`/`Vi`/`mm`
+// in the bundle, all duck-typed structural checks against
+// `event.senderFrame.url` and `event.senderFrame.parent === null`).
+// Pulling the function out of `_invokeHandlers` and calling it with a
+// synthesized event whose `senderFrame.url` is `'https://claude.ai/'`
+// works (the gate is structural, not `instanceof`-checked) but spoofs
+// the gate's security claim. The wrapper IS at claude.ai, so the
+// synthesized event carries an honest senderFrame and the test surface
+// matches real attack surface.
+//
+// Errors:
+// - "no handler registered with suffix": the registry walk returned
+//   nothing matching. Same shape as `findEipcChannel` returning null;
+//   waitForEipcChannel first if your spec needs the populate-on-init
+//   poll.
+// - "eipc namespace missing in renderer: claude.<scope>": the wrapper
+//   isn't exposed on this renderer. Either the urlFilter selected a
+//   webContents whose origin failed `Qc()`, or the build flipped the
+//   scope's exposure gate. Check `evalInRenderer(urlFilter,
+//   'Object.keys(window).filter(k => k.startsWith("claude."))')`.
+// - String-form rejection from the renderer eval: the gate / arg-
+//   validator / result-validator inside the handler closure rejected.
+//   The framed channel name appears in the error message — use it to
+//   pinpoint which handler rejected.
+//
+// Args are JSON-marshaled into the renderer eval. Return value is
+// JSON-deserialized via `evalInRenderer`'s `executeJavaScript` path.
+// Non-JSON-serializable handler returns (Date, Buffer, circular refs)
+// would mangle through this primitive — none of the current Tier 2
+// case-doc consumers return such shapes; flag if a future one does.
+export async function invokeEipcChannel<T = unknown>(
+	inspector: InspectorClient,
+	caseDocSuffix: string,
+	args: readonly unknown[] = [],
+	opts: InvokeEipcChannelOptions = {},
+): Promise<T> {
+	const urlFilter = opts.urlFilter ?? 'claude.ai';
+	const channel = await findEipcChannel(inspector, caseDocSuffix, {
+		urlFilter,
+	});
+	if (!channel) {
+		throw new Error(
+			`invokeEipcChannel: no handler registered with suffix ` +
+				`'${caseDocSuffix}' on a webContents matching ` +
+				`'${urlFilter}'`,
+		);
+	}
+	// Full suffix is `<scope>_$_<iface>_$_<method>`. Scope contains a
+	// dot (e.g. claude.web) but the `_$_` separator is unambiguous —
+	// a 3-part split gives [scope, iface, method] cleanly.
+	const parts = channel.suffix.split('_$_');
+	if (parts.length !== 3) {
+		throw new Error(
+			`invokeEipcChannel: bad suffix shape '${channel.suffix}' ` +
+				`(expected '<scope>_$_<iface>_$_<method>')`,
+		);
+	}
+	const [scope, iface, method] = parts;
+	const argsJson = JSON.stringify(args);
+	const js = `(async () => {
+		const ns = window[${JSON.stringify(scope)}];
+		if (!ns) throw new Error(
+			'eipc namespace missing in renderer: ' + ${JSON.stringify(scope)}
+		);
+		const ifaceObj = ns[${JSON.stringify(iface)}];
+		if (!ifaceObj) throw new Error(
+			'eipc interface missing: ' + ${JSON.stringify(iface)} +
+			' (under ' + ${JSON.stringify(scope)} + ')'
+		);
+		const fn = ifaceObj[${JSON.stringify(method)}];
+		if (typeof fn !== 'function') throw new Error(
+			'eipc method not a function: ' + ${JSON.stringify(method)} +
+			' (under ' + ${JSON.stringify(scope)} + '.' + ${JSON.stringify(iface)} + ')'
+		);
+		return await fn.apply(ifaceObj, ${argsJson});
+	})()`;
+	return inspector.evalInRenderer<T>(urlFilter, js, opts.timeoutMs);
+}
--- a/tools/test-harness/src/lib/electron-mocks.ts
+++ b/tools/test-harness/src/lib/electron-mocks.ts
@@ -0,0 +1,206 @@
+// Mock-then-call helpers for side-effecting Electron module APIs.
+//
+// Tests that exercise an Electron egress whose real invocation would
+// touch the host system (open a file manager, launch an editor, show a
+// dialog) install a recorder mock first, then invoke the API via
+// `inspector.evalInMain` and assert against the recorded calls. The
+// pattern strengthens "didn't throw" probes into "the egress was
+// reached + the args flowed through verbatim", with no host side
+// effect.
+//
+// Each helper:
+//   - is idempotent within an Electron lifecycle (guarded by a
+//     globalThis flag so re-installation in retry loops is a no-op),
+//   - records `{ ts, ...args }` into a globalThis call list,
+//   - returns a value matching the real API's documented contract
+//     (void / Promise<boolean> / canned dialog result).
+//
+// The companion `get*Calls()` reader returns `[]` if the mock was
+// never installed (rather than throwing) so pre-install reads in
+// retry loops are cheap.
+//
+// Extracted from `lib/claudeai.ts` once the third helper landed
+// (T17 dialog → T25 showItemInFolder → T24 openExternal). These
+// helpers are not claude.ai-domain — they're generic Electron module
+// patches — so the extraction keeps `claudeai.ts` focused on the AX-
+// tree page-objects and gives future mock-then-call tests an obvious
+// home to add to.
+//
+// Caller pattern: see `runners/T17_folder_picker.spec.ts`,
+// `runners/T25_show_item_in_folder_no_throw.spec.ts`,
+// `runners/T24_open_in_editor_no_throw.spec.ts`.
+
+import type { InspectorClient } from './inspector.js';
+
+// ----- dialog.showOpenDialog -----------------------------------------
+
+// Replace dialog.showOpenDialog with a mock that records every call
+// and returns a canned result. Idempotent — re-installing within the
+// same Electron lifecycle is a no-op (guarded by
+// globalThis.__claudeAiDialogMockInstalled). Mirrors the shape of
+// QuickEntry.installInterceptor (quickentry.ts:86) so callers across
+// libs feel consistent.
+//
+// The first BrowserWindow positional arg is optional in Electron's
+// API, so the mock handles both `showOpenDialog(opts)` and
+// `showOpenDialog(window, opts)` shapes.
+export async function installOpenDialogMock(
+	inspector: InspectorClient,
+	cannedResult: { canceled: boolean; filePaths: string[] } = {
+		canceled: false,
+		filePaths: ['/tmp/claude-test-folder'],
+	},
+): Promise<void> {
+	const canned = JSON.stringify(cannedResult);
+	await inspector.evalInMain<null>(`
+		if (globalThis.__claudeAiDialogMockInstalled) return null;
+		const { dialog } = process.mainModule.require('electron');
+		globalThis.__claudeAiDialogCalls = [];
+		const original = dialog.showOpenDialog.bind(dialog);
+		dialog.showOpenDialog = async function(...args) {
+			const browserWindowArg = args[0]
+				&& typeof args[0] === 'object'
+				&& args[0].constructor
+				&& args[0].constructor.name === 'BrowserWindow';
+			const opts = browserWindowArg ? args[1] : args[0];
+			globalThis.__claudeAiDialogCalls.push({
+				ts: Date.now(),
+				nargs: args.length,
+				title: opts && opts.title,
+				properties: opts && opts.properties,
+			});
+			return ${canned};
+		};
+		void original;
+		globalThis.__claudeAiDialogMockInstalled = true;
+		return null;
+	`);
+}
+
+export interface OpenDialogCall {
+	ts: number;
+	nargs: number;
+	title?: string;
+	properties?: string[];
+}
+
+// Read the recorded call list. Returns [] if the mock was never
+// installed (rather than throwing) — pre-install reads in retry
+// loops stay cheap.
+export async function getOpenDialogCalls(
+	inspector: InspectorClient,
+): Promise<OpenDialogCall[]> {
+	return await inspector.evalInMain<OpenDialogCall[]>(
+		`return globalThis.__claudeAiDialogCalls || []`,
+	);
+}
+
+// ----- shell.showItemInFolder ----------------------------------------
+
+// Replace electron.shell.showItemInFolder with a mock that records
+// every call without performing the underlying DBus FileManager1 /
+// xdg-open dispatch. Same idempotency-flag pattern as
+// installOpenDialogMock.
+//
+// Why mock vs. invoke real: `showItemInFolder` is fire-and-forget on
+// Linux (returns void, no success signal). Invoking it for real opens
+// the host's actual file manager — fine in a click-chain test, but
+// disruptive when the assertion is just "the JS-level call is
+// reachable + accepts a path arg + the IPC layer terminates here".
+// The mock keeps the same assertion shape with no host side effect.
+export async function installShowItemInFolderMock(
+	inspector: InspectorClient,
+): Promise<void> {
+	await inspector.evalInMain<null>(`
+		if (globalThis.__claudeAiShowItemMockInstalled) return null;
+		const { shell } = process.mainModule.require('electron');
+		globalThis.__claudeAiShowItemCalls = [];
+		const original = shell.showItemInFolder.bind(shell);
+		shell.showItemInFolder = function(fullPath) {
+			globalThis.__claudeAiShowItemCalls.push({
+				ts: Date.now(),
+				path: typeof fullPath === 'string' ? fullPath : String(fullPath),
+			});
+			// Return undefined like the real method — callers don't
+			// inspect the return value.
+		};
+		void original;
+		globalThis.__claudeAiShowItemMockInstalled = true;
+		return null;
+	`);
+}
+
+export interface ShowItemInFolderCall {
+	ts: number;
+	path: string;
+}
+
+export async function getShowItemInFolderCalls(
+	inspector: InspectorClient,
+): Promise<ShowItemInFolderCall[]> {
+	return await inspector.evalInMain<ShowItemInFolderCall[]>(
+		`return globalThis.__claudeAiShowItemCalls || []`,
+	);
+}
+
+// ----- shell.openExternal --------------------------------------------
+
+// Replace electron.shell.openExternal with a mock that records every
+// call without performing the underlying xdg-open / scheme-handler
+// dispatch. Same idempotency-flag pattern as installOpenDialogMock /
+// installShowItemInFolderMock.
+//
+// Why mock vs. invoke real: `shell.openExternal` is the single egress
+// for all URL-scheme handoffs (browser, OAuth callback, editor URL
+// schemes like `vscode://file/<path>`). Invoking it for real on a
+// host with the matching scheme handler installed launches the target
+// app (e.g. a full VS Code window) — fine in a click-chain test,
+// disruptive when the assertion is just "the JS-level call is
+// reachable + the URL flowed through verbatim". The mock keeps the
+// same assertion shape with no host side effect.
+//
+// Unlike `showItemInFolder`, `openExternal` returns `Promise<boolean>`
+// (true on success, false otherwise — see Electron docs), so the mock
+// must return a resolved Promise with the canned boolean rather than
+// undefined, otherwise callers that `await` the result would observe
+// `undefined` instead of the documented contract.
+export async function installOpenExternalMock(
+	inspector: InspectorClient,
+	cannedResult: boolean = true,
+): Promise<void> {
+	const canned = JSON.stringify(cannedResult);
+	await inspector.evalInMain<null>(`
+		if (globalThis.__claudeAiOpenExternalMockInstalled) return null;
+		const { shell } = process.mainModule.require('electron');
+		globalThis.__claudeAiOpenExternalCalls = [];
+		const original = shell.openExternal.bind(shell);
+		shell.openExternal = async function(url, options) {
+			globalThis.__claudeAiOpenExternalCalls.push({
+				ts: Date.now(),
+				url: typeof url === 'string' ? url : String(url),
+				options: options,
+			});
+			// Return a resolved Promise<boolean> like the real method —
+			// callers that await the result expect the documented
+			// contract (true on success, false otherwise).
+			return ${canned};
+		};
+		void original;
+		globalThis.__claudeAiOpenExternalMockInstalled = true;
+		return null;
+	`);
+}
+
+export interface OpenExternalCall {
+	ts: number;
+	url: string;
+	options?: unknown;
+}
+
+export async function getOpenExternalCalls(
+	inspector: InspectorClient,
+): Promise<OpenExternalCall[]> {
+	return await inspector.evalInMain<OpenExternalCall[]>(
+		`return globalThis.__claudeAiOpenExternalCalls || []`,
+	);
+}
--- a/tools/test-harness/src/lib/electron.ts
+++ b/tools/test-harness/src/lib/electron.ts
@@ -0,0 +1,515 @@
+import { spawn, execFile, type ChildProcess } from 'node:child_process';
+import { existsSync, readlinkSync, rmSync } from 'node:fs';
+import { homedir } from 'node:os';
+import { dirname, join } from 'node:path';
+import { promisify } from 'node:util';
+import { sleep, retryUntil } from './retry.js';
+import { findX11WindowByPid } from './wm.js';
+import { InspectorClient } from './inspector.js';
+import { createIsolation, type Isolation } from './isolation.js';
+import { MainWindow, waitForUserLoaded } from './quickentry.js';
+
+const exec = promisify(execFile);
+
+export interface LaunchOptions {
+	extraEnv?: Record<string, string>;
+	args?: string[];
+	// Pass an existing Isolation to share config across multiple
+	// launches in one test (e.g. S35 position-memory across restart).
+	// Pass `null` to opt out of isolation entirely (legacy: shares
+	// ~/.config/Claude with the host). Default: a fresh isolation per
+	// launch, cleaned up on close().
+	isolation?: Isolation | null;
+}
+
+// Tiered readiness levels for waitForReady(). Higher levels include
+// every check from lower levels. Pick the lowest level a test
+// actually needs:
+//   - 'window'      X11 window mapped (no inspector, no renderer state)
+//   - 'mainVisible' main shell BrowserWindow.isVisible() === true
+//   - 'claudeAi'    any claude.ai webContents reachable (may be /login)
+//   - 'userLoaded'  claude.ai URL past /login (lHn() precondition; the
+//                   tightest gate before exercising QE submit paths)
+export type ReadyLevel = 'window' | 'mainVisible' | 'claudeAi' | 'userLoaded';
+
+export interface WaitForReadyOptions {
+	// Overall budget across all levels. Each step consumes from the
+	// remaining budget. Default 90_000ms covers the userLoaded path
+	// (~5-10s startup + main visible + 30s claude.ai load + login
+	// nav) with margin. Override down for cheaper levels.
+	timeout?: number;
+}
+
+export interface WindowReady {
+	wid: string;
+}
+
+export interface MainVisibleReady extends WindowReady {
+	inspector: InspectorClient;
+}
+
+export interface ClaudeAiReady extends MainVisibleReady {
+	// First claude.ai webContents URL observed. Absent if claude.ai
+	// never loaded within the budget — caller can treat as a skip
+	// (host likely not signed in).
+	claudeAiUrl?: string;
+}
+
+export interface UserLoadedReady extends ClaudeAiReady {
+	// claude.ai URL past /login. Absent if the renderer never
+	// navigated past the login page within the budget.
+	postLoginUrl?: string;
+}
+
+// Maps each level to the precise return shape its callers see.
+// Conditional type rather than overloads because the implementation
+// is a single closure with a union return — overloads would require
+// either an unsafe cast or function-declaration overloads, both
+// noisier than this.
+export type ReadyResultFor<L extends ReadyLevel> =
+	L extends 'window' ? WindowReady :
+	L extends 'mainVisible' ? MainVisibleReady :
+	L extends 'claudeAi' ? ClaudeAiReady :
+	L extends 'userLoaded' ? UserLoadedReady :
+	never;
+
+export interface ClaudeApp {
+	process: ChildProcess;
+	pid: number;
+	isolation: Isolation | null;
+	// Populated on close(). When the spawned Electron exits with
+	// non-zero `code` and was NOT killed by us (`signal === null`),
+	// this carries the data so a runner can `testInfo.attach()` the
+	// crash info without us coupling electron.ts to Playwright APIs
+	// or breaking the existing `await app.close()` sites that ignore
+	// the return value. Stays null while the proc is still running.
+	lastExitInfo: { code: number | null; signal: NodeJS.Signals | null } | null;
+	close(): Promise<void>;
+	waitForX11Window(timeoutMs?: number): Promise<string>;
+	attachInspector(timeoutMs?: number): Promise<InspectorClient>;
+	// Tiered "is the app ready for the kind of work this test does"
+	// helper. See ReadyLevel for what each level checks. Throws on
+	// timeout for 'window' / 'mainVisible' (hard-fail levels). For
+	// 'claudeAi' / 'userLoaded', returns with the corresponding field
+	// (claudeAiUrl, postLoginUrl) absent on timeout so callers can
+	// `testInfo.skip()` rather than fail when the host isn't signed in.
+	waitForReady<L extends ReadyLevel>(
+		level: L,
+		opts?: WaitForReadyOptions,
+	): Promise<ReadyResultFor<L>>;
+}
+
+// CDP auth gate: index.pre.js has
+//   uF(process.argv) && !qL() && process.exit(1);
+// where uF matches --remote-debugging-port / --remote-debugging-pipe on argv
+// and qL validates a token in CLAUDE_CDP_AUTH against a hardcoded ed25519
+// public key (signed payload `${timestamp_ms}.${base64(userDataDir)}`,
+// 5-minute TTL). Both Playwright's _electron.launch() and
+// chromium.connectOverCDP() inject --remote-debugging-port=0 and trip the
+// gate. Signing key is upstream's; we can't forge tokens.
+//
+// Workaround: the gate doesn't check --inspect or runtime SIGUSR1 (the
+// "Developer → Enable Main Process Debugger" menu's code path). So we
+// spawn without any debug-port flags (gate stays asleep), wait for the
+// X11 window to appear, then send SIGUSR1 to attach the Node inspector at
+// runtime. From there lib/inspector.ts gives us main-process JS eval,
+// which reaches the renderer via webContents.executeJavaScript() and
+// supports main-process mocks (e.g. dialog.showOpenDialog for T17).
+
+// Default backend: X11 via XWayland. Mirrors launcher-common.sh's
+// build_electron_args() X11 branch (the launcher itself isn't invoked
+// because we spawn Electron directly to keep CLAUDE_CDP_AUTH out of
+// the picture — see the SIGUSR1 attach comment above).
+const LAUNCHER_INJECTED_FLAGS_X11 = [
+	'--disable-features=CustomTitlebar',
+	'--ozone-platform=x11',
+	'--no-sandbox',
+];
+
+// Native-Wayland backend, opted into by CLAUDE_HARNESS_USE_WAYLAND=1.
+// Mirrors launcher-common.sh's Wayland branch (lines 132-135). Tests
+// that need to drive the app under native Wayland (#226 follow-ups,
+// future S07 sweep) flip the harness-level switch and every runner
+// inherits this without per-spec changes.
+const LAUNCHER_INJECTED_FLAGS_WAYLAND = [
+	'--disable-features=CustomTitlebar',
+	'--enable-features=UseOzonePlatform,WaylandWindowDecorations',
+	'--ozone-platform=wayland',
+	'--enable-wayland-ime',
+	'--wayland-text-input-version=3',
+	'--no-sandbox',
+];
+
+const LAUNCHER_INJECTED_ENV: Record<string, string> = {
+	ELECTRON_FORCE_IS_PACKAGED: 'true',
+	ELECTRON_USE_SYSTEM_TITLE_BAR: '1',
+};
+
+// Top-level opt-in: when CLAUDE_HARNESS_USE_WAYLAND=1, every
+// launchClaude() call swaps the X11 flag set for the Wayland one and
+// also exports CLAUDE_USE_WAYLAND=1 into the spawn env (so any in-app
+// path that reads the launcher var stays consistent). Caller-supplied
+// extraEnv still wins — a single test can override per-launch.
+function harnessUseWayland(): boolean {
+	return process.env.CLAUDE_HARNESS_USE_WAYLAND === '1';
+}
+
+const DEFAULT_INSTALL_PATHS = [
+	{
+		electron: '/usr/lib/claude-desktop/node_modules/electron/dist/electron',
+		asar: '/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar',
+	},
+	{
+		electron: '/opt/Claude/node_modules/electron/dist/electron',
+		asar: '/opt/Claude/node_modules/electron/dist/resources/app.asar',
+	},
+];
+
+interface AppPaths {
+	electron: string;
+	asar: string;
+}
+
+// Per-launch state needed by the SIGINT/SIGTERM cleanup. Tracks the
+// child proc + isolation root so a Ctrl-C through Playwright doesn't
+// leak Electron processes or the per-launch tmpdir. Stored separately
+// from ClaudeApp so the signal handler doesn't reach into closure
+// internals — `proc` and `root` are everything cleanup needs.
+interface ActiveLaunch {
+	proc: ChildProcess;
+	// Isolation root to remove on signal. null when caller opted out
+	// (`isolation: null`) or supplied a shared handle (`ownsIsolation`
+	// false — that handle's lifetime is the test's, not ours).
+	root: string | null;
+}
+
+const activeLaunches = new Set<ActiveLaunch>();
+let signalHandlersInstalled = false;
+
+// Install once across every launch in the test process. Handler is
+// synchronous: SIGKILL each spawned proc, rmSync each owned isolation
+// root, then re-emit the signal so Playwright's own teardown still
+// runs (and the process actually exits — without re-emit, Node would
+// notice the handler swallowed the signal and stay alive).
+//
+// Only owns processes/dirs from this module, not anything Playwright
+// itself spawned, so the cleanup is safe to run in parallel with
+// Playwright's teardown.
+function ensureSignalHandlers(): void {
+	if (signalHandlersInstalled) return;
+	signalHandlersInstalled = true;
+	const cleanup = (signal: NodeJS.Signals) => {
+		for (const launch of activeLaunches) {
+			try {
+				launch.proc.kill('SIGKILL');
+			} catch {
+				// proc may already be dead
+			}
+			if (launch.root) {
+				try {
+					rmSync(launch.root, { recursive: true, force: true });
+				} catch {
+					// best-effort — tmpdir cleanup is not load-bearing
+				}
+			}
+		}
+		activeLaunches.clear();
+		// Re-emit so default disposition runs. Removing our handler
+		// first prevents an infinite loop.
+		process.removeListener('SIGINT', sigintHandler);
+		process.removeListener('SIGTERM', sigtermHandler);
+		process.kill(process.pid, signal);
+	};
+	const sigintHandler = () => cleanup('SIGINT');
+	const sigtermHandler = () => cleanup('SIGTERM');
+	process.on('SIGINT', sigintHandler);
+	process.on('SIGTERM', sigtermHandler);
+}
+
+function resolveInstall(): AppPaths {
+	const envBin = process.env.CLAUDE_DESKTOP_ELECTRON;
+	const envAsar = process.env.CLAUDE_DESKTOP_APP_ASAR;
+	if (envBin && envAsar) return { electron: envBin, asar: envAsar };
+	for (const candidate of DEFAULT_INSTALL_PATHS) {
+		if (existsSync(candidate.electron) && existsSync(candidate.asar)) {
+			return candidate;
+		}
+	}
+	throw new Error(
+		'Could not locate claude-desktop install. Set CLAUDE_DESKTOP_ELECTRON ' +
+			'and CLAUDE_DESKTOP_APP_ASAR, or install the deb/rpm package.',
+	);
+}
+
+// Mirrors the pre-launch cleanup in launcher-common.sh (cleanup_orphaned_
+// cowork_daemon + cleanup_stale_lock + cleanup_stale_cowork_socket).
+//
+// When `configDir` is provided (isolated test mode), the SingletonLock
+// path is relative to that dir rather than ~/.config/Claude — the host
+// config is left untouched.
+export async function cleanupPreLaunch(configDir?: string): Promise<void> {
+	try {
+		await exec('pkill', ['-f', 'cowork-vm-service\\.js']);
+	} catch {
+		// pkill returns non-zero when no matches; that's fine.
+	}
+
+	const lockPath = configDir
+		? join(configDir, 'SingletonLock')
+		: join(homedir(), '.config/Claude/SingletonLock');
+	try {
+		const target = readlinkSync(lockPath);
+		const pidMatch = target.match(/-(\d+)$/);
+		if (pidMatch && !existsSync(`/proc/${pidMatch[1]}`)) {
+			rmSync(lockPath, { force: true });
+		}
+	} catch {
+		// Lock doesn't exist or isn't a symlink — both fine.
+	}
+
+	const sockPath = join(
+		process.env.XDG_RUNTIME_DIR ?? '/tmp',
+		'cowork-vm-service.sock',
+	);
+	if (existsSync(sockPath)) {
+		try {
+			rmSync(sockPath, { force: true });
+		} catch {
+			// Stale socket may already be gone.
+		}
+	}
+}
+
+export async function launchClaude(opts: LaunchOptions = {}): Promise<ClaudeApp> {
+	// Isolation default: create a fresh per-launch sandbox unless the
+	// caller passed `null` (legacy ~/.config/Claude) or supplied a
+	// pre-existing handle (shared across multiple launches in one test).
+	let isolation: Isolation | null;
+	let ownsIsolation = false;
+	if (opts.isolation === null) {
+		isolation = null;
+	} else if (opts.isolation) {
+		isolation = opts.isolation;
+	} else {
+		isolation = await createIsolation();
+		ownsIsolation = true;
+	}
+
+	await cleanupPreLaunch(isolation?.configDir);
+	const { electron: electronBin, asar } = resolveInstall();
+	const appDir = dirname(dirname(dirname(dirname(electronBin))));
+
+	const useWayland = harnessUseWayland();
+	const launcherFlags = useWayland
+		? LAUNCHER_INJECTED_FLAGS_WAYLAND
+		: LAUNCHER_INJECTED_FLAGS_X11;
+	// CLAUDE_USE_WAYLAND only when the harness-level gate is on.
+	// Spread BEFORE opts.extraEnv so a single test can override.
+	const waylandEnv: Record<string, string> = useWayland
+		? { CLAUDE_USE_WAYLAND: '1', GDK_BACKEND: 'wayland' }
+		: {};
+
+	const proc = spawn(
+		electronBin,
+		[...launcherFlags, asar, ...(opts.args ?? [])],
+		{
+			cwd: appDir,
+			env: {
+				...process.env,
+				...LAUNCHER_INJECTED_ENV,
+				...(isolation?.env ?? {}),
+				...waylandEnv,
+				...opts.extraEnv,
+				CI: '1',
+			} as Record<string, string>,
+			stdio: 'ignore',
+			detached: false,
+		},
+	);
+
+	if (!proc.pid) {
+		if (ownsIsolation && isolation) await isolation.cleanup();
+		throw new Error('Failed to spawn Electron — no pid');
+	}
+
+	// Register signal handlers + add this launch to the active set so a
+	// Ctrl-C through Playwright SIGKILLs the Electron child and (if we
+	// own the tmpdir) rmSync's the isolation root. Owned-isolation
+	// signal cleanup uses dirname(configHome) — Isolation doesn't
+	// expose `root`, but createIsolation builds configHome as
+	// `<root>/config`, so the parent dir is the tmpdir to remove.
+	ensureSignalHandlers();
+	const isolationRoot =
+		ownsIsolation && isolation ? dirname(isolation.configHome) : null;
+	const launchEntry: ActiveLaunch = { proc, root: isolationRoot };
+	activeLaunches.add(launchEntry);
+
+	// Single-slot inspector tracking. Only one inspector ever attaches
+	// per launch (SIGUSR1 opens port 9229; reusing the port across
+	// re-attaches isn't supported). Stored so close() can release the
+	// WebSocket even if the runner forgets — previously every runner
+	// did `inspector.close(); finally app.close();` and the WS leaked
+	// when an `expect()` between those threw.
+	let trackedInspector: InspectorClient | null = null;
+
+	const waitForX11Window = async (timeoutMs = 15_000): Promise<string> => {
+		const wid = await retryUntil(
+			async () => findX11WindowByPid(proc.pid!),
+			{ timeout: timeoutMs, interval: 250 },
+		);
+		if (!wid) {
+			throw new Error(
+				`X11 window for pid ${proc.pid} did not appear within ${timeoutMs}ms`,
+			);
+		}
+		return wid;
+	};
+
+	const attachInspector = async (timeoutMs = 15_000): Promise<InspectorClient> => {
+		// Send SIGUSR1 to open the Node inspector at runtime — same code
+		// path as Developer → Enable Main Process Debugger menu item.
+		// Then poll http://127.0.0.1:9229/json/list until it answers.
+		process.kill(proc.pid!, 'SIGUSR1');
+		const start = Date.now();
+		let lastErr: unknown = null;
+		while (Date.now() - start < timeoutMs) {
+			try {
+				const client = await InspectorClient.connect(9229);
+				trackedInspector = client;
+				return client;
+			} catch (err) {
+				lastErr = err;
+				await sleep(250);
+			}
+		}
+		throw new Error(
+			`Inspector did not become ready on port 9229 within ${timeoutMs}ms: ${
+				lastErr instanceof Error ? lastErr.message : String(lastErr)
+			}`,
+		);
+	};
+
+	const waitForReady = async (
+		level: ReadyLevel,
+		opts: WaitForReadyOptions = {},
+	): Promise<WindowReady | MainVisibleReady | ClaudeAiReady | UserLoadedReady> => {
+		const overall = opts.timeout ?? 90_000;
+		const start = Date.now();
+		// Each step uses the remaining overall budget rather than
+		// a fixed per-step timeout. If startup is slow, downstream
+		// steps still get whatever's left; if startup is fast, the
+		// later steps inherit the unused margin.
+		const remaining = () => Math.max(0, overall - (Date.now() - start));
+
+		const wid = await waitForX11Window(remaining());
+		if (level === 'window') return { wid };
+
+		const inspector = await attachInspector(remaining());
+
+		// 'mainVisible' — the main shell BrowserWindow has been
+		// shown. MainWindow.getState() resolves the window via
+		// claude.ai webContents, so this poll implicitly also
+		// requires that webContents to exist; the explicit
+		// 'claudeAi' step below is for the URL-list signal that
+		// some tests want even when window visibility is incidental.
+		const mainWin = new MainWindow(inspector);
+		const visibleState = await retryUntil(
+			async () => {
+				const s = await mainWin.getState();
+				return s && s.visible ? s : null;
+			},
+			{ timeout: remaining(), interval: 250 },
+		);
+		if (!visibleState) {
+			throw new Error(
+				`waitForReady('${level}'): main window did not become ` +
+					`visible within ${overall}ms`,
+			);
+		}
+		if (level === 'mainVisible') return { wid, inspector };
+
+		// 'claudeAi' — a claude.ai-domain webContents exists in
+		// the registry. May still be on /login. Soft-fails on
+		// timeout: returns without claudeAiUrl so the caller
+		// can skip (host likely not signed in).
+		const claudeAiUrl = await retryUntil(
+			async () => {
+				const all = await inspector.evalInMain<{ url: string }[]>(`
+					const { webContents } = process.mainModule.require('electron');
+					return webContents.getAllWebContents().map(w => ({ url: w.getURL() }));
+				`);
+				return all.find((w) => w.url.includes('claude.ai'))?.url ?? null;
+			},
+			{ timeout: remaining(), interval: 500 },
+		);
+		if (!claudeAiUrl) {
+			return { wid, inspector };
+		}
+		if (level === 'claudeAi') return { wid, inspector, claudeAiUrl };
+
+		// 'userLoaded' — URL past /login. Necessary precondition
+		// for upstream's lHn() (`!user.isLoggedOut`) returning
+		// true, which gates Ko.show() in the shortcut handler.
+		// NOT sufficient on its own — main-process user state
+		// loads on a separate timeline from the renderer URL,
+		// so QE submit paths still need openAndWaitReady's
+		// retry loop on top of this.
+		const postLoginUrl =
+			(await waitForUserLoaded(inspector, remaining())) ?? undefined;
+		return { wid, inspector, claudeAiUrl, postLoginUrl };
+	};
+
+	const app: ClaudeApp = {
+		process: proc,
+		pid: proc.pid,
+		isolation,
+		lastExitInfo: null,
+		async close() {
+			// Drop the inspector first — InspectorClient.close() is now
+			// idempotent (see lib/inspector.ts) so the runner-side
+			// `inspector.close()` calls keep working even when this
+			// fires too. Wrapped in try/catch because a thrown ws.close
+			// shouldn't block the proc/iso cleanup below.
+			if (trackedInspector) {
+				try {
+					trackedInspector.close();
+				} catch {
+					// already closed
+				}
+				trackedInspector = null;
+			}
+
+			if (proc.exitCode === null && proc.signalCode === null) {
+				proc.kill('SIGTERM');
+				await Promise.race([
+					new Promise<void>((resolve) => proc.once('exit', () => resolve())),
+					sleep(5000),
+				]);
+				if (proc.exitCode === null && proc.signalCode === null) {
+					proc.kill('SIGKILL');
+				}
+			}
+
+			// Capture exit info BEFORE iso cleanup. Runners can attach
+			// app.lastExitInfo to testInfo when non-null + signal === null
+			// (we didn't kill it, so a non-zero code means a real crash).
+			app.lastExitInfo = {
+				code: proc.exitCode,
+				signal: proc.signalCode,
+			};
+
+			activeLaunches.delete(launchEntry);
+			if (ownsIsolation && isolation) {
+				await isolation.cleanup();
+			}
+		},
+		waitForX11Window,
+		attachInspector,
+		// TS can't verify a closure with a union return matches the
+		// generic conditional signature, even though the runtime
+		// branches do produce the right shape per level. The cast
+		// preserves the public contract.
+		waitForReady: waitForReady as ClaudeApp['waitForReady'],
+	};
+	return app;
+}
--- a/tools/test-harness/src/lib/env.ts
+++ b/tools/test-harness/src/lib/env.ts
@@ -0,0 +1,30 @@
+export interface DesktopEnv {
+	desktop: string;
+	sessionType: string;
+	isWayland: boolean;
+	isX11: boolean;
+	isKDE: boolean;
+	isGNOME: boolean;
+	isSWAY: boolean;
+	isHYPR: boolean;
+	isNIRI: boolean;
+	row: string;
+}
+
+export function getEnv(): DesktopEnv {
+	const desktop = process.env.XDG_CURRENT_DESKTOP ?? '';
+	const sessionType = process.env.XDG_SESSION_TYPE ?? '';
+	const upper = desktop.toUpperCase();
+	return {
+		desktop,
+		sessionType,
+		isWayland: sessionType === 'wayland',
+		isX11: sessionType === 'x11',
+		isKDE: upper.includes('KDE'),
+		isGNOME: upper.includes('GNOME'),
+		isSWAY: upper.includes('SWAY'),
+		isHYPR: upper.includes('HYPRLAND'),
+		isNIRI: upper.includes('NIRI'),
+		row: process.env.ROW ?? 'KDE-W',
+	};
+}
--- a/tools/test-harness/src/lib/host-claude.ts
+++ b/tools/test-harness/src/lib/host-claude.ts
@@ -0,0 +1,111 @@
+// Detect-and-kill any running Claude Desktop process owned by the
+// current user. Used before seeding a hermetic isolation from the
+// host config, because Cookies (SQLite) and Local Storage / IndexedDB
+// (LevelDB) all hold writer locks while the host app is running — a
+// naive cp would either copy a torn page or fail outright on the
+// LevelDB LOCK file.
+//
+// SIGTERM first, wait up to 5s for graceful exit, SIGKILL survivors.
+// Loud stderr output: the user needs to know we're force-quitting
+// their app so they can blame us, not Claude Desktop, when their
+// unsaved chat draft disappears.
+
+import { execFile } from 'node:child_process';
+import { promisify } from 'node:util';
+import { sleep } from './retry.js';
+
+const exec = promisify(execFile);
+
+// Patterns that match host installs (deb, rpm, AppImage, dev tree).
+// argv-based via `pgrep -f`: matches the installed binary path or
+// the mounted AppImage path. The harness's own launches always set
+// XDG_CONFIG_HOME to a tmpdir, so they wouldn't be confused with
+// the host even if the patterns overlapped — but kill runs BEFORE
+// our launch, so at this moment there's nothing of ours to confuse.
+const HOST_PROCESS_PATTERNS = [
+	'/usr/lib/claude-desktop/',
+	'/opt/Claude/',
+	'\\.mount_[Cc]laude',
+	'/usr/bin/claude-desktop',
+];
+
+// Per-pid graceful-exit budget. Electron flushes LevelDB + checkpoints
+// the SQLite WAL on SIGTERM; 5s covers a typical shutdown with margin.
+const SIGTERM_GRACE_MS = 5_000;
+const POLL_INTERVAL_MS = 200;
+
+interface HostProcess {
+	pid: number;
+	argv: string;
+}
+
+async function findHostProcesses(): Promise<HostProcess[]> {
+	const pattern = HOST_PROCESS_PATTERNS.join('|');
+	try {
+		const { stdout } = await exec('pgrep', ['-af', pattern]);
+		return stdout
+			.split('\n')
+			.filter(Boolean)
+			.map((line) => {
+				const space = line.indexOf(' ');
+				const pid = Number(space === -1 ? line : line.slice(0, space));
+				const argv = space === -1 ? '' : line.slice(space + 1);
+				return { pid, argv };
+			})
+			.filter((p) => Number.isFinite(p.pid) && p.pid !== process.pid);
+	} catch {
+		// pgrep returns 1 when nothing matches — happy path.
+		return [];
+	}
+}
+
+function isAlive(pid: number): boolean {
+	try {
+		// Signal 0: existence check, no signal delivered.
+		process.kill(pid, 0);
+		return true;
+	} catch {
+		return false;
+	}
+}
+
+export async function killHostClaude(): Promise<void> {
+	const procs = await findHostProcesses();
+	if (procs.length === 0) return;
+
+	process.stderr.write(
+		`host-claude: ${procs.length} running Claude process(es) found; ` +
+			'sending SIGTERM (auth-state seed needs writer-lock release):\n',
+	);
+	for (const { pid, argv } of procs) {
+		process.stderr.write(`  pid=${pid} ${argv.slice(0, 120)}\n`);
+		try {
+			process.kill(pid, 'SIGTERM');
+		} catch {
+			// Race: already exited between pgrep and now.
+		}
+	}
+
+	const deadline = Date.now() + SIGTERM_GRACE_MS;
+	while (Date.now() < deadline) {
+		if (!procs.some((p) => isAlive(p.pid))) return;
+		await sleep(POLL_INTERVAL_MS);
+	}
+
+	const survivors = procs.filter((p) => isAlive(p.pid));
+	if (survivors.length === 0) return;
+
+	process.stderr.write(
+		`host-claude: ${survivors.length} survived SIGTERM; sending SIGKILL:\n`,
+	);
+	for (const { pid } of survivors) {
+		process.stderr.write(`  pid=${pid}\n`);
+		try {
+			process.kill(pid, 'SIGKILL');
+		} catch {
+			// Race: already exited.
+		}
+	}
+	// Final beat so /proc entries clear before the seed copy starts.
+	await sleep(POLL_INTERVAL_MS);
+}
--- a/tools/test-harness/src/lib/input-niri.ts
+++ b/tools/test-harness/src/lib/input-niri.ts
@@ -0,0 +1,393 @@
+// Focus-shifter primitive for "Quick Entry shortcut fires from any
+// focus" (S14) on Niri sessions — the Wayland-native sibling of
+// lib/input.ts. The runner needs to (a) spawn a sacrificial window
+// with a known title, (b) shove keyboard focus to it, then (c) press
+// the global shortcut and observe whether the QE popup appears
+// regardless of focus.
+//
+// Niri only — by design.
+//   - There is no portable focus-injection on native Wayland. Each
+//     compositor exposes a different IPC: niri msg here, swaymsg for
+//     Sway, hyprctl for Hyprland, riverctl for River. The libei-based
+//     "input emulation" portal is the long-term cross-compositor
+//     answer but isn't widely deployed (KDE/GNOME are getting it,
+//     niri/sway/hypr are not yet). We pay one file per compositor
+//     until a second consumer surfaces the dispatcher need; a
+//     hypothetical lib/input-wayland.ts would just switch on
+//     XDG_CURRENT_DESKTOP and delegate. With only S14 consuming this,
+//     a dispatcher would be ceremony.
+//   - lib/input.ts (X11) and this file are independent: they don't
+//     share a focus-id type — niri window IDs are u64 numerics, X11
+//     WIDs are hex strings. Callers handle one or the other based on
+//     session detection; nothing crosses the boundary.
+//
+// Why niri msg --json over plain text: the niri wiki explicitly
+// contracts the JSON output as stable while the plain-text form is
+// described as unstable / human-readable-only. A test harness that
+// regex-greps human-readable IPC output is one niri release away
+// from a quiet break.
+//
+// Why we verify post-focus via niri msg focused-window: niri msg
+// action focus-window exits 0 even when the focus didn't actually
+// land (the action queues into the compositor and a competing input
+// event or a closing window can race it). The only honest answer is
+// to read focused-window back out and compare IDs. This mirrors
+// lib/input.ts's xprop-readback paragraph but for niri's IPC. ~3s
+// budget covers slow compositor paths; anything beyond is a refusal
+// not a slow ack — surface as an error so S14 sees it.
+//
+// Why foot for the marker terminal: it's the niri-default in many
+// distros (Fedora niri spin, several Arch derivatives), accepts
+// --title <T> verbatim with no de-escaping surprises, and ships in
+// most niri setups so a single binary covers the common case. We
+// deliberately don't fall back to alacritty / kitty — the X11
+// primitive uses xterm-only and the simplicity is worth more than
+// the marginal robustness; an environment without foot can install
+// it the same way an X11 environment without xterm installs xterm.
+//
+// Why detached:false on the marker spawn: keep the foot child in the
+// parent's process group so the OS cleans it up if the test crashes.
+// (Session 5 recon sketched detached:true; lib/input.ts uses
+// detached:false and is the safer pattern — a leaked terminal past a
+// crashed test run is worse than a marker that dies cleanly with its
+// parent.)
+//
+// No fixed sleeps. The verification poll uses retryUntil so a fast
+// compositor finishes in ~50ms while a slow one gets the full budget.
+
+import { execFile } from 'node:child_process';
+import { promisify } from 'node:util';
+import { retryUntil } from './retry.js';
+
+const exec = promisify(execFile);
+
+// Caller catches this and calls test.skip() — it's an environment
+// gap (not a Niri session, or niri msg not on PATH), not a
+// regression. Subclassing Error gives consumers a clean
+// `instanceof` check without parsing message strings.
+export class NiriIpcUnavailable extends Error {
+	constructor(message?: string) {
+		super(
+			message ??
+				'niri msg IPC unavailable: either this is not a Niri ' +
+					'session (XDG_CURRENT_DESKTOP !== "niri") or the ' +
+					'`niri` binary is missing from PATH. Install the ' +
+					'`niri-ipc` / `niri` package, or skip on this row.',
+		);
+		this.name = 'NiriIpcUnavailable';
+	}
+}
+
+// Mirrors lib/input.ts's XdotoolUnavailable — the install command is
+// the actually-useful part of the error. Consumers should usually
+// skip rather than fail; the absence of foot is an environment
+// configuration issue, not a Claude Desktop regression.
+export class FootUnavailable extends Error {
+	constructor(message?: string) {
+		super(
+			message ??
+				'foot binary not found on PATH. Install with ' +
+					'`dnf install foot` / `apt install foot`.',
+		);
+		this.name = 'FootUnavailable';
+	}
+}
+
+// Single source of truth for the Niri / not-Niri branch. Pure env
+// check, no process spawn — matches the simplicity of isX11Session()
+// in lib/input.ts. A `niri msg version` probe would be more
+// authoritative (catches the case where someone manually overrides
+// XDG_CURRENT_DESKTOP) but adds a fork-per-call cost that's
+// disproportionate to how rare the override is in practice.
+//
+// The literal string 'niri' is the value niri itself sets in
+// XDG_CURRENT_DESKTOP per its own documentation; we trust that and
+// nothing else (no case-folding, no startswith).
+export function isNiriSession(): boolean {
+	return process.env.XDG_CURRENT_DESKTOP === 'niri';
+}
+
+// Niri's --json output for several IPC calls is wrapped in a
+// Result-style envelope: `{"Ok": <payload>}`. Newer/older niri
+// versions sometimes return the bare payload. Defensively unwrap one
+// layer of `.Ok` if present, then return the payload as-is. Returns
+// null if the input is null/undefined.
+function unwrapOk(value: unknown): unknown {
+	if (value === null || value === undefined) return null;
+	if (typeof value === 'object' && value !== null && 'Ok' in value) {
+		return (value as { Ok: unknown }).Ok;
+	}
+	return value;
+}
+
+// Shape of a niri window row, restricted to the fields we use. The
+// real schema has more (workspace_id, is_floating, etc.) — we don't
+// commit to those.
+interface NiriWindow {
+	id: number;
+	title: string | null;
+	app_id: string | null;
+	is_focused?: boolean;
+}
+
+// Read the currently-focused niri window via `niri msg --json
+// focused-window`.
+//
+// Returns null on:
+//   - Non-Niri session (gated out by isNiriSession()).
+//   - niri binary missing / spawn ENOENT — analogous to lib/input.ts
+//     returning null on xprop spawn failure rather than throwing.
+//     focusOtherWindow's poll fails through to its own timeout.
+//   - JSON parse failure or unexpected shape (defensive — should
+//     not happen against a healthy niri but the cost of a null
+//     return is one re-poll).
+//   - No focused window (e.g. all workspaces empty).
+export async function getFocusedWindowId(): Promise<number | null> {
+	if (!isNiriSession()) return null;
+	let stdout: string;
+	try {
+		({ stdout } = await exec('niri', [
+			'msg',
+			'--json',
+			'focused-window',
+		]));
+	} catch {
+		return null;
+	}
+	const trimmed = stdout.trim();
+	if (!trimmed) return null;
+	let parsed: unknown;
+	try {
+		parsed = JSON.parse(trimmed);
+	} catch {
+		return null;
+	}
+	// Two known wrappings: `{Ok: {FocusedWindow: <window>}}` (older)
+	// and the bare window object (newer). Try unwrapping in order.
+	const okUnwrapped = unwrapOk(parsed);
+	let candidate: unknown = okUnwrapped;
+	if (
+		typeof okUnwrapped === 'object' &&
+		okUnwrapped !== null &&
+		'FocusedWindow' in okUnwrapped
+	) {
+		candidate = (okUnwrapped as { FocusedWindow: unknown }).FocusedWindow;
+	}
+	if (
+		typeof candidate !== 'object' ||
+		candidate === null ||
+		!('id' in candidate)
+	) {
+		return null;
+	}
+	const id = (candidate as { id: unknown }).id;
+	if (typeof id !== 'number' || !Number.isFinite(id)) return null;
+	return id;
+}
+
+// Resolve a window title to its niri ID via `niri msg --json
+// windows`. The list is `Vec<Window>`; we filter on title match AND
+// app_id !== 'Claude' so we never accidentally pick the test target
+// itself. Returns null on zero matches; returns the first match's
+// ID on multi-match (mirrors xdotool's first-match behavior in
+// lib/input.ts).
+async function resolveWindowIdByTitle(
+	title: string,
+): Promise<number | null> {
+	const { stdout } = await exec('niri', ['msg', '--json', 'windows']);
+	const trimmed = stdout.trim();
+	if (!trimmed) return null;
+	let parsed: unknown;
+	try {
+		parsed = JSON.parse(trimmed);
+	} catch {
+		return null;
+	}
+	// Same Ok-wrapping defense as getFocusedWindowId.
+	const unwrapped = unwrapOk(parsed);
+	if (!Array.isArray(unwrapped)) return null;
+	for (const row of unwrapped as NiriWindow[]) {
+		if (
+			row &&
+			typeof row === 'object' &&
+			typeof row.id === 'number' &&
+			row.title === title &&
+			row.app_id !== 'Claude'
+		) {
+			return row.id;
+		}
+	}
+	return null;
+}
+
+// Shift Niri focus to the first window whose title matches `title`
+// and whose app_id is not 'Claude' (so we never target Claude's own
+// window), then verify the shift actually took.
+//
+// Throws:
+//   - NiriIpcUnavailable when not a Niri session, or niri binary
+//     missing.
+//   - Plain Error when no window matches (caller's bug — forgot to
+//     spawn the marker, or used the wrong title).
+//   - Plain Error when niri msg returns 0 but focused-window never
+//     reflects the focus change within ~3s (compositor refused the
+//     activation; this is the diagnostic path S14 wants surfaced,
+//     not swallowed).
+export async function focusOtherWindow(title: string): Promise<void> {
+	if (!isNiriSession()) {
+		throw new NiriIpcUnavailable();
+	}
+
+	let targetId: number | null;
+	try {
+		targetId = await resolveWindowIdByTitle(title);
+	} catch (err) {
+		const e = err as { code?: string | number };
+		if (e.code === 'ENOENT') throw new NiriIpcUnavailable();
+		throw err;
+	}
+	if (targetId === null) {
+		throw new Error(
+			`focusOtherWindow: no Niri window matches title ${JSON.stringify(title)} ` +
+				'(with app_id != "Claude"). Did the marker window finish ' +
+				'mapping? Caller should await spawnMarkerWindow + a short ' +
+				'readiness poll before calling focusOtherWindow.',
+		);
+	}
+
+	try {
+		await exec('niri', [
+			'msg',
+			'action',
+			'focus-window',
+			'--id',
+			String(targetId),
+		]);
+	} catch (err) {
+		const e = err as { code?: string | number };
+		if (e.code === 'ENOENT') throw new NiriIpcUnavailable();
+		throw err;
+	}
+
+	const matched = await retryUntil(
+		async () => {
+			const active = await getFocusedWindowId();
+			return active === targetId ? true : null;
+		},
+		{ timeout: 3_000, interval: 100 },
+	);
+	if (!matched) {
+		throw new Error(
+			'focusOtherWindow: niri msg action focus-window returned 0 ' +
+				`but focused-window never settled to id=${targetId} ` +
+				`for title ${JSON.stringify(title)}. Compositor may have ` +
+				'refused the activation request.',
+		);
+	}
+}
+
+// Handle returned from spawnMarkerWindow. Lifecycle is owned by the
+// caller — the test that spawned it must kill() in afterEach (or
+// equivalent), otherwise the foot terminal leaks past the test run.
+export interface MarkerWindow {
+	pid: number;
+	title: string;
+	kill(): Promise<void>;
+}
+
+// Spawn a long-lived foot terminal with a known title, suitable as
+// a focus target on a Niri session. Backgrounded with detached:false
+// so the parent test process owns its lifetime — if the test
+// crashes, the OS cleans up the child when the parent dies.
+//
+// Throws FootUnavailable if foot isn't on PATH (both at spawn-throw
+// time AND via the 'error' event, mirroring lib/input.ts's redundant
+// ENOENT handling — Node delivers ENOENT through different paths
+// across versions).
+export async function spawnMarkerWindow(
+	title: string,
+): Promise<MarkerWindow> {
+	const { spawn } = await import('node:child_process');
+
+	let child;
+	try {
+		// `sleep 600` keeps the foot terminal alive for 10min — longer
+		// than any reasonable single test, short enough that a leaked
+		// terminal self-cleans within the sweep. foot's --title sets
+		// the window title field that niri's windows list reports.
+		child = spawn('foot', ['--title', title, '-e', 'sleep', '600'], {
+			detached: false,
+			stdio: 'ignore',
+		});
+	} catch (err) {
+		const e = err as { code?: string | number };
+		if (e.code === 'ENOENT') {
+			throw new FootUnavailable();
+		}
+		throw err;
+	}
+
+	const earlyError = await new Promise<Error | null>((resolve) => {
+		const onError = (err: Error) => {
+			child.removeListener('spawn', onSpawn);
+			resolve(err);
+		};
+		const onSpawn = () => {
+			child.removeListener('error', onError);
+			resolve(null);
+		};
+		child.once('error', onError);
+		child.once('spawn', onSpawn);
+	});
+	if (earlyError) {
+		const e = earlyError as Error & { code?: string | number };
+		if (e.code === 'ENOENT') {
+			throw new FootUnavailable();
+		}
+		throw earlyError;
+	}
+
+	const pid = child.pid;
+	if (typeof pid !== 'number') {
+		throw new Error(
+			'spawnMarkerWindow: child.pid was undefined after spawn',
+		);
+	}
+
+	let killed = false;
+	const kill = async (): Promise<void> => {
+		if (killed) return;
+		killed = true;
+		if (child.exitCode !== null || child.signalCode !== null) {
+			return;
+		}
+		// SIGTERM with a short grace period before SIGKILL. foot
+		// honors SIGTERM cleanly; the SIGKILL fallback is for the
+		// pathological "child wedged in a syscall" case.
+		const exited = new Promise<void>((resolve) => {
+			child.once('exit', () => resolve());
+		});
+		try {
+			child.kill('SIGTERM');
+		} catch {
+			// Process may have died between the check and the kill.
+		}
+		const graceMs = 500;
+		const timedOut = await Promise.race([
+			exited.then(() => false),
+			new Promise<boolean>((resolve) =>
+				setTimeout(() => resolve(true), graceMs),
+			),
+		]);
+		if (timedOut) {
+			try {
+				child.kill('SIGKILL');
+			} catch {
+				// Already dead.
+			}
+			await exited;
+		}
+	};
+
+	return { pid, title, kill };
+}
--- a/tools/test-harness/src/lib/input.ts
+++ b/tools/test-harness/src/lib/input.ts
@@ -0,0 +1,346 @@
+// Focus-shifter primitive for "Quick Entry shortcut fires from any focus"
+// (S11, S14). The runner needs to (a) spawn a sacrificial window with
+// a known title, (b) shove keyboard focus to it, then (c) press the
+// global shortcut and observe whether the QE popup appears regardless
+// of focus.
+//
+// X11 only — by design.
+//   - There is no portable focus-injection on native Wayland. Each
+//     compositor exposes its own IPC (swaymsg, kitten, hyprctl,
+//     niri msg) and the libei-based "input emulation" portal isn't
+//     universally honored. Rather than bake a per-compositor matrix
+//     into the harness, runners on native Wayland rows must skip
+//     this test entirely. WaylandFocusUnavailable is the signal.
+//   - Wayland-with-XWayland (KDE-W default, Ubu-W default, GNOME-W
+//     when XDG_SESSION_TYPE=x11 is forced) is *not* an X11 session
+//     for our purposes — the WAYLAND-SIDE windows xdotool can't see
+//     are exactly the windows S11/S14 care about. The single source
+//     of truth is XDG_SESSION_TYPE === 'x11'. Anything else: skip.
+//
+// Why xdotool over xprop+wmctrl-equivalent: xdotool ships
+// `search --name <regex> windowfocus` as one atomic call. Doing it
+// with raw xprop means walking _NET_CLIENT_LIST, fetching _NET_WM_NAME
+// per WID, picking a match, then sending an _NET_ACTIVE_WINDOW
+// ClientMessage — which xprop can't generate, only read. wmctrl can,
+// but adds a second binary dependency for no win.
+//
+// Why we verify post-focus via xprop: xdotool exits 0 even when
+// focus didn't actually shift. Some compositors (mutter under
+// XWayland-forced mode notably) accept the WM_TAKE_FOCUS / SetInputFocus
+// pair and then quietly refuse the activation. The only honest
+// answer is to read _NET_ACTIVE_WINDOW back out and compare WIDs.
+// xdotool prints decimal WIDs; xprop prints `0x...` hex. We
+// normalize to lowercase 0x-prefixed hex with leading zeros stripped.
+//
+// No fixed sleeps. The verification poll uses retryUntil so a fast
+// compositor finishes in ~50ms while a slow one gets the full budget.
+
+import { execFile } from 'node:child_process';
+import { promisify } from 'node:util';
+import { retryUntil } from './retry.js';
+
+const exec = promisify(execFile);
+
+// Caller catches this and calls test.skip() — it's an environment gap,
+// not a regression. Subclassing Error gives consumers a clean
+// `instanceof` check without parsing message strings.
+export class WaylandFocusUnavailable extends Error {
+	constructor(message?: string) {
+		super(
+			message ??
+				'focusOtherWindow: native Wayland session — no portable ' +
+					'focus-injection path. Skip on this row.',
+		);
+		this.name = 'WaylandFocusUnavailable';
+	}
+}
+
+// Mirrors quickentry.ts's ensureYdotool message style — the install
+// command is the actually-useful part of the error. Consumers should
+// usually skip rather than fail; the absence of xdotool is an
+// environment configuration issue, not a Claude Desktop regression.
+export class XdotoolUnavailable extends Error {
+	constructor(message?: string) {
+		super(
+			message ??
+				'xdotool binary not found on PATH. Install with ' +
+					'`dnf install xdotool` / `apt install xdotool`.',
+		);
+		this.name = 'XdotoolUnavailable';
+	}
+}
+
+// Single source of truth for the X11/Wayland branch. Every other
+// function in this file calls this — do not duplicate the env check.
+//
+// XDG_SESSION_TYPE is set by logind. Possible values per spec are
+// `x11`, `wayland`, `tty`, `mir`, `unspecified`. We only trust the
+// literal string `x11` — anything else, including missing, returns
+// false. That means an unset env var on a real X11 box returns false
+// here; that's the correct conservative default since we can't
+// verify the assumption.
+export function isX11Session(): boolean {
+	return process.env.XDG_SESSION_TYPE === 'x11';
+}
+
+// Normalize a WID to lowercase 0x-prefixed hex with leading zeros
+// stripped after the prefix. Accepts decimal (xdotool stdout) or hex
+// (xprop stdout, with or without 0x). Returns null on parse failure.
+//
+// Examples:
+//   '94371842'    → '0x5a00002'
+//   '0x05a00002'  → '0x5a00002'
+//   '0X5A00002'   → '0x5a00002'
+function normalizeWid(raw: string): string | null {
+	const s = raw.trim();
+	if (!s) return null;
+	const isHex = /^0x/i.test(s);
+	const n = isHex ? parseInt(s, 16) : parseInt(s, 10);
+	if (!Number.isFinite(n) || n <= 0) return null;
+	return '0x' + n.toString(16);
+}
+
+// Read the currently-focused X11 window via _NET_ACTIVE_WINDOW.
+//
+// Returns null on:
+//   - Native Wayland (xprop may still respond via XWayland but the
+//     value is meaningless for native-Wayland clients — they don't
+//     appear in the X11 active-window list at all). Returning null
+//     here lets focusOtherWindow's poll fail through to its own
+//     timeout, but in practice native-Wayland rows are gated out
+//     earlier by isX11Session().
+//   - xprop missing / spawn failure.
+//   - Output that doesn't match the documented format (defensive —
+//     this should never happen on a real EWMH-compliant WM but the
+//     cost of a null return is one re-poll).
+export async function getFocusedWindowId(): Promise<string | null> {
+	if (!isX11Session()) return null;
+	let stdout: string;
+	try {
+		({ stdout } = await exec('xprop', [
+			'-root',
+			'_NET_ACTIVE_WINDOW',
+		]));
+	} catch {
+		return null;
+	}
+	// Documented format:
+	//   _NET_ACTIVE_WINDOW(WINDOW): window id # 0x5a00002
+	const m = stdout.match(/window id #\s*(0x[0-9a-fA-F]+)/);
+	if (!m || !m[1]) return null;
+	return normalizeWid(m[1]);
+}
+
+// Resolve a window title to its WID via xdotool. xdotool prints one
+// decimal WID per matching line — we take the first (and warn via
+// thrown Error if there are zero matches; multi-match is silently
+// resolved to the first, mirroring xdotool's own windowfocus
+// behavior).
+async function resolveWindowIdByTitle(
+	title: string,
+): Promise<string | null> {
+	const { stdout } = await exec('xdotool', ['search', '--name', title]);
+	const lines = stdout
+		.split('\n')
+		.map((l) => l.trim())
+		.filter(Boolean);
+	if (lines.length === 0) return null;
+	const first = lines[0];
+	if (!first) return null;
+	return normalizeWid(first);
+}
+
+// Shift X11 focus to the first window whose title matches `title`,
+// then verify the shift actually took.
+//
+// Throws:
+//   - WaylandFocusUnavailable on native Wayland.
+//   - XdotoolUnavailable when xdotool isn't on PATH.
+//   - Plain Error when no window matches the title (caller's bug —
+//     forgot to spawn the marker, or used the wrong title).
+//   - Plain Error when xdotool reports success but xprop never
+//     reflects the focus change within ~3s (compositor refused the
+//     activation; this is the diagnostic path S11/S14 actually want
+//     to surface, not swallow).
+export async function focusOtherWindow(title: string): Promise<void> {
+	if (!isX11Session()) {
+		throw new WaylandFocusUnavailable();
+	}
+
+	// Resolve target WID first so we know what to verify against.
+	// Combining this with `windowfocus` would save a roundtrip but
+	// would also make the post-focus comparison impossible.
+	let targetWid: string | null;
+	try {
+		targetWid = await resolveWindowIdByTitle(title);
+	} catch (err) {
+		const e = err as { code?: string | number };
+		if (e.code === 'ENOENT') throw new XdotoolUnavailable();
+		throw err;
+	}
+	if (!targetWid) {
+		throw new Error(
+			`focusOtherWindow: no X11 window matches title ${JSON.stringify(title)}. ` +
+				'Did the marker window finish mapping? Caller should ' +
+				'await spawnMarkerWindow + a short readiness poll before ' +
+				'calling focusOtherWindow.',
+		);
+	}
+
+	// Send the focus request. xdotool's windowfocus issues a
+	// SetInputFocus, which is best-effort; the verify-via-xprop
+	// step below is the actual assertion.
+	try {
+		await exec('xdotool', ['search', '--name', title, 'windowfocus']);
+	} catch (err) {
+		const e = err as { code?: string | number };
+		if (e.code === 'ENOENT') throw new XdotoolUnavailable();
+		throw err;
+	}
+
+	// Poll _NET_ACTIVE_WINDOW until it matches the target. ~3s budget
+	// covers slow compositor activation paths (mutter cold-path is
+	// the worst observed, ~800ms). Anything beyond 3s is a refusal,
+	// not a slow ack — surface as an error so S11/S14 see it.
+	const matched = await retryUntil(
+		async () => {
+			const active = await getFocusedWindowId();
+			return active === targetWid ? true : null;
+		},
+		{ timeout: 3_000, interval: 100 },
+	);
+	if (!matched) {
+		throw new Error(
+			`focusOtherWindow: xdotool windowfocus returned 0 but ` +
+				`_NET_ACTIVE_WINDOW never settled to ${targetWid} ` +
+				`for title ${JSON.stringify(title)}. Compositor may ` +
+				'have refused the activation request.',
+		);
+	}
+}
+
+// Handle returned from spawnMarkerWindow. Lifecycle is owned by the
+// caller — the test that spawned it must kill() in afterEach (or
+// equivalent), otherwise the xterm leaks past the test run.
+export interface MarkerWindow {
+	pid: number;
+	title: string;
+	kill(): Promise<void>;
+}
+
+// Spawn a long-lived xterm with a known title, suitable as a focus
+// target. Backgrounded with detached:false so the parent test process
+// owns its lifetime — if the test crashes, the OS cleans up the child
+// when the parent dies.
+//
+// Why xterm: it's the lowest-common-denominator X11 terminal — every
+// X11 row has it (or can install it via the standard package). It
+// honors -title verbatim (no de-escaping surprises) and -e accepts
+// a single command without argv parsing quirks. Alternatives like
+// `xclock` / `xeyes` either don't accept arbitrary titles or are
+// missing on minimal Fedora installs.
+//
+// Throws if xterm isn't on PATH. Caller's responsibility to fall
+// back or skip; we don't carry an `XtermUnavailable` class because
+// the consumer decision tree is identical to "skip on missing
+// xdotool" and the message is self-explanatory.
+export async function spawnMarkerWindow(
+	title: string,
+): Promise<MarkerWindow> {
+	// Lazy import so the module loads cleanly on Wayland rows that
+	// never call this function. (Top-level imports of node:child_process
+	// are already paid for by execFile, so this is mostly stylistic.)
+	const { spawn } = await import('node:child_process');
+
+	let child;
+	try {
+		// `sleep 600` keeps the xterm alive for 10min — longer than
+		// any reasonable single test, short enough that a leaked
+		// xterm self-cleans within the sweep. -hold not used: we
+		// want the window to die when sleep dies.
+		child = spawn('xterm', ['-title', title, '-e', 'sleep', '600'], {
+			detached: false,
+			stdio: 'ignore',
+		});
+	} catch (err) {
+		const e = err as { code?: string | number };
+		if (e.code === 'ENOENT') {
+			throw new Error(
+				'xterm binary not found on PATH. Install with ' +
+					'`dnf install xterm` / `apt install xterm`. ' +
+					'Required by the focus-shift test path; consumers ' +
+					'should skip when this throws.',
+			);
+		}
+		throw err;
+	}
+
+	// Surface synchronous spawn failures (ENOENT on some Node
+	// versions arrives via the 'error' event, not the throw above).
+	const earlyError = await new Promise<Error | null>((resolve) => {
+		const onError = (err: Error) => {
+			child.removeListener('spawn', onSpawn);
+			resolve(err);
+		};
+		const onSpawn = () => {
+			child.removeListener('error', onError);
+			resolve(null);
+		};
+		child.once('error', onError);
+		child.once('spawn', onSpawn);
+	});
+	if (earlyError) {
+		const e = earlyError as Error & { code?: string | number };
+		if (e.code === 'ENOENT') {
+			throw new Error(
+				'xterm binary not found on PATH. Install with ' +
+					'`dnf install xterm` / `apt install xterm`.',
+			);
+		}
+		throw earlyError;
+	}
+
+	const pid = child.pid;
+	if (typeof pid !== 'number') {
+		// Shouldn't happen after a successful 'spawn' event, but
+		// the type system doesn't know that.
+		throw new Error('spawnMarkerWindow: child.pid was undefined after spawn');
+	}
+
+	let killed = false;
+	const kill = async (): Promise<void> => {
+		if (killed) return;
+		killed = true;
+		if (child.exitCode !== null || child.signalCode !== null) {
+			return; // already exited
+		}
+		// SIGTERM with a short grace period before SIGKILL. xterm
+		// honors SIGTERM cleanly; the SIGKILL fallback is for the
+		// pathological "child wedged in a syscall" case.
+		const exited = new Promise<void>((resolve) => {
+			child.once('exit', () => resolve());
+		});
+		try {
+			child.kill('SIGTERM');
+		} catch {
+			// Process may have died between the check and the kill.
+		}
+		const graceMs = 500;
+		const timedOut = await Promise.race([
+			exited.then(() => false),
+			new Promise<boolean>((resolve) =>
+				setTimeout(() => resolve(true), graceMs),
+			),
+		]);
+		if (timedOut) {
+			try {
+				child.kill('SIGKILL');
+			} catch {
+				// Already dead.
+			}
+			await exited;
+		}
+	};
+
+	return { pid, title, kill };
+}
--- a/tools/test-harness/src/lib/inspector.ts
+++ b/tools/test-harness/src/lib/inspector.ts
@@ -0,0 +1,328 @@
+// Node-inspector client for Electron's main process.
+//
+// Why this exists: the shipped Electron has an authenticated-CDP gate
+// (see lib/electron.ts) that exits the app whenever
+// --remote-debugging-port is on argv. The gate doesn't check --inspect /
+// SIGUSR1, so we can attach the Node inspector at runtime — same code
+// path as the in-app "Developer → Enable Main Process Debugger" menu.
+//
+// From the inspector we can evaluate arbitrary JS in the main process,
+// which gives us:
+//   - Electron API access (app, webContents, dialog, BrowserView)
+//   - Renderer access via webContents.executeJavaScript()
+//   - Main-process mocks (e.g. dialog.showOpenDialog for T17)
+//
+// Caveat: `BrowserWindow.getAllWindows()` returns 0 because frame-fix-
+// wrapper substitutes the BrowserWindow class and the substitution
+// breaks the static registry. Use `webContents.getAllWebContents()`
+// instead — that registry stays intact.
+
+interface PendingCall {
+	resolve: (value: unknown) => void;
+	reject: (err: Error) => void;
+	timer: ReturnType<typeof setTimeout>;
+}
+
+// CDP accessibility-tree node shape (subset). The full AX tree is a flat
+// array of these with parent/child links carried by id refs. We surface
+// the value-bearing fields the v7 walker + claudeai.ts page-objects
+// actually consume; remaining CDP fields (ignoredReasons,
+// frameId, …) are accessible via the string-keyed bag.
+export interface AxValue {
+	type: string;
+	value?: unknown;
+}
+export interface AxProperty {
+	name: string;
+	value: AxValue;
+}
+export interface AxNode {
+	nodeId: string;
+	parentId?: string;
+	childIds?: string[];
+	backendDOMNodeId?: number;
+	role?: { type: string; value: string };
+	name?: { type: string; value: string };
+	// AX state/relation properties (`haspopup`, `expanded`, `modal`,
+	// `checked`, `disabled`, …). claudeai.ts reads `haspopup` to
+	// discriminate menu-trigger buttons from action buttons that
+	// happen to share an accessible name.
+	properties?: AxProperty[];
+	ignored?: boolean;
+	[k: string]: unknown;
+}
+
+export class InspectorClient {
+	// why: 30s default for send() timeouts. "Slow but not stuck."
+	// Lower defaults break legitimately-slow operations like initial
+	// page-load on a cold app or a chunky DOM snapshot; higher defaults
+	// turn renderer-side hangs (blocked event loop, modal trapping focus,
+	// network-bound script stalled) into invisible silent freezes.
+	// Consumers can override per-call (timeoutMs arg) or per-instance
+	// (mutate InspectorClient.defaultTimeoutMs before instantiating).
+	static defaultTimeoutMs = 30000;
+
+	private ws: WebSocket;
+	private nextId = 0;
+	private pending = new Map<number, PendingCall>();
+	// Idempotency flag for close(). Runners + electron.ts close() may
+	// both call this on the same instance (intentionally — see
+	// electron.ts launchClaude tracking comment); the flag guarantees
+	// a second call is a true no-op rather than a redundant ws.close().
+	private closed = false;
+
+	private constructor(ws: WebSocket) {
+		this.ws = ws;
+		this.ws.addEventListener('message', (ev) => this.handleMessage(ev));
+	}
+
+	static async connect(port: number): Promise<InspectorClient> {
+		const meta = await fetch(`http://127.0.0.1:${port}/json/list`).then((r) =>
+			r.json(),
+		) as Array<{ webSocketDebuggerUrl: string }>;
+		if (!meta.length) {
+			throw new Error(`Inspector at ${port} has no debuggee`);
+		}
+		const url = meta[0]!.webSocketDebuggerUrl;
+		const ws = new WebSocket(url);
+		await new Promise<void>((resolve, reject) => {
+			ws.addEventListener('open', () => resolve(), { once: true });
+			ws.addEventListener(
+				'error',
+				(e) => reject(new Error(`inspector ws error: ${e.type}`)),
+				{ once: true },
+			);
+		});
+		const client = new InspectorClient(ws);
+		await client.send('Runtime.enable');
+		await client.send('Runtime.runIfWaitingForDebugger');
+		return client;
+	}
+
+	private handleMessage(ev: MessageEvent): void {
+		const msg = JSON.parse(typeof ev.data === 'string' ? ev.data : '{}') as {
+			id?: number;
+			error?: unknown;
+			result?: unknown;
+		};
+		if (msg.id !== undefined && this.pending.has(msg.id)) {
+			const { resolve, reject, timer } = this.pending.get(msg.id)!;
+			this.pending.delete(msg.id);
+			clearTimeout(timer);
+			if (msg.error) {
+				reject(new Error(JSON.stringify(msg.error)));
+			} else {
+				resolve(msg.result);
+			}
+		}
+	}
+
+	// why: every pending call gets a timer. When the renderer event loop
+	// is blocked (modal focus trap, network-bound script stalled, DOM
+	// snapshot too large) the CDP reply never arrives and the promise
+	// would hang forever. We reject with a clear "method=X" error and
+	// drop the pending entry (no leak), but we deliberately do NOT
+	// close the websocket — a single hung eval shouldn't tear down the
+	// connection; the next call may succeed.
+	send(
+		method: string,
+		params: Record<string, unknown> = {},
+		timeoutMs?: number,
+	): Promise<unknown> {
+		const id = ++this.nextId;
+		const ms = timeoutMs ?? InspectorClient.defaultTimeoutMs;
+		return new Promise((resolve, reject) => {
+			const timer = setTimeout(() => {
+				if (this.pending.delete(id)) {
+					reject(
+						new Error(
+							`inspector.send timed out after ${ms}ms (method=${method})`,
+						),
+					);
+				}
+			}, ms);
+			this.pending.set(id, { resolve, reject, timer });
+			this.ws.send(JSON.stringify({ id, method, params }));
+		});
+	}
+
+	// Evaluate an async expression in the main process; the expression body
+	// must end with `return X` (or set a value). Returns the JSON-parsed
+	// value. JSON-stringification inside the IIFE dodges the inspector's
+	// Promise-result deep-marshaling quirks (returnByValue produces empty
+	// objects for awaited Promise resolutions on this build).
+	//
+	// Bare `require` is NOT a global in the CDP eval scope — go through
+	// `process.mainModule.require('electron'|'node:fs'|…)` instead.
+	async evalInMain<T = unknown>(body: string, timeoutMs?: number): Promise<T> {
+		const expression =
+			'globalThis.__r = (async () => { ' +
+			'const __v = await (async () => { ' +
+			body +
+			' })(); ' +
+			'return JSON.stringify(__v === undefined ? null : __v); ' +
+			'})(); globalThis.__r;';
+		const result = (await this.send(
+			'Runtime.evaluate',
+			{
+				expression,
+				awaitPromise: true,
+				returnByValue: true,
+			},
+			timeoutMs,
+		)) as { result?: { value?: unknown }; exceptionDetails?: unknown };
+
+		if (result.exceptionDetails) {
+			throw new Error(
+				`evalInMain threw: ${JSON.stringify(result.exceptionDetails)}`,
+			);
+		}
+		const v = result.result?.value;
+		if (typeof v !== 'string') {
+			throw new Error(
+				`evalInMain expected JSON string, got ${JSON.stringify(result.result)}`,
+			);
+		}
+		return JSON.parse(v) as T;
+	}
+
+	// Convenience: evaluate JS in a specific webContents (renderer).
+	// `urlFilter` selects which webContents (substring match on getURL()).
+	async evalInRenderer<T = unknown>(
+		urlFilter: string,
+		js: string,
+		timeoutMs?: number,
+	): Promise<T> {
+		const escaped = JSON.stringify(js);
+		const result = await this.evalInMain<T>(
+			`
+			const { webContents } = process.mainModule.require('electron');
+			const all = webContents.getAllWebContents();
+			const target = all.find(w => w.getURL().includes(${JSON.stringify(urlFilter)}));
+			if (!target) {
+				throw new Error('no webContents matching: ${urlFilter.replace(/'/g, "\\'")}');
+			}
+			return await target.executeJavaScript(${escaped});
+		`,
+			timeoutMs,
+		);
+		return result;
+	}
+
+	// Query the renderer's full accessibility tree via Chrome DevTools
+	// Protocol's `Accessibility.getFullAXTree`. Reachable from main
+	// process JS (this client connects to Node's debugger, not Chromium's
+	// — but webContents.debugger gives us full CDP access from there).
+	//
+	// `urlFilter` selects which webContents to attach to (substring match
+	// on getURL()). Idempotent attach: re-using the same webContents
+	// across calls won't double-attach. Caller is responsible for AX
+	// cost — at large surfaces full-tree latency may be ≥100ms (see
+	// fingerprint-v7-plan.md "Open questions"); for those, use a
+	// scoped subtree query instead.
+	async getAccessibleTree(
+		urlFilter: string,
+		timeoutMs?: number,
+	): Promise<AxNode[]> {
+		const result = await this.evalInMain<{ nodes: AxNode[] }>(
+			`
+			const { webContents } = process.mainModule.require('electron');
+			const all = webContents.getAllWebContents();
+			const target = all.find(w => w.getURL().includes(${JSON.stringify(urlFilter)}));
+			if (!target) {
+				throw new Error('no webContents matching: ${urlFilter.replace(/'/g, "\\'")}');
+			}
+			if (!target.debugger.isAttached()) {
+				target.debugger.attach('1.3');
+			}
+			try {
+				await target.debugger.sendCommand('Accessibility.enable');
+			} catch (err) {
+				// Already-enabled is benign; surface anything else.
+				if (!String(err && err.message).includes('already enabled')) {
+					throw err;
+				}
+			}
+			const r = await target.debugger.sendCommand(
+				'Accessibility.getFullAXTree',
+			);
+			return r;
+		`,
+			timeoutMs,
+		);
+		return result.nodes;
+	}
+
+	// Resolve the AX-tree-supplied backendNodeId to a renderer-side
+	// JS object handle, then invoke `.click()` on it. This is the
+	// click-path counterpart to `getAccessibleTree`: capture identifies
+	// nodes by backendDOMNodeId, click consumes the same id without any
+	// selector reconstruction. `DOM.resolveNode` handles cross-frame
+	// nodes natively, and `Runtime.callFunctionOn` runs in the node's
+	// own execution context — so the click dispatches against the right
+	// document even when the target sits in an iframe.
+	async clickByBackendNodeId(
+		urlFilter: string,
+		backendNodeId: number,
+		timeoutMs?: number,
+	): Promise<void> {
+		await this.evalInMain<null>(
+			`
+			const { webContents } = process.mainModule.require('electron');
+			const all = webContents.getAllWebContents();
+			const target = all.find(w => w.getURL().includes(${JSON.stringify(urlFilter)}));
+			if (!target) {
+				throw new Error('no webContents matching: ${urlFilter.replace(/'/g, "\\'")}');
+			}
+			if (!target.debugger.isAttached()) {
+				target.debugger.attach('1.3');
+			}
+			const resolved = await target.debugger.sendCommand(
+				'DOM.resolveNode',
+				{ backendNodeId: ${backendNodeId} },
+			);
+			const objectId = resolved && resolved.object && resolved.object.objectId;
+			if (!objectId) {
+				throw new Error(
+					'clickByBackendNodeId: DOM.resolveNode returned no objectId for ' +
+						${backendNodeId},
+				);
+			}
+			try {
+				await target.debugger.sendCommand('Runtime.callFunctionOn', {
+					objectId,
+					functionDeclaration: 'function() { this.click(); }',
+				});
+			} finally {
+				try {
+					await target.debugger.sendCommand('Runtime.releaseObject', {
+						objectId,
+					});
+				} catch (_) {
+					// Releasing a stale handle is benign.
+				}
+			}
+			return null;
+		`,
+			timeoutMs,
+		);
+	}
+
+	close(): void {
+		if (this.closed) return;
+		this.closed = true;
+		// Drain pending timers + reject in-flight promises so callers
+		// don't hang on close. Without this an outstanding send() keeps
+		// the event loop alive past close().
+		for (const [, pending] of this.pending) {
+			clearTimeout(pending.timer);
+			pending.reject(new Error('inspector closed'));
+		}
+		this.pending.clear();
+		try {
+			this.ws.close();
+		} catch {
+			// already closed
+		}
+	}
+}
--- a/tools/test-harness/src/lib/isolation.ts
+++ b/tools/test-harness/src/lib/isolation.ts
@@ -0,0 +1,158 @@
+// Per-test config isolation.
+//
+// Decision 1 in docs/testing/automation.md calls for hermetic
+// XDG_CONFIG_HOME / CLAUDE_CONFIG_DIR per test (S19 is the underlying
+// primitive). Without it, persisted state leaks between tests:
+// SingletonLock from one run blocks the next; S35's saved
+// quickWindowPosition contaminates S29's closed-to-tray sanity; etc.
+//
+// Shape: each call to `createIsolation()` builds a fresh config root
+// under $TMPDIR/claude-test-<random>/ and returns the env vars to merge
+// into the spawned app, plus a teardown that removes the dir. Pass the
+// same handle to multiple `launchClaude({ isolation })` calls when a
+// test needs to launch the same app twice with shared state (e.g. S35
+// position-memory across restart).
+//
+// `seedFromHost: true` extends this for tests that need the host's
+// signed-in auth state (U01). The host directory itself stays
+// untouched after the kill+copy: the test runs hermetically against
+// a copy of just the auth-relevant files, and the tmpdir is rm -rf'd
+// on cleanup so secrets never persist past the test process.
+
+import { cp, mkdir, mkdtemp, rm, stat } from 'node:fs/promises';
+import { homedir, tmpdir } from 'node:os';
+import { join } from 'node:path';
+
+import { killHostClaude } from './host-claude.js';
+
+export interface Isolation {
+	configHome: string;
+	configDir: string;
+	cacheHome: string;
+	dataHome: string;
+	env: Record<string, string>;
+	cleanup(): Promise<void>;
+}
+
+export interface CreateIsolationOptions {
+	// When true: kill any running host Claude (LevelDB / SQLite hold
+	// writer locks while it runs), then copy the auth-relevant subset
+	// of $XDG_CONFIG_HOME/Claude into the new configDir. The host
+	// config never gets mutated by the test; secrets never leave the
+	// per-launch tmpdir.
+	seedFromHost?: boolean;
+}
+
+// Allowlist of relative paths under ~/.config/Claude/ that carry auth
+// or first-launch UI state. Everything else is deliberately
+// regenerated fresh in the tmpdir:
+//   - Cache/, Code Cache/, GPUCache/, Dawn*Cache/  — cheap to rebuild
+//   - blob_storage/, Crashpad/, logs/             — irrelevant to auth
+//   - SingletonLock, SingletonCookie, SingletonSocket — block startup
+//   - .org.chromium.Chromium.*                    — host-specific lock turds
+//   - claude-code-sessions/, claude-code-vm/, local-agent-mode-sessions/
+//     — large, account-specific, not needed for renderer auth
+//
+// Cookies + Local State are the auth-cookie pair (the latter holds
+// the os_crypt key wrapper on platforms that need it). IndexedDB +
+// Local Storage hold the renderer-side auth context that claude.ai's
+// route guards check before redirecting to /login — cookies alone
+// leave you bouncing back to login.
+const SEED_PATHS = [
+	'Cookies',
+	'Cookies-journal',
+	'Local State',
+	'Local Storage',
+	'IndexedDB',
+	'Session Storage',
+	'WebStorage',
+	'SharedStorage',
+	'Network Persistent State',
+	'config.json',
+	'claude_desktop_config.json',
+	'developer_settings.json',
+];
+
+async function exists(path: string): Promise<boolean> {
+	try {
+		await stat(path);
+		return true;
+	} catch {
+		return false;
+	}
+}
+
+async function seedAuthFromHost(targetConfigDir: string): Promise<void> {
+	const hostConfigHome =
+		process.env.XDG_CONFIG_HOME ?? join(homedir(), '.config');
+	const hostClaudeDir = join(hostConfigHome, 'Claude');
+
+	if (!(await exists(hostClaudeDir))) {
+		throw new Error(
+			`seedFromHost: host config dir not found at ${hostClaudeDir}. ` +
+				'Sign into Claude Desktop on this machine first, then re-run.',
+		);
+	}
+
+	await mkdir(targetConfigDir, { recursive: true });
+
+	let copied = 0;
+	for (const rel of SEED_PATHS) {
+		const src = join(hostClaudeDir, rel);
+		if (!(await exists(src))) continue;
+		const dst = join(targetConfigDir, rel);
+		await cp(src, dst, {
+			recursive: true,
+			preserveTimestamps: true,
+			errorOnExist: false,
+		});
+		copied++;
+	}
+
+	if (copied === 0) {
+		throw new Error(
+			`seedFromHost: ${hostClaudeDir} exists but contains none of the ` +
+				'expected auth files. Open Claude Desktop, sign in, fully close, ' +
+				'and re-run.',
+		);
+	}
+}
+
+export async function createIsolation(
+	opts: CreateIsolationOptions = {},
+): Promise<Isolation> {
+	const root = await mkdtemp(join(tmpdir(), 'claude-test-'));
+	const configHome = join(root, 'config');
+	const configDir = join(configHome, 'Claude');
+	const cacheHome = join(root, 'cache');
+	const dataHome = join(root, 'data');
+
+	if (opts.seedFromHost) {
+		// Order matters: kill before copy. While the host app runs,
+		// LevelDB holds a LOCK file in IndexedDB/Local Storage that
+		// makes the directory unreadable to a second process, and
+		// SQLite Cookies has WAL pages that may not be checkpointed.
+		await killHostClaude();
+		await seedAuthFromHost(configDir);
+	}
+
+	const env: Record<string, string> = {
+		XDG_CONFIG_HOME: configHome,
+		XDG_CACHE_HOME: cacheHome,
+		XDG_DATA_HOME: dataHome,
+		// CLAUDE_CONFIG_DIR is honored by launcher-common.sh and by
+		// the app itself for picking the persisted-settings location.
+		CLAUDE_CONFIG_DIR: configDir,
+	};
+
+	return {
+		configHome,
+		configDir,
+		cacheHome,
+		dataHome,
+		env,
+		async cleanup() {
+			await rm(root, { recursive: true, force: true });
+		},
+	};
+}
--- a/tools/test-harness/src/lib/name-classifier.ts
+++ b/tools/test-harness/src/lib/name-classifier.ts
@@ -0,0 +1,150 @@
+// Name-classifier vocabulary + instance-shape registry. The v7 walker
+// (Phase 2) consumes this to decide whether a captured accessible-name
+// is stable copy ("Search", "Send"), instance-shaped ("AWAaddrick·Max",
+// "Today+12"), or unknown copy that needs human triage. The vocabulary
+// `stable` / `suspect` arrays are derived from a prior inventory walk
+// by `explore/derive-vocabulary.ts` and re-derived on each major
+// upstream release.
+//
+// First-match-wins ordering: more specific shapes go before general
+// ones so e.g. a model-version pattern hits before a generic
+// title-case-words pattern.
+
+export interface InstanceShape {
+	id: string;
+	regex: RegExp;
+	// Canonical pattern recorded into the v7 fingerprint's NameMatcher
+	// when this shape matches. Null on shapes that should *not*
+	// contribute a regex matcher — those entries fall through to
+	// `kind: instance` ancestor-presence checks at resolve time.
+	pattern: string | null;
+}
+
+export const INSTANCE_SHAPES: readonly InstanceShape[] = [
+	// Plan badge — `<handle>·<tier>` with optional trailing PUA glyph
+	// (Claude Desktop ships private-area font icons as the badge
+	// ornament; e.g. AWAaddrick·Max).
+	{
+		id: 'plan-badge',
+		regex: /^.+·(Free|Pro|Max|Team|Enterprise)[-\s]*$/u,
+		pattern: '\\w+·(Free|Pro|Max|Team|Enterprise)',
+	},
+	// Model-version names. Stable across users, versioned across
+	// releases — recording as a pattern lets a re-walked inventory
+	// keep resolving when upstream bumps "Opus 4.7" → "Opus 4.8".
+	{ id: 'opus-version', regex: /^Opus \d/, pattern: '^Opus \\d' },
+	{ id: 'sonnet-version', regex: /^Sonnet \d/, pattern: '^Sonnet \\d' },
+	{ id: 'haiku-version', regex: /^Haiku \d/, pattern: '^Haiku \\d' },
+	// Usage / quota percentage suffix ("Usage: plan 11%").
+	{ id: 'percentage', regex: /\d{1,3}%$/, pattern: '\\d{1,3}%' },
+	// Relative date a list row often appends to a title ("Untitled
+	// conversationToday+12"). The shape includes an optional `+N`
+	// counter for collapsed-instance groupings.
+	{
+		id: 'relative-date',
+		regex:
+			/(Today|Yesterday|\d+\s(day|hour|minute|second|week|month|year)s?\sago)/,
+		pattern:
+			'(Today|Yesterday|\\d+\\s(day|hour|minute|second|week|month|year)s?\\sago)(\\+\\d+)?',
+	},
+	// File / quota size suffix ("1.5 GB").
+	{
+		id: 'size-with-unit',
+		regex: /^\d+\.\d+\s\w+/,
+		pattern: '^\\d+\\.\\d+\\s\\w+',
+	},
+	// User handle prefix ("@aaddrick").
+	{ id: 'user-handle', regex: /@\w+/, pattern: '@\\w+' },
+	// Cowork session row in the sidebar. Names are status-prefixed
+	// session titles ("Idle Review PR 555…", "Awaiting input Plan
+	// automated testing strategy…", "Pull request merged Review issue
+	// 373"). The status enum is bounded; the title varies per session.
+	// Recording as a pattern lets the v7 instance-collapse fold the
+	// whole sidebar list into one representative entry — without this
+	// shape the title classifies as `suspect` (or `stable` if literal-
+	// matching once) and each session is captured + drilled
+	// individually. Placed before `long-title` so the more specific
+	// shape wins (long-title returns `pattern: null`, which loses
+	// account-portability for these rows).
+	{
+		id: 'cowork-session',
+		regex:
+			/^(Idle|Ready|Working|Awaiting input|Pull request merged|Done|Failed|Cancelled)\s/,
+		pattern:
+			'^(Idle|Ready|Working|Awaiting input|Pull request merged|Done|Failed|Cancelled)\\s',
+	},
+	// Per-row action triggers in list-row contexts. Claude.ai exposes a
+	// "⋮" menu next to each cowork session / conversation row with an
+	// aria-label `More options for <row title>` — one button per row.
+	// Without this shape the per-row title makes each button literally
+	// unique, so each gets its own stable entry and the BFS drills
+	// every one. With the shape they collapse to a single representative
+	// per surface, mirroring the cowork-session row collapse above.
+	{
+		id: 'row-more-options',
+		regex: /^More options for /,
+		pattern: '^More options for ',
+	},
+	// 3+ word title-case prose. No pattern recorded — the title is
+	// per-conversation, not a recurring shape, so the resolver should
+	// fall back to ancestor-presence rather than try to match the
+	// literal text.
+	{
+		id: 'long-title',
+		regex: /^[A-Z][a-z]+ [A-Z][a-z]+ [a-z]/,
+		pattern: null,
+	},
+] as const;
+
+export type NameClass = 'stable' | 'instance' | 'positional' | 'suspect';
+
+export interface NameClassification {
+	kind: NameClass;
+	// Present iff `kind === 'instance'`.
+	shapeId?: string;
+	// Present iff `kind === 'instance'`. Null when the matched shape
+	// has no canonical regex (e.g. long-title) — caller should drop the
+	// name from the fingerprint and rely on ariaPath + ancestor
+	// presence.
+	pattern?: string | null;
+}
+
+export interface Vocabulary {
+	stable: ReadonlySet<string>;
+	suspect: ReadonlySet<string>;
+}
+
+// classifyName decides how a captured accessible-name should be
+// matched at resolve time. Priority order tracks the v7 plan's "Name
+// classifier" §:
+//   1. Empty / whitespace → 'positional' (no usable name)
+//   2. Matches an instance-shape regex → 'instance' + shapeId
+//   3. Present in vocabulary.stable → 'stable'
+//   4. Default → 'suspect' (treated as stable by the walker but
+//      surfaced for reconciliation review)
+//
+// The list-row-child rule from the plan ('option/listitem inside
+// listbox/list' → 'instance') depends on ariaPath context the
+// classifier doesn't have access to here. The walker checks that
+// condition before calling classifyName.
+export function classifyName(
+	name: string | null,
+	vocabulary: Vocabulary,
+): NameClassification {
+	if (name === null || name.trim() === '') {
+		return { kind: 'positional' };
+	}
+	for (const shape of INSTANCE_SHAPES) {
+		if (shape.regex.test(name)) {
+			return {
+				kind: 'instance',
+				shapeId: shape.id,
+				pattern: shape.pattern,
+			};
+		}
+	}
+	if (vocabulary.stable.has(name)) {
+		return { kind: 'stable' };
+	}
+	return { kind: 'suspect' };
+}
--- a/tools/test-harness/src/lib/quickentry.ts
+++ b/tools/test-harness/src/lib/quickentry.ts
@@ -0,0 +1,656 @@
+// Quick Entry domain wrapper — single point of coupling to upstream's
+// main-process structure for QE-* tests.
+//
+// Why centralize: upstream symbol names (Ko for popup, ut for main, h1
+// for the visibility check) drift between releases per CLAUDE.md's
+// "Working with Minified JavaScript" notes. If this lookup logic lives
+// in 12 separate spec files, every release becomes a 12-file fix. If
+// it lives here, it's one fix.
+//
+// Discovery strategy: don't rely on minified symbol names. Use shape:
+//   - Popup webContents = the new entry that appears after the shortcut
+//     fires (snapshot/diff pattern).
+//   - Popup BrowserWindow = the only one constructed with
+//     transparent: true && alwaysOnTop: true.
+//   - Main BrowserWindow = the one whose webContents URL contains
+//     "claude.ai".
+//
+// Shortcut injection: ydotool through /dev/uinput. Works on X11,
+// XWayland, and native Wayland with portal-grabbed shortcuts (KDE-W,
+// Ubu-W, KDE-X). Does NOT work where the OS-level grab itself is broken
+// (#404 GNOME-W) — that's the test, not a tool gap. Tests that need
+// the popup to be open *without* exercising the OS shortcut grab call
+// `installInterceptor()` first to stash a popup-constructor ref via
+// BrowserWindow construction-time capture, then... we still need a
+// trigger. For the closeout sweep the assumption is ydotool is present
+// and the OS grab works on the row under test. S11/S12 explicitly test
+// the grab path; everything else assumes it.
+
+import { execFile } from 'node:child_process';
+import { readFile } from 'node:fs/promises';
+import { homedir } from 'node:os';
+import { join } from 'node:path';
+import { promisify } from 'node:util';
+import type { InspectorClient } from './inspector.js';
+import { retryUntil, sleep } from './retry.js';
+
+const exec = promisify(execFile);
+
+export interface WebContentsInfo {
+	id: number;
+	url: string;
+}
+
+export interface BrowserWindowState {
+	visible: boolean;
+	minimized: boolean;
+	fullScreen: boolean;
+	focused: boolean;
+	bounds: { x: number; y: number; width: number; height: number };
+}
+
+// Linux key codes for the upstream default Ctrl+Alt+Space accelerator.
+// Override via constructor option for tests that exercise a remapped
+// shortcut.
+const DEFAULT_KEY_SEQUENCE = [
+	'29:1', // LEFTCTRL down
+	'56:1', // LEFTALT  down
+	'57:1', // SPACE    down
+	'57:0', // SPACE    up
+	'56:0', // LEFTALT  up
+	'29:0', // LEFTCTRL up
+];
+
+export class QuickEntry {
+	constructor(
+		private readonly inspector: InspectorClient,
+		private readonly keySeq: string[] = DEFAULT_KEY_SEQUENCE,
+	) {}
+
+	// Capture BrowserWindow refs by hooking prototype methods, not the
+	// constructor.
+	//
+	// Why prototype-level: scripts/frame-fix-wrapper.js returns the
+	// electron module wrapped in a Proxy whose `get` trap returns a
+	// closure-captured PatchedBrowserWindow. A constructor-level wrap
+	// (`electron.BrowserWindow = Wrapped`) writes to the underlying
+	// module but the Proxy keeps returning PatchedBrowserWindow on
+	// reads, so the wrap is bypassed entirely. Hooking
+	// `BrowserWindow.prototype.loadFile` instead captures every
+	// instance regardless of which subclass it was constructed
+	// through — Patched, frame-fix-wrapped, or plain.
+	//
+	// The popup is identified by its loadFile target:
+	// `.vite/renderer/quick_window/quick-window.html`
+	// (build-reference index.js:515443).
+	async installInterceptor(): Promise<void> {
+		await this.inspector.evalInMain<null>(`
+			if (globalThis.__qeInterceptorInstalled) return null;
+			const electron = process.mainModule.require('electron');
+			const proto = electron.BrowserWindow.prototype;
+			globalThis.__qeWindows = [];
+			const origLoadFile = proto.loadFile;
+			proto.loadFile = function(filePath, ...rest) {
+				try {
+					const url = String(filePath || '');
+					globalThis.__qeWindows.push({
+						ref: this,
+						loadedFile: url,
+					});
+				} catch (e) { /* recording must never throw */ }
+				return origLoadFile.call(this, filePath, ...rest);
+			};
+			const origLoadURL = proto.loadURL;
+			proto.loadURL = function(url, ...rest) {
+				try {
+					globalThis.__qeWindows.push({
+						ref: this,
+						loadedFile: String(url || ''),
+					});
+				} catch (e) {}
+				return origLoadURL.call(this, url, ...rest);
+			};
+			globalThis.__qeInterceptorInstalled = true;
+			return null;
+		`);
+	}
+
+	// The popup is the BrowserWindow whose loadFile target ends with
+	// `quick-window.html`. Stable path — upstream uses it verbatim
+	// (build-reference index.js:515443).
+	private popupSelector(): string {
+		return `(w => {
+			if (!w || !w.ref || w.ref.isDestroyed()) return false;
+			const f = String(w.loadedFile || '');
+			return f.indexOf('quick-window.html') !== -1
+				|| f.indexOf('quick_window/') !== -1;
+		})`;
+	}
+
+	async listWebContents(): Promise<WebContentsInfo[]> {
+		return await this.inspector.evalInMain<WebContentsInfo[]>(`
+			const { webContents } = process.mainModule.require('electron');
+			return webContents.getAllWebContents().map(w => ({
+				id: w.id, url: w.getURL(),
+			}));
+		`);
+	}
+
+	// Find the popup by elimination: not the main shell (file:// chrome)
+	// and not the embedded claude.ai BrowserView.
+	async getPopupWebContents(): Promise<WebContentsInfo | null> {
+		const all = await this.listWebContents();
+		const popup = all.find((w) => isPopupUrl(w.url));
+		return popup ?? null;
+	}
+
+	// Send the configured accelerator via ydotool. Errors out (caller
+	// can catch + skip) if ydotool isn't on PATH.
+	//
+	// YDOTOOL_SOCKET is honored from the parent env; defaults to
+	// /tmp/.ydotool_socket (the path the shipped systemd unit uses
+	// after the override drop-in). Without YDOTOOL_SOCKET, the client
+	// probes /run/user/$UID/.ydotool_socket — a location the daemon
+	// doesn't bind to, so the call fails confusingly.
+	async openViaShortcut(): Promise<void> {
+		await ensureYdotool();
+		await exec('ydotool', ['key', ...this.keySeq], {
+			env: {
+				...process.env,
+				YDOTOOL_SOCKET:
+					process.env.YDOTOOL_SOCKET ?? '/tmp/.ydotool_socket',
+			} as Record<string, string>,
+		});
+	}
+
+	// openViaShortcut + waitForPopupReady, with retry for the
+	// upstream-only-shows-when-logged-in race (build-reference
+	// index.js:515604: `function lHn() { return !user.isLoggedOut; }`).
+	// On a fresh launch, the renderer URL flips past /login before
+	// the main-process user object is populated; the first shortcut
+	// constructs the popup but skips show(). A second shortcut after
+	// a brief settle hits the populated-user path. Total budget is
+	// `attempts * (perAttemptMs + retryDelayMs)`.
+	async openAndWaitReady(opts: {
+		attempts?: number;
+		perAttemptMs?: number;
+		retryDelayMs?: number;
+	} = {}): Promise<void> {
+		const attempts = opts.attempts ?? 3;
+		const perAttemptMs = opts.perAttemptMs ?? 8_000;
+		const retryDelayMs = opts.retryDelayMs ?? 1_500;
+		let lastErr: unknown = null;
+		for (let i = 0; i < attempts; i++) {
+			await this.openViaShortcut();
+			try {
+				await this.waitForPopupReady(perAttemptMs);
+				return;
+			} catch (err) {
+				lastErr = err;
+				if (i < attempts - 1) await sleep(retryDelayMs);
+			}
+		}
+		throw new Error(
+			`openAndWaitReady: popup never became ready after ${attempts} ` +
+				`shortcut presses. Last error: ` +
+				(lastErr instanceof Error ? lastErr.message : String(lastErr)),
+		);
+	}
+
+	// Wait for the popup webContents to appear after openViaShortcut().
+	async waitForPopup(timeoutMs = 5000): Promise<WebContentsInfo> {
+		const wc = await retryUntil(
+			async () => this.getPopupWebContents(),
+			{ timeout: timeoutMs, interval: 100 },
+		);
+		if (!wc) {
+			throw new Error(
+				`Quick Entry popup webContents did not appear within ${timeoutMs}ms`,
+			);
+		}
+		return wc;
+	}
+
+	// Wait for the popup to become hidden (the upstream "submit
+	// accepted" signal). Upstream reuses the popup BrowserWindow
+	// across invocations — Ko stays alive, only the visibility
+	// toggles — so checking webContents existence would never
+	// resolve. Read isVisible() on the captured BrowserWindow ref
+	// instead.
+	async waitForPopupClosed(timeoutMs = 5000): Promise<void> {
+		const closed = await retryUntil(
+			async () => {
+				const state = await this.getPopupState();
+				if (!state) return true; // destroyed → closed
+				return state.visible ? null : true;
+			},
+			{ timeout: timeoutMs, interval: 100 },
+		);
+		if (!closed) {
+			throw new Error(
+				`Quick Entry popup did not become hidden within ${timeoutMs}ms`,
+			);
+		}
+	}
+
+	// Read live properties of the popup BrowserWindow. Replaces the
+	// previous getPopupConstructionArgs — construction-time options
+	// aren't observable through the prototype-method hook, but every
+	// upstream-relevant signal has a runtime equivalent. Frame state
+	// uses `getContentBounds() vs getBounds()` (frameless windows
+	// have equal content + frame bounds). Transparent uses the
+	// background color (popup is `#00000000`).
+	async getPopupRuntimeProps(): Promise<{
+		frameless: boolean;
+		transparent: boolean;
+		alwaysOnTop: boolean;
+		backgroundColor: string;
+	} | null> {
+		// `skipTaskbar` was previously reported here but BrowserWindow
+		// has no isSkipTaskbar() getter; the field hardcoded `false`
+		// regardless of how the popup was constructed, which is
+		// misleading. Dropped — no current spec consumes it. If a
+		// future test needs it, capture via a setSkipTaskbar wrap in
+		// installInterceptor() rather than faking a getter.
+		return await this.inspector.evalInMain(`
+			const wins = globalThis.__qeWindows || [];
+			const isPopup = ${this.popupSelector()};
+			const popup = wins.find(isPopup);
+			if (!popup || !popup.ref || popup.ref.isDestroyed()) return null;
+			const w = popup.ref;
+			const bounds = w.getBounds();
+			const content = w.getContentBounds();
+			const bg = (w.getBackgroundColor && w.getBackgroundColor()) || '';
+			return {
+				frameless: bounds.width === content.width && bounds.height === content.height,
+				transparent: bg === '#00000000' || bg === '#0000',
+				alwaysOnTop: w.isAlwaysOnTop(),
+				backgroundColor: bg,
+			};
+		`);
+	}
+
+	// Read the popup BrowserWindow's runtime visibility / bounds /
+	// focus / fullscreen state. Used by waitForPopupReady and
+	// waitForPopupClosed; the popup is reused across invocations
+	// (Ko stays alive, only visibility toggles), so isVisible() is
+	// the right "open vs closed" signal — not webContents existence.
+	async getPopupState(): Promise<(BrowserWindowState & { alwaysOnTop: boolean }) | null> {
+		return await this.inspector.evalInMain(`
+			const wins = globalThis.__qeWindows || [];
+			const isPopup = ${this.popupSelector()};
+			const popup = wins.find(isPopup);
+			if (!popup || !popup.ref || popup.ref.isDestroyed()) return null;
+			const w = popup.ref;
+			return {
+				visible: w.isVisible(),
+				minimized: w.isMinimized(),
+				fullScreen: w.isFullScreen(),
+				focused: w.isFocused(),
+				bounds: w.getBounds(),
+				alwaysOnTop: w.isAlwaysOnTop(),
+			};
+		`);
+	}
+
+	// Wait for the popup to be fully ready for input — meaning:
+	//   (a) BrowserWindow has been show()n (isVisible === true),
+	//       which only fires after upstream's `ready-to-show` event,
+	//       which is after React's mount + first-pass effects, which
+	//       is when document.addEventListener('keydown', ...) gets
+	//       attached;
+	//   (b) the textarea exists in the DOM.
+	// Without (a), first-time-mount typing fires keydown into a
+	// document with no listener and the submit silently drops.
+	async waitForPopupReady(timeoutMs = 5000): Promise<void> {
+		const popup = await this.waitForPopup(timeoutMs);
+		let lastState: unknown = null;
+		const ready = await retryUntil(
+			async () => {
+				const state = await this.getPopupState();
+				const dom = await this.inspector
+					.evalInMain<{
+						readyState: string;
+						hasTextarea: boolean;
+					} | null>(
+						`
+							const { webContents } = process.mainModule.require('electron');
+							const wc = webContents.fromId(${popup.id});
+							if (!wc || wc.isDestroyed()) return null;
+							return await wc.executeJavaScript(\`(() => ({
+								readyState: document.readyState,
+								hasTextarea: !!(document.querySelector('textarea')
+									|| document.querySelector('[contenteditable="true"]')),
+							}))()\`);
+						`,
+					)
+					.catch(() => null);
+				lastState = { state, dom };
+				if (!state || !state.visible) return null;
+				return dom && dom.hasTextarea ? dom : null;
+			},
+			{ timeout: timeoutMs, interval: 100 },
+		);
+		if (!ready) {
+			throw new Error(
+				`Popup did not become visible with a textarea within ${timeoutMs}ms. ` +
+					`Last observed: ${JSON.stringify(lastState)}`,
+			);
+		}
+	}
+
+	// Type a prompt into the popup's textarea and submit. The popup is
+	// a React app with a textarea + send button; React tracks input
+	// values via a private setter, so plain `el.value = ...` is ignored.
+	// The native-setter dance below is the standard React-friendly path.
+	//
+	// Waits for the textarea to exist before dispatching — first-time
+	// lazy popup creation needs the React mount to complete, otherwise
+	// the input event lands before any state listener and upstream
+	// drops the submit as empty.
+	async typeAndSubmit(text: string): Promise<void> {
+		await this.waitForPopupReady();
+		const popup = await this.getPopupWebContents();
+		if (!popup) throw new Error('popup vanished after waitForPopupReady');
+		const popupId = popup.id;
+		await this.inspector.evalInMain<null>(`
+			const { webContents } = process.mainModule.require('electron');
+			const wc = webContents.fromId(${popupId});
+			if (!wc) throw new Error('popup webContents ${popupId} gone');
+			await wc.executeJavaScript(${JSON.stringify(typeAndSubmitJs(text))});
+			return null;
+		`);
+	}
+
+	// Read the persisted popup position (S35) directly from the
+	// on-disk store. electron-store defaults to `config.json` under the
+	// app's userData dir; for claude-desktop that's
+	// `${configDir}/Claude/config.json` (or `~/.config/Claude/...`
+	// when no isolation is in play). Reading the file beats the
+	// previous globalThis-walk: that probe matched any object with
+	// .get/.set returning a `quickWindowPosition` value, which is
+	// fragile against unrelated minified objects coincidentally
+	// matching the shape.
+	//
+	// Optional `configDir` keeps the call backward-compatible — pass
+	// `app.isolation?.configDir` from runners under per-test isolation,
+	// omit it to fall back to the host's `~/.config/Claude`.
+	async getStoredPosition(configDir?: string): Promise<unknown | null> {
+		const storePath = configDir
+			? join(configDir, 'config.json')
+			: join(homedir(), '.config/Claude/config.json');
+		try {
+			const raw = await readFile(storePath, 'utf8');
+			const parsed = JSON.parse(raw) as { quickWindowPosition?: unknown };
+			return parsed.quickWindowPosition ?? null;
+		} catch {
+			// File missing (never saved) or unreadable — both null.
+			return null;
+		}
+	}
+}
+
+// Upstream loads the popup via
+//   loadFile('.vite/renderer/quick_window/quick-window.html')
+// (build-reference index.js:515443). Anchor on that exact path. Fall
+// back to a broader 'quick_window/' substring if upstream renames just
+// the HTML file.
+export function isPopupUrl(url: string): boolean {
+	if (!url.startsWith('file://')) return false;
+	if (url.includes('claude.ai')) return false;
+	if (url.includes('quick_window/quick-window.html')) return true;
+	if (url.includes('/quick_window/')) return true;
+	return false;
+}
+
+// React-friendly value setter. document.activeElement isn't reliable
+// because the popup may not have focus on construction; we walk the
+// DOM for the only textarea (or contenteditable).
+function typeAndSubmitJs(text: string): string {
+	const escaped = JSON.stringify(text);
+	return `
+		(async () => {
+			const input = document.querySelector('textarea')
+				|| document.querySelector('[contenteditable="true"]');
+			if (!input) throw new Error('no textarea/contenteditable in popup DOM');
+			input.focus();
+			if (input.tagName === 'TEXTAREA') {
+				const setter = Object.getOwnPropertyDescriptor(
+					HTMLTextAreaElement.prototype, 'value'
+				).set;
+				setter.call(input, ${escaped});
+				input.dispatchEvent(new Event('input', { bubbles: true }));
+			} else {
+				input.textContent = ${escaped};
+				input.dispatchEvent(new InputEvent('input', { bubbles: true, data: ${escaped} }));
+			}
+			// Submit via Enter keydown — popup binds its own keyhandler
+			// (renderer-side per the closeout doc).
+			input.dispatchEvent(new KeyboardEvent('keydown', {
+				key: 'Enter', code: 'Enter', keyCode: 13, which: 13,
+				bubbles: true, cancelable: true,
+			}));
+			input.dispatchEvent(new KeyboardEvent('keyup', {
+				key: 'Enter', code: 'Enter', keyCode: 13, which: 13,
+				bubbles: true,
+			}));
+		})()
+	`;
+}
+
+// Main-window state manipulation. Used by QE-7/8/9/10/11 to set the
+// precondition (minimized, hidden-to-tray, fullscreen, etc.) before
+// triggering Quick Entry.
+//
+// All methods walk webContents to find the claude.ai-hosting
+// BrowserWindow via BrowserWindow.fromWebContents(). The
+// `BrowserWindow.getAllWindows()` registry is broken by frame-fix-
+// wrapper (see lib/inspector.ts gotchas) but `fromWebContents` uses a
+// different code path and remains reliable.
+export class MainWindow {
+	constructor(private readonly inspector: InspectorClient) {}
+
+	async setState(action: 'minimize' | 'hide' | 'show' | 'restore' | 'fullScreen' | 'unFullScreen' | 'focus' | 'close'): Promise<void> {
+		await this.inspector.evalInMain<null>(`
+			const { webContents, BrowserWindow } = process.mainModule.require('electron');
+			const main = webContents.getAllWebContents().find(w => w.getURL().includes('claude.ai'));
+			if (!main) throw new Error('no claude.ai webContents — main not yet loaded');
+			const win = BrowserWindow.fromWebContents(main);
+			if (!win) throw new Error('no BrowserWindow for claude.ai webContents');
+			switch (${JSON.stringify(action)}) {
+				case 'minimize':    win.minimize(); break;
+				case 'hide':        win.hide(); break;
+				case 'show':        win.show(); break;
+				case 'restore':     win.restore(); break;
+				case 'fullScreen':  win.setFullScreen(true); break;
+				case 'unFullScreen':win.setFullScreen(false); break;
+				case 'focus':       win.focus(); break;
+				// 'close' fires the BrowserWindow 'close' event so
+				// frame-fix-wrapper.js:178-185 (the close-to-tray
+				// interceptor) and the upstream before-quit flow
+				// run as they would on a real X-button click. NOT
+				// the same as 'hide' — that bypasses the wrapper.
+				// T08 asserts on this distinction.
+				case 'close':       win.close(); break;
+			}
+			return null;
+		`);
+		// Compositor-side state changes are async — small settle.
+		await sleep(150);
+	}
+
+	async getState(): Promise<BrowserWindowState | null> {
+		return await this.inspector.evalInMain(`
+			const { webContents, BrowserWindow } = process.mainModule.require('electron');
+			const main = webContents.getAllWebContents().find(w => w.getURL().includes('claude.ai'));
+			if (!main) return null;
+			const win = BrowserWindow.fromWebContents(main);
+			if (!win || win.isDestroyed()) return null;
+			return {
+				visible: win.isVisible(),
+				minimized: win.isMinimized(),
+				fullScreen: win.isFullScreen(),
+				focused: win.isFocused(),
+				bounds: win.getBounds(),
+			};
+		`);
+	}
+}
+
+// Wait for the claude.ai user object to be loaded — the precondition
+// for upstream's lHn() (`!user.isLoggedOut`) returning true. The
+// shortcut handler calls Ko.show() only when lHn() is true; if the
+// renderer hasn't finished loading the user yet, the popup gets
+// constructed and ready-to-show fires, but show() is silently
+// skipped (build-reference index.js:515604). The user object is
+// available once the renderer has navigated past the login page —
+// e.g. /new, /chat/<uuid>, /code, /projects.
+//
+// Returns the post-login URL on success. Returns null on timeout —
+// caller can decide to skip vs fail.
+//
+// Anchored at the host root and bounded with a path-terminator class so
+// only `/login`, `/auth`, `/sign-in` etc. as the *first* path segment
+// match. The previous unanchored `/\/(login|auth|sign[-_]?in)/i` also
+// caught substrings like `/oauth/callback` (auth) and any URL containing
+// `/login` further down the path.
+const LOGIN_URL_RE =
+	/^https?:\/\/[^/]+\/(login|auth|sign[-_]?in)(?:[/?#]|$)/i;
+
+export async function waitForUserLoaded(
+	inspector: InspectorClient,
+	timeoutMs = 30_000,
+): Promise<string | null> {
+	return await retryUntil(
+		async () => {
+			const urls = await inspector.evalInMain<string[]>(`
+				const { webContents } = process.mainModule.require('electron');
+				return webContents.getAllWebContents()
+					.filter(w => w.getURL().includes('claude.ai'))
+					.map(w => w.getURL());
+			`);
+			const postLogin = urls.find(
+				(u) => !LOGIN_URL_RE.test(u) && u.includes('claude.ai'),
+			);
+			return postLogin ?? null;
+		},
+		{ timeout: timeoutMs, interval: 250 },
+	);
+}
+
+// Wait for a new chat session to load in the claude.ai webContents.
+// Returns the URL once a /chat/<uuid> path is reached. This is the
+// network-coupled half of the layered submit assertion (S31): a slow
+// claude.ai or a network blip can fail this independently of any QE
+// regression. Callers should treat its failure as Should-not-Critical.
+const CHAT_URL_RE = /\/chat\/[0-9a-f-]{8,}/i;
+
+export async function waitForNewChat(
+	inspector: InspectorClient,
+	timeoutMs = 15_000,
+): Promise<string | null> {
+	return await retryUntil(
+		async () => {
+			const all = await inspector.evalInMain<{ url: string }[]>(`
+				const { webContents } = process.mainModule.require('electron');
+				return webContents.getAllWebContents()
+					.filter(w => w.getURL().includes('claude.ai'))
+					.map(w => ({ url: w.getURL() }));
+			`);
+			const match = all.find((w) => CHAT_URL_RE.test(w.url));
+			return match ? match.url : null;
+		},
+		{ timeout: timeoutMs, interval: 250 },
+	);
+}
+
+// Local-only assertion half: did the popup-side IPC fire with the
+// right payload? Wraps the popup's `requestDismissWithPayload` IPC
+// channel by intercepting it on the main side. Call before
+// typeAndSubmit; resolves with the captured payload (or null on
+// timeout).
+export async function captureSubmitIpc(
+	inspector: InspectorClient,
+	timeoutMs = 5000,
+): Promise<{ text: string } | null> {
+	await inspector.evalInMain<null>(`
+		if (!globalThis.__qeIpcInstalled) {
+			const { ipcMain } = process.mainModule.require('electron');
+			globalThis.__qeIpcCalls = [];
+			// Wrap every existing 'requestDismiss'-shaped channel.
+			// Channel names are minified-stable: requestDismiss /
+			// requestDismissWithPayload (closeout doc index.js:515409).
+			const channels = ['requestDismissWithPayload', 'requestDismiss'];
+			for (const ch of channels) {
+				const handlers = ipcMain._invokeHandlers || ipcMain._events || {};
+				// Best-effort: register a parallel listener that records
+				// invocations without disturbing the original handler.
+				ipcMain.on(ch, (_event, payload) => {
+					globalThis.__qeIpcCalls.push({ channel: ch, payload, ts: Date.now() });
+				});
+			}
+			globalThis.__qeIpcInstalled = true;
+		}
+		return null;
+	`);
+	return await retryUntil(
+		async () => {
+			const calls = await inspector.evalInMain<
+				{ channel: string; payload: unknown; ts: number }[]
+			>(`return globalThis.__qeIpcCalls || []`);
+			const submit = calls.find(
+				(c) =>
+					c.channel === 'requestDismissWithPayload' &&
+					c.payload != null &&
+					typeof c.payload === 'object',
+			);
+			if (!submit) return null;
+			const p = submit.payload as Record<string, unknown>;
+			const text =
+				typeof p.text === 'string'
+					? p.text
+					: typeof p.prompt === 'string'
+						? p.prompt
+						: typeof p.value === 'string'
+							? p.value
+							: '';
+			return { text };
+		},
+		{ timeout: timeoutMs, interval: 100 },
+	);
+}
+
+async function ensureYdotool(): Promise<void> {
+	try {
+		// `ydotool` with no args exits 1 and prints the help text — that
+		// confirms the binary works without sending input. Avoid
+		// `ydotool --help` which is rejected as an unknown command.
+		await exec('ydotool', [], {
+			env: {
+				...process.env,
+				YDOTOOL_SOCKET:
+					process.env.YDOTOOL_SOCKET ?? '/tmp/.ydotool_socket',
+			} as Record<string, string>,
+		});
+	} catch (err) {
+		const e = err as { code?: string | number; stderr?: string };
+		// exit 1 with usage help is normal — only fail on ENOENT (no
+		// binary) or stderr socket errors.
+		const stderr = (e.stderr ?? '').toString();
+		if (e.code === 'ENOENT') {
+			throw new Error(
+				'ydotool binary not found on PATH. Install with ' +
+					'`dnf install ydotool` / `apt install ydotool`.',
+			);
+		}
+		if (stderr.includes('failed to connect socket')) {
+			throw new Error(
+				'ydotoold socket not reachable. Start the daemon ' +
+					'(`sudo systemctl start ydotool.service`) and ensure ' +
+					'YDOTOOL_SOCKET points at its bind path. Underlying: ' +
+					stderr.trim(),
+			);
+		}
+		// Any other non-zero exit (notably exit 1 with usage) is fine.
+	}
+}
--- a/tools/test-harness/src/lib/retry.ts
+++ b/tools/test-harness/src/lib/retry.ts
@@ -0,0 +1,27 @@
+export interface RetryOptions {
+	timeout?: number;
+	interval?: number;
+	message?: string;
+}
+
+export async function retryUntil<T>(
+	fn: () => Promise<T | null | undefined>,
+	options: RetryOptions = {},
+): Promise<T | null> {
+	const timeout = options.timeout ?? 10_000;
+	const interval = options.interval ?? 250;
+	const start = Date.now();
+
+	while (Date.now() - start < timeout) {
+		const result = await fn();
+		if (result !== null && result !== undefined) {
+			return result;
+		}
+		await sleep(interval);
+	}
+	return null;
+}
+
+export function sleep(ms: number): Promise<void> {
+	return new Promise((resolve) => setTimeout(resolve, ms));
+}
--- a/tools/test-harness/src/lib/row.ts
+++ b/tools/test-harness/src/lib/row.ts
@@ -0,0 +1,48 @@
+// Row-aware skip primitive.
+//
+// Spec files declare which matrix rows they apply to. Anything else is
+// skipped (not failed) so the JUnit run carries `<skipped>` →
+// `matrix.md` cell `-`. See Decision 1 in docs/testing/automation.md
+// for the JUnit-to-cell mapping.
+//
+// Usage in a runner:
+//   skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W']);
+//
+// The reason is auto-formatted from the row list so the dashboard
+// caller doesn't have to write it.
+
+import type { TestInfo } from '@playwright/test';
+import { getEnv } from './env.js';
+
+export type Row =
+	| 'KDE-W'
+	| 'KDE-X'
+	| 'GNOME-W'
+	| 'GNOME-X'
+	| 'Ubu-W'
+	| 'Ubu-X'
+	| 'COSMIC'
+	| 'Sway'
+	| 'Niri'
+	| 'Hypr-O'
+	| 'Hypr-N'
+	| 'i3';
+
+export function currentRow(): string {
+	return getEnv().row;
+}
+
+export function skipUnlessRow(testInfo: TestInfo, allowed: Row[]): void {
+	const row = currentRow();
+	if (allowed.includes(row as Row)) return;
+	testInfo.skip(
+		true,
+		`row ${row} not in [${allowed.join(', ')}] — applies-to mismatch`,
+	);
+}
+
+export function skipOnRow(testInfo: TestInfo, blocked: Row[]): void {
+	const row = currentRow();
+	if (!blocked.includes(row as Row)) return;
+	testInfo.skip(true, `row ${row} excluded`);
+}
--- a/tools/test-harness/src/lib/sni.ts
+++ b/tools/test-harness/src/lib/sni.ts
@@ -0,0 +1,53 @@
+import { getSessionBus, getConnectionPid, method } from './dbus.js';
+import type { Variant } from 'dbus-next';
+
+const WATCHER_DEST = 'org.kde.StatusNotifierWatcher';
+const WATCHER_PATH = '/StatusNotifierWatcher';
+const ITEM_IFACE = 'org.kde.StatusNotifierItem';
+
+export interface SniItem {
+	service: string;
+	objectPath: string;
+}
+
+export async function listRegisteredItems(): Promise<SniItem[]> {
+	const bus = getSessionBus();
+	const proxy = await bus.getProxyObject(WATCHER_DEST, WATCHER_PATH);
+	const props = proxy.getInterface('org.freedesktop.DBus.Properties');
+	const result = await method(props, 'Get')(
+		WATCHER_DEST,
+		'RegisteredStatusNotifierItems',
+	);
+	const variant = result as Variant<string[]>;
+	return variant.value.map(parseItemAddress);
+}
+
+export async function findItemByPid(pid: number): Promise<SniItem | null> {
+	const items = await listRegisteredItems();
+	for (const item of items) {
+		try {
+			const itemPid = await getConnectionPid(item.service);
+			if (itemPid === pid) {
+				return item;
+			}
+		} catch {
+			// connection may have gone away mid-iteration; skip
+		}
+	}
+	return null;
+}
+
+export async function activateItem(item: SniItem): Promise<void> {
+	const bus = getSessionBus();
+	const proxy = await bus.getProxyObject(item.service, item.objectPath);
+	const iface = proxy.getInterface(ITEM_IFACE);
+	await method(iface, 'Activate')(0, 0);
+}
+
+function parseItemAddress(raw: string): SniItem {
+	const slash = raw.indexOf('/');
+	if (slash === -1) {
+		return { service: raw, objectPath: '/StatusNotifierItem' };
+	}
+	return { service: raw.slice(0, slash), objectPath: raw.slice(slash) };
+}
--- a/tools/test-harness/src/lib/wm.ts
+++ b/tools/test-harness/src/lib/wm.ts
@@ -0,0 +1,71 @@
+import { execFile } from 'node:child_process';
+import { promisify } from 'node:util';
+
+const exec = promisify(execFile);
+
+export interface FrameExtents {
+	left: number;
+	right: number;
+	top: number;
+	bottom: number;
+}
+
+export async function findX11WindowByPid(pid: number): Promise<string | null> {
+	// Walk _NET_CLIENT_LIST and match on _NET_WM_PID. Pure xprop, no
+	// xdotool dependency — Electron's main window will surface here once
+	// the WM has accepted it.
+	const ids = await listClientWindows();
+	let firstMatch: string | null = null;
+	for (const id of ids) {
+		const wmPid = await getWindowPid(id);
+		if (wmPid !== pid) continue;
+		const title = await getWindowProperty(id, '_NET_WM_NAME');
+		if (title) return id;
+		if (!firstMatch) firstMatch = id;
+	}
+	return firstMatch;
+}
+
+async function listClientWindows(): Promise<string[]> {
+	try {
+		const { stdout } = await exec('xprop', ['-root', '_NET_CLIENT_LIST']);
+		// _NET_CLIENT_LIST(WINDOW): window id # 0x1234, 0x5678, ...
+		const m = stdout.match(/#\s*(.+)$/m);
+		if (!m) return [];
+		return m[1]!.split(',').map((s) => s.trim()).filter(Boolean);
+	} catch {
+		return [];
+	}
+}
+
+async function getWindowPid(windowId: string): Promise<number | null> {
+	const raw = await getWindowProperty(windowId, '_NET_WM_PID');
+	if (!raw) return null;
+	const n = parseInt(raw, 10);
+	return Number.isNaN(n) ? null : n;
+}
+
+export async function getFrameExtents(windowId: string): Promise<FrameExtents | null> {
+	const raw = await getWindowProperty(windowId, '_NET_FRAME_EXTENTS');
+	if (!raw) return null;
+	const nums = raw.split(',').map((s) => parseInt(s.trim(), 10));
+	if (nums.length !== 4 || nums.some(Number.isNaN)) return null;
+	return { left: nums[0]!, right: nums[1]!, top: nums[2]!, bottom: nums[3]! };
+}
+
+export async function getWindowTitle(windowId: string): Promise<string | null> {
+	const raw = await getWindowProperty(windowId, '_NET_WM_NAME');
+	if (!raw) return null;
+	const m = raw.match(/^"(.*)"$/s);
+	return m ? m[1]! : raw;
+}
+
+async function getWindowProperty(windowId: string, prop: string): Promise<string | null> {
+	try {
+		const { stdout } = await exec('xprop', ['-id', windowId, prop]);
+		const m = stdout.match(/=\s*(.+)$/m);
+		return m ? m[1]!.trim() : null;
+	} catch {
+		return null;
+	}
+}
--- a/tools/test-harness/src/runners/H01_cdp_gate_canary.spec.ts
+++ b/tools/test-harness/src/runners/H01_cdp_gate_canary.spec.ts
@@ -0,0 +1,184 @@
+import { test, expect } from '@playwright/test';
+import { spawn } from 'node:child_process';
+import { existsSync } from 'node:fs';
+import { dirname } from 'node:path';
+import { createIsolation } from '../lib/isolation.js';
+
+// H-prefix runners are HARNESS self-tests — they validate the test
+// harness's preconditions and the build pipeline's invariants, distinct
+// from T-tests (upstream test cases) and S-tests (doc-spec entries).
+// They tend to be cheap (file probes, exit-code assertions) and exist
+// to catch silent drift in the things our other tests assume.
+//
+// H01 — CDP auth gate canary.
+//
+// The whole L1 strategy (lib/electron.ts:96-110) hinges on the fact
+// that the shipped Electron exits the app whenever
+// `--remote-debugging-port` / `--remote-debugging-pipe` is on argv
+// without a valid CLAUDE_CDP_AUTH token. If upstream removes that
+// gate, every L1 test silently weakens — Playwright's
+// `_electron.launch()` (which always injects --remote-debugging-port=0)
+// would start working again, but our SIGUSR1-attach pathway would
+// keep "passing" without exercising the contract it was built for.
+//
+// This canary spawns the bundled Electron directly with
+// --remote-debugging-port=0 and NO auth token, then asserts the
+// process exits with code 1 (the gate's `process.exit(1)` per
+// lib/electron.ts:96-97) and was not killed by signal. Timeout
+// without exit means the gate is gone.
+//
+// Spawn-only — no app stays running, no inspector attach, no
+// X11 window probe. Pure exit-code observation under isolation
+// so the host config never sees the failed launch.
+//
+// Row-independent: the gate's Linux behavior is the same on every
+// row we ship to. Don't `skipUnlessRow`.
+
+// DEFAULT_INSTALL_PATHS mirror lib/electron.ts:123-132 — kept inline
+// rather than importing resolveInstall() so this canary can run even
+// if a future change to electron.ts breaks the import surface (the
+// canary should be the LEAST coupled spec to any moving part).
+const DEFAULT_INSTALL_PATHS: { electron: string; asar: string }[] = [
+	{
+		electron: '/usr/lib/claude-desktop/node_modules/electron/dist/electron',
+		asar: '/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar',
+	},
+	{
+		electron: '/opt/Claude/node_modules/electron/dist/electron',
+		asar: '/opt/Claude/node_modules/electron/dist/resources/app.asar',
+	},
+];
+
+function resolveInstallInline(): { electron: string; asar: string } {
+	const envBin = process.env.CLAUDE_DESKTOP_ELECTRON;
+	const envAsar = process.env.CLAUDE_DESKTOP_APP_ASAR;
+	if (envBin && envAsar) return { electron: envBin, asar: envAsar };
+	for (const candidate of DEFAULT_INSTALL_PATHS) {
+		if (existsSync(candidate.electron) && existsSync(candidate.asar)) {
+			return candidate;
+		}
+	}
+	throw new Error(
+		'Could not locate claude-desktop install. Set CLAUDE_DESKTOP_ELECTRON ' +
+			'and CLAUDE_DESKTOP_APP_ASAR, or install the deb/rpm package.',
+	);
+}
+
+test.setTimeout(30_000);
+
+test('H01 — CDP auth gate fires on --remote-debugging-port without token', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Critical' });
+	testInfo.annotations.push({ type: 'surface', description: 'CDP auth gate' });
+
+	const { electron: electronBin, asar } = resolveInstallInline();
+	const appDir = dirname(dirname(dirname(dirname(electronBin))));
+
+	// Fresh isolation — the gate trips before any persisted state is
+	// touched, but if anything sneaks past `process.exit(1)` we'd
+	// rather it write to /tmp than ~/.config/Claude.
+	const isolation = await createIsolation();
+	const start = Date.now();
+
+	// Raw spawn — no LAUNCHER_INJECTED_FLAGS, no isolation env beyond
+	// what we set explicitly. The OPPOSITE of launchClaude(): we WANT
+	// the debug-port flag on argv so the gate fires.
+	const argv = [
+		'--remote-debugging-port=0',
+		asar,
+	];
+
+	// Build env: scrub CLAUDE_CDP_AUTH so a developer who set it
+	// locally doesn't accidentally pass the gate. Keep the rest of
+	// the parent env so Electron's normal load path (DISPLAY,
+	// XDG_RUNTIME_DIR, etc.) still works up to the gate check.
+	const env: Record<string, string> = {};
+	for (const [k, v] of Object.entries(process.env)) {
+		if (v !== undefined) env[k] = v;
+	}
+	delete env.CLAUDE_CDP_AUTH;
+	for (const [k, v] of Object.entries(isolation.env)) {
+		env[k] = v;
+	}
+
+	const proc = spawn(electronBin, argv, {
+		cwd: appDir,
+		env,
+		stdio: 'ignore',
+		detached: false,
+	});
+
+	let exitCode: number | null = null;
+	let signalCode: NodeJS.Signals | null = null;
+	let timedOut = false;
+
+	try {
+		await Promise.race([
+			new Promise<void>((resolve) => {
+				proc.once('exit', (code, signal) => {
+					exitCode = code;
+					signalCode = signal;
+					resolve();
+				});
+			}),
+			new Promise<void>((resolve) => {
+				setTimeout(() => {
+					timedOut = true;
+					resolve();
+				}, 10_000);
+			}),
+		]);
+	} finally {
+		// If the gate didn't fire we have a live Electron — kill it
+		// hard so the test environment isn't polluted by a running
+		// app pointed at the host's display.
+		if (proc.exitCode === null && proc.signalCode === null) {
+			proc.kill('SIGKILL');
+			await new Promise<void>((resolve) => {
+				proc.once('exit', () => resolve());
+				setTimeout(() => resolve(), 2_000);
+			});
+		}
+		await isolation.cleanup();
+	}
+
+	const elapsedMs = Date.now() - start;
+
+	await testInfo.attach('spawn-argv', {
+		body: JSON.stringify([electronBin, ...argv], null, 2),
+		contentType: 'application/json',
+	});
+	await testInfo.attach('exit-info', {
+		body: JSON.stringify(
+			{
+				exitCode,
+				signalCode,
+				timedOut,
+				elapsedMs,
+				note:
+					'Gate fires via process.exit(1) (lib/electron.ts:96-107). ' +
+					'exitCode=1, signalCode=null is the canonical signature.',
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	if (timedOut) {
+		throw new Error(
+			'CDP gate did not fire — app stayed running with ' +
+				'--remote-debugging-port flag and no auth token, gate may ' +
+				'have been removed (lib/electron.ts:96-107). The L1 test ' +
+				'strategy depends on this gate being present.',
+		);
+	}
+
+	expect(
+		exitCode,
+		'gate exits with code 1 (process.exit(1) in index.pre.js)',
+	).toBe(1);
+	expect(
+		signalCode,
+		'process exited via gate, not killed by signal',
+	).toBe(null);
+});
--- a/tools/test-harness/src/runners/H02_frame_fix_wrapper_present.spec.ts
+++ b/tools/test-harness/src/runners/H02_frame_fix_wrapper_present.spec.ts
@@ -0,0 +1,145 @@
+import { test, expect } from '@playwright/test';
+import { listAsar, readAsarFile, resolveAsarPath } from '../lib/asar.js';
+
+// H02 — frame-fix-wrapper presence (file probe).
+//
+// The wrapper at scripts/frame-fix-wrapper.js is the linchpin of every
+// Linux frame fix (close-to-tray, autostart shim, KWin child-bounds
+// jiggle, AZERTY Ctrl+Q). It's injected by patch_app_asar in
+// scripts/patches/app-asar.sh:18-49: the script copies the wrapper
+// into the asar root, writes a frame-fix-entry.js shim that requires
+// it, then rewrites package.json's `main` to point at the shim.
+//
+// If any of those steps silently breaks (missing source file, asar
+// pack failure, package.json rewrite skipped), the app reverts to
+// upstream's frameless-window behavior on every Linux row and our
+// test harness's hook patterns (CLAUDE.md "Hooking Electron")
+// stop matching what's loaded. S09 only covers the quick-window
+// patch; nothing else asserts the wrapper landed at all.
+//
+// Three checks, ordered cheapest-first:
+//   1. Both files exist in the asar manifest.
+//   2. frame-fix-wrapper.js contains `Proxy(` (the Proxy pattern is
+//      the entire reason the wrapper works — see CLAUDE.md and
+//      lib/quickentry.ts:75-81).
+//   3. frame-fix-entry.js requires the wrapper.
+//   4. package.json's `main` references frame-fix-entry (substring,
+//      not exact, since patches don't always preserve `.js`).
+//
+// Pure file probe — no app launch. Fast (<1s). Row-independent.
+
+test('H02 — frame-fix-wrapper.js + frame-fix-entry.js injected into app.asar', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Critical' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Frame fix wrapper injection',
+	});
+
+	const asarPath = resolveAsarPath();
+	await testInfo.attach('asar-path', {
+		body: asarPath,
+		contentType: 'text/plain',
+	});
+
+	// 1. Manifest probe. listAsar returns full paths inside the
+	//    archive (e.g. '/frame-fix-wrapper.js' or 'frame-fix-wrapper.js'
+	//    depending on @electron/asar's normalization). Use endsWith
+	//    so either form matches.
+	const manifest = listAsar(asarPath);
+	const frameFixFiles = manifest.filter(
+		(p) =>
+			p.endsWith('frame-fix-wrapper.js') ||
+			p.endsWith('frame-fix-entry.js'),
+	);
+	const wrapperPresent = frameFixFiles.some((p) =>
+		p.endsWith('frame-fix-wrapper.js'),
+	);
+	const entryPresent = frameFixFiles.some((p) =>
+		p.endsWith('frame-fix-entry.js'),
+	);
+
+	await testInfo.attach('frame-fix-files', {
+		body: JSON.stringify(
+			{
+				found: frameFixFiles,
+				wrapperPresent,
+				entryPresent,
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	expect(
+		wrapperPresent,
+		'frame-fix-wrapper.js is present in app.asar manifest',
+	).toBe(true);
+	expect(
+		entryPresent,
+		'frame-fix-entry.js is present in app.asar manifest',
+	).toBe(true);
+
+	// 2. Wrapper contents — the Proxy pattern is the load-bearing
+	//    structure (see scripts/frame-fix-wrapper.js:491-506 and
+	//    CLAUDE.md "Frame Fix Wrapper" section). A wrapper without
+	//    a Proxy is a stub that doesn't intercept anything.
+	const wrapper = readAsarFile('frame-fix-wrapper.js', asarPath);
+	const proxyPresent = wrapper.includes('Proxy(');
+	expect(
+		proxyPresent,
+		'frame-fix-wrapper.js uses the Proxy() pattern (CLAUDE.md "Frame Fix Wrapper")',
+	).toBe(true);
+
+	// 3. Entry shim — it must require the wrapper, otherwise it's
+	//    not actually loading any of the patches.
+	const entry = readAsarFile('frame-fix-entry.js', asarPath);
+	const entryRequiresWrapper =
+		entry.includes("require('./frame-fix-wrapper") ||
+		entry.includes('require("./frame-fix-wrapper');
+	expect(
+		entryRequiresWrapper,
+		'frame-fix-entry.js requires ./frame-fix-wrapper',
+	).toBe(true);
+
+	// 4. package.json `main` — patch_app_asar in app-asar.sh:40-49
+	//    rewrites pkg.main to 'frame-fix-entry.js'. Substring match
+	//    on 'frame-fix-entry' tolerates patches that re-extension
+	//    or rename the shim.
+	const pkgJsonRaw = readAsarFile('package.json', asarPath);
+	let mainEntry = '';
+	try {
+		const parsed = JSON.parse(pkgJsonRaw) as { main?: unknown };
+		if (typeof parsed.main === 'string') mainEntry = parsed.main;
+	} catch (err) {
+		throw new Error(
+			'package.json in app.asar is not valid JSON: ' +
+				(err instanceof Error ? err.message : String(err)),
+		);
+	}
+
+	await testInfo.attach('package-main', {
+		body: JSON.stringify({ main: mainEntry }, null, 2),
+		contentType: 'application/json',
+	});
+
+	expect(
+		mainEntry.includes('frame-fix-entry'),
+		'package.json `main` references frame-fix-entry (app-asar.sh:40-49)',
+	).toBe(true);
+
+	await testInfo.attach('evidence', {
+		body: JSON.stringify(
+			{
+				wrapperPresent,
+				entryPresent,
+				proxyPresent,
+				entryRequiresWrapper,
+				mainEntry,
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+});
--- a/tools/test-harness/src/runners/H03_patch_fingerprints.spec.ts
+++ b/tools/test-harness/src/runners/H03_patch_fingerprints.spec.ts
@@ -0,0 +1,161 @@
+import { test, expect } from '@playwright/test';
+import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
+
+// H03 — build pipeline patch fingerprints (file probe).
+//
+// scripts/patches/*.sh layers a stack of regex-based mutations onto
+// the bundled JS at build time. Each patch lands a distinctive
+// string somewhere in the asar; if a patch silently skips (anchor
+// regex misses, idempotency guard short-circuits the wrong way,
+// build orchestrator drops the call), that string is absent and
+// the patch's behavior is gone.
+//
+// S09 already covers quick-window.sh. This test consolidates the
+// rest into one manifest so future drift is observable in a single
+// JSON dump. Fingerprints are pinned to STRINGS THE PATCH INJECTS
+// (not strings the patch matches against), so an upstream rename
+// of the matched site doesn't false-positive a passing patch.
+//
+// Pure file probe — no app launch. Fast (<1s). Row-independent.
+
+interface PatchEntry {
+	patch: string;
+	fingerprint: string;
+	file: string;
+	// One-line note explaining where the fingerprint comes from
+	// in the patch script — surfaced in the attached manifest so
+	// future maintainers can tie a failure back to the right
+	// scripts/patches/*.sh:LINE.
+	source: string;
+}
+
+const MANIFEST: PatchEntry[] = [
+	{
+		patch: 'quick-window.sh',
+		fingerprint: 'XDG_CURRENT_DESKTOP',
+		file: '.vite/build/index.js',
+		source:
+			'patches/quick-window.sh injects an XDG_CURRENT_DESKTOP env-var ' +
+			'gate; same fingerprint S09 asserts.',
+	},
+	{
+		patch: 'app-asar.sh (frame-fix injection)',
+		fingerprint: 'frame-fix-entry',
+		file: 'package.json',
+		source:
+			'patches/app-asar.sh:40-49 rewrites package.json main to ' +
+			"'frame-fix-entry.js'.",
+	},
+	{
+		patch: 'tray.sh (startup-delay nativeTheme guard)',
+		fingerprint: '_trayStartTime',
+		file: '.vite/build/index.js',
+		source:
+			'patches/tray.sh:67-69 injects `let _trayStartTime=Date.now();` ' +
+			"into the nativeTheme `on('updated')` handler. Variable name " +
+			'is unique to our patch — upstream never declares it.',
+	},
+	{
+		patch: 'cowork.sh (Linux daemon quit handler)',
+		fingerprint: 'cowork-linux-daemon-shutdown',
+		file: '.vite/build/index.js',
+		source:
+			'patches/cowork.sh:602-605 registers a Linux-only quit handler ' +
+			"with name:'cowork-linux-daemon-shutdown'. Distinctive string " +
+			'unique to the patch.',
+	},
+	{
+		patch: 'claude-code.sh (Linux platform branch)',
+		fingerprint: 'linux-arm64',
+		file: '.vite/build/index.js',
+		source:
+			'patches/claude-code.sh:20-24 injects `linux-arm64` / `linux-x64` ' +
+			'platform-bundle branches into getHostPlatform. Upstream throws ' +
+			'on Linux; the string is absent without the patch.',
+	},
+];
+
+// TODOs intentionally left where a stable fingerprint isn't easy:
+//   - tray.sh has multiple sub-patches (icon selection, in-place
+//     update, menu-bar default). _trayStartTime above covers the
+//     menu-handler patch reliably; the in-place update patch
+//     anchors on a generated name like `${TRAY_VAR}.setImage(...)`
+//     where TRAY_VAR is minifier-renamed every release, so no
+//     fingerprint there is stable enough to assert without a
+//     second extraction step. Acceptable: the menu-handler
+//     fingerprint is upstream of the in-place patch in the same
+//     subsystem, so a missing _trayStartTime implies a much
+//     bigger build problem anyway.
+
+test('H03 — build pipeline patch fingerprints present in app.asar', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Critical' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Build pipeline patch fingerprints',
+	});
+
+	const asarPath = resolveAsarPath();
+	await testInfo.attach('asar-path', {
+		body: asarPath,
+		contentType: 'text/plain',
+	});
+
+	// Read each unique file once, then check fingerprints against
+	// the cached contents. Saves repeated asar extraction for
+	// patches that share a target file.
+	const fileCache = new Map<string, string>();
+	const results: {
+		patch: string;
+		fingerprint: string;
+		file: string;
+		source: string;
+		found: boolean;
+	}[] = [];
+
+	for (const entry of MANIFEST) {
+		let contents = fileCache.get(entry.file);
+		if (contents === undefined) {
+			try {
+				contents = readAsarFile(entry.file, asarPath);
+				fileCache.set(entry.file, contents);
+			} catch (err) {
+				// File missing — record as a "not found" result so
+				// the manifest dump shows the failure shape rather
+				// than aborting on the first hiccup.
+				results.push({
+					patch: entry.patch,
+					fingerprint: entry.fingerprint,
+					file: entry.file,
+					source:
+						entry.source +
+						' [READ ERROR: ' +
+						(err instanceof Error ? err.message : String(err)) +
+						']',
+					found: false,
+				});
+				continue;
+			}
+		}
+		results.push({
+			patch: entry.patch,
+			fingerprint: entry.fingerprint,
+			file: entry.file,
+			source: entry.source,
+			found: contents.includes(entry.fingerprint),
+		});
+	}
+
+	// Always attach the manifest — passing tests should still
+	// surface the verified fingerprints so future drift is visible
+	// without re-running with -v.
+	await testInfo.attach('patch-manifest', {
+		body: JSON.stringify(results, null, 2),
+		contentType: 'application/json',
+	});
+
+	const missing = results.filter((r) => !r.found);
+	expect(
+		missing,
+		'every expected patch fingerprint is present in the bundled app.asar',
+	).toEqual([]);
+});
--- a/tools/test-harness/src/runners/H04_cowork_daemon_lifecycle.spec.ts
+++ b/tools/test-harness/src/runners/H04_cowork_daemon_lifecycle.spec.ts
@@ -0,0 +1,205 @@
+import { test, expect } from '@playwright/test';
+import { execFile } from 'node:child_process';
+import { promisify } from 'node:util';
+import { launchClaude } from '../lib/electron.js';
+import { skipUnlessRow } from '../lib/row.js';
+import { sleep } from '../lib/retry.js';
+import { captureSessionEnv } from '../lib/diagnostics.js';
+
+const exec = promisify(execFile);
+
+// H04 — cowork daemon spawn / cleanup contract.
+//
+// docs/learnings/cowork-vm-daemon.md describes the contract that
+// patches/cowork.sh implements: the app's auto-launch path
+// (cowork.sh:262-362) forks cowork-vm-service.js as a detached
+// child on first VM-service connection attempt, and the Linux
+// quit handler registered at cowork.sh:584-633 SIGTERMs that
+// daemon on app exit. No existing test asserts that contract
+// end-to-end. If the auto-launch regresses, the app falls back
+// to "VM service not running" errors silently; if the quit
+// handler regresses, daemons leak across app sessions and
+// pollute the next launch's socket binding.
+//
+// Shape: pgrep baseline (must be empty after launchClaude's
+// cleanupPreLaunch — see lib/electron.ts:160-191), launch with
+// isolation, wait for mainVisible, poll for a daemon pid, then
+// close + verify cleanup.
+//
+// The daemon spawn is conditional — cowork.sh:265 anchors on
+// 'VM service not running. The service failed to start.' which
+// only fires when something in the renderer triggers a VM
+// connection. On a freshly-launched app that never hits the
+// Cowork tab, the daemon may legitimately not appear within
+// the budget. Treat that as `testInfo.skip` rather than a fail.
+//
+// Row-gated to the same set as the QE tests — daemon is a Linux
+// thing, gating mirrors S30.
+
+const PGREP_PATTERN = 'cowork-vm-service\\.js';
+
+async function pgrepPids(pattern: string): Promise<Set<number>> {
+	try {
+		const { stdout } = await exec('pgrep', ['-f', pattern], {
+			timeout: 5_000,
+		});
+		return new Set(
+			stdout
+				.split('\n')
+				.map((l) => parseInt(l.trim(), 10))
+				.filter((n) => !Number.isNaN(n)),
+		);
+	} catch (err) {
+		// pgrep exits 1 with empty stdout when no matches. Treat as
+		// the empty set; everything else propagates.
+		const e = err as { code?: number; stdout?: string };
+		if (e.code === 1) return new Set();
+		const out = e.stdout ?? '';
+		return new Set(
+			out
+				.split('\n')
+				.map((l) => parseInt(l.trim(), 10))
+				.filter((n) => !Number.isNaN(n)),
+		);
+	}
+}
+
+test.setTimeout(60_000);
+
+test('H04 — cowork daemon spawns under app, exits with app', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Should' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Cowork daemon lifecycle',
+	});
+	skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
+
+	await testInfo.attach('session-env', {
+		body: JSON.stringify(captureSessionEnv(), null, 2),
+		contentType: 'application/json',
+	});
+
+	// Baseline — launchClaude's cleanupPreLaunch (lib/electron.ts:160-191)
+	// pkills any leftover cowork daemon before spawning, so a stray
+	// pid here would mean the cleanup itself is broken.
+	const baselinePids = await pgrepPids(PGREP_PATTERN);
+	await testInfo.attach('baseline-pids', {
+		body: JSON.stringify(
+			{
+				pids: Array.from(baselinePids),
+				note:
+					'cleanupPreLaunch should leave this empty before launch. ' +
+					'Non-empty here is a bug in lib/electron.ts:160-191.',
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
+	const app = await launchClaude({
+		isolation: useHostConfig ? null : undefined,
+	});
+	let daemonPid: number | null = null;
+	let lingeringPids: number[] = [];
+
+	try {
+		// mainVisible — main shell up; the daemon spawn is gated on
+		// renderer activity (cowork.sh:262-362) which can begin
+		// asynchronously after the shell paints. Lower readiness
+		// levels race the spawn window.
+		await app.waitForReady('mainVisible');
+
+		// Poll up to 15s for a new daemon pid. cowork.sh's auto-
+		// launch only fires when the renderer attempts a VM service
+		// connection; on a passive launch (no Cowork tab interaction)
+		// the daemon may legitimately not appear in this window.
+		const start = Date.now();
+		while (Date.now() - start < 15_000) {
+			const pids = await pgrepPids(PGREP_PATTERN);
+			const newPids = Array.from(pids).filter(
+				(p) => !baselinePids.has(p),
+			);
+			if (newPids.length > 0) {
+				daemonPid = newPids[0]!;
+				break;
+			}
+			await sleep(500);
+		}
+
+		if (daemonPid === null) {
+			await testInfo.attach('skip-reason', {
+				body: JSON.stringify(
+					{
+						reason:
+							'cowork daemon not spawned within 15s of mainVisible',
+						note:
+							'Auto-launch in cowork.sh:262-362 is gated on a VM ' +
+							'service connection attempt from the renderer; on a ' +
+							'passive launch with no Cowork-tab interaction it may ' +
+							'legitimately not fire. Not a regression on its own.',
+					},
+					null,
+					2,
+				),
+				contentType: 'application/json',
+			});
+			testInfo.skip(
+				true,
+				'cowork daemon not spawned by this build — gating in ' +
+					'cowork.sh:262-362 may have suppressed it on a passive launch',
+			);
+			return;
+		}
+
+		await testInfo.attach('daemon-spawned', {
+			body: JSON.stringify(
+				{
+					pid: daemonPid,
+					elapsedMs: Date.now() - start,
+				},
+				null,
+				2,
+			),
+			contentType: 'application/json',
+		});
+	} finally {
+		await app.close();
+	}
+
+	// Quit handler (cowork.sh:584-633) waits up to 10s for the
+	// daemon to exit after SIGTERM. Give it a 5s settle window —
+	// graceful exit is the common case, but on a slow runner the
+	// kill loop's poll cadence (200ms × 50) can stretch. Re-pgrep
+	// after the wait.
+	await sleep(5_000);
+
+	const postExitPids = await pgrepPids(PGREP_PATTERN);
+	lingeringPids = Array.from(postExitPids).filter(
+		(p) => p === daemonPid || !baselinePids.has(p),
+	);
+
+	await testInfo.attach('post-exit-pgrep', {
+		body: JSON.stringify(
+			{
+				baseline: Array.from(baselinePids),
+				postExit: Array.from(postExitPids),
+				lingering: lingeringPids,
+				note:
+					'Lingering daemon pids after app.close() indicate the ' +
+					'Linux quit handler in cowork.sh:584-633 did not run, ' +
+					'or its 10s SIGTERM-then-noop loop completed without ' +
+					'the daemon actually exiting (escalate to SIGKILL upstream).',
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	expect(
+		lingeringPids,
+		'no cowork-vm-service daemon lingers 5s after app.close()',
+	).toEqual([]);
+});
--- a/tools/test-harness/src/runners/H05_ui_drift_check.spec.ts
+++ b/tools/test-harness/src/runners/H05_ui_drift_check.spec.ts
@@ -0,0 +1,399 @@
+import { test, expect } from '@playwright/test';
+import { readdirSync, readFileSync, existsSync } from 'node:fs';
+import { dirname, resolve } from 'node:path';
+import { fileURLToPath } from 'node:url';
+
+import { launchClaude } from '../lib/electron.js';
+import { sleep } from '../lib/retry.js';
+import { captureSessionEnv } from '../lib/diagnostics.js';
+import { capture, type Snapshot } from '../../explore/snapshot.js';
+import { diff } from '../../explore/diff.js';
+import type { InspectorClient } from '../lib/inspector.js';
+
+// H05 — claude.ai UI drift detection.
+//
+// docs/testing/claudeai-ui-mapping-plan.md Phase 5: catch upstream
+// renderer changes that would break the page-object selectors in
+// lib/claudeai.ts BEFORE they fail a real spec mid-sweep.
+//
+// For each baseline JSON under docs/testing/ui-snapshots/:
+//   1. Navigate the renderer to the captured claudeAiUrl (if any).
+//   2. Capture a fresh Snapshot via the same `capture()` the explore
+//      CLI uses — no forked logic.
+//   3. Compare against the baseline via the same `diff()` the explore
+//      CLI uses. Attach the per-snapshot diff if non-empty.
+//   4. A snapshot is "clean" if `diff(...).entries.length === 0`.
+//
+// Pass criterion: ≥80% of snapshots clean (per the plan). The
+// threshold is forgiving on purpose — a single rendered surface
+// shifting class names shouldn't block CI; we want a signal, not a
+// blast radius.
+//
+// Per-snapshot timing target ≤200ms (snapshot capture only — the
+// 30s navigation settle is excluded). Exceedance is a soft warning
+// surfaced via attachment, never a hard fail.
+//
+// Skip behaviours:
+//   - Zero baselines: skip with the "capture some first" message
+//     (the directory is gitignored beyond .gitkeep + README, so a
+//     fresh checkout legitimately has none).
+//   - Not signed in (no claude.ai webContents at the claudeAi
+//     readiness level): skip — most baselines target post-login
+//     surfaces and would fail spuriously on /login.
+//
+// Row-gated to the same set as the QE-driven specs since the host
+// must be capable of reaching claude.ai under launchClaude.
+
+const SNAPSHOT_DIR = resolve(
+	dirname(fileURLToPath(import.meta.url)),
+	'..',
+	'..',
+	'..',
+	'..',
+	'docs',
+	'testing',
+	'ui-snapshots',
+);
+
+// 200ms is the per-snapshot capture target from the plan. Surface
+// (not enforce) when a single capture exceeds this.
+const CAPTURE_BUDGET_MS = 200;
+
+// 80% from the plan — pass if at least this fraction of snapshots
+// have zero diffs. Computed as floor(N * 0.8) so 5/5 passes, 4/5
+// passes, 3/5 fails, etc.
+const CLEAN_FRACTION_REQUIRED = 0.8;
+
+// Navigation settle: after setting location.href, we poll for the
+// URL to land + readyState to reach 'complete' before snapshotting.
+// Coupled to the renderer route load + auth-gate redirect time;
+// 30s is the same upper bound used by waitForReady('claudeAi').
+const NAV_SETTLE_TIMEOUT_MS = 30_000;
+const NAV_SETTLE_INTERVAL_MS = 500;
+
+interface SnapshotFile {
+	name: string;
+	path: string;
+	baseline: Snapshot;
+}
+
+interface PerSnapshotResult {
+	name: string;
+	url: string | null;
+	clean: boolean;
+	captureMs: number;
+	summary: { removed: number; changed: number; added: number };
+	skipped?: string;
+	error?: string;
+}
+
+function loadBaselines(): SnapshotFile[] {
+	if (!existsSync(SNAPSHOT_DIR)) return [];
+	const files = readdirSync(SNAPSHOT_DIR).filter((f) => f.endsWith('.json'));
+	const out: SnapshotFile[] = [];
+	for (const file of files) {
+		const path = resolve(SNAPSHOT_DIR, file);
+		const raw = readFileSync(path, 'utf8');
+		try {
+			out.push({
+				name: file.replace(/\.json$/, ''),
+				path,
+				baseline: JSON.parse(raw) as Snapshot,
+			});
+		} catch (err) {
+			// Surface the bad file as a skipped result rather than
+			// aborting the whole run — one corrupt baseline shouldn't
+			// hide drift in the rest.
+			out.push({
+				name: file.replace(/\.json$/, ''),
+				path,
+				baseline: {
+					capturedAt: '',
+					claudeAiUrl: '',
+					appVersion: null,
+					pageState: { url: '', title: '', readyState: '' },
+					dfPills: [],
+					compactPills: [],
+					ariaLabeledButtons: [],
+					openMenu: null,
+					modals: [],
+				},
+			});
+			// Stash the parse error on the file object via a side
+			// channel: the spec body checks for an empty capturedAt
+			// on the baseline as the "load failed" signal.
+			(out[out.length - 1] as { _loadError?: string })._loadError =
+				err instanceof Error ? err.message : String(err);
+		}
+	}
+	return out;
+}
+
+// Drive the active claude.ai webContents to the target URL. We set
+// location.href in the renderer rather than calling webContents.loadURL
+// from main: setting from the renderer keeps the React app's history
+// stack intact (it's the same pathway a user-initiated navigation
+// takes), avoiding the "blank window then re-mount" flicker loadURL
+// triggers. Then poll for the URL to land and readyState=='complete'.
+async function navigateRendererTo(
+	inspector: InspectorClient,
+	targetUrl: string,
+): Promise<void> {
+	await inspector.evalInRenderer<null>(
+		'claude.ai',
+		`(() => { window.location.href = ${JSON.stringify(targetUrl)}; return null; })()`,
+	);
+
+	const start = Date.now();
+	while (Date.now() - start < NAV_SETTLE_TIMEOUT_MS) {
+		try {
+			const state = await inspector.evalInRenderer<{
+				url: string;
+				readyState: string;
+			}>(
+				'claude.ai',
+				`(() => ({ url: location.href, readyState: document.readyState }))()`,
+			);
+			if (
+				state.readyState === 'complete' &&
+				sameOrigin(state.url, targetUrl)
+			) {
+				// One extra tick to let claude.ai's React render finish
+				// — readyState='complete' fires before the SPA mounts.
+				await sleep(500);
+				return;
+			}
+		} catch {
+			// During navigation the webContents URL changes and the
+			// 'claude.ai' filter may transiently miss; just retry.
+		}
+		await sleep(NAV_SETTLE_INTERVAL_MS);
+	}
+	throw new Error(
+		`renderer did not settle on ${targetUrl} within ${NAV_SETTLE_TIMEOUT_MS}ms`,
+	);
+}
+
+// Compare URLs by origin + pathname. claude.ai tacks on tracking
+// params, modal state, etc. to the URL after route resolution, so an
+// exact match is too strict; the route is what we care about.
+function sameOrigin(a: string, b: string): boolean {
+	try {
+		const ua = new URL(a);
+		const ub = new URL(b);
+		return ua.origin === ub.origin && ua.pathname === ub.pathname;
+	} catch {
+		return a === b;
+	}
+}
+
+test.setTimeout(180_000);
+
+test('H05 — claude.ai UI drift detection', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Should' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'claude.ai UI drift detection',
+	});
+
+	await testInfo.attach('session-env', {
+		body: JSON.stringify(captureSessionEnv(), null, 2),
+		contentType: 'application/json',
+	});
+
+	const baselines = loadBaselines();
+	await testInfo.attach('baselines-found', {
+		body: JSON.stringify(
+			{
+				dir: SNAPSHOT_DIR,
+				count: baselines.length,
+				names: baselines.map((b) => b.name),
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	if (baselines.length === 0) {
+		testInfo.skip(
+			true,
+			'no baselines under docs/testing/ui-snapshots/ — capture some ' +
+				'with `npm run explore:snapshot <name>` first',
+		);
+		return;
+	}
+
+	const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
+	const app = await launchClaude({
+		isolation: useHostConfig ? null : undefined,
+	});
+
+	const results: PerSnapshotResult[] = [];
+
+	try {
+		// claudeAi level: a claude.ai webContents exists. We don't
+		// require userLoaded here because some baselines might
+		// legitimately be of /login surfaces; per-snapshot navigation
+		// will land us on whatever the baseline captured.
+		const { inspector, claudeAiUrl } = await app.waitForReady('claudeAi');
+		if (!claudeAiUrl) {
+			testInfo.skip(
+				true,
+				'claude.ai webContents never loaded — likely not signed in. ' +
+					'Set CLAUDE_TEST_USE_HOST_CONFIG=1 to share host config.',
+			);
+			return;
+		}
+
+		await testInfo.attach('initial-claude-ai-url', {
+			body: claudeAiUrl,
+			contentType: 'text/plain',
+		});
+
+		for (const file of baselines) {
+			const loadError = (file as { _loadError?: string })._loadError;
+			if (loadError) {
+				results.push({
+					name: file.name,
+					url: null,
+					clean: false,
+					captureMs: 0,
+					summary: { removed: 0, changed: 0, added: 0 },
+					error: `failed to parse baseline: ${loadError}`,
+				});
+				await testInfo.attach(`drift-${file.name}.json`, {
+					body: JSON.stringify({ error: loadError }, null, 2),
+					contentType: 'application/json',
+				});
+				continue;
+			}
+
+			const targetUrl = file.baseline.claudeAiUrl;
+
+			// Navigate (best-effort). If a baseline has no URL,
+			// snapshot the current renderer state in place — it
+			// matches the explore CLI's bare `snapshot <name>`
+			// pathway, which captures wherever the app is sitting.
+			if (targetUrl) {
+				try {
+					await navigateRendererTo(inspector, targetUrl);
+				} catch (err) {
+					results.push({
+						name: file.name,
+						url: targetUrl,
+						clean: false,
+						captureMs: 0,
+						summary: { removed: 0, changed: 0, added: 0 },
+						error: `navigation failed: ${err instanceof Error ? err.message : String(err)}`,
+					});
+					continue;
+				}
+			}
+
+			const captureStart = Date.now();
+			let fresh: Snapshot;
+			try {
+				fresh = await capture(inspector);
+			} catch (err) {
+				results.push({
+					name: file.name,
+					url: targetUrl || null,
+					clean: false,
+					captureMs: Date.now() - captureStart,
+					summary: { removed: 0, changed: 0, added: 0 },
+					error: `capture failed: ${err instanceof Error ? err.message : String(err)}`,
+				});
+				continue;
+			}
+			const captureMs = Date.now() - captureStart;
+
+			const result = diff(file.baseline, fresh);
+			const clean = result.entries.length === 0;
+
+			results.push({
+				name: file.name,
+				url: targetUrl || null,
+				clean,
+				captureMs,
+				summary: result.summary,
+			});
+
+			// Always attach the diff payload — clean diffs are the
+			// "no entries" case and confirm the snapshot was actually
+			// compared (vs. silently skipped). Naming per-snapshot so
+			// the report shows them side-by-side.
+			await testInfo.attach(`drift-${file.name}.json`, {
+				body: JSON.stringify(result, null, 2),
+				contentType: 'application/json',
+			});
+		}
+
+		inspector.close();
+	} finally {
+		await app.close();
+	}
+
+	const cleanCount = results.filter((r) => r.clean).length;
+	const totalCount = results.length;
+	const cleanFraction = totalCount === 0 ? 0 : cleanCount / totalCount;
+	const slowSnapshots = results.filter(
+		(r) => r.captureMs > CAPTURE_BUDGET_MS,
+	);
+	const errored = results.filter((r) => r.error);
+
+	await testInfo.attach('drift-summary', {
+		body: JSON.stringify(
+			{
+				totalCount,
+				cleanCount,
+				cleanFraction,
+				thresholdRequired: CLEAN_FRACTION_REQUIRED,
+				results,
+				slowSnapshots: slowSnapshots.map((r) => ({
+					name: r.name,
+					captureMs: r.captureMs,
+					budgetMs: CAPTURE_BUDGET_MS,
+				})),
+				erroredSnapshots: errored.map((r) => ({
+					name: r.name,
+					error: r.error,
+				})),
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	if (slowSnapshots.length > 0) {
+		// Soft warning only — surface as an attachment, don't fail.
+		// Capture latency is bounded by the renderer's main-thread
+		// availability, which is noisy. The plan's 200ms is a
+		// "looking-good" target, not a contract.
+		await testInfo.attach('slow-capture-warning', {
+			body: JSON.stringify(
+				{
+					note:
+						`${slowSnapshots.length} snapshot(s) exceeded the ` +
+						`${CAPTURE_BUDGET_MS}ms capture target. Soft warning — ` +
+						'not a fail. Investigate if this trends upward.',
+					snapshots: slowSnapshots.map((r) => ({
+						name: r.name,
+						captureMs: r.captureMs,
+					})),
+				},
+				null,
+				2,
+			),
+			contentType: 'application/json',
+		});
+	}
+
+	expect(
+		cleanFraction,
+		`at least ${Math.round(CLEAN_FRACTION_REQUIRED * 100)}% of snapshots ` +
+			`must have zero diffs (got ${cleanCount}/${totalCount} clean — see ` +
+			'drift-*.json attachments for per-snapshot diffs)',
+	).toBeGreaterThanOrEqual(CLEAN_FRACTION_REQUIRED);
+});
--- a/tools/test-harness/src/runners/S01_appimage_launches_without_libfuse2t64.spec.ts
+++ b/tools/test-harness/src/runners/S01_appimage_launches_without_libfuse2t64.spec.ts
@@ -0,0 +1,356 @@
+import { test, expect } from '@playwright/test';
+import { spawn, execFile } from 'node:child_process';
+import { existsSync, statSync } from 'node:fs';
+import { open } from 'node:fs/promises';
+import { promisify } from 'node:util';
+import { tmpdir } from 'node:os';
+import { join } from 'node:path';
+import { mkdtemp, rm } from 'node:fs/promises';
+
+const exec = promisify(execFile);
+
+// S01 — AppImage launches without manual `libfuse2t64` install.
+//
+// Per docs/testing/cases/distribution.md S01: on Ubuntu 24.04+ the
+// project AppImage currently fails with `dlopen(): error loading
+// libfuse.so.2` unless the user manually installs `libfuse2t64`.
+// The case-doc anchor (scripts/packaging/appimage.sh:226) notes the
+// upstream `appimagetool` runtime is bundled as-is — no FUSE shim,
+// no postinst dep declaration, no clear error message. CI papers
+// over the gap by `apt install libfuse2`-ing before exec
+// (.github/workflows/test-artifacts.yml:47).
+//
+// Assertion shape:
+//   1. Locate an AppImage. Skip cleanly if not running from one.
+//   2. Spawn the AppImage with a brief grace window. Capture stderr.
+//   3. Assert stderr does NOT contain `libfuse.so.2` (or the broader
+//      `dlopen` failure pattern that the AppImage runtime emits when
+//      FUSE is missing).
+//   4. Kill the proc — we don't need a full launch, just the FUSE
+//      load attempt which happens before any squashfs mount.
+//
+// Why a runtime spawn rather than a static probe: the failure mode
+// is `dlopen()` of libfuse.so.2 inside the AppImage runtime ELF
+// itself, not anything our scripts produce. Only a real spawn on
+// the target host exercises that dynamic loader path.
+//
+// Approach choice: we do NOT use `--appimage-version`. That flag is
+// handled by the AppImage runtime BEFORE any FUSE mount, so it
+// would exit 0 even on a host missing libfuse2 and silently pass
+// the test. Instead we let the runtime reach its mount step, watch
+// stderr for the dlopen error (which fires within ~100ms when the
+// lib is absent), then kill before the Electron child has a chance
+// to persist anything.
+//
+// Isolation: we spawn with a temp `XDG_CONFIG_HOME` / `HOME`-adjacent
+// override so even if Electron does come up briefly before we kill
+// it, nothing lands in `~/.config/Claude`.
+//
+// Row gating: this isn't matrix-row-driven — it's install-method-
+// driven. The harness's `ROW` env doesn't carry "is this row's
+// install an AppImage?", so we detect at runtime via launcher path
+// + magic-byte sniff. Skip when the local install isn't AppImage.
+
+interface AppImageProbeResult {
+	path: string | null;
+	reason: string;
+}
+
+// AppImages are ELF executables containing a squashfs image with a
+// magic header at offset 8: `AI\x02` for type 2 (the format our build
+// emits) or `AI\x01` for type 1. The magic is also visible to `file`,
+// but ELF + extension + magic is cheap enough to inline.
+async function probeAppImagePath(): Promise<AppImageProbeResult> {
+	const explicit = process.env.CLAUDE_DESKTOP_LAUNCHER;
+	const candidates: string[] = [];
+	if (explicit) candidates.push(explicit);
+
+	// Fallback search: project test-build dir holds AppImages from
+	// `./build.sh --build appimage`. Resolve relative to this spec
+	// so the search works regardless of CWD.
+	const projectRoot = '/home/aaddrick/source/claude-desktop-debian';
+	const testBuildDir = `${projectRoot}/test-build`;
+	if (existsSync(testBuildDir)) {
+		try {
+			const fs = await import('node:fs/promises');
+			const entries = await fs.readdir(testBuildDir);
+			for (const entry of entries) {
+				if (entry.endsWith('.AppImage')) {
+					candidates.push(`${testBuildDir}/${entry}`);
+				}
+			}
+		} catch {
+			// best-effort
+		}
+	}
+
+	for (const candidate of candidates) {
+		if (!existsSync(candidate)) continue;
+		try {
+			const st = statSync(candidate);
+			if (!st.isFile()) continue;
+			// Quick filename hint: skip the magic-byte read entirely
+			// for unambiguous .AppImage suffixes.
+			if (candidate.endsWith('.AppImage')) {
+				return { path: candidate, reason: 'matched .AppImage suffix' };
+			}
+			// Magic-byte sniff: ELF (`\x7fELF`) at offset 0, AppImage
+			// type marker `AI\x02` at offset 8.
+			const fh = await open(candidate, 'r');
+			try {
+				const buf = Buffer.alloc(12);
+				await fh.read(buf, 0, 12, 0);
+				const elf = buf.subarray(0, 4).toString('hex') === '7f454c46';
+				const aiMagic = buf.subarray(8, 11);
+				const isAppImage =
+					elf &&
+					aiMagic[0] === 0x41 &&
+					aiMagic[1] === 0x49 &&
+					(aiMagic[2] === 0x01 || aiMagic[2] === 0x02);
+				if (isAppImage) {
+					return {
+						path: candidate,
+						reason: 'matched AppImage magic bytes',
+					};
+				}
+			} finally {
+				await fh.close();
+			}
+		} catch {
+			// fall through to next candidate
+		}
+	}
+
+	return {
+		path: null,
+		reason:
+			'no AppImage found via CLAUDE_DESKTOP_LAUNCHER or ' +
+			`${testBuildDir}/*.AppImage`,
+	};
+}
+
+async function captureFuseDpkg(): Promise<string> {
+	// Best-effort context capture for the case-doc's listed
+	// "Diagnostics on failure". `dpkg -l` is Debian-only — we still
+	// run it and let it fail cleanly on RPM hosts (the empty/error
+	// output is itself diagnostic).
+	try {
+		const { stdout, stderr } = await exec(
+			'sh',
+			['-c', 'dpkg -l 2>&1 | grep -i fuse || true'],
+			{ timeout: 5_000 },
+		);
+		return `${stdout}${stderr}`.trim() || '(no fuse-related dpkg entries)';
+	} catch (err) {
+		const e = err as { stdout?: string; stderr?: string; code?: number };
+		return (
+			`dpkg query failed (exit ${e.code ?? '?'})\n` +
+			`${(e.stdout ?? '').trim()}\n` +
+			`${(e.stderr ?? '').trim()}`
+		).trim();
+	}
+}
+
+// Matches the dlopen failure pattern the AppImage runtime prints
+// when libfuse2 is missing. The case-doc lists `libfuse.so.2` as the
+// canonical token; we also flag the broader `dlopen` + `fuse`
+// combination so a future runtime that changes the wording without
+// fixing the underlying bug still trips the test.
+function fuseFailureFound(stderr: string): { found: boolean; match?: string } {
+	const lower = stderr.toLowerCase();
+	if (lower.includes('libfuse.so.2')) {
+		return { found: true, match: 'libfuse.so.2' };
+	}
+	// Both 'dlopen' and 'fuse' on the same line of stderr — wider net
+	// for future-proofing.
+	for (const line of stderr.split('\n')) {
+		const ll = line.toLowerCase();
+		if (ll.includes('dlopen') && ll.includes('fuse')) {
+			return { found: true, match: line.trim() };
+		}
+	}
+	return { found: false };
+}
+
+test.setTimeout(30_000);
+
+test('S01 — AppImage launches without manual libfuse2t64', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Critical' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Distribution / AppImage',
+	});
+
+	const probe = await probeAppImagePath();
+	await testInfo.attach('appimage-probe', {
+		body: JSON.stringify(probe, null, 2),
+		contentType: 'application/json',
+	});
+
+	if (!probe.path) {
+		test.skip(true, `S01 only applies to AppImage installs: ${probe.reason}`);
+		return;
+	}
+
+	const appimagePath = probe.path;
+
+	// Always-on context: dpkg fuse state. Cheap, useful for triage
+	// regardless of pass/fail.
+	const dpkgFuse = await captureFuseDpkg();
+	await testInfo.attach('dpkg-fuse', {
+		body: dpkgFuse,
+		contentType: 'text/plain',
+	});
+
+	// Per-test sandbox so a brief Electron child doesn't pollute the
+	// host's ~/.config/Claude. We don't use launchClaude()'s isolation
+	// because it spawns the bundled Electron directly (bypassing the
+	// AppImage runtime's FUSE mount, which is exactly what we're
+	// trying to exercise here).
+	const sandboxRoot = await mkdtemp(join(tmpdir(), 'claude-s01-'));
+	const sandboxConfig = join(sandboxRoot, 'config');
+	const sandboxHome = join(sandboxRoot, 'home');
+
+	let exitCode: number | null = null;
+	let signalCode: NodeJS.Signals | null = null;
+	let timedOutBeforeFuseSignal = false;
+	const stderrChunks: Buffer[] = [];
+	const stdoutChunks: Buffer[] = [];
+	const start = Date.now();
+
+	try {
+		const proc = spawn(appimagePath, [], {
+			cwd: sandboxRoot,
+			env: {
+				...process.env,
+				HOME: sandboxHome,
+				XDG_CONFIG_HOME: sandboxConfig,
+				XDG_DATA_HOME: join(sandboxRoot, 'data'),
+				XDG_CACHE_HOME: join(sandboxRoot, 'cache'),
+				// Surface FUSE mount errors loudly; the AppImage runtime
+				// honours this for its diagnostic output.
+				APPIMAGE_DEBUG: '1',
+			},
+			stdio: ['ignore', 'pipe', 'pipe'],
+			detached: false,
+		});
+
+		proc.stderr?.on('data', (chunk: Buffer) => stderrChunks.push(chunk));
+		proc.stdout?.on('data', (chunk: Buffer) => stdoutChunks.push(chunk));
+
+		// Race three outcomes:
+		//   (a) process exits on its own (FUSE failure exits ~100-300ms)
+		//   (b) we observed a FUSE error in stderr — kill early
+		//   (c) timeout: app probably mounted fine and is starting up,
+		//       in which case absence of FUSE error in stderr is a PASS
+		const fuseSignal = new Promise<'fuse-error'>((resolve) => {
+			const checkInterval = setInterval(() => {
+				const so_far = Buffer.concat(stderrChunks).toString('utf8');
+				if (fuseFailureFound(so_far).found) {
+					clearInterval(checkInterval);
+					resolve('fuse-error');
+				}
+			}, 100);
+			proc.once('exit', () => clearInterval(checkInterval));
+		});
+		const exitSignal = new Promise<'exit'>((resolve) => {
+			proc.once('exit', (code, signal) => {
+				exitCode = code;
+				signalCode = signal;
+				resolve('exit');
+			});
+		});
+		const timeoutSignal = new Promise<'timeout'>((resolve) => {
+			setTimeout(() => {
+				timedOutBeforeFuseSignal = true;
+				resolve('timeout');
+			}, 8_000);
+		});
+
+		const winner = await Promise.race([
+			fuseSignal,
+			exitSignal,
+			timeoutSignal,
+		]);
+
+		// Whatever happened, kill the process so we don't leave
+		// Electron running. SIGTERM first, SIGKILL backstop.
+		if (proc.exitCode === null && proc.signalCode === null) {
+			proc.kill('SIGTERM');
+			await Promise.race([
+				new Promise<void>((resolve) =>
+					proc.once('exit', (code, signal) => {
+						exitCode = code;
+						signalCode = signal;
+						resolve();
+					}),
+				),
+				new Promise<void>((resolve) => setTimeout(resolve, 3_000)),
+			]);
+			if (proc.exitCode === null && proc.signalCode === null) {
+				proc.kill('SIGKILL');
+				await new Promise<void>((resolve) => {
+					proc.once('exit', (code, signal) => {
+						exitCode = code;
+						signalCode = signal;
+						resolve();
+					});
+					setTimeout(() => resolve(), 2_000);
+				});
+			}
+		}
+
+		await testInfo.attach('race-winner', {
+			body: winner,
+			contentType: 'text/plain',
+		});
+	} finally {
+		await rm(sandboxRoot, { recursive: true, force: true }).catch(() => {});
+	}
+
+	const elapsedMs = Date.now() - start;
+	const stderrFull = Buffer.concat(stderrChunks).toString('utf8');
+	const stdoutFull = Buffer.concat(stdoutChunks).toString('utf8');
+	const stderrTail =
+		stderrFull.length > 4096 ? stderrFull.slice(-4096) : stderrFull;
+	const stdoutTail =
+		stdoutFull.length > 4096 ? stdoutFull.slice(-4096) : stdoutFull;
+
+	const fuseCheck = fuseFailureFound(stderrFull);
+
+	await testInfo.attach('appimage-path', {
+		body: appimagePath,
+		contentType: 'text/plain',
+	});
+	await testInfo.attach('exit-info', {
+		body: JSON.stringify(
+			{
+				exitCode,
+				signalCode,
+				timedOutBeforeFuseSignal,
+				elapsedMs,
+				fuseFailureMatch: fuseCheck.match ?? null,
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+	await testInfo.attach('stderr-tail-4k', {
+		body: stderrTail || '(empty)',
+		contentType: 'text/plain',
+	});
+	await testInfo.attach('stdout-tail-4k', {
+		body: stdoutTail || '(empty)',
+		contentType: 'text/plain',
+	});
+
+	expect(
+		fuseCheck.found,
+		`AppImage stderr should not report a libfuse.so.2 dlopen failure ` +
+			`(matched: ${fuseCheck.match ?? 'n/a'}). The case-doc S01 ` +
+			`scenario fails on Ubuntu 24.04 unless libfuse2t64 is manually ` +
+			`installed; see scripts/packaging/appimage.sh:226 for the ` +
+			`upstream-runtime-as-is build choice.`,
+	).toBe(false);
+});
--- a/tools/test-harness/src/runners/S02_xdg_current_desktop_substring_match.spec.ts
+++ b/tools/test-harness/src/runners/S02_xdg_current_desktop_substring_match.spec.ts
@@ -0,0 +1,184 @@
+import { test, expect } from '@playwright/test';
+import { existsSync, readFileSync } from 'node:fs';
+import { join, resolve } from 'node:path';
+
+// S02 — XDG_CURRENT_DESKTOP detection uses substring match.
+//
+// Backs S02 in docs/testing/cases/distribution.md.
+//
+// Ubuntu sets XDG_CURRENT_DESKTOP=ubuntu:GNOME (colon-separated,
+// distro-prefixed). A naive `== "GNOME"` (or POSIX `= "GNOME"`)
+// equality check misses Ubuntu and silently disables every DE-gated
+// branch on those rows. The expected pattern is a substring/glob
+// match (case-insensitive) over the colon-separated value:
+//
+//   launcher-common.sh:38-44  →  desktop="${XDG_CURRENT_DESKTOP,,}"
+//                                [[ "$desktop" == *niri* ]]
+//   quick-window.sh:34-35     →  (process.env.XDG_CURRENT_DESKTOP||"")
+//                                  .toLowerCase().includes("kde")
+//   quick-window.sh:117-118   →  same shape, injected into index.js
+//
+// This is a source-tree regression detector: if a future change
+// rewrites either gate to a strict-equality form, the runner trips.
+// It does NOT assert the presence of any specific good pattern (the
+// case doc anchors describe several different shapes — niri glob,
+// KDE includes(), runtime JS gate); it asserts the *absence* of the
+// bad ones.
+//
+// Pure file probe — no app launch. Fast (<1s). Row-independent.
+//
+// Path resolution probes, in order:
+//   1. $CLAUDE_DESKTOP_REPO_ROOT/scripts (override)
+//   2. ../../scripts relative to cwd (dev worktree, where the harness
+//      runs from tools/test-harness/)
+//   3. /usr/lib/claude-desktop/scripts (deb/rpm install layout)
+// If none resolve, the test skips with a reason.
+
+interface BadHit {
+	file: string;
+	line: number;
+	text: string;
+}
+
+function resolveScriptsDir(): string | null {
+	const env = process.env.CLAUDE_DESKTOP_REPO_ROOT;
+	if (env) {
+		const p = join(env, 'scripts');
+		if (
+			existsSync(join(p, 'launcher-common.sh')) &&
+			existsSync(join(p, 'patches', 'quick-window.sh'))
+		) {
+			return p;
+		}
+	}
+	// Dev worktree probe — tools/test-harness lives two dirs deep,
+	// so cwd/../../scripts is the repo's scripts/ when tests are run
+	// from tools/test-harness/.
+	const devProbe = resolve(process.cwd(), '..', '..', 'scripts');
+	if (
+		existsSync(join(devProbe, 'launcher-common.sh')) &&
+		existsSync(join(devProbe, 'patches', 'quick-window.sh'))
+	) {
+		return devProbe;
+	}
+	// Installed path (deb/rpm).
+	const installedProbe = '/usr/lib/claude-desktop/scripts';
+	if (
+		existsSync(join(installedProbe, 'launcher-common.sh')) &&
+		existsSync(join(installedProbe, 'patches', 'quick-window.sh'))
+	) {
+		return installedProbe;
+	}
+	return null;
+}
+
+// Bad patterns: shell + JS strict-equality forms against
+// XDG_CURRENT_DESKTOP. Each regex is intentionally narrow so the
+// expected substring/glob shapes don't false-positive:
+//
+//   - Shell `[[ "$XDG_CURRENT_DESKTOP" == "GNOME" ]]` — bash strict
+//     equality with a *literal* RHS (no glob `*`). The `*niri*`
+//     glob form is fine and must NOT match.
+//   - Shell `[ "$XDG_CURRENT_DESKTOP" = "GNOME" ]` — POSIX strict
+//     equality.
+//   - JS `process.env.XDG_CURRENT_DESKTOP === "GNOME"` (and `==`).
+//
+// Each regex captures the variable on either side of the operator
+// so `"GNOME" == "$XDG_CURRENT_DESKTOP"` is also caught.
+//
+// `lowered` form (`"${XDG_CURRENT_DESKTOP,,}" == *niri*`) uses a
+// glob and is allowed; the bad-RHS regexes require the literal to
+// have no `*` wildcards inside the quotes.
+const BAD_PATTERNS: { name: string; re: RegExp }[] = [
+	{
+		// bash [[ ... == "literal" ]] with XDG_CURRENT_DESKTOP on
+		// either side. RHS literal contains no `*` (glob-free).
+		name: 'bash [[ == ]] strict equality (no glob)',
+		re: /\[\[[^\]]*\$\{?XDG_CURRENT_DESKTOP[^\]]*==\s*"[^"*]*"[^\]]*\]\]/,
+	},
+	{
+		name: 'bash [[ == ]] strict equality, var on right (no glob)',
+		re: /\[\[[^\]]*==\s*"\$\{?XDG_CURRENT_DESKTOP[^\]]*\]\]/,
+	},
+	{
+		// POSIX [ ... = "literal" ] with XDG_CURRENT_DESKTOP.
+		name: 'POSIX [ = ] strict equality',
+		re: /\[\s+[^]]*\$\{?XDG_CURRENT_DESKTOP[^\]]*=\s*"[^"]*"[^\]]*\]/,
+	},
+	{
+		// JS strict equality (=== or ==) against a string literal.
+		// Either single or double quotes; either side of the operator.
+		name: 'JS === / == strict equality',
+		re: /process\.env\.XDG_CURRENT_DESKTOP\s*===?\s*['"][^'"]*['"]|['"][^'"]*['"]\s*===?\s*process\.env\.XDG_CURRENT_DESKTOP/,
+	},
+];
+
+function scanFile(absPath: string): BadHit[] {
+	const text = readFileSync(absPath, 'utf8');
+	const lines = text.split('\n');
+	const hits: BadHit[] = [];
+	for (let i = 0; i < lines.length; i++) {
+		const line = lines[i] ?? '';
+		// Cheap pre-filter: only check lines mentioning the env var.
+		if (!line.includes('XDG_CURRENT_DESKTOP')) continue;
+		for (const { re } of BAD_PATTERNS) {
+			if (re.test(line)) {
+				hits.push({
+					file: absPath,
+					line: i + 1,
+					text: line.length > 200 ? line.slice(0, 200) + '…' : line,
+				});
+				break;
+			}
+		}
+	}
+	return hits;
+}
+
+test('S02 — XDG_CURRENT_DESKTOP detection uses substring match, not strict ==', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Should' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Distribution / desktop detection',
+	});
+
+	const scriptsDir = resolveScriptsDir();
+	if (!scriptsDir) {
+		test.skip(
+			true,
+			'No accessible scripts/ dir (set CLAUDE_DESKTOP_REPO_ROOT or install deb/rpm)',
+		);
+		return;
+	}
+
+	await testInfo.attach('scripts-dir', {
+		body: scriptsDir,
+		contentType: 'text/plain',
+	});
+
+	const targets = [
+		join(scriptsDir, 'launcher-common.sh'),
+		join(scriptsDir, 'patches', 'quick-window.sh'),
+	];
+
+	await testInfo.attach('files-checked', {
+		body: JSON.stringify(targets, null, 2),
+		contentType: 'application/json',
+	});
+
+	const allHits: BadHit[] = [];
+	for (const t of targets) {
+		allHits.push(...scanFile(t));
+	}
+
+	await testInfo.attach('bad-pattern-hits', {
+		body: JSON.stringify(allHits, null, 2),
+		contentType: 'application/json',
+	});
+
+	expect(
+		allHits,
+		// eslint-disable-next-line max-len
+		'No strict-equality checks against XDG_CURRENT_DESKTOP — ubuntu:GNOME would miss them. Use substring/glob match (case-insensitive) instead.',
+	).toEqual([]);
+});
--- a/tools/test-harness/src/runners/S03_deb_dependencies_declared.spec.ts
+++ b/tools/test-harness/src/runners/S03_deb_dependencies_declared.spec.ts
@@ -0,0 +1,157 @@
+import { test, expect } from '@playwright/test';
+import { execFile } from 'node:child_process';
+import { promisify } from 'node:util';
+import { captureSessionEnv } from '../lib/diagnostics.js';
+
+const exec = promisify(execFile);
+
+// S03 — DEB control file declares runtime dependencies.
+//
+// Per docs/testing/cases/distribution.md S03:
+//   Expected: All transitive runtime deps are declared in the package
+//   and pulled by APT. First launch succeeds without manual `apt
+//   install` of any extra package.
+//
+// Code anchor: scripts/packaging/deb.sh:185-197 — the DEBIAN/control
+// file emits Package/Version/Section/Priority/Architecture/Maintainer/
+// Description fields and **no `Depends:` line**, with the inline
+// comment at :181-183 ("No external dependencies are required at
+// runtime"). The case-doc treats this as a regression: Critical
+// surface, expected contract is "deps declared", current state is
+// "deps absent". So this runner is a regression detector — on a
+// deb-installed host today it will FAIL until upstream emits a
+// Depends line. Don't invert the assertion to make it green; failing
+// is the signal.
+//
+// Layer: pure spawn probe. `dpkg-query -W -f='${Depends}'
+// claude-desktop` reads the field straight out of dpkg's status db,
+// so we don't need to know where the .deb lives in apt's cache or
+// how the package was originally fetched.
+//
+// Skip behaviour: if dpkg-query exits non-zero (no dpkg installed,
+// or claude-desktop not in dpkg's db), the package isn't deb-managed
+// on this host and S03 has nothing to assert against.
+//
+// Subtlety on mixed-tooling hosts: a Fedora/RPM box that also has
+// `dpkg` installed for cross-distro dev can wind up with a stale
+// `claude-desktop` entry in dpkg's status db (matching the field
+// shape from a previous deb install). dpkg-query exits 0 in that
+// case and we still run the assertion — the field shape we read is
+// authoritative for what a current deb install would look like, so
+// it's a valid signal even if the binary on PATH is the rpm one.
+
+test('S03 — DEB control file declares runtime dependencies', async (
+	{},
+	testInfo,
+) => {
+	testInfo.annotations.push({
+		type: 'severity',
+		description: 'Critical',
+	});
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Distribution / DEB packaging',
+	});
+
+	await testInfo.attach('session-env', {
+		body: JSON.stringify(captureSessionEnv(), null, 2),
+		contentType: 'application/json',
+	});
+
+	// Read the Depends field from dpkg's status db. If dpkg-query
+	// itself isn't installed (ENOENT) or the package isn't in the db
+	// (exit 1), skip — S03 only applies to deb-managed installs.
+	let dependsField: string;
+	let pkgVersion = '';
+	try {
+		const { stdout } = await exec(
+			'dpkg-query',
+			['-W', '-f=${Depends}', 'claude-desktop'],
+			{ timeout: 5_000 },
+		);
+		dependsField = stdout.trim();
+	} catch (err) {
+		const e = err as { stderr?: string; code?: number | string };
+		await testInfo.attach('dpkg-query-error', {
+			body: JSON.stringify(
+				{
+					code: e.code ?? null,
+					stderr: (e.stderr ?? '').trim(),
+				},
+				null,
+				2,
+			),
+			contentType: 'application/json',
+		});
+		test.skip(
+			true,
+			'S03 only applies to deb-installed claude-desktop ' +
+				'(dpkg-query missing or package not in dpkg db)',
+		);
+		return;
+	}
+
+	// Capture the full Depends payload, version, and resolved binary
+	// path as evidence regardless of pass/fail. Per Decision 7 these
+	// are always-on attachments.
+	try {
+		const { stdout } = await exec(
+			'dpkg-query',
+			['-W', '-f=${Version}', 'claude-desktop'],
+			{ timeout: 5_000 },
+		);
+		pkgVersion = stdout.trim();
+	} catch {
+		// Version probe is best-effort — Depends-field result above
+		// already proves the package is in the db.
+	}
+
+	let installPath = '';
+	try {
+		const { stdout } = await exec('which', ['claude-desktop'], {
+			timeout: 5_000,
+		});
+		installPath = stdout.trim();
+	} catch {
+		// `which` fails when the launcher isn't on PATH (e.g. dpkg
+		// has a stale record but the binary's been removed). Capture
+		// the empty string and let the Depends assertion run.
+	}
+
+	await testInfo.attach('depends-field', {
+		body: dependsField,
+		contentType: 'text/plain',
+	});
+	await testInfo.attach('package-version', {
+		body: pkgVersion,
+		contentType: 'text/plain',
+	});
+	await testInfo.attach('install-path', {
+		body: installPath,
+		contentType: 'text/plain',
+	});
+	await testInfo.attach('evidence', {
+		body: JSON.stringify(
+			{
+				dependsField,
+				dependsLength: dependsField.length,
+				packageVersion: pkgVersion,
+				installPath,
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	// Core S03 assertion. Upstream contract: a Critical-severity
+	// runtime install pulls all transitive deps via APT, which
+	// requires the control file to declare them. Empty Depends ==
+	// regression against scripts/packaging/deb.sh:185-197.
+	expect(
+		dependsField,
+		'DEBIAN/control Depends: field is non-empty per upstream ' +
+			'contract (case-doc S03 — currently fails until ' +
+			'scripts/packaging/deb.sh:185-197 emits a Depends line)',
+	).not.toBe('');
+});
--- a/tools/test-harness/src/runners/S04_rpm_requires_declared.spec.ts
+++ b/tools/test-harness/src/runners/S04_rpm_requires_declared.spec.ts
@@ -0,0 +1,228 @@
+import { test, expect } from '@playwright/test';
+import { execFile } from 'node:child_process';
+import { promisify } from 'node:util';
+
+const exec = promisify(execFile);
+
+// S04 — RPM install via DNF pulls all required runtime deps.
+//
+// Mirror of S03 for the RPM/DNF branch. Case-doc:
+// docs/testing/cases/distribution.md#s04--rpm-install-via-dnf-pulls-all-required-runtime-deps
+//
+// Severity: Critical. Surface: DNF repository / dependency
+// declarations. Applies to KDE-W, KDE-X, GNOME, Sway, i3, Niri (any
+// RPM-based distro).
+//
+// Case-doc anchors `scripts/packaging/rpm.sh:188` (`AutoReqProv: no`
+// disables RPM's auto-dep generation; the spec declares no
+// `Requires:`) and `:194-198` (strip + build-id disabled because
+// Electron binaries don't tolerate them — bundled approach).
+//
+// **Regression-detector shape.** The assertion direction is "Requires
+// has at least one declared runtime dep" — i.e. at least one line in
+// `rpm -qR claude-desktop` that isn't an `rpmlib(...)` capability and
+// isn't a `%post`/`%postun` interpreter path (`/bin/sh` etc). Today
+// that filter empties out and the test FAILS, which is the documented
+// state per the case-doc until upstream `rpm.sh` flips
+// `AutoReqProv: on` (or declares an explicit `Requires:` block).
+//
+// `rpm -qR` always emits `rpmlib(CompressedFileNames)`,
+// `rpmlib(FileDigests)`, `rpmlib(PayloadFilesHavePrefix)`, and
+// `rpmlib(PayloadIsZstd)` regardless of spec content — those are
+// satisfied by the rpm runtime itself, not by declared deps. Bare
+// interpreter paths like `/bin/sh` come from scriptlet detection on
+// the spec's `%post` / `%postun`, not from declared library deps.
+// Both get filtered out so the assertion is strictly "did anyone
+// declare a runtime dep, by hand or via AutoReqProv".
+//
+// Skip cleanly when:
+//   - `rpm` isn't on PATH (Debian/Ubuntu host, AppImage-only host).
+//   - `rpm -q claude-desktop` says the package isn't rpm-installed
+//     (deb host with rpm tooling for cross-distro dev, AppImage extract).
+//
+// Layer: spawn probe + stdout parse. No app launch. Row-independent
+// in shape, but only meaningful on RPM-based rows.
+
+interface ProbeResult {
+	cmd: string;
+	exitCode: number | null;
+	stdout: string;
+	stderr: string;
+}
+
+async function probe(
+	bin: string,
+	args: string[],
+): Promise<ProbeResult> {
+	const cmd = `${bin} ${args.join(' ')}`;
+	try {
+		const { stdout, stderr } = await exec(bin, args, {
+			timeout: 5_000,
+		});
+		return {
+			cmd,
+			exitCode: 0,
+			stdout: stdout.trim(),
+			stderr: stderr.trim(),
+		};
+	} catch (err) {
+		const e = err as {
+			stdout?: string;
+			stderr?: string;
+			code?: number | string;
+		};
+		const code =
+			typeof e.code === 'number'
+				? e.code
+				: typeof e.code === 'string'
+					? null
+					: null;
+		return {
+			cmd,
+			exitCode: code,
+			stdout: (e.stdout ?? '').trim(),
+			stderr: (e.stderr ?? '').trim(),
+		};
+	}
+}
+
+function formatProbe(p: ProbeResult): string {
+	const tail = [
+		p.stdout && `stdout: ${p.stdout}`,
+		p.stderr && `stderr: ${p.stderr}`,
+	]
+		.filter(Boolean)
+		.join('\n');
+	return `$ ${p.cmd} (exit ${p.exitCode ?? '?'})\n${tail}`.trim();
+}
+
+// `rpm -qR` lines we don't count as "declared runtime deps":
+//   - `rpmlib(...)` capabilities — auto-emitted by rpm regardless of
+//     the spec, satisfied by the rpm runtime itself.
+//   - Bare interpreter paths (`/bin/sh`, `/bin/bash`, `/usr/bin/env`)
+//     — picked up from the spec's scriptlets (`%post` / `%postun`),
+//     not from declared library deps.
+function isAutoEmittedRequire(line: string): boolean {
+	const trimmed = line.trim();
+	if (!trimmed) return true;
+	if (trimmed.startsWith('rpmlib(')) return true;
+	// Strip a trailing version constraint ("/bin/sh >= 1.0") before
+	// matching so the shape is just the capability/path.
+	const head = trimmed.split(/\s+/)[0] ?? '';
+	if (
+		head === '/bin/sh' ||
+		head === '/bin/bash' ||
+		head === '/usr/bin/env' ||
+		head === '/usr/bin/sh' ||
+		head === '/usr/bin/bash'
+	) {
+		return true;
+	}
+	return false;
+}
+
+test('S04 — RPM package declares runtime requirements', async (
+	{},
+	testInfo,
+) => {
+	testInfo.annotations.push({
+		type: 'severity',
+		description: 'Critical',
+	});
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'DNF repository / dependency declarations',
+	});
+
+	// Skip cleanly on hosts without rpm tooling.
+	const rpmWhich = await probe('which', ['rpm']);
+	await testInfo.attach('which-rpm', {
+		body: formatProbe(rpmWhich),
+		contentType: 'text/plain',
+	});
+	if (rpmWhich.exitCode !== 0 || !rpmWhich.stdout) {
+		test.skip(
+			true,
+			'S04 only applies to rpm-installed claude-desktop ' +
+				'(rpm not on PATH)',
+		);
+		return;
+	}
+
+	// Resolve installed package version. `rpm -q` returns non-zero if
+	// the package isn't installed via rpm (Debian/AppImage host with
+	// rpm tooling, etc) — that's the second skip path.
+	const rpmQ = await probe('rpm', ['-q', 'claude-desktop']);
+	await testInfo.attach('rpm-q', {
+		body: formatProbe(rpmQ),
+		contentType: 'text/plain',
+	});
+	if (rpmQ.exitCode !== 0) {
+		test.skip(
+			true,
+			'S04 only applies to rpm-installed claude-desktop ' +
+				'(rpm -q claude-desktop returned non-zero)',
+		);
+		return;
+	}
+
+	// Capture install path for the diagnostics bundle. Failure here
+	// isn't a skip — `which` not finding `claude-desktop` on a host
+	// where `rpm -q claude-desktop` succeeds is unusual but harmless
+	// for the assertion shape.
+	const whichClaude = await probe('which', ['claude-desktop']);
+	await testInfo.attach('which-claude-desktop', {
+		body: formatProbe(whichClaude),
+		contentType: 'text/plain',
+	});
+
+	const rpmRequires = await probe('rpm', ['-qR', 'claude-desktop']);
+	await testInfo.attach('rpm-qR', {
+		body: formatProbe(rpmRequires),
+		contentType: 'text/plain',
+	});
+	expect(
+		rpmRequires.exitCode,
+		`rpm -qR claude-desktop must succeed on an rpm-installed host`,
+	).toBe(0);
+
+	const allLines = rpmRequires.stdout
+		.split('\n')
+		.map((l) => l.trim())
+		.filter((l) => l.length > 0);
+	const declaredRequires = allLines.filter(
+		(l) => !isAutoEmittedRequire(l),
+	);
+
+	await testInfo.attach('requires-classified', {
+		body: JSON.stringify(
+			{
+				all: allLines,
+				declared: declaredRequires,
+				declaredCount: declaredRequires.length,
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	// Core S04 assertion. Per case-doc "Expected": "All transitive
+	// runtime deps are declared in the RPM and pulled by DNF." A
+	// non-empty `declaredRequires` is the minimum signal — it doesn't
+	// prove the *full* set is declared, but it proves the spec moved
+	// off `AutoReqProv: no` with no manual `Requires:` (the current
+	// state per scripts/packaging/rpm.sh:188).
+	//
+	// Today this fails by design — the failure IS the regression-
+	// detector state. The assertion flips green once
+	// scripts/packaging/rpm.sh starts declaring runtime deps (manual
+	// Requires lines, AutoReqProv flip, or both).
+	expect(
+		declaredRequires.length,
+		`rpm -qR claude-desktop should report at least one declared ` +
+			`runtime requirement (non-rpmlib(...), non-interpreter). ` +
+			`Currently empty per scripts/packaging/rpm.sh:188 ` +
+			`(\`AutoReqProv: no\`, no \`Requires:\`).`,
+	).toBeGreaterThan(0);
+});
--- a/tools/test-harness/src/runners/S05_doctor_recognises_rpm_install.spec.ts
+++ b/tools/test-harness/src/runners/S05_doctor_recognises_rpm_install.spec.ts
@@ -0,0 +1,201 @@
+import { test, expect } from '@playwright/test';
+import { execFile } from 'node:child_process';
+import { promisify } from 'node:util';
+import {
+	runDoctor,
+	captureSessionEnv,
+} from '../lib/diagnostics.js';
+
+const exec = promisify(execFile);
+
+// S05 — Doctor recognises rpm-installed claude-desktop, doesn't
+// false-flag as AppImage.
+//
+// Per docs/testing/cases/distribution.md S05 (sibling of T13 in
+// launch.md — same surface, intentional matrix overlap):
+//
+// * Steps: on a Fedora/Nobara/RPM-based distro with claude-desktop
+//   installed via dnf, run `claude-desktop --doctor` and look for the
+//   install-method line.
+// * Expected: doctor detects rpm install (e.g. via `rpm -qf` against
+//   the binary path) and reports it cleanly. No `not found via dpkg
+//   (AppImage?)` warning.
+// * Currently: scripts/doctor.sh's install-method probe is gated on
+//   `command -v dpkg-query` and has no `rpm -qf` branch. Case-doc
+//   anchors the block as :290-299; the actual lines in the file as of
+//   runner-write time are :353-362 (drift noted, see report). On
+//   RPM-only hosts (no dpkg-query) the entire block is skipped — no
+//   install-method line is printed at all. On hosts with both
+//   dpkg-query installed AND an rpm-installed claude-desktop, the
+//   `_warn 'claude-desktop not found via dpkg (AppImage?)'` branch
+//   fires only if dpkg-query comes up empty. (Anecdotally on some
+//   Fedora hosts dpkg-query returns a stale Version string against
+//   `claude-desktop` — in that case the PASS path runs and the
+//   warning is suppressed for the wrong reason, but S05 still
+//   passes by the letter of the assertion.)
+//
+// Scope split vs T13:
+//
+// * T13 (launch.md) covers all rows: detect rpm OR deb, assert no
+//   false-flag for whichever owns the binary. Skips on AppImage /
+//   hand-built / undetectable installs.
+// * S05 (this file) is RPM-only: skips when `rpm -qf` doesn't claim
+//   the binary, regardless of whether dpkg owns it. The matrix wants
+//   both cells filled; the overlap is intentional — S05 fails loudly
+//   on Fedora rows when T13's broader gating happens to skip (e.g.
+//   if `rpm -qf` is missing from PATH, T13 falls through to the
+//   `unknown` branch and skips, while S05 reports skip with the same
+//   reason but separately).
+//
+// Layer: spawn probe + stdout grep. Doesn't touch the running app
+// instance; doctor is `--doctor`-gated and exits without launching
+// Electron.
+//
+// Diagnostics on failure (per case-doc): full --doctor output,
+// `rpm -qf $(which claude-desktop)`, the doctor source line that
+// decides the format. Captured unconditionally as attachments so
+// post-hoc triage from a JUnit-only run is possible.
+
+const FALSE_FLAG_FRAGMENT = 'not found via dpkg (AppImage?)';
+
+interface ProbeResult {
+	cmd: string;
+	exitCode: number | null;
+	stdout: string;
+	stderr: string;
+}
+
+async function probe(
+	bin: string,
+	args: string[],
+): Promise<ProbeResult> {
+	const cmd = `${bin} ${args.join(' ')}`;
+	try {
+		const { stdout, stderr } = await exec(bin, args, {
+			timeout: 5_000,
+		});
+		return {
+			cmd,
+			exitCode: 0,
+			stdout: stdout.trim(),
+			stderr: stderr.trim(),
+		};
+	} catch (err) {
+		const e = err as {
+			stdout?: string;
+			stderr?: string;
+			code?: number;
+		};
+		return {
+			cmd,
+			exitCode: typeof e.code === 'number' ? e.code : null,
+			stdout: (e.stdout ?? '').trim(),
+			stderr: (e.stderr ?? '').trim(),
+		};
+	}
+}
+
+function formatProbe(p: ProbeResult): string {
+	const tail = [
+		p.stdout && `stdout: ${p.stdout}`,
+		p.stderr && `stderr: ${p.stderr}`,
+	]
+		.filter(Boolean)
+		.join('\n');
+	return `$ ${p.cmd} (exit ${p.exitCode ?? '?'})\n${tail}`.trim();
+}
+
+test('S05 — Doctor recognises rpm install, no dpkg false-flag', async (
+	{},
+	testInfo,
+) => {
+	testInfo.annotations.push({
+		type: 'severity',
+		description: 'Should',
+	});
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'CLI / --doctor',
+	});
+
+	// Applies to RPM-based rows per case-doc (KDE-W, KDE-X, GNOME,
+	// Sway, i3, Niri). Rather than gating on the ROW env var, gate on
+	// the actual install method — the assertion has no signal on
+	// non-rpm hosts regardless of how the matrix labels them.
+
+	await testInfo.attach('session-env', {
+		body: JSON.stringify(captureSessionEnv(), null, 2),
+		contentType: 'application/json',
+	});
+
+	const launcher =
+		process.env.CLAUDE_DESKTOP_LAUNCHER ?? 'claude-desktop';
+	const whichProbe = await probe('which', [launcher]);
+	await testInfo.attach('which-claude-desktop', {
+		body: formatProbe(whichProbe),
+		contentType: 'text/plain',
+	});
+
+	const installPath =
+		whichProbe.stdout.split('\n')[0]?.trim() ?? '';
+	if (whichProbe.exitCode !== 0 || !installPath) {
+		test.skip(
+			true,
+			`claude-desktop not reachable on PATH ` +
+				`(launcher='${launcher}'); rpm-install probe needs ` +
+				`a resolvable binary`,
+		);
+		return;
+	}
+
+	// Detect rpm install. `rpm -qf` returns 0 + the owning package's
+	// NEVRA when the file is rpm-managed, non-zero otherwise. We also
+	// run `rpm -q claude-desktop` to surface the package metadata
+	// independent of which file `which` resolved (helpful when the
+	// launcher is a wrapper script that shadows the real binary).
+	const rpmFile = await probe('rpm', ['-qf', installPath]);
+	const rpmPkg = await probe('rpm', ['-q', 'claude-desktop']);
+	await testInfo.attach('rpm-qf', {
+		body: formatProbe(rpmFile),
+		contentType: 'text/plain',
+	});
+	await testInfo.attach('rpm-q-claude-desktop', {
+		body: formatProbe(rpmPkg),
+		contentType: 'text/plain',
+	});
+
+	if (rpmFile.exitCode !== 0) {
+		// Not rpm-installed. S05's assertion only has signal on RPM
+		// rows; on deb / AppImage / hand-built / undetectable installs
+		// this is a clean skip (T13 covers the deb-side mirror).
+		test.skip(
+			true,
+			`S05 only applies to rpm-installed claude-desktop; ` +
+				`rpm -qf ${installPath} returned ` +
+				`exit ${rpmFile.exitCode ?? '?'} ` +
+				`(stderr: ${rpmFile.stderr || '<empty>'})`,
+		);
+		return;
+	}
+
+	const result = await runDoctor(launcher);
+	await testInfo.attach('doctor-output', {
+		body: result.output,
+		contentType: 'text/plain',
+	});
+	await testInfo.attach('doctor-exit-code', {
+		body: String(result.exitCode),
+		contentType: 'text/plain',
+	});
+
+	// Core S05 assertion: doctor must NOT print the dpkg false-flag
+	// warning for an rpm-installed copy. T02 already asserts the
+	// exit-code contract (`doctor exits 0`) — don't duplicate that
+	// here; S05 is purely about the install-method line.
+	expect(
+		result.output,
+		`doctor must not false-flag rpm install ` +
+			`(${rpmFile.stdout || 'rpm-owned'} at ${installPath}) ` +
+			`as missing-dpkg AppImage`,
+	).not.toContain(FALSE_FLAG_FRAGMENT);
+});
--- a/tools/test-harness/src/runners/S07_claude_use_wayland_opt_in.spec.ts
+++ b/tools/test-harness/src/runners/S07_claude_use_wayland_opt_in.spec.ts
@@ -0,0 +1,167 @@
+import { test, expect } from '@playwright/test';
+import { launchClaude } from '../lib/electron.js';
+import { skipUnlessRow } from '../lib/row.js';
+import { readPidArgv, argvHasFlag } from '../lib/argv.js';
+import { readLauncherLog, captureSessionEnv } from '../lib/diagnostics.js';
+import { retryUntil } from '../lib/retry.js';
+
+// S07 — `CLAUDE_USE_WAYLAND=1` opt-in path works.
+//
+// Backs S07 in docs/testing/cases/shortcuts-and-input.md.
+//
+// Case-doc anchors:
+//   scripts/launcher-common.sh:28-29 — `CLAUDE_USE_WAYLAND=1` opt-in
+//     (sets `use_x11_on_wayland=false`, taking the native-Wayland
+//     branch in build_electron_args).
+//   scripts/launcher-common.sh:100-111 — native-Wayland Electron flags:
+//     `--enable-features=UseOzonePlatform,WaylandWindowDecorations`,
+//     `--ozone-platform=wayland`, `--enable-wayland-ime`,
+//     `--wayland-text-input-version=3`, plus `GDK_BACKEND=wayland`.
+//
+// What this asserts: when the harness's Wayland mode is engaged
+// (`CLAUDE_HARNESS_USE_WAYLAND=1`), the spawned Electron's argv
+// contains `--ozone-platform=wayland` and `CLAUDE_USE_WAYLAND=1` is
+// exported into the spawn env. That mirrors the launcher's
+// CLAUDE_USE_WAYLAND=1 branch — same flag set is emitted (see
+// LAUNCHER_INJECTED_FLAGS_WAYLAND in src/lib/electron.ts:134-141).
+//
+// Gating choice — harness-mode vs launcher-script:
+//
+// The harness deliberately bypasses the launcher script (CDP-gate
+// reasons — see lib/electron.ts:102-117), so it constructs its own
+// flag set. Setting `extraEnv: { CLAUDE_USE_WAYLAND: '1' }` would
+// only affect the child env, not the harness's flag selector. To
+// exercise the Wayland branch end-to-end the harness exposes
+// `CLAUDE_HARNESS_USE_WAYLAND=1`, which:
+//   1. swaps to LAUNCHER_INJECTED_FLAGS_WAYLAND (the same flag
+//      set the launcher's Wayland branch emits), and
+//   2. exports `CLAUDE_USE_WAYLAND=1` + `GDK_BACKEND=wayland` into
+//      the child env.
+//
+// This test asserts that contract. When CLAUDE_HARNESS_USE_WAYLAND
+// is unset we skip — the harness's X11 default doesn't model the
+// CLAUDE_USE_WAYLAND opt-in path. Run the suite with
+// `CLAUDE_HARNESS_USE_WAYLAND=1 npx playwright test ...` to
+// activate the assertion.
+//
+// Row gate: native-Wayland-capable rows only. KDE-W is intentionally
+// included even though the case-doc Applies-to lists wlroots rows
+// (Sway/Niri/Hypr) — KDE Plasma Wayland can also run native Wayland
+// when CLAUDE_USE_WAYLAND=1 is set, and KDE-W is the harness's CI
+// row, so we want this to be exercisable there.
+
+test.setTimeout(45_000);
+
+test('S07 — CLAUDE_USE_WAYLAND opt-in surfaces in Electron argv', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Should' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Display backend / Wayland opt-in',
+	});
+	skipUnlessRow(testInfo, [
+		'Sway',
+		'Niri',
+		'Hypr-O',
+		'Hypr-N',
+		'GNOME-W',
+		'KDE-W',
+	]);
+
+	if (process.env.CLAUDE_HARNESS_USE_WAYLAND !== '1') {
+		test.skip(
+			true,
+			'S07 requires CLAUDE_HARNESS_USE_WAYLAND=1 (the harness ' +
+				'Wayland-mode that mirrors the launcher CLAUDE_USE_WAYLAND ' +
+				'branch). Re-run with the env set.',
+		);
+		return;
+	}
+
+	await testInfo.attach('session-env', {
+		body: JSON.stringify(captureSessionEnv(), null, 2),
+		contentType: 'application/json',
+	});
+	await testInfo.attach('harness-env', {
+		body: JSON.stringify(
+			{
+				CLAUDE_HARNESS_USE_WAYLAND:
+					process.env.CLAUDE_HARNESS_USE_WAYLAND ?? null,
+				CLAUDE_USE_WAYLAND: process.env.CLAUDE_USE_WAYLAND ?? null,
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
+	const app = await launchClaude({
+		isolation: useHostConfig ? null : undefined,
+	});
+
+	try {
+		// Don't waitForX11Window — under native Wayland the app is
+		// going through Ozone-Wayland directly, no XWayland window
+		// appears. /proc/$pid/cmdline is populated by exec(), so we
+		// just need the spawned Electron to stay alive long enough
+		// to read it. Poll for non-null + non-empty argv.
+		const argv = await retryUntil(
+			async () => {
+				const a = await readPidArgv(app.pid);
+				return a && a.length > 0 ? a : null;
+			},
+			{ timeout: 15_000, interval: 250 },
+		);
+		await testInfo.attach('electron-argv', {
+			body: JSON.stringify(argv, null, 2),
+			contentType: 'application/json',
+		});
+		expect(argv, 'could read /proc/$pid/cmdline').not.toBeNull();
+
+		// Launcher log is only populated when the launcher script
+		// runs; the harness spawns Electron directly. Capture the
+		// log if it happens to exist (host-leftover from an earlier
+		// real-launcher run) for diagnostic context only.
+		const log = await readLauncherLog();
+		if (log) {
+			const tail = log.split('\n').slice(-50).join('\n');
+			await testInfo.attach('launcher-log-tail', {
+				body: tail,
+				contentType: 'text/plain',
+			});
+		}
+
+		const ozoneWayland = argvHasFlag(argv ?? [], '--ozone-platform=wayland');
+		const useOzone = argvHasFlag(
+			argv ?? [],
+			'--enable-features=UseOzonePlatform',
+		);
+		await testInfo.attach('flag-presence', {
+			body: JSON.stringify(
+				{
+					'--ozone-platform=wayland': ozoneWayland,
+					'--enable-features=UseOzonePlatform': useOzone,
+					note:
+						'When CLAUDE_HARNESS_USE_WAYLAND=1 the harness ' +
+						'must emit the same Electron flag set as the ' +
+						'launcher script\'s CLAUDE_USE_WAYLAND=1 branch.',
+				},
+				null,
+				2,
+			),
+			contentType: 'application/json',
+		});
+
+		expect(
+			ozoneWayland,
+			'spawned Electron has --ozone-platform=wayland on argv',
+		).toBe(true);
+		expect(
+			useOzone,
+			'spawned Electron has --enable-features=UseOzonePlatform ' +
+				'(co-emitted with the wayland ozone flag)',
+		).toBe(true);
+	} finally {
+		await app.close();
+	}
+});
--- a/tools/test-harness/src/runners/S08_tray_dedupe_fastpath_fingerprint.spec.ts
+++ b/tools/test-harness/src/runners/S08_tray_dedupe_fastpath_fingerprint.spec.ts
@@ -0,0 +1,129 @@
+import { test, expect } from '@playwright/test';
+import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
+import { skipUnlessRow } from '../lib/row.js';
+
+// S08 — Tray rebuild-race fast-path injected (file probe).
+//
+// Backs the static side of S08 in
+// docs/testing/cases/tray-and-window-chrome.md. T03 already covers the
+// runtime SNI-count assertion (post-`nativeTheme.themeSource` toggle:
+// exactly one StatusNotifierItem stays registered). This spec is the
+// complementary build-time fingerprint — verifies that
+// `patch_tray_inplace_update` in scripts/patches/tray.sh actually
+// landed in the bundled `index.js`, so a silent regex miss in the
+// patch script (idempotency guard short-circuits, anchor regex drifts
+// against minifier churn, etc.) is observable without having to wait
+// for a runtime tray-duplication failure on KDE.
+//
+// Fingerprint: literal `.setImage(` substring in
+// `.vite/build/index.js`.
+//
+// Why this is load-bearing and stable:
+//
+//   - Pristine upstream (`build-reference/app-extracted/.vite/build/
+//     index.js`) contains zero `.setImage(` occurrences. The tray
+//     constructs exclusively via `new <EL>.Tray(<EL>.nativeImage
+//     .createFromPath(...))` and never re-images in place. (Verified
+//     by `grep -cE '\.setImage\s*\(' index.js` → 0.)
+//   - The injected fast-path emitted by `patch_tray_inplace_update`
+//     (scripts/patches/tray.sh:212-217) calls
+//     `<TRAY_VAR>.setImage(<EL_VAR>.nativeImage.createFromPath(
+//     <PATH_VAR>))` — that is the entire point of the fast-path
+//     (skip destroy + recreate, update the existing Tray's image in
+//     place so the SNI registration stays put on KDE Plasma).
+//   - The Electron API name `setImage` is not a minified local —
+//     it's a method on `Tray.prototype` and stays literal across
+//     upstream version bumps regardless of the bundler's variable
+//     renaming. So the fingerprint is robust to the same minifier
+//     churn that forces tray.sh to extract `tray_var` / `electron_var`
+//     / `path_var` dynamically.
+//   - Idempotency marker in tray.sh:174-180 keys on the same literal
+//     post-rename `setImage(<EL>.nativeImage.createFromPath(<PATH>))`
+//     sequence; presence of `.setImage(` therefore tracks 1:1 with
+//     the patch's own self-detection.
+//
+// Why not the other candidates considered:
+//
+//   - `_trayStartTime`: already covered by H03 for the prior tray.sh
+//     sub-patch (`patch_tray_menu_handler`). H03's note explicitly
+//     calls out that the in-place update sub-patch needs its own
+//     fingerprint, which is what S08 supplies here.
+//   - `process.platform!=="darwin"`: appears 50+ times in the
+//     minified bundle (every Electron-on-Linux / -on-Windows
+//     branch). Not distinctive.
+//   - `setContextMenu` count >= 2: works (upstream has exactly one
+//     occurrence; patched bundle has two — fast-path + slow-path),
+//     but is brittle to any future upstream code that calls
+//     `setContextMenu` for an unrelated reason. `.setImage(`
+//     presence-only is stricter and simpler.
+//
+// Pure file probe — no app launch. Fast (<1s). Row-gated to KDE
+// (case-doc Applies-to: KDE-W, KDE-X) since the underlying SNI
+// rebuild race only manifests on KDE Plasma's `systemtray` widget;
+// other DEs handle UnregisterItem/Register sequencing without the
+// duplicate-icon visual artifact, so the fast-path is a should-have
+// there but the assertion isn't load-bearing for the row.
+
+test('S08 — Tray rebuild-race fast-path injected (file probe)', async ({}, testInfo) => {
+	skipUnlessRow(testInfo, ['KDE-W', 'KDE-X']);
+
+	testInfo.annotations.push({ type: 'severity', description: 'Should' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Tray icon / KDE rebuild race',
+	});
+
+	const asarPath = resolveAsarPath();
+	await testInfo.attach('asar-path', {
+		body: asarPath,
+		contentType: 'text/plain',
+	});
+
+	const indexJs = readAsarFile('.vite/build/index.js', asarPath);
+
+	// `.setImage(` is the patch-injected literal. Match-count is
+	// surfaced for diagnostics: 0 = patch missed, 1+ = patch landed.
+	// (We don't pin to exactly 1 — if upstream ever ships a
+	// legitimate second `.setImage(` site, the patch's fast-path is
+	// still present and S08 should still pass.)
+	const setImageCount = (indexJs.match(/\.setImage\s*\(/g) ?? []).length;
+	const fastPathPresent = setImageCount > 0;
+
+	// Bonus diagnostic signal: the slow-path destroy+recreate block
+	// is preserved by the patch (it stays in place for initial-
+	// creation and tray-disable cases — see tray.sh:182-188 and
+	// docs/learnings/tray-rebuild-race.md "The fix"). So a healthy
+	// patched bundle has >= 1 `setContextMenu` call (slow path) and
+	// >= 1 `.setImage(` call (fast path). Pristine upstream has
+	// exactly 1 `setContextMenu` and 0 `.setImage(`.
+	const setContextMenuCount = (
+		indexJs.match(/\.setContextMenu\s*\(/g) ?? []
+	).length;
+
+	await testInfo.attach('fingerprint-evidence', {
+		body: JSON.stringify(
+			{
+				file: '.vite/build/index.js',
+				fingerprint: '.setImage(',
+				setImageCount,
+				setContextMenuCount,
+				fastPathPresent,
+				source:
+					'patches/tray.sh:212-217 (patch_tray_inplace_update) ' +
+					'injects `<TRAY>.setImage(<EL>.nativeImage.' +
+					'createFromPath(<PATH>))` before the destroy+recreate ' +
+					'block. Upstream never calls .setImage on the tray, ' +
+					'so non-zero count == patch landed.',
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	expect(
+		fastPathPresent,
+		'app.asar contains the in-place `.setImage(` call injected by ' +
+			'patch_tray_inplace_update (scripts/patches/tray.sh)',
+	).toBe(true);
+});
--- a/tools/test-harness/src/runners/S09_quick_window_patch_only_kde.spec.ts
+++ b/tools/test-harness/src/runners/S09_quick_window_patch_only_kde.spec.ts
@@ -0,0 +1,47 @@
+import { test, expect } from '@playwright/test';
+import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
+
+// S09 — Quick window patch runs only on KDE (post-#406 gate).
+// Backs QE-19 in docs/testing/quick-entry-closeout.md.
+//
+// The patch in scripts/patches/quick-window.sh injects an
+// `(process.env.XDG_CURRENT_DESKTOP||"").toLowerCase().includes("kde")`
+// gate into the bundled JS. The string `XDG_CURRENT_DESKTOP` shows up
+// in app.asar's index.js if and only if the patch ran at build time.
+// The patch ships in every build; the KDE-vs-non-KDE branch is
+// decided at runtime by the env-var check.
+//
+// Pure file probe — no app launch. Fast (<1s).
+//
+// Runtime gate effectiveness is verified implicitly by S31 passing
+// on KDE (popup-show works through the patched code path) and the
+// upstream-equivalent path running on non-KDE rows.
+
+test('S09 — Quick window patch runs only on KDE (post-#406 gate)', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Critical' });
+	testInfo.annotations.push({ type: 'surface', description: 'Patch gate' });
+
+	const asarPath = resolveAsarPath();
+	await testInfo.attach('asar-path', {
+		body: asarPath,
+		contentType: 'text/plain',
+	});
+
+	const indexJs = readAsarFile('.vite/build/index.js', asarPath);
+
+	// The gate string is the runtime fingerprint of the patch. If the
+	// patch didn't run, the bundled JS won't contain it.
+	const gatePresent = indexJs.includes('XDG_CURRENT_DESKTOP');
+	expect(
+		gatePresent,
+		'app.asar contains the XDG_CURRENT_DESKTOP gate string injected by quick-window.sh',
+	).toBe(true);
+
+	// Bonus signal: the patch's idempotency guard. If both are
+	// present the patch's full payload landed.
+	const patchedComment = indexJs.includes('kde');
+	await testInfo.attach('gate-evidence', {
+		body: JSON.stringify({ gatePresent, patchedComment }, null, 2),
+		contentType: 'application/json',
+	});
+});
--- a/tools/test-harness/src/runners/S10_quick_entry_popup_transparent.spec.ts
+++ b/tools/test-harness/src/runners/S10_quick_entry_popup_transparent.spec.ts
@@ -0,0 +1,122 @@
+import { test, expect } from '@playwright/test';
+import { launchClaude } from '../lib/electron.js';
+import { skipUnlessRow } from '../lib/row.js';
+import { QuickEntry } from '../lib/quickentry.js';
+import { captureSessionEnv } from '../lib/diagnostics.js';
+
+// S10 — Quick Entry popup is transparent (no opaque square frame).
+// Backs the KDE-W row of S10 in
+// docs/testing/cases/shortcuts-and-input.md.
+//
+// Upstream constructs the popup BrowserWindow with
+//   transparent: true, backgroundColor: "#00000000", frame: false
+// at build-reference index.js:515380, 515383, 515381. On KDE Plasma
+// Wayland the compositor honours the alpha channel and the popup
+// renders with a transparent background; on broken-Electron versions
+// (electron/electron#50213, the 41.0.4-41.x.y bisect window per
+// @noctuum on #370) the alpha is dropped and an opaque square frame
+// shows behind the rounded prompt UI.
+//
+// Construction-time options aren't observable through the prototype-
+// method hook in lib/quickentry.ts (the Proxy from frame-fix-wrapper
+// returns the closure-captured PatchedBrowserWindow on `electron.
+// BrowserWindow` reads — see the doc-comment on
+// QuickEntry.installInterceptor and CLAUDE.md "Test harness Electron
+// hooks" learning). Runtime-side, `getBackgroundColor()` reflects
+// what the BrowserWindow was actually constructed with — so we read
+// it via getPopupRuntimeProps() and assert
+//   transparent === true && backgroundColor in {'#00000000','#0000'}
+// matching the predicate in lib/quickentry.ts:266.
+//
+// Gated to KDE-W: other KDE rows (KDE-X) don't have the same
+// compositor / Electron-Wayland concern that the case-doc S10
+// surfaces. If S10 fails on a host whose bundled Electron is in the
+// 41.0.4-41.x.y window, that's the upstream regression — see S33 for
+// the version-capture half. Don't wrap in skip on failure; surface
+// it as a regression-detector signal.
+
+test.setTimeout(60_000);
+
+test('S10 — Quick Entry popup is transparent (no opaque square frame)', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Should' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Quick Entry window (KDE Wayland)',
+	});
+	skipUnlessRow(testInfo, ['KDE-W']);
+
+	await testInfo.attach('session-env', {
+		body: JSON.stringify(captureSessionEnv(), null, 2),
+		contentType: 'application/json',
+	});
+
+	const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
+	const app = await launchClaude({
+		isolation: useHostConfig ? null : undefined,
+	});
+
+	await testInfo.attach('isolation', {
+		body: JSON.stringify(
+			{
+				useHostConfig,
+				configDir: app.isolation?.configDir ?? null,
+			},
+			null,
+			2,
+		),
+		contentType: 'application/json',
+	});
+
+	try {
+		// Main needs to be up before the shortcut can lazily construct
+		// the popup — the popup-show path reads renderer state via
+		// upstream's lHn() user-loaded check (see openAndWaitReady's
+		// retry-loop comment in lib/quickentry.ts).
+		const { inspector } = await app.waitForReady('mainVisible');
+		const qe = new QuickEntry(inspector);
+		await qe.installInterceptor();
+
+		// Fire the OS shortcut and wait for the popup BrowserWindow to
+		// be visible with its textarea mounted — same handshake S29
+		// uses. If ydotool isn't reachable, openAndWaitReady throws
+		// the install-instructions error from ensureYdotool — that
+		// surfaces as a clear test failure (acceptable per the
+		// case-doc; not wrapped in a skip).
+		await qe.openAndWaitReady();
+
+		const props = await qe.getPopupRuntimeProps();
+		await testInfo.attach('popup-runtime-props', {
+			body: JSON.stringify(props, null, 2),
+			contentType: 'application/json',
+		});
+
+		expect(
+			props,
+			'getPopupRuntimeProps returned null — interceptor did not ' +
+				'capture the popup BrowserWindow ref',
+		).not.toBeNull();
+		// Predicate matches lib/quickentry.ts:266 — '#00000000' is the
+		// canonical 8-digit form Electron returns for the upstream
+		// construction value, '#0000' is the short form some Electron
+		// builds normalise to. Either is acceptable.
+		expect(
+			props!.backgroundColor === '#00000000'
+				|| props!.backgroundColor === '#0000',
+			`popup backgroundColor must be transparent (#00000000 or ` +
+				`#0000), got ${JSON.stringify(props!.backgroundColor)}. ` +
+				`If the bundled Electron is in the 41.0.4-41.x.y window ` +
+				`(see S33), this is the electron#50213 regression ` +
+				`tracked under issue #370.`,
+		).toBe(true);
+		expect(
+			props!.transparent,
+			'popup transparent flag (derived from backgroundColor) is ' +
+				'false — opaque square frame would render behind the ' +
+				'rounded prompt UI',
+		).toBe(true);
+
+		inspector.close();
+	} finally {
+		await app.close();
+	}
+});
--- a/tools/test-harness/src/runners/S11_quick_entry_from_other_focus.spec.ts
+++ b/tools/test-harness/src/runners/S11_quick_entry_from_other_focus.spec.ts
@@ -0,0 +1,262 @@
+import { test, expect } from '@playwright/test';
+import { launchClaude } from '../lib/electron.js';
+import { skipUnlessRow } from '../lib/row.js';
+import { QuickEntry } from '../lib/quickentry.js';
+import {
+	focusOtherWindow,
+	getFocusedWindowId,
+	spawnMarkerWindow,
+	WaylandFocusUnavailable,
+	XdotoolUnavailable,
+	type MarkerWindow,
+} from '../lib/input.js';
+import { captureSessionEnv, readLauncherLog } from '../lib/diagnostics.js';
+import { sleep } from '../lib/retry.js';
+
+// S11 — Quick Entry shortcut fires from any focus on Wayland
+// (mutter XWayland key-grab). Backs the S11 row in
+// docs/testing/cases/shortcuts-and-input.md (severity: Critical).
+//
+// What this catches vs what it doesn't
+// ------------------------------------
+// The case-doc's load-bearing concern is the GNOME-W mutter
+// XWayland key-grab regression — issue #404 — where mutter under
+// native Wayland refuses to honour the XWayland-side global key
+// grab, so the shortcut becomes focus-bound. This spec CANNOT
+// detect that regression: there is no portable focus-injection
+// path on native Wayland (each compositor exposes its own IPC
+// and the libei input-emulation portal isn't universally
+// honored). The lib/input.ts focus-shifter primitive throws
+// `WaylandFocusUnavailable` on native Wayland rows by design —
+// see its leading comment for the full reasoning. The Wayland-
+// side regression detector is a primitive-gap; it stays manual
+// until libei adoption broadens.
+//
+// What this spec DOES catch is a regression in the X11-side of
+// the global-shortcut path (the side that currently works on
+// GNOME-X / Ubu-X — `🔧` and `✅` respectively in the matrix).
+// If the X11 grab broke on those rows, S11 would catch it. So
+// this is a regression detector on a CURRENTLY-PASSING path,
+// unlike S12 which is a currently-failing detector for the
+// `--enable-features=GlobalShortcutsPortal` wiring.
+//
+// Row gate
+// --------
+// Case-doc applies-to is "GNOME, Ubu" (both W and X variants),
+// but the focus-shifter primitive is X11-only, gated strictly on
+// `XDG_SESSION_TYPE === 'x11'`. Wayland rows can't be exercised
+// here — they would either skip via the row gate or trip
+// `WaylandFocusUnavailable` from the primitive. So the runner's
+// row gate is the X11 subset only: GNOME-X, Ubu-X. The Wayland
+// rows for S11 stay manual / matrix-cell-from-doc until a
+// libei-based primitive lands.
+
+test.setTimeout(60_000);
+
+test('S11 — Quick Entry shortcut fires from any focus (X11 path)', async ({}, testInfo) => {
+	skipUnlessRow(testInfo, ['GNOME-X', 'Ubu-X']);
+	testInfo.annotations.push({ type: 'severity', description: 'Critical' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Quick Entry / global shortcut',
+	});
+
+	// Single-shot diagnostic record. We attach this once at the
+	// end (or on early throw) rather than spreading five separate
+	// attachments — mirrors S31's results shape so matrix-regen
+	// has one well-known JSON to scrape per spec.
+	const diag: {
+		sessionEnv: Record<string, string>;
+		markerTitle: string | null;
+		activeWidBeforeFocus: string | null;
+		activeWidAfterFocus: string | null;
+		popupState: unknown;
+		openError: string | null;
+		focusError: string | null;
+		launcherLogTail: string | null;
+	} = {
+		sessionEnv: captureSessionEnv(),
+		markerTitle: null,
+		activeWidBeforeFocus: null,
+		activeWidAfterFocus: null,
+		popupState: null,
+		openError: null,
+		focusError: null,
+		launcherLogTail: null,
+	};
+
+	const attachDiag = async () => {
+		await testInfo.attach('s11-diagnostics', {
+			body: JSON.stringify(diag, null, 2),
+			contentType: 'application/json',
+		});
+	};
+
+	const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
+	const app = await launchClaude({
+		isolation: useHostConfig ? null : undefined,
+	});
+
+	let marker: MarkerWindow | null = null;
+	try {
+		// `mainVisible` is the cheapest level that gives us a
+		// registered global shortcut. Upstream registers via
+		// globalShortcut.register early in main-process startup
+		// (build-reference index.js:499416), but we still want
+		// the main window mapped so the popup-construction path
+		// has something to anchor to.
+		const { inspector } = await app.waitForReady('mainVisible');
+		const qe = new QuickEntry(inspector);
+		await qe.installInterceptor();
+
+		// Capture pre-focus active WID for the diagnostic record.
+		// On a healthy X11 session this is the Claude main window
+		// (we just `mainVisible`-readied it). If null, xprop is
+		// missing or _NET_ACTIVE_WINDOW is unset — neither is a
+		// blocker for the test, just less useful diagnostics.
+		diag.activeWidBeforeFocus = await getFocusedWindowId();
+
+		// Marker title is unique-per-test to avoid colliding with
+		// any leftover xterm from a previous run (xterm exits its
+		// `sleep 600` after 10min so leaks are bounded, but a
+		// re-run inside that window would otherwise match the
+		// stale window).
+		const markerTitle =
+			`claude-test-s11-marker-${testInfo.testId}-${Date.now()}`;
+		diag.markerTitle = markerTitle;
+
+		try {
+			marker = await spawnMarkerWindow(markerTitle);
+		} catch (err) {
+			// Most likely cause: xterm not on PATH. The primitive
+			// throws a plain Error with the install hint. Skip
+			// rather than fail — this is an environment gap.
+			const msg = err instanceof Error ? err.message : String(err);
+			diag.focusError = `spawnMarkerWindow: ${msg}`;
+			await attachDiag();
+			testInfo.skip(
+				true,
+				'xterm not installed; required for the focus-shift target. ' +
+					`Underlying: ${msg}`,
+			);
+			return;
+		}
+
+		// `focusOtherWindow` calls `xdotool search --name <title>`
+		// once and throws if there are zero matches; only the
+		// post-focus _NET_ACTIVE_WINDOW verification has its own
+		// retry. So we need a brief readiness poll for the marker
+		// window to actually map into the X tree before we attempt
+		// the focus shift — and the focus shift itself must
+		// eventually succeed within the budget.
+		//
+		// We capture the LAST error (rather than rethrowing on the
+		// first) so the diagnostic carries the real cause if every
+		// attempt fails. WaylandFocusUnavailable / XdotoolUnavailable
+		// are sticky — they won't change between retries — so we
+		// short-circuit out on the first occurrence and skip.
+		let focusOk = false;
+		let lastFocusErr: unknown = null;
+		let earlySkipReason: string | null = null;
+		const focusBudgetMs = 5_000;
+		const focusStart = Date.now();
+		while (Date.now() - focusStart < focusBudgetMs) {
+			try {
+				await focusOtherWindow(markerTitle);
+				focusOk = true;
+				break;
+			} catch (err) {
+				lastFocusErr = err;
+				if (err instanceof WaylandFocusUnavailable) {
+					earlySkipReason =
+						'WaylandFocusUnavailable on a row that was ' +
+						'supposed to be X11-gated. Check XDG_SESSION_TYPE.';
+					break;
+				}
+				if (err instanceof XdotoolUnavailable) {
+					earlySkipReason =
+						'xdotool not installed; required for the ' +
+						'focus-shift step. ' +
+						(err instanceof Error ? err.message : String(err));
+					break;
+				}
+				// "no X11 window matches" (marker not mapped yet) or
+				// "compositor refused activation" — both can resolve on
+				// retry. Brief pause then loop.
+				await sleep(100);
+			}
+		}
+
+		if (earlySkipReason) {
+			diag.focusError =
+				lastFocusErr instanceof Error
+					? lastFocusErr.message
+					: String(lastFocusErr);
+			await attachDiag();
+			testInfo.skip(true, earlySkipReason);
+			return;
+		}
+
+		if (!focusOk) {
+			const msg =
+				lastFocusErr instanceof Error
+					? lastFocusErr.message
+					: String(lastFocusErr);
+			diag.focusError = msg;
+			diag.launcherLogTail = await readLauncherLog();
+			await attachDiag();
+			throw new Error(
+				`focusOtherWindow failed within ${focusBudgetMs}ms: ${msg}`,
+			);
+		}
+
+		// At this point focus is on the marker xterm. Capture the
+		// post-focus active WID — should equal the marker's WID,
+		// not Claude's. (We don't have a clean way to fetch the
+		// marker's WID independently here without re-running
+		// xdotool; the value-vs-pre comparison in the diagnostic
+		// is sufficient evidence of the shift.)
+		diag.activeWidAfterFocus = await getFocusedWindowId();
+
+		// Now press the global shortcut. The whole point of S11:
+		// even though the marker xterm holds focus (and Claude
+		// does not), the OS-level grab should fire the popup.
+		try {
+			await qe.openAndWaitReady();
+		} catch (err) {
+			diag.openError = err instanceof Error ? err.message : String(err);
+			diag.popupState = await qe.getPopupState();
+			diag.launcherLogTail = await readLauncherLog();
+			await attachDiag();
+			throw err;
+		}
+
+		const popupState = await qe.getPopupState();
+		diag.popupState = popupState;
+		diag.launcherLogTail = await readLauncherLog();
+		await attachDiag();
+
+		// Single critical assertion: popup exists AND is visible
+		// after the shortcut press from non-Claude focus. A null
+		// state means the BrowserWindow was never constructed —
+		// the X11 grab didn't fire. visible === false means it
+		// constructed but show() was suppressed (the upstream
+		// lHn() short-circuit, or a regression in the visibility
+		// flow). Either is a fail for S11's contract.
+		expect(
+			popupState && popupState.visible,
+			'Quick Entry popup is visible after shortcut press from ' +
+				'non-Claude focus (X11 path)',
+		).toBe(true);
+	} finally {
+		// Marker xterm cleanup is idempotent. Always run before
+		// app.close() so the kill happens even if the spec
+		// throws between the two.
+		if (marker) {
+			await marker.kill().catch(() => {
+				// best-effort — process may already be dead
+			});
+		}
+		await app.close();
+	}
+});
--- a/tools/test-harness/src/runners/S12_global_shortcuts_portal_flag.spec.ts
+++ b/tools/test-harness/src/runners/S12_global_shortcuts_portal_flag.spec.ts
@@ -0,0 +1,95 @@
+import { test, expect } from '@playwright/test';
+import { launchClaude } from '../lib/electron.js';
+import { skipUnlessRow } from '../lib/row.js';
+import { readPidArgv, argvHasFlag } from '../lib/argv.js';
+import { readLauncherLog, captureSessionEnv } from '../lib/diagnostics.js';
+
+// S12 — `--enable-features=GlobalShortcutsPortal` launcher flag
+// wired up for GNOME Wayland. Backs QE-6 in
+// docs/testing/quick-entry-closeout.md.
+//
+// On GNOME Wayland, mutter no longer honors XWayland-side key grabs,
+// so the Quick Entry global shortcut fails from unfocused state
+// (#404). The fix is to route global shortcuts through XDG Desktop
+// Portal: pass `--enable-features=GlobalShortcutsPortal` to Electron
+// from the launcher when XDG_CURRENT_DESKTOP includes GNOME and
+// XDG_SESSION_TYPE is wayland.
+//
+// As of writing, this fix is NOT implemented. The test asserts the
+// fix's signature (the flag is in the spawned Electron's argv) and
+// will therefore FAIL on GNOME-W until the launcher patch lands.
+// That's intentional — it's the regression detector, not a smoke
+// test. Once the patch is in, this becomes a Critical green cell.
+//
+// Row gate: GNOME Wayland only. KDE rows skip with `-`.
+
+test.setTimeout(45_000);
+
+test('S12 — --enable-features=GlobalShortcutsPortal launcher flag wired up for GNOME Wayland', async ({}, testInfo) => {
+	testInfo.annotations.push({ type: 'severity', description: 'Critical' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Launcher flag wiring',
+	});
+	skipUnlessRow(testInfo, ['GNOME-W', 'Ubu-W']);
+
+	await testInfo.attach('session-env', {
+		body: JSON.stringify(captureSessionEnv(), null, 2),
+		contentType: 'application/json',
+	});
+
+	const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
+	const app = await launchClaude({
+		isolation: useHostConfig ? null : undefined,
+	});
+
+	try {
+		await app.waitForX11Window(15_000);
+
+		const argv = await readPidArgv(app.pid);
+		await testInfo.attach('electron-argv', {
+			body: JSON.stringify(argv, null, 2),
+			contentType: 'application/json',
+		});
+		expect(argv, 'could read /proc/$pid/cmdline').not.toBeNull();
+
+		// Launcher log carries a stable line — see
+		// scripts/launcher-common.sh:98, 102 — that says which backend
+		// was selected. Capture it for diagnostic context.
+		const log = await readLauncherLog();
+		if (log) {
+			const tail = log.split('\n').slice(-50).join('\n');
+			await testInfo.attach('launcher-log-tail', {
+				body: tail,
+				contentType: 'text/plain',
+			});
+		}
+
+		const present = argvHasFlag(
+			argv ?? [],
+			'--enable-features=GlobalShortcutsPortal',
+		);
+		await testInfo.attach('flag-presence', {
+			body: JSON.stringify(
+				{
+					flag: '--enable-features=GlobalShortcutsPortal',
+					present,
+					note:
+						'On GNOME Wayland this flag must be present for ' +
+						'#404 to be closeable. Until the launcher patch ' +
+						'lands, this test fails as a regression detector.',
+				},
+				null,
+				2,
+			),
+			contentType: 'application/json',
+		});
+
+		expect(
+			present,
+			'--enable-features=GlobalShortcutsPortal is in Electron argv on GNOME Wayland',
+		).toBe(true);
+	} finally {
+		await app.close();
+	}
+});
--- a/tools/test-harness/src/runners/S14_quick_entry_from_other_focus_niri.spec.ts
+++ b/tools/test-harness/src/runners/S14_quick_entry_from_other_focus_niri.spec.ts
@@ -0,0 +1,266 @@
+import { test, expect } from '@playwright/test';
+import { launchClaude } from '../lib/electron.js';
+import { skipUnlessRow } from '../lib/row.js';
+import { QuickEntry } from '../lib/quickentry.js';
+import {
+	focusOtherWindow,
+	getFocusedWindowId,
+	spawnMarkerWindow,
+	NiriIpcUnavailable,
+	FootUnavailable,
+	type MarkerWindow,
+} from '../lib/input-niri.js';
+import { captureSessionEnv, readLauncherLog } from '../lib/diagnostics.js';
+import { sleep } from '../lib/retry.js';
+
+// S14 — Quick Entry shortcut fires from any focus on Niri
+// (XDG portal BindShortcuts path). Backs the S14 row in
+// docs/testing/cases/shortcuts-and-input.md (severity: Critical
+// for Niri users).
+//
+// What this catches vs what it doesn't
+// ------------------------------------
+// On Niri the launcher special-cases the app to native Wayland
+// (`scripts/launcher-common.sh:41-44`), so upstream's
+// `globalShortcut.register` (`index.js:499416`) routes through
+// Electron's `xdg-desktop-portal` `BindShortcuts` path inside
+// Chromium rather than an X11 grab. The case-doc records this
+// path as currently failing on Niri:
+// `Failed to call BindShortcuts (error code 5)`. So this spec
+// is a known-failing detector — the shape mirrors S12's
+// `--enable-features=GlobalShortcutsPortal` GNOME-W detector:
+// the assertion encodes the contract, and the test will start
+// passing automatically once the upstream / portal-side issue
+// is resolved on Niri without any spec edit.
+//
+// The user-visible symptom (Quick Entry shortcut doesn't fire
+// on Niri) is the same as #404 (mutter XWayland key-grab on
+// GNOME-W) but the root cause is different: Niri is wlroots
+// Wayland with no XWayland by default, so the X11-side
+// `lib/input.ts` focus-shifter cannot exercise this path.
+// `lib/input-niri.ts` is the substrate — `niri msg --json`
+// for the focus-injection + readback chain, `foot --title` for
+// the Wayland-native marker window. The mutter / GNOME-W
+// regression detector remains a separate primitive gap (libei
+// when broadly available, or a per-compositor mutter-IPC
+// primitive — neither shipped).
+//
+// Row gate
+// --------
+// Niri only. Other Wayland rows (KDE-W, GNOME-W, Ubu-W) each
+// need their own compositor IPC and stay manual / matrix-cell-
+// from-doc until a libei-based primitive lands.
+
+test.setTimeout(60_000);
+
+test('S14 — Quick Entry shortcut fires from any focus (Niri Wayland path)', async ({}, testInfo) => {
+	skipUnlessRow(testInfo, ['Niri']);
+	testInfo.annotations.push({ type: 'severity', description: 'Critical' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'XDG Desktop Portal BindShortcuts',
+	});
+
+	// Single-shot diagnostic record. We attach this once at the
+	// end (or on early throw) rather than spreading five separate
+	// attachments — mirrors S31's results shape so matrix-regen
+	// has one well-known JSON to scrape per spec.
+	const diag: {
+		sessionEnv: Record<string, string>;
+		markerTitle: string | null;
+		activeWidBeforeFocus: number | null;
+		activeWidAfterFocus: number | null;
+		popupState: unknown;
+		openError: string | null;
+		focusError: string | null;
+		launcherLogTail: string | null;
+	} = {
+		sessionEnv: captureSessionEnv(),
+		markerTitle: null,
+		activeWidBeforeFocus: null,
+		activeWidAfterFocus: null,
+		popupState: null,
+		openError: null,
+		focusError: null,
+		launcherLogTail: null,
+	};
+
+	const attachDiag = async () => {
+		await testInfo.attach('s14-diagnostics', {
+			body: JSON.stringify(diag, null, 2),
+			contentType: 'application/json',
+		});
+	};
+
+	const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
+	const app = await launchClaude({
+		isolation: useHostConfig ? null : undefined,
+	});
+
+	let marker: MarkerWindow | null = null;
+	try {
+		// `mainVisible` is the cheapest level that gives us a
+		// registered global shortcut. Upstream registers via
+		// globalShortcut.register early in main-process startup
+		// (build-reference index.js:499416), but we still want
+		// the main window mapped so the popup-construction path
+		// has something to anchor to.
+		const { inspector } = await app.waitForReady('mainVisible');
+		const qe = new QuickEntry(inspector);
+		await qe.installInterceptor();
+
+		// Capture pre-focus active window id for the diagnostic
+		// record. On a healthy Niri session this is the Claude
+		// main window (we just `mainVisible`-readied it). If
+		// null, `niri msg` is unavailable or there is no focused
+		// window — neither blocks the test, just less useful
+		// diagnostics.
+		diag.activeWidBeforeFocus = await getFocusedWindowId();
+
+		// Marker title is unique-per-test to avoid colliding with
+		// any leftover foot from a previous run (foot exits its
+		// `sleep 600` after 10min so leaks are bounded, but a
+		// re-run inside that window would otherwise match the
+		// stale window).
+		const markerTitle =
+			`claude-test-s14-marker-${testInfo.testId}-${Date.now()}`;
+		diag.markerTitle = markerTitle;
+
+		try {
+			marker = await spawnMarkerWindow(markerTitle);
+		} catch (err) {
+			// Most likely cause: foot not on PATH. The primitive
+			// throws `FootUnavailable` with the install hint. Skip
+			// rather than fail — this is an environment gap.
+			const msg = err instanceof Error ? err.message : String(err);
+			diag.focusError = `spawnMarkerWindow: ${msg}`;
+			await attachDiag();
+			testInfo.skip(
+				true,
+				'foot not installed; required for the focus-shift target. ' +
+					`Underlying: ${msg}`,
+			);
+			return;
+		}
+
+		// `focusOtherWindow` queries `niri msg --json windows`
+		// once and throws if there are zero matches; only the
+		// post-focus focused-window verification has its own
+		// retry. So we need a brief readiness poll for the
+		// marker window to actually appear in the niri window
+		// list before we attempt the focus shift — and the focus
+		// shift itself must eventually succeed within the budget.
+		//
+		// We capture the LAST error (rather than rethrowing on
+		// the first) so the diagnostic carries the real cause if
+		// every attempt fails. NiriIpcUnavailable / FootUnavailable
+		// are sticky — they won't change between retries — so we
+		// short-circuit out on the first occurrence and skip.
+		let focusOk = false;
+		let lastFocusErr: unknown = null;
+		let earlySkipReason: string | null = null;
+		const focusBudgetMs = 5_000;
+		const focusStart = Date.now();
+		while (Date.now() - focusStart < focusBudgetMs) {
+			try {
+				await focusOtherWindow(markerTitle);
+				focusOk = true;
+				break;
+			} catch (err) {
+				lastFocusErr = err;
+				if (err instanceof NiriIpcUnavailable) {
+					earlySkipReason =
+						'NiriIpcUnavailable on a row that was ' +
+						'supposed to be Niri-gated. Check NIRI_SOCKET / ' +
+						'`niri msg` availability.';
+					break;
+				}
+				if (err instanceof FootUnavailable) {
+					earlySkipReason =
+						'foot not installed; required for the ' +
+						'focus-shift step. ' +
+						(err instanceof Error ? err.message : String(err));
+					break;
+				}
+				// "no window matches" (marker not yet listed by
+				// niri) or "focus-window action did not stick" —
+				// both can resolve on retry. Brief pause then loop.
+				await sleep(100);
+			}
+		}
+
+		if (earlySkipReason) {
+			diag.focusError =
+				lastFocusErr instanceof Error
+					? lastFocusErr.message
+					: String(lastFocusErr);
+			await attachDiag();
+			testInfo.skip(true, earlySkipReason);
+			return;
+		}
+
+		if (!focusOk) {
+			const msg =
+				lastFocusErr instanceof Error
+					? lastFocusErr.message
+					: String(lastFocusErr);
+			diag.focusError = msg;
+			diag.launcherLogTail = await readLauncherLog();
+			await attachDiag();
+			throw new Error(
+				`focusOtherWindow failed within ${focusBudgetMs}ms: ${msg}`,
+			);
+		}
+
+		// At this point focus is on the marker foot. Capture the
+		// post-focus focused-window id — should equal the
+		// marker's id, not Claude's. (We don't have a clean way
+		// to fetch the marker's id independently here without
+		// re-running `niri msg`; the value-vs-pre comparison in
+		// the diagnostic is sufficient evidence of the shift.)
+		diag.activeWidAfterFocus = await getFocusedWindowId();
+
+		// Now press the global shortcut. The whole point of S14:
+		// even though the marker foot holds focus (and Claude
+		// does not), the portal-routed BindShortcuts grab should
+		// fire the popup. Currently known-failing per case-doc
+		// S14 (`Failed to call BindShortcuts (error code 5)`).
+		try {
+			await qe.openAndWaitReady();
+		} catch (err) {
+			diag.openError = err instanceof Error ? err.message : String(err);
+			diag.popupState = await qe.getPopupState();
+			diag.launcherLogTail = await readLauncherLog();
+			await attachDiag();
+			throw err;
+		}
+
+		const popupState = await qe.getPopupState();
+		diag.popupState = popupState;
+		diag.launcherLogTail = await readLauncherLog();
+		await attachDiag();
+
+		// Single critical assertion: popup exists AND is visible
+		// after the shortcut press from non-Claude focus. A null
+		// state means the BrowserWindow was never constructed —
+		// the portal grab didn't fire. visible === false means
+		// it constructed but show() was suppressed (the upstream
+		// lHn() short-circuit, or a regression in the visibility
+		// flow). Either is a fail for S14's contract.
+		expect(
+			popupState && popupState.visible,
+			'Quick Entry popup is visible after shortcut press from ' +
+				'non-Claude focus (Niri Wayland path)',
+		).toBe(true);
+	} finally {
+		// Marker foot cleanup is idempotent. Always run before
+		// app.close() so the kill happens even if the spec
+		// throws between the two.
+		if (marker) {
+			await marker.kill().catch(() => {
+				// best-effort — process may already be dead
+			});
+		}
+		await app.close();
+	}
+});
--- a/tools/test-harness/src/runners/S15_appimage_extract_works.spec.ts
+++ b/tools/test-harness/src/runners/S15_appimage_extract_works.spec.ts
@@ -0,0 +1,367 @@
+import { test, expect } from '@playwright/test';
+import { spawn } from 'node:child_process';
+import { existsSync, statSync } from 'node:fs';
+import { mkdtemp, open, readdir, rm } from 'node:fs/promises';
+import { tmpdir } from 'node:os';
+import { join } from 'node:path';
+
+// S15 — AppImage `--appimage-extract` fallback works as documented.
+//
+// Per docs/testing/cases/distribution.md S15: on FUSE-less hosts the
+// AppImage runtime ships an extract fallback. Running the AppImage
+// with `--appimage-extract` should drop a `squashfs-root/` next to
+// CWD with a working `AppRun` inside, runnable without FUSE. The
+// case-doc anchors point at scripts/packaging/appimage.sh:282/:312
+// (built with stock `appimagetool`, which always supports
+// `--appimage-extract`) and the AppRun script at
+// scripts/packaging/appimage.sh:70-118; CI exercises the same path
+// (tests/test-artifact-appimage.sh:36-44).
+//
+// Assertion shape:
+//   1. Locate an AppImage. Skip cleanly if not running from one.
+//   2. mkdtemp a work dir, spawn `<AppImage> --appimage-extract` with
+//      that dir as CWD. Assert exit 0.
+//   3. Assert `squashfs-root/AppRun` exists.
+//   4. Spawn `squashfs-root/AppRun --version` with a 5s timeout. The
+//      case-doc accepts "exit 0 or doesn't immediately fail" — we
+//      treat anything that didn't crash with a FUSE/dlopen error
+//      within the window as a pass; clean exit 0 is the strongest
+//      signal.
+//   5. rm the extracted tree in `finally`.
+//
+// AppImage detection mirrors S01's inline probe (probe
+// CLAUDE_DESKTOP_LAUNCHER, fall back to <repo>/test-build/*.AppImage,
+// verify ELF magic + AppImage type marker). Inline rather than
+// extracted to a shared lib — only two callers today, and the
+// canary-style runners benefit from being decoupled from moving
+// helper surfaces.
+
+interface AppImageProbeResult {
+	path: string | null;
+	reason: string;
+}
+
+// AppImages are ELF executables containing a squashfs image with a
+// magic header at offset 8: `AI\x02` for type 2 (the format our build
+// emits) or `AI\x01` for type 1.
+async function probeAppImagePath(): Promise<AppImageProbeResult> {
+	const explicit = process.env.CLAUDE_DESKTOP_LAUNCHER;
+	const candidates: string[] = [];
+	if (explicit) candidates.push(explicit);
+
+	const projectRoot = '/home/aaddrick/source/claude-desktop-debian';
+	const testBuildDir = `${projectRoot}/test-build`;
+	if (existsSync(testBuildDir)) {
+		try {
+			const entries = await readdir(testBuildDir);
+			for (const entry of entries) {
+				if (entry.endsWith('.AppImage')) {
+					candidates.push(`${testBuildDir}/${entry}`);
+				}
+			}
+		} catch {
+			// best-effort
+		}
+	}
+
+	for (const candidate of candidates) {
+		if (!existsSync(candidate)) continue;
+		try {
+			const st = statSync(candidate);
+			if (!st.isFile()) continue;
+			if (candidate.endsWith('.AppImage')) {
+				return { path: candidate, reason: 'matched .AppImage suffix' };
+			}
+			const fh = await open(candidate, 'r');
+			try {
+				const buf = Buffer.alloc(12);
+				await fh.read(buf, 0, 12, 0);
+				const elf = buf.subarray(0, 4).toString('hex') === '7f454c46';
+				const aiMagic = buf.subarray(8, 11);
+				const isAppImage =
+					elf &&
+					aiMagic[0] === 0x41 &&
+					aiMagic[1] === 0x49 &&
+					(aiMagic[2] === 0x01 || aiMagic[2] === 0x02);
+				if (isAppImage) {
+					return {
+						path: candidate,
+						reason: 'matched AppImage magic bytes',
+					};
+				}
+			} finally {
+				await fh.close();
+			}
+		} catch {
+			// fall through to next candidate
+		}
+	}
+
+	return {
+		path: null,
+		reason:
+			'no AppImage found via CLAUDE_DESKTOP_LAUNCHER or ' +
+			`${testBuildDir}/*.AppImage`,
+	};
+}
+
+interface SpawnResult {
+	exitCode: number | null;
+	signalCode: NodeJS.Signals | null;
+	stdout: string;
+	stderr: string;
+	timedOut: boolean;
+	elapsedMs: number;
+}
+
+async function runWithTimeout(
+	cmd: string,
+	args: string[],
+	cwd: string,
+	timeoutMs: number,
+): Promise<SpawnResult> {
+	const start = Date.now();
+	const proc = spawn(cmd, args, {
+		cwd,
+		env: process.env,
+		stdio: ['ignore', 'pipe', 'pipe'],
+		detached: false,
+	});
+
+	const stdoutChunks: Buffer[] = [];
+	const stderrChunks: Buffer[] = [];
+	proc.stdout?.on('data', (c: Buffer) => stdoutChunks.push(c));
+	proc.stderr?.on('data', (c: Buffer) => stderrChunks.push(c));
+
+	let exitCode: number | null = null;
+	let signalCode: NodeJS.Signals | null = null;
+	let timedOut = false;
+
+	await Promise.race([
+		new Promise<void>((resolve) => {
+			proc.once('exit', (code, signal) => {
+				exitCode = code;
+				signalCode = signal;
+				resolve();
+			});
+		}),
+		new Promise<void>((resolve) => {
+			setTimeout(() => {
+				timedOut = true;
+				resolve();
+			}, timeoutMs);
+		}),
+	]);
+
+	if (proc.exitCode === null && proc.signalCode === null) {
+		proc.kill('SIGTERM');
+		await Promise.race([
+			new Promise<void>((resolve) =>
+				proc.once('exit', (code, signal) => {
+					exitCode = code;
+					signalCode = signal;
+					resolve();
+				}),
+			),
+			new Promise<void>((resolve) => setTimeout(resolve, 2_000)),
+		]);
+		if (proc.exitCode === null && proc.signalCode === null) {
+			proc.kill('SIGKILL');
+			await new Promise<void>((resolve) => {
+				proc.once('exit', (code, signal) => {
+					exitCode = code;
+					signalCode = signal;
+					resolve();
+				});
+				setTimeout(() => resolve(), 1_000);
+			});
+		}
+	}
+
+	return {
+		exitCode,
+		signalCode,
+		stdout: Buffer.concat(stdoutChunks).toString('utf8'),
+		stderr: Buffer.concat(stderrChunks).toString('utf8'),
+		timedOut,
+		elapsedMs: Date.now() - start,
+	};
+}
+
+function tail(s: string, n: number): string {
+	if (s.length <= n) return s;
+	return s.slice(-n);
+}
+
+test.setTimeout(60_000);
+
+test('S15 — AppImage --appimage-extract fallback works', async ({}, testInfo) => {
+	// Case-doc S15 lists Severity: Could. Surface label is the harness
+	// taxonomy ("Distribution / AppImage extract") rather than the
+	// case-doc's free-text "AppImage runtime / FUSE-less fallback".
+	testInfo.annotations.push({ type: 'severity', description: 'Could' });
+	testInfo.annotations.push({
+		type: 'surface',
+		description: 'Distribution / AppImage extract',
+	});
+
+	const probe = await probeAppImagePath();
+	await testInfo.attach('appimage-probe', {
+		body: JSON.stringify(probe, null, 2),
+		contentType: 'application/json',
+	});
+
+	if (!probe.path) {
+		test.skip(true, `S15 only applies to AppImage installs: ${probe.reason}`);
+		return;
+	}
+
+	const appImagePath = probe.path;
+	await testInfo.attach('appimage-path', {
+		body: appImagePath,
+		contentType: 'text/plain',
+	});
+
+	// mkdtemp so the extract tree lands in $TMPDIR, not the harness
+	// CWD. `--appimage-extract` writes `squashfs-root/` relative to
+	// CWD, so we just spawn with cwd = the temp dir.
+	const extractDir = await mkdtemp(join(tmpdir(), 'claude-s15-'));
+	const squashRoot = join(extractDir, 'squashfs-root');
+	const appRun = join(squashRoot, 'AppRun');
+
+	await testInfo.attach('extract-dir', {
+		body: extractDir,
+		contentType: 'text/plain',
+	});
+
+	try {
+		// Step 1: extraction. 30s budget — extracting ~200MB of
+		// squashfs to disk is well under that on any modern host.
+		const extract = await runWithTimeout(
+			appImagePath,
+			['--appimage-extract'],
+			extractDir,
+			30_000,
+		);
+
+		await testInfo.attach('extract-exit', {
+			body: JSON.stringify(
+				{
+					exitCode: extract.exitCode,
+					signalCode: extract.signalCode,
+					timedOut: extract.timedOut,
+					elapsedMs: extract.elapsedMs,
+				},
+				null,
+				2,
+			),
+			contentType: 'application/json',
+		});
+		await testInfo.attach('extract-stderr-tail-4k', {
+			body: tail(extract.stderr, 4096) || '(empty)',
+			contentType: 'text/plain',
+		});
+		await testInfo.attach('extract-stdout-tail-4k', {
+			body: tail(extract.stdout, 4096) || '(empty)',
+			contentType: 'text/plain',
+		});
+
+		expect(
+			extract.exitCode,
+			`AppImage --appimage-extract should exit 0 ` +
+				`(stderr tail: ${tail(extract.stderr, 256)})`,
+		).toBe(0);
+		expect(
+			extract.signalCode,
+			'extraction process should not be killed by signal',
+		).toBe(null);
+
+		// Step 2: assert squashfs-root/AppRun exists.
+		const appRunExists = existsSync(appRun);
+		await testInfo.attach('apprun-exists', {
+			body: JSON.stringify(
+				{
+					path: appRun,
+					exists: appRunExists,
+					squashfsRootExists: existsSync(squashRoot),
+				},
+				null,
+				2,
+			),
+			contentType: 'application/json',
+		});
+		expect(
+			appRunExists,
+			`squashfs-root/AppRun should exist after extract at ${appRun}`,
+		).toBe(true);
+
+		// Step 3: spawn `AppRun --version` with a 5s timeout. AppRun
+		// is a wrapper script (scripts/packaging/appimage.sh:70-118)
+		// that hands off to the real Electron entry — `--version`
+		// is the cheapest probe that exercises the full launch path
+		// without bringing up a window. The case-doc accepts "exit 0
+		// or doesn't immediately fail"; a clean exit 0 is best, but
+		// we also flag obvious FUSE / dlopen errors as failures.
+		const apprun = await runWithTimeout(
+			appRun,
+			['--version'],
+			squashRoot,
+			5_000,
+		);
+
+		await testInfo.attach('apprun-exit', {
+			body: JSON.stringify(
+				{
+					exitCode: apprun.exitCode,
+					signalCode: apprun.signalCode,
+					timedOut: apprun.timedOut,
+					elapsedMs: apprun.elapsedMs,
+				},
+				null,
+				2,
+			),
+			contentType: 'application/json',
+		});
+		await testInfo.attach('apprun-stderr-tail-4k', {
+			body: tail(apprun.stderr, 4096) || '(empty)',
+			contentType: 'text/plain',
+		});
+		await testInfo.attach('apprun-stdout-tail-4k', {
+			body: tail(apprun.stdout, 4096) || '(empty)',
+			contentType: 'text/plain',
+		});
+
+		// Hard fail on the cardinal "didn't run at all" patterns: a
+		// FUSE / dlopen complaint here would mean the extract path
+		// ALSO depends on FUSE (which would defeat its purpose).
+		const stderrLower = apprun.stderr.toLowerCase();
+		const fuseFailure =
+			stderrLower.includes('libfuse.so.2') ||
+			(stderrLower.includes('dlopen') && stderrLower.includes('fuse'));
+		expect(
+			fuseFailure,
+			`AppRun --version stderr should not show a FUSE/dlopen ` +
+				`failure (the extract fallback exists precisely to avoid ` +
+				`FUSE). stderr tail: ${tail(apprun.stderr, 256)}`,
+		).toBe(false);
+
+		// Soft acceptance: exit 0 is canonical, but Electron's
+		// `--version` printer can occasionally exit non-zero on Linux
+		// when accessory subsystems (sandbox, dbus) are missing while
+		// still printing the version. Accept exit 0 OR (timed-out
+		// while still alive AND stdout shows a version string).
+		const versionLooksOk =
+			/\d+\.\d+\.\d+/.test(apprun.stdout) ||
+			/\d+\.\d+\.\d+/.test(apprun.stderr);
+		const acceptableNonZero = apprun.timedOut && versionLooksOk;
+		expect(
+			apprun.exitCode === 0 || acceptableNonZero,
+			`AppRun --version should exit 0 or print a version before ` +
+				`timeout. exit=${apprun.exitCode} signal=${apprun.signalCode} ` +
+				`timedOut=${apprun.timedOut} ` +
+				`stdoutHasVersion=${versionLooksOk}`,
+		).toBe(true);
+	} finally {
+		await rm(extractDir, { recursive: true, force: true }).catch(() => {});
+	}
+});
--- a/Show More
+++ b/Show More