docs(testing): session 3 plan/inventory + rotate session 4 prompt

Updates the post-execution status section with session 3's seven
shipped specs, the eipc-registry finding (corrects session 2's T38
assumption), and the four reclassifications (T22/T31/T33/T38 from
Tier 2 IPC probes to Tier 1 fingerprints). Captures the
authentication-state lesson too — launches that depend on
authenticated renderer state need createIsolation({ seedFromHost:
true }), even if the case-doc-shaped Tier 2 form looks hermetic on
paper.

README inventory grows from 50 to 57 specs and adds a note that
LocalSessions_$_* / CustomPlugins_$_* channels use a custom eipc
protocol, not Electron's standard ipcMain.handle() — so future
runners should anchor on channel-name strings (Tier 1) rather than
introspect _invokeHandlers (broken).

Followup prompt rewritten for session 4: focus-shifter primitive +
S11/S14, T35 MCP separation fingerprints (Phase 1) and optional
fixture-readback (Phase 2, may abort), and the eipc-registry
exposer as a flagged primitive gap.

Co-Authored-By: Claude <claude@anthropic.com>
This commit is contained in:
aaddrick
2026-05-03 17:40:33 -04:00
parent 549bf4281a
commit ee7b35ff86
3 changed files with 410 additions and 285 deletions

View File

@@ -1,18 +1,22 @@
# test-harness runner implementation — session 3 prompt
# test-harness runner implementation — session 4 prompt
This file is meant to be **copied verbatim into a fresh Claude Code
session** as the initial user message. Don't paraphrase it; the
orchestration depends on the exact directives below.
You're picking up after a runner-implementation session that landed 10
new specs (5 Tier 2 + 4 Tier 2-reframes + 1 Tier 1 reclass), lifting
harness coverage from 40/76 (53%) to 50/76 (66%). Three commits on
`docs/compat-matrix`:
You're picking up after a runner-implementation session that landed 7
new specs (T22, T24, T30, T31, T32, T33, T37) and reclassified one
session-2 carryover (T38), lifting harness coverage from 50/76 (66%)
to 57/76 (75%). Three commits on `docs/compat-matrix`:
- `XXX``test(harness): session 2 runners + lib/claudeai mock helper`
(10 new spec files; `installShowItemInFolderMock` added to
`lib/claudeai.ts` mirroring the `installOpenDialogMock` pattern;
README inventory + plan-doc status section updated).
- `XXX``test(harness): session 3 runners + eipc-registry finding`
(7 new spec files; `lib/electron-mocks.ts` extracted from
`lib/claudeai.ts` once T24 brought the third mock-then-call helper
online — `installOpenDialogMock` / `installShowItemInFolderMock` /
`installOpenExternalMock` plus their `getCalls` readers; T17, T24,
T25 imports updated; T22/T31/T33/T38 reclassified to Tier 1
fingerprints after the eipc-registry finding; README inventory +
plan-doc status updated).
(Substitute the actual SHA after committing — the user reviews and
commits at the end of every session.)
@@ -21,250 +25,225 @@ The plan doc at
[`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
captures the tier classification and execution-time reclassifications.
Its "Status (post-execution)" section is the source of truth for what's
done and what's deferred — read **session 2** then **session 1**
sub-sections.
done and what's deferred — read **session 3** first, then **session 2**,
then **session 1** sub-sections.
This session is a continuation, not a restart. Start by reading the
plan doc's status sections.
### Big new findings from session 2
### Big new findings from session 3
1. **Mock-then-call beats invoke-then-cleanup for Tier 2 reframes of
side-effecting Electron APIs.** T17's existing
`installOpenDialogMock` pattern was extended to T25 via a new
`installShowItemInFolderMock` in `lib/claudeai.ts`. Net: no host
file-manager pop-up during the run, AND the assertion strengthens
from "didn't throw" to "the egress was reached + the path arg
flowed through verbatim". Apply this pattern to any future
`shell.*` / `dialog.*` Tier 2 reframes (T24 `shell.openExternal`
would mock cleanly the same way).
1. **`ipcMain._invokeHandlers` does NOT see `claude.web` eipc
channels.** This is load-bearing — corrects a session-2 assumption.
The `LocalSessions_$_*` / `CustomPlugins_$_*` channels named in
case-doc Code anchors use a **custom message-port protocol**
(`$eipc_message$_<UUID>_$_claude.web_$_<name>` framing at
`index.js:68816`) that's distinct from Electron stdlib IPC. KDE-W
run revealed the standard registry holds only three chat-tab
MCP-bridge handlers regardless of ready level
(`mainVisible`/`claudeAi`/`userLoaded`) and regardless of whether
the launch is hermetic or `seedFromHost: true` authenticated. The
eipc registry itself is a closure-local — same gotcha as session
2's `Sbn()` (S28) and `cE()`/`Tce()` (S19). Reverse-engineering the
eipc bootstrap to expose the registry is a primitive gap that
would unblock proper Tier 2 runtime probes for **T22, T31, T33,
T38** and any future LocalSessions_/CustomPlugins_ tests.
Reference: T22's leading comment, plan-doc session 3 status.
2. **`gdbus monitor --dest <name>` only sees signals OWNED BY that
destination, not method calls TO it.** T23 had to switch from the
plan's gdbus suggestion to `dbus-monitor` (eavesdrop match rule)
to observe `org.freedesktop.Notifications.Notify` calls from
Electron. If T27 / T22 / S24 ever ship Tier 2 reframes that need
to observe method calls on a service, use `dbus-monitor`.
2. **For tests that depend on authenticated renderer state, ALWAYS
use `createIsolation({ seedFromHost: true })`.** Session 3's first
four launch-based specs (T22/T24/T31/T33 — all originally drafted
as IPC handler probes) defaulted to hermetic isolation,
i.e. unauthenticated. Even if the eipc registry HAD been at
`ipcMain._invokeHandlers`, the LocalSessions/CustomPlugins
handlers register only after the renderer's authenticated init
path runs — default isolation never gets past `/login`. T16 and
T26 are the canonical seedFromHost templates; copy that shape any
time the assertion depends on claude.ai's renderer modules being
loaded.
3. **`ipcMain._invokeHandlers` channel naming carries a build-stable
UUID prefix:** `$eipc_message$_<UUID>_$_claude.web_$_<name>`. T38
anchors on the `_$_<name>` suffix to survive UUID rotation; the
prefix is captured as diagnostic. Useful precedent for any future
IPC-introspection probes.
3. **`shell.openExternal` mock-then-call works identically to
`shell.showItemInFolder` mock — but the mock returns
`Promise<boolean>` not void.** T24 ships the mock pattern in
`lib/electron-mocks.ts`. If a future spec needs to mock another
`shell.*` method, mirror this shape: idempotency flag on
`globalThis`, recorder pushes to a `__claudeAi*Calls` array, mock
matches the documented return type. The mocks live in
`lib/electron-mocks.ts` (extracted in session 3 — was in
`lib/claudeai.ts` until the third helper landed).
4. **Closure-local minified helpers are NOT reachable from
globalThis.** S28's plan called for inspector-eval against
`Sbn()`, but `Sbn` is a closure-local — couldn't be invoked. S28
reclassified to Tier 1 (asar fingerprint of the classifier
expression). For any future "Tier 2 reframe via inspector-eval
against minified helper X" entry: confirm reachability before
classifying.
4. **Asar fingerprint regex with multi-string proximity gates works
well for cadence-style code.** T30 anchors three strings
(`300*1e3`, `3600*1e3`, `AutoArchiveEngine`) in colocation with
tuned distance windows (≤200 chars, ≤3000 chars), then runs an
`.includes()` for a fourth string (`ccAutoArchiveOnPrClose`)
inside the captured window. Single match globally. Pattern for
any future "these constants are colocated with this class" test.
5. **`safeStorage` on Linux uses random IVs.** Only decrypted
plaintext is comparable across encrypt calls; ciphertext bytes
are not deterministic. S25 compares plaintexts.
6. **`extraEnv` precedence.** `lib/electron.ts:317-323` spreads in
order: `process.env`, `LAUNCHER_INJECTED_ENV`, `isolation?.env`,
`waylandEnv`, then `opts.extraEnv`, then `CI: '1'`. Override
wins. S19 leans on this; load-bearing for future tests that
need to override isolation defaults.
5. **Build-reference is in BEAUTIFIED form; installed asar is
MINIFIED. Numeric literals differ.** T30 case-doc named the
constants as `300_000` / `3_600_000` (with underscores —
beautified preserves them). The actual installed asar has
`300*1e3` / `3600*1e3`. Always grep the installed asar before
settling on a fingerprint string. The
`/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar`
path on KDE-W is the source of truth for fingerprints.
### Authoritative reference
Read these in order before fanning out:
- [`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
— tier classification + status section. Read both **session 2**
and **session 1** "Status (post-execution)" sub-sections. The
Tier-3 list (line ~342) is the candidate pool for further
reframes.
— tier classification + status section. Read **session 3**,
**session 2**, then **session 1** "Status (post-execution)" sub-
sections. The Tier-3 list (line ~342) is the candidate pool for
further reframes.
- [`tools/test-harness/README.md`](../../tools/test-harness/README.md)
— runner conventions, the now-50-spec inventory, primitives in
`lib/`, isolation defaults, the CDP-gate workaround, the
`seedFromHost` reference.
— runner conventions, the now-57-spec inventory, primitives in
`lib/`, isolation defaults, the CDP-gate workaround, the eipc note.
- [`docs/testing/cases/README.md`](cases/README.md) — case-doc
structure and the four anchor scopes.
- [`tools/test-harness/src/lib/`](../../tools/test-harness/src/lib/)
— the existing primitives. Notable additions since session 2:
- `claudeai.ts``installShowItemInFolderMock` /
`getShowItemInFolderCalls` (mirrors `installOpenDialogMock`).
If 3+ tests start using mock-then-call, consider extracting to
`lib/electron-mocks.ts` — but don't pre-extract.
— the existing primitives. Notable additions since session 3:
- `electron-mocks.ts` — extracted from `claudeai.ts` once T24
brought the third mock-then-call helper online. Three pairs
today (`installOpenDialogMock`, `installShowItemInFolderMock`,
`installOpenExternalMock` + their readers). If a future spec
needs another `shell.*` / `dialog.*` / similar mock, add it
here as a fourth sibling.
- [`tools/test-harness/src/runners/`](../../tools/test-harness/src/runners/)
— every existing spec is a template. Notable session 2 templates:
- `T16_code_tab_loads.spec.ts` / `T26_routines_page_renders.spec.ts`
— seedFromHost + post-login renderer-side AX nav. T16 uses the
existing `CodeTab.activate()`; T26 inlines a similar AX walker
for the sidebar. Pattern for any further "click an AX-tree
button after login" test.
- `T25_show_item_in_folder_no_throw.spec.ts` — mock-then-call
pattern (mirrors T17). Use this shape for any future Tier 2
reframe of a side-effecting Electron API.
- `T38_open_in_editor_handler_registered.spec.ts` — IPC handler
registry introspection via `ipcMain._invokeHandlers`. Pattern
for any "is this handler wired up" check.
- `T23_notification_reaches_dbus.spec.ts` — dbus-monitor
subprocess + inspector-fired notification + buffer scan.
- `T10_cowork_daemon_respawn.spec.ts` — H04 extension: spawn,
SIGKILL, poll for new pid. Pattern for any "service auto-respawn
contract" test.
- `S25_safestorage_token_persists.spec.ts` — two-launch with
shared isolation handle + safeStorage round-trip via tmpfile.
- `S28_worktree_permission_classifier.spec.ts` — single-regex
asar fingerprint for a multi-string-OR classifier expression.
— every existing spec is a template. Notable session 3 templates:
- `T22_pr_monitoring_handler.spec.ts` — multi-fingerprint Tier 1
(eipc channel-name string + Linux-fallthrough throw site).
Pattern for any "the IPC channel name is in the bundle" probe
when the registry isn't introspectable.
- `T24_open_in_editor_no_throw.spec.ts` — mock-then-call with a
`Promise<boolean>` egress. Pattern for any future `shell.*`
egress that returns a Promise (not void).
- `T30_auto_archive_cadence_constants.spec.ts` — single-regex
multi-string-proximity asar fingerprint with a tuned distance
window. Pattern for any "these constants are colocated with
this class" test.
- `T31_side_chat_handlers_registered.spec.ts`
`T33_plugin_browser_handler_registered.spec.ts` — eipc channel-
name fingerprints. Pattern for any "is this IPC channel name
in the bundle" probe.
- `T37_claude_md_memory_fingerprint.spec.ts` — multi-anchor Tier
1 with a single-occurrence high-signal log line as the primary
anchor + broader namespace tokens for context.
- [`docs/testing/cases/*.md`](cases/) — the spec each runner
asserts. The **Code anchors:** field tells you exactly where
upstream implements the feature.
### Tests in scope this session
Five categories, in priority order:
**Realistic ceiling: ~5 new specs this session.** Session 3 hit ~7
because Tier 1 fingerprints are cheap. Session 4's candidates are
heavier — most need either a new primitive (focus-shifter, eipc-
registry exposer) or fixture-then-readback against state that may
not be reachable.
Three categories:
| # | Tests | Source files | Notes |
|---|---|---|---|
| **A** Deferred from session 2 | T31, T32, S06, S11, S14 | `code-tab-workflow.md` (T31, T32), `shortcuts-and-input.md` (S06, S11, S14) | T31/T32 need a Code-tab session OPEN; S06 needs Wayland row; S11/S14 need a new focus-shifter primitive |
| **B** Tier 3 → Tier 2 reframes (read-only) | T22, T35, T37 | `code-tab-workflow.md` (T22), `extensibility.md` (T35, T37) | Each can ship as a *reads-from-disk-or-IPC-registry* probe without writing to the user's account |
| **C** Asar fingerprint cleanups | T24, T30, T33 | `code-tab-handoff.md` (T24), `code-tab-workflow.md` (T30), `extensibility.md` (T33) | Each has a load-bearing string set in `index.js` that pins the wiring without needing a launch |
| **D** New primitive — focus-shifter | (unblocks A's S11/S14) | `lib/input.ts` | xdotool / ydotool focus-stealing helper. Build only if S11/S14 worth shipping this session |
| **E** Mock-then-call extension | T24 (mock form) | `code-tab-handoff.md` (T24) | Mirror of T25's pattern but for `shell.openExternal` — handler reaches the egress with the right URL |
| **A** Focus-shifter primitive + S11/S14 | (lib/input.ts) + S11, S14 | `shortcuts-and-input.md` (S11, S14) | One PR builds the primitive (`focusOtherWindow()`), a second PR ships both runners |
| **B** T35 — MCP server config picked up | T35 | `extensibility.md` (T35) | Place fixture `~/.claude.json` + `<project>/.mcp.json` under isolation; assert on parsed-state readback. Risky — the parsed-state target may be a closure-local (same blocker as T37b/S19/S28) |
| **C** Deferred items audit | various | — | Re-walk session 1/2/3 deferrals; pick anything that's now tractable given the eipc finding + electron-mocks split |
Realistic ceiling: **~6-8 new specs** this session. Don't try all
13 — Categories A and B are heavier than session 2's mix because
they need either a Code-tab session opened OR a new primitive built
first.
#### Category A — focus-shifter primitive (3 specs)
### Detailed scope per category
- **`lib/input.ts:focusOtherWindow()`.** Build the primitive first.
- **X11 path:** `xdotool search --name '<test-marker>' windowfocus`
or similar. xdotool is available on most rows.
- **Wayland path:** No portable focus injection. Skip cleanly per
row gate. KDE-W might allow `kwin_x11`-class hacks but those are
not portable.
- Verify by spawning a marker window (e.g. a `xterm -title
'<marker>'` background process), focusing it, then asserting
`xprop -root _NET_ACTIVE_WINDOW` returns its WID.
- **S11** — Quick Entry shortcut fires from any focus.
Launch app → focus marker window → fire `Ctrl+Alt+Space` via
ydotool → assert popup appears (existing primitives).
Row gate: GNOME-W, Ubu-W (mutter XWayland key-grab story is the
load-bearing context). Currently broken on GNOME-W per #404; this
runner is a regression detector.
- **S14** — Global shortcuts via XDG portal work on Niri.
Same shape as S11. Row gate: Niri. Currently fails per case-doc.
Reframe possible: assert `--enable-features=GlobalShortcutsPortal`
is in argv (this is what S12 already does). The DELIVERY-side
test needs the focus-shifter primitive.
#### Category Adeferred items (5)
#### Category BT35 MCP server config (1 spec)
- **T31 — Side chat opens.** Needs: `seedFromHost` + Code-tab
session OPEN (env pill → Local → choose folder → wait for session
load). After session loads, send `Ctrl+;` via ydotool OR find the
IPC handler `startSideChat` in `ipcMain._invokeHandlers` (T38
pattern) and assert it's registered + invokable. The lighter form
is the IPC-registry probe; the heavier form is full
click-chain-into-side-chat.
- **T32 — Slash command menu.** Needs: same Code-tab session OPEN
preamble as T31. Then trigger `/` in the prompt textarea and
assert the slash menu renders (AX-tree query for menuitem* nodes
in the prompt area). Heavier than T31 because the slash menu is
rendered server-side by claude.ai's bundle.
- **S06 — URL handler segfault on native Wayland.** Needs:
`CLAUDE_HARNESS_USE_WAYLAND=1` row + `coredumpctl info
claude-desktop` observation after firing
`xdg-open 'claude://chat/new'`. Skip cleanly if not on a
Wayland row.
- **S11 / S14 — focus-shifter delivery.** Needs: `lib/input.ts`
with `focusOtherWindow()` (xdotool on X11; skip on Wayland or
use compositor-specific). Then S11 / S14 launch app, focus
another window, fire shortcut, assert popup appears. Build the
primitive in one PR (Category D), then both runners in a second
PR.
T35 case-doc anchors at `:215418` (Code-tab loads
`<project>/.mcp.json`), `:176766` (`~/.claude.json` reader), `:489098`
(Code-session passes `settingSources: ["user","project","local"]` to
agent SDK), `:130821` (`claude_desktop_config.json` is chat-tab path
constant — separate userData dir per `:130829` `kee()`).
#### Category B — Tier 3 → Tier 2 reframes (3)
**Phase 1 (cheap, ship today): asar separation fingerprint.** Assert:
These each have a slice that doesn't write to the user's real
account:
1. `claude_desktop_config.json` string is in `index.js`. (Chat-tab
MCP path constant — load-bearing for the per-tab separation.)
2. `kee()` resolution path: assert the userData-dir resolver is
present.
3. The strings `~/.claude.json` and `.mcp.json` are in `index.js`.
(Code-tab MCP loaders.)
- **T22 — PR monitoring (read-only half).** The Tier 3 form opens a
PR; the Tier 2 reframe is "after `seedFromHost`, IPC handler
`getPrChecks` is registered + the `gh CLI not found in PATH`
string is in the bundle". The handler-registered probe is the
shippable form; the missing-`gh` warning string is a static
fingerprint. Both ship as one runner.
- **T35 — MCP server config picked up.** Reframe: place a fixture
`claude_desktop_config.json` under the isolation's configDir
(no host config touch needed — fresh isolation), then via
inspector eval, read whatever main-process state holds the
parsed MCP server list. Anchor on a known path under
`${configDir}/Claude/`.
- **T37 — `CLAUDE.md` memory loads.** Reframe: place a fixture
`~/.claude/CLAUDE.md` (or under `CLAUDE_CONFIG_DIR/CLAUDE.md`
with extraEnv override — see S19's pattern), then via inspector
eval read the loaded memory state. Anchor needs to come from
case-doc Code anchors.
This Tier 1 form pins the wiring without needing a launch. It does
NOT verify "the MCP server actually starts when a Code session
opens" — that's the full Tier 3 form, needs login + a Code-tab
session OPEN + an MCP server fixture.
#### Category C — asar fingerprint cleanups (3)
**Phase 2 (risky, do only if Phase 1 lands and budget allows):
fixture-then-readback Tier 2.** Place a fixture
`<isolationDir>/Claude/claude_desktop_config.json` containing a
synthetic `mcpServers` entry. Launch with `seedFromHost: true` (so
the renderer is authenticated) + extraEnv override pointing the
chat-tab loader at the isolationDir. Try inspector-eval to read the
parsed MCP server list. **STOP AND REPORT** if the parsed-state
target is a closure-local (same blocker as T37b/S19/S28). Don't
ship a stub.
Each is Tier 1 / no launch:
#### Category C — deferred items audit
- **T24 — Open in external editor (asar fingerprint).** The full
click-chain T24 is Tier 3. The fingerprint half: assert `Mtt`
registry is in `index.js` with the editor scheme strings
(`vscode://`, `cursor://`, `zed://`, `windsurf://`).
- **T30 — Auto-archive on PR merge (cadence constants).** Static
fingerprint of the sweep cadence — assert `300_000` (5 min)
and `3_600_000` (1 h) appear near the auto-archive code.
- **T33 — Plugin browser (IPC handler registered).** Same shape
as T38 — assert `listMarketplaces` IPC handler is registered.
Walk through session 1/2/3 deferrals and identify any that are now
tractable given session 3's findings. Specifically:
#### Category D — primitive build
- **`lib/input.ts:focusOtherWindow()`.** xdotool on X11
(`xdotool search --name '<test-marker>' windowfocus`); on
Wayland skip cleanly (no portable focus injection). Used by
S11 / S14. Don't build unless those are in scope this session.
#### Category E — mock-then-call extension
- **T24 (mock form, alternative to Category C).** Mock
`shell.openExternal` via a new `installOpenExternalMock` in
`lib/claudeai.ts` (mirror of `installShowItemInFolderMock`),
then `inspector.evalInMain` calls
`shell.openExternal('vscode://file/tmp/test')` and assert
the recorded call list contains the URL. Strictly stronger
than the Category C fingerprint form. **Pick C OR E for T24
— not both.**
### Why this iteration
The harness is at 50/76 coverage and every release tag now
exercises the smoke-set + a chunk of critical surfaces
automatically. Remaining work clusters in three pockets:
- **Code-tab cluster (T15-T39, mostly login-walled).** Session 2
unblocked the *render-only* half via `seedFromHost`; session 3
should push into the *open-a-session* half (T31, T32) and the
read-only-Tier-3-reframes (T22, T35, T37).
- **Wayland-specific tests (S06).** Need a Wayland row + the
harness's `CLAUDE_HARNESS_USE_WAYLAND=1` switch.
- **Focus-shift-dependent tests (S11, S14).** Need
`lib/input.ts:focusOtherWindow()` built first.
After this session, future sessions can focus on the genuinely
heavy Tier 3 work (destructive-write login tests; multi-launch
state) with a clearer cost model.
### Known mechanism-recipe table (session 1 + session 2)
| Pattern | Use when | Worked example |
|---|---|---|
| `createIsolation({ seedFromHost: true })` | spec needs a signed-in renderer; read-only | T07, T16, T26 |
| `isolation: null` + pre-launch `killHostClaude()` | spec needs SingletonLock collision (delivery probes) | T05 |
| Default isolation | most other tests | T01, T03, T04, S29 |
| `isolation: <handle>` (shared across launches) | multi-launch persistent state | S35, S25 |
| `MainWindow.setState('close')` | exercise the wrapper close-interceptor | T08 |
| Mock-then-call for `shell.*` / `dialog.*` | Tier 2 reframe of side-effecting API | T17, T25 |
| `ipcMain._invokeHandlers` registry probe | "is this IPC handler wired up" | T38 |
| `dbus-monitor` subprocess | observe DBus method calls TO a destination | T23 |
| Asar single-regex multi-string-OR fingerprint | classifier-style code that combines several strings | S28 |
- **S20** — `powerSaveBlocker` Inhibit. Issue #569 still open; not
this session.
- **T18** — drag-drop. X11 path is Tier 3 with xdotool drag. Wayland
blocked until libei. Not this session.
- **T34** — OAuth round-trip. Hard to mock; not this session.
- **eipc-registry exposer (primitive gap)**. If you're feeling
ambitious, reverse-engineer the eipc bootstrap and find a way to
expose the channel→handler registry from main. Would unblock
proper Tier 2 runtime probes for T22/T31/T33/T38. **High-risk,
high-reward.** Likely involves walking the bundled `index.js`
near `:68820` (`le(i)` origin validation) and `:68816` (channel
framing) and identifying a stable handle. If the registry is
truly closure-local with no exposed surface, abort and document.
### Constraints to respect (don't violate)
These are unchanged from sessions 1 and 2 and still load-bearing:
These are unchanged from sessions 1/2/3 and still load-bearing:
- **Default isolation** unless the spec needs otherwise. Never
write to `~/.config/Claude` without explicit gating
(`CLAUDE_TEST_USE_HOST_CONFIG=1` opt-out, OR `seedFromHost: true`
with read-only-then-discard semantics, OR an explicit comment
documenting why).
- **Default isolation** unless the spec needs otherwise. Use
`seedFromHost: true` for any test that depends on authenticated
renderer state — never assume default isolation gets past
`/login`. T16/T26 are the templates.
- **Don't introspect `ipcMain._invokeHandlers` for `claude.web`
channels.** Session 3 confirmed those use a custom eipc protocol
not in the standard registry. T22/T31/T33/T38 are now Tier 1
fingerprints. If you build the eipc-registry exposer (Category
C), update the plan-doc and this prompt accordingly.
- **CDP auth gate is alive** — runtime SIGUSR1 attach via
`app.attachInspector()`, never Playwright's `_electron.launch()`
or `chromium.connectOverCDP()`.
- **BrowserWindow Proxy gotcha** — use `webContents.getAllWebContents()`
not `BrowserWindow.getAllWindows()`. Constructor-level wraps
don't work; use prototype-method hooks.
- **BrowserWindow Proxy gotcha** — use
`webContents.getAllWebContents()` not `BrowserWindow.getAllWindows()`.
Constructor-level wraps don't work; use prototype-method hooks.
- **`skipUnlessRow()` always first.** First line of every `test()`
body when the test is row-gated.
- **No fixed sleeps.** `retryUntil` from `lib/retry.ts`, or
@@ -279,22 +258,30 @@ These are unchanged from sessions 1 and 2 and still load-bearing:
- **Don't break existing runners.** `npm run typecheck` must stay
clean. H01-H05 are the canaries; `npm test` must still pass them
after every commit.
- **For mock-then-call: leading comment must document why mock
beats invoke** (T25's leading comment is the worked example —
three short paragraphs).
- **Always grep the installed asar** to verify a fingerprint string
is present (and how often) BEFORE shipping. Build-reference is
beautified — strings differ from the minified bundle. Use
`node -e "const {extractFile}=require('@electron/asar'); ..."`
from inside `tools/test-harness` (where `@electron/asar` is on
the require path).
- **For mock-then-call: the helper goes in `lib/electron-mocks.ts`,
not `lib/claudeai.ts`.** Session 3 extracted them. The pattern is
documented in T24/T25's leading comments.
### Phases
#### Phase 0 — calibration
1. `cd tools/test-harness && npm run typecheck` — should pass.
2. Read the plan doc's "Status (post-execution)" session 2 section,
then read T25's leading comment (the mock-then-call pattern's
worked-example doc) and T16/T26 (the seedFromHost-then-AX-nav
pattern). Confirm you understand both.
3. Pick one Category B candidate (suggest T22 — read-only half) and
sketch the runner shape mentally. Don't write it yet — confirm
you can plan from the spec.
2. Read the plan doc's "Status (post-execution)" session 3 section,
then read T22's leading comment (the eipc-registry finding's
worked-example doc) and T24's leading comment (the
`Promise<boolean>` mock-then-call variant). Confirm you
understand both.
3. Pick one Category candidate and sketch the runner shape mentally.
Don't write it yet — confirm you can plan from the spec. Verify
any fingerprint strings exist in the installed asar before
committing to them.
If Phase 0 surfaces a problem (typecheck failing, primitives
unclear, patterns not understood), stop and report. Don't fan out.
@@ -304,34 +291,23 @@ unclear, patterns not understood), stop and report. Don't fan out.
Spawn parallel subagents (cap at 6 in flight) for the highest-
confidence candidates first.
**Suggested initial batch (4-5 specs):**
**Suggested initial batch (~3-4 specs):**
- **B / T22 — PR monitoring read-only half.** seedFromHost +
`ipcMain._invokeHandlers` for `getPrChecks` + asar fingerprint
for the missing-`gh` warning string.
- **C / T24 (asar fingerprint OR mock form, pick one).** If you
want stronger coverage, go mock form (Category E shape — mirror
of T25's `installOpenExternalMock` helper). If you want a quick
Tier 1, go asar fingerprint (`Mtt` registry + scheme strings).
- **C / T30 — Auto-archive cadence constants.** Pure asar probe.
- **C / T33 — Plugin browser handler registered.** T38 pattern.
- **B / T35 — MCP server config picked up.** Fixture under
isolation configDir + inspector eval.
- **A / `lib/input.ts:focusOtherWindow()` primitive.** Build the
X11 path with xdotool, skip cleanly on Wayland. Verify with a
marker-window round-trip.
- **B / T35 Phase 1 — MCP separation fingerprints.** Pure asar
probe; load-bearing strings only.
- (Hold A/S11 + A/S14 for batch 2 — they depend on the primitive
landing.)
If those land cleanly, dispatch a second batch:
If those land cleanly, dispatch batch 2:
- **A / T31 — Side chat opens (handler-registered shape).**
seedFromHost + `ipcMain._invokeHandlers` for `startSideChat`
/ `sendSideChatMessage` / `stopSideChat`. The lighter probe.
- **A / T32 — Slash command menu (asar fingerprint).** The full
AX-tree form needs a Code-tab session open AND server-side
rendered menu — heavy. The fingerprint form: `getSupportedCommands`
+ `slashCommands` schema present in `index.js`.
- **A / S11 / S14 (only if Category D primitive is built first).**
Build `lib/input.ts:focusOtherWindow()` in PR 1; ship S11/S14
in PR 2.
- **B / T37 — CLAUDE.md memory loads.** Fixture file + inspector
eval against the loaded memory state.
- **A / S11** — Quick Entry shortcut from any focus.
- **A / S14** — Global shortcuts via XDG portal on Niri.
- **B / T35 Phase 2** — fixture-then-readback (only if Phase 1
lands AND a reachable readback target is found; STOP AND REPORT
otherwise).
#### Per-subagent prompt shape
@@ -350,8 +326,8 @@ Read in order:
Write tools/test-harness/src/runners/<TEST-ID>_short_name.spec.ts.
[per-test specifics: pattern (seedFromHost / mock-then-call /
ipcMain._invokeHandlers / asar fingerprint / shared isolation),
assertion shape, skip rules, key constraint warnings]
asar fingerprint / shared isolation), assertion shape, skip rules,
key constraint warnings]
Constraints:
- Tabs, ~80-char wrap.
@@ -367,9 +343,10 @@ If the test isn't reasonable to implement (anchors don't resolve
to anything assertable, the test depends on state you can't
construct, the existing primitives don't cover the surface), DO
NOT write a stub. Report under Open questions and stop. Sessions
1 and 2 had cumulative ~6 "stop and report" outcomes that were
1, 2, and 3 had cumulative ~8 "stop and report" outcomes that were
the right call (S20 deferral, T05 reshape, T07 needs seedFromHost,
T08 needs setState('close'), S28 reclassification, T38 framing).
T08 needs setState('close'), S28 reclassification, T38 framing,
session-3 eipc-registry finding, T37 fixture-readback deferral).
Report shape (~150 words):
## <TEST-ID> runner
@@ -389,9 +366,11 @@ After fan-out returns:
1. `cd tools/test-harness && npm run typecheck` — must stay clean.
2. Run the new runners against KDE-W (the dev box) — but flag the
user first if any are destructive (seedFromHost kills running
Claude; T31/T32 require an open Code-tab session that may
accumulate state). Capture pass/skip/fail per spec for the
matrix.
Claude). **CRITICAL:** Test that any spec depending on
authenticated state actually uses `seedFromHost: true` — session
3 shipped specs with default isolation that needed
authentication, masking the eipc-registry finding for several
iterations. Capture pass/skip/fail per spec for the matrix.
3. Update [`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
"Status (post-execution)" section to reflect newly-shipped
specs and any reclassifications discovered mid-flight.
@@ -400,8 +379,8 @@ After fan-out returns:
5. Write a final report listing:
- Specs landed (pass / skip / needs-tuning per row)
- Specs deferred (with the per-test rationale)
- Specs reclassified (Tier 3 → Tier 2, Tier 2 → blocked, etc.)
- Updated coverage stat (was 50/76 = 66%, now N/76 = M%)
- Specs reclassified (Tier 3 → Tier 2, Tier 2 → Tier 1, etc.)
- Updated coverage stat (was 57/76 = 75%, now N/76 = M%)
6. Don't commit. The user reviews and commits.
7. Rotate this prompt: rewrite
`docs/testing/runner-implementation-followup-prompt.md` for the
@@ -409,7 +388,7 @@ After fan-out returns:
### Self-correction loop
Same as sessions 1 and 2:
Same as sessions 1, 2, and 3:
1. Subagent typecheck failure → re-spawn with explicit fix
instruction.
@@ -417,6 +396,13 @@ Same as sessions 1 and 2:
file → re-spawn with explicit "use the Write tool" instruction.
3. Two subagents wrote runners that share a primitive but with
different shapes → factor into `lib/<topic>.ts` BEFORE shipping.
4. **NEW for session 4:** Spec passes locally but the assertion is
actually trivial (e.g. an unauthenticated launch where the
handler check vacuously passes because no handlers are
registered) → re-examine the assertion shape. Session 3's eipc-
registry finding came from running the specs and finding only
3 handlers in the registry; the lesson is to verify the
assertion is meaningful, not just that it passes.
Cap re-spawns at 2 per file. Past that, mark as needing human
review and move on.
@@ -425,52 +411,51 @@ review and move on.
Stop and write the final report when one of:
1. **All Category A + B + C target specs landed and typecheck-clean.**
1. **All Category A + B target specs landed and typecheck-clean.**
Write coverage update, stop.
2. **Hit re-spawn cap on 3+ runners.** Stop, write up which are
blocked.
3. **Discovered a primitive gap that breaks 5+ Tier 2/Tier 3
tests.** Stop, propose where the new primitive should live in
`lib/`. Future session adds the primitive first, then resumes.
4. **Session budget hits ~7 new specs.** Stop, synthesize, leave
4. **Session budget hits ~5 new specs.** Stop, synthesize, leave
the rest for the next session.
### What you should NOT do
- **Don't try to land Category A + B + C in one batch.** That's
~9-11 specs. Pick the highest-confidence subset for the first
batch and decide whether to do more based on what came back.
- **Don't try to land Category A + B + C in one batch.** Pick the
highest-confidence subset for the first batch.
- **Don't ship stubs.** If a runner can't actually assert what the
spec says, mark it as Tier 3 / blocked / primitive-gap and don't
write a placeholder. The cumulative six "stop and report"
outcomes from sessions 1+2 were the right call — every one
write a placeholder. The cumulative eight "stop and report"
outcomes from sessions 1/2/3 were the right call — every one
revealed a real constraint.
- **Don't break existing runners.** H01-H05 are the canaries.
- **Don't pre-extract `lib/electron-mocks.ts`.** The
`installShowItemInFolderMock` + `installOpenDialogMock` pair
doesn't yet justify a new file; if T24 ships as Category E
(mock form), THAT's the third — extract then.
- **Don't restructure `lib/`** beyond targeted additions.
Premature abstractions are wrong abstractions.
Premature abstractions are wrong abstractions. `electron-mocks.ts`
was extracted in session 3 once the third helper landed —
threshold-driven, not speculative.
- **Don't run destructive Tier 3 tests** that write to the user's
real claude.ai account (T22 PR write, T27 scheduling, T29
worktree creation, T34 OAuth, T36 hooks). Only the *read-only
reframes* of those are in scope this session.
reframes* of those are in scope.
- **Don't introspect `ipcMain._invokeHandlers` for `claude.web`
eipc channels.** Confirmed broken in session 3. If you need
runtime IPC verification for those channels, the eipc-registry
exposer is the primitive gap to land first.
- **Don't implement the #569 power-inhibit patch in this session.**
That's a separate workstream. The S20 spec follows the patch,
not the other way around.
That's a separate workstream.
- **Don't commit.** The user reviews and commits.
### Final report format
```markdown
## Runner implementation summary (session 3)
## Runner implementation summary (session 4)
- Category A landed: N / 5
- Category B landed: N / 3
- Category C landed: N / 3
- Category A landed: N / 3 (focus-shifter primitive + S11 + S14)
- Category B landed: N / 1-2 (T35 Phase 1 + maybe Phase 2)
- Reclassified mid-flight: N (with reasons)
- Coverage: was 50/76 (66%), now <NEW>/76 (<PCT>%)
- Coverage: was 57/76 (75%), now <NEW>/76 (<PCT>%)
- Typecheck: clean | <errors>
- KDE-W test run: <pass/skip/fail counts>
@@ -478,7 +463,7 @@ Stop and write the final report when one of:
| Cat | Test ID | File | Assertion shape | Status |
|---|---|---|---|---|
| B | T22 | T22_pr_monitoring_handler.spec.ts | seedFromHost + IPC handler probe + asar fingerprint | ✓ pass |
| A | S11 | S11_quick_entry_from_other_focus.spec.ts | … | ✓ pass |
| ... |
## Notable findings
@@ -514,8 +499,24 @@ git diff --stat
the right substrate — see `T17_folder_picker.spec.ts` for the
end-to-end example. Don't query DOM by CSS selector unless
`claudeai.ts` doesn't already cover the surface.
- For mock-then-call: see T25 for the canonical pattern. Mock
installation is in `lib/claudeai.ts` alongside the dialog-mock;
add a sibling export, don't pre-extract a new file.
- For mock-then-call: helpers live in `lib/electron-mocks.ts` (not
`claudeai.ts` anymore — extracted in session 3). See T24's
leading comment for the `Promise<boolean>` variant + T25's for
the void variant.
- **For asar fingerprints: ALWAYS grep the installed asar first.**
Build-reference is beautified; the bundle is minified.
```bash
cd tools/test-harness && node -e "
const {extractFile} = require('@electron/asar');
const buf = extractFile(
'/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar',
'.vite/build/index.js'
);
const s = buf.toString('utf8');
for (const k of ['<your-needle>', '<another>']) {
console.log(k, '->', s.split(k).length - 1);
}
"
```
Begin with Phase 0. Don't fan out until calibration succeeds.

View File

@@ -18,6 +18,110 @@ work begins.
## Status (post-execution)
**Shipped session 3 (7 new specs):** T22, T24, T30, T31, T32, T33, T37.
Coverage moved from 50/76 (66%) to 57/76 (75%).
Session 3 findings + reclassifications:
- **eipc-registry finding (load-bearing — corrects session 2 T38).** The
`LocalSessions_$_*` and `CustomPlugins_$_*` channels named in case-doc
Code anchors (`:68816` framing comment, `:71392` listMarketplaces, etc.)
do **not** register through Electron's standard `ipcMain.handle()`
registry. KDE-W run revealed `ipcMain._invokeHandlers` holds only three
chat-tab MCP-bridge handlers (`list-mcp-servers`,
`connect-to-mcp-server`, `request-open-mcp-settings`) regardless of
ready level (`mainVisible` / `claudeAi` / `userLoaded`) and regardless
of whether the launch was hermetic (default isolation) or authenticated
(`createIsolation({ seedFromHost: true })`). Confirmed via inspector
walk of `globalThis` — no Map containing 5+ keys with the
`LocalSessions_$_*` shape exists at any reachable surface. The custom
`$eipc_message$_<UUID>_$_claude.web_$_<name>` protocol uses a closure-
local message-port registry that's not introspectable from main without
reverse-engineering the eipc bootstrap (deferred — same gotcha as
session 2's S28 with `Sbn()`).
- **T38 reclassified from Tier 2 → Tier 1.** Session 2 shipped T38 as a
`ipcMain._invokeHandlers` introspection probe assuming the channel
registered through stdlib IPC; the eipc-registry finding above shows
that probe never resolved a real handler. Reclassified to a Tier 1
asar fingerprint asserting the channel-name string
`LocalSessions_$_openInEditor` is present in bundled `index.js` (case-
doc anchor `:68816` framing / `:464011` egress). Same drift signal,
zero false-positive surface, no launch needed. Updated leading
comment links the eipc-registry finding for future maintainers.
- **T22, T31, T33 shipped as Tier 1 fingerprints, not Tier 2 IPC
probes.** All three were originally drafted using the (now-known-broken)
T38 handler-registered pattern. After the eipc-registry finding,
rewritten as pure asar fingerprints anchoring on the eipc channel-name
strings:
- **T22** asserts `LocalSessions_$_getPrChecks` *and* the
`"gh CLI not found in PATH"` throw site (case-doc anchors
`:464281` / `:464964` / `:464368`). Two-fingerprint runner; the
missing-`gh` string is the Linux-specific UX backstop since
`installGh()` is macOS-only.
- **T31** asserts the side-chat trio: `LocalSessions_$_startSideChat`,
`LocalSessions_$_sendSideChatMessage`, `LocalSessions_$_stopSideChat`
(case-doc anchors `:487025` / `:487265`). Trio is load-bearing —
side chat is broken without all three.
- **T33** asserts `CustomPlugins_$_listMarketplaces` and
`CustomPlugins_$_listAvailablePlugins` (case-doc anchors `:71392` /
`:71534` / `:507176`). Both load-bearing for the plugin-browser
populate flow.
- **T24 shipped as Tier 2 mock-then-call (Category E pattern).**
Mirrors T25's `installShowItemInFolderMock` shape; new
`installOpenExternalMock` helper records every `shell.openExternal`
call without launching a real editor on the host. Strictly stronger
than the asar-fingerprint alternative (Category C / `Mtt` registry
fingerprint) — exercises the actual egress at index.js:464011 with
the URL flowing through verbatim. The meaningful difference from T25:
`shell.openExternal` returns `Promise<boolean>` (not void), so the
mock returns a resolved Promise.
- **T30 sweep cadence regex tuned to minified bundle.** Case-doc names
the constants as `300_000` / `3_600_000` (beautified form); installed
asar has them as `300*1e3` / `3600*1e3`. Single regex with two
proximity windows — tail of `300*1e3` to `3600*1e3` ≤ 200 chars,
tail of `3600*1e3` to `AutoArchiveEngine` ≤ 3000 chars — confirmed
to match exactly once globally. Followed by an `.includes()` check
for `ccAutoArchiveOnPrClose` inside the captured window to colocate
the gate key.
- **T37 fixture-readback form deferred — Tier 1 fingerprint shipped.**
Session prompt suggested placing a fixture `~/.claude/CLAUDE.md` and
inspector-eval'ing the loaded memory state. The parsed-memory state
target is a closure-local minified symbol (same gotcha as S28 from
session 2 / S19's `cE()`/`Tce()` re-implementation note); without a
reachable readback target the fixture form would assert nothing
beyond "the spec didn't crash". Shipped as Tier 1 fingerprint
anchoring on `[GlobalMemory] Copied CLAUDE.md` (single-occurrence
log line, the cleanest possible anchor) plus `CLAUDE.md` filename
literal and `CLAUDE_CONFIG_DIR` env-var token.
- **`lib/electron-mocks.ts` extracted.** With T24 landing the third
mock-then-call helper (after T17's dialog mock and T25's
showItemInFolder mock), the threshold from the session prompt was
hit. Moved `installOpenDialogMock` / `installShowItemInFolderMock` /
`installOpenExternalMock` plus their `getCalls` readers + interfaces
out of `lib/claudeai.ts` into `lib/electron-mocks.ts`. T17, T24, T25
imports updated. The mocks are generic Electron module patches —
not claude.ai-domain — so the new home keeps `claudeai.ts` focused
on AX-tree page-objects.
- **Authentication state in launch-based specs.** All four launch-
based specs in this session (T22/T24/T31/T33 originally) ran with
default isolation, i.e. unauthenticated. After the eipc-registry
finding the three IPC probes converted to pure file probes (no
launch needed); T24 (mock-then-call) doesn't depend on auth state
because `shell.openExternal` is a stdlib Electron module patched
in main. For future Tier 2 reframes that DO depend on authenticated
renderer state (e.g. testing claude.ai DOM after login), the
T16/T26 `seedFromHost: true` pattern is the correct gate.
Tier 2 → Tier 2 candidates remaining for a future session: **S11,
S14** (focus-shifter primitive still unbuilt). **T35** (MCP server
config picked up — needs a reachable readback for parsed MCP server
state, same blocker as T37b/S19/S28). The eipc-registry surface
itself is a primitive gap — landing it would unlock proper Tier 2
runtime probes for T22/T31/T33/T38 and any future LocalSessions_*
or CustomPlugins_* tests.
---
**Shipped session 2 (10 new specs):** T10, T16, T23, T25, T26, T38, S10,
S19, S25, S28. Coverage moved from 40/76 (53%) to 50/76 (66%).

View File

@@ -7,8 +7,8 @@ architecture, decisions, and rationale.
## Status
Fifty specs wired (14 cross-env T-tests, 31 env-specific S-tests, 5
H-prefix harness self-tests). See
Fifty-seven specs wired (21 cross-env T-tests, 31 env-specific S-tests,
5 H-prefix harness self-tests). See
[`docs/testing/runner-implementation-plan.md`](../../docs/testing/runner-implementation-plan.md)
for the tiered triage of remaining tests and the per-spec rationale
behind tier classification.
@@ -32,10 +32,17 @@ behind tier classification.
| [T14b](../../docs/testing/cases/launch.md#t14--multi-instance-behavior) | Second invocation under same isolation exits cleanly; primary pid stays alive (runtime probe) | spawn delta + pgrep |
| [T16](../../docs/testing/cases/code-tab-foundations.md#t16--code-tab-loads) | After `seedFromHost` + `userLoaded`, `CodeTab.activate()` resolves and ≥1 compact pill renders (env pill = Code-body mounted) | L1 + AX-tree |
| [T17](../../docs/testing/cases/code-tab-foundations.md#t17--folder-picker-opens) | Code df-pill → env pill → Local → Select folder → Open folder triggers `dialog.showOpenDialog` (requires `CLAUDE_TEST_USE_HOST_CONFIG=1`) | L1 |
| [T22](../../docs/testing/cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) | Bundled `index.js` contains `LocalSessions_$_getPrChecks` eipc channel name *and* `gh CLI not found in PATH` Linux-fallthrough throw site (Tier 1 fingerprint — eipc registry not introspectable from main) | file probe |
| [T23](../../docs/testing/cases/code-tab-handoff.md#t23--desktop-notifications-fire) | Firing `new Notification({title})` from main reaches the session bus's `org.freedesktop.Notifications.Notify` (observed via `dbus-monitor`) | L1 + DBus subprocess |
| [T24](../../docs/testing/cases/code-tab-handoff.md#t24--open-in-external-editor) | After `installOpenExternalMock` mirroring T25's pattern, `evalInMain` calls `shell.openExternal('vscode://file/...')`; mock records the URL verbatim, no real editor launch | L1 (mocked egress) |
| [T25](../../docs/testing/cases/code-tab-handoff.md#t25--show-in-files--file-manager) | After `installShowItemInFolderMock` mirroring T17's dialog-mock pattern, `evalInMain` calls `shell.showItemInFolder(<synthetic path>)`; mock records the call verbatim, no throw — no host side effect | L1 (mocked egress) |
| [T26](../../docs/testing/cases/routines.md#t26--routines-page-renders) | After `seedFromHost` + `userLoaded`, click "Routines" sidebar AX button; assert "New routine" / "All" / "Calendar" anchor renders | L1 + AX-tree |
| [T38](../../docs/testing/cases/code-tab-handoff.md#t38--continue-in-ide) | `ipcMain._invokeHandlers` registry contains a channel ending in `LocalSessions_$_openInEditor` (handler-registered probe) | L1 (IPC introspection) |
| [T30](../../docs/testing/cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge) | Bundled `index.js` colocates the auto-archive sweep cadence (`300*1e3``3600*1e3``AutoArchiveEngine`) with the `ccAutoArchiveOnPrClose` gate key (single-regex multi-string fingerprint) | file probe |
| [T31](../../docs/testing/cases/code-tab-workflow.md#t31--side-chat-opens) | Bundled `index.js` contains all three side-chat eipc channel names (`startSideChat`, `sendSideChatMessage`, `stopSideChat`) — load-bearing trio | file probe |
| [T32](../../docs/testing/cases/code-tab-workflow.md#t32--slash-command-menu) | Bundled `index.js` contains `LocalSessions_$_getSupportedCommands` eipc channel + `slashCommands` schema field | file probe |
| [T33](../../docs/testing/cases/extensibility.md#t33--plugin-browser) | Bundled `index.js` contains `CustomPlugins_$_listMarketplaces` and `CustomPlugins_$_listAvailablePlugins` eipc channel names (browser populate flow) | file probe |
| [T37](../../docs/testing/cases/extensibility.md#t37--claudemd-memory-loads) | Bundled `index.js` contains `[GlobalMemory] Copied CLAUDE.md` log line + `CLAUDE.md` filename literal + `CLAUDE_CONFIG_DIR` env-var token (memory-loading wiring) | file probe |
| [T38](../../docs/testing/cases/code-tab-handoff.md#t38--continue-in-ide) | Bundled `index.js` contains `LocalSessions_$_openInEditor` eipc channel name (Tier 1 fingerprint — reclassified from session 2's broken `ipcMain._invokeHandlers` probe; eipc registry is closure-local) | file probe |
| H01 | CDP auth gate exits with code 1 when spawned with `--remote-debugging-port` and no `CLAUDE_CDP_AUTH` token | spawn probe |
| H02 | `frame-fix-wrapper.js` + `frame-fix-entry.js` injected into `app.asar` (Proxy + main-field reference) | file probe |
| H03 | Build-pipeline patch fingerprints all present in `app.asar` (KDE gate, frame-fix inject, tray, cowork, claude-code) | file probe |
@@ -75,17 +82,30 @@ These specs exercise the substrate primitives in `lib/`: `xprop`
shell-outs (T01, T04), `dbus-next` (T03), `dbus-monitor` subprocess
eavesdrop (T23), Node-inspector runtime-attach
(T07/T16/T17/T26/S10/S29-S35/T05-T14b L1 specs), `app.asar` content reads
(S08/S09/S21/S22/S26/S27/S28/T11/T14a/H02/H03/S33), `/proc/$pid/cmdline`
reads (S07/S12), pgrep-based pid deltas (T10/T14b/H04/S16/S30),
`mount(8)` parsing (S16), source-tree probes against
`scripts/launcher-common.sh` (S02), `dpkg-query` / `rpm -qR` / `rpm -qf`
calls (S03/S04/S05/T13), `safeStorage.encryptString` round-trip across
two launches (S25), `extraEnv` precedence over isolation env (S19),
`ipcMain._invokeHandlers` registry introspection (T38), and the
(S08/S09/S21/S22/S26/S27/S28/T11/T14a/T22/T30/T31/T32/T33/T37/T38/H02/H03/S33),
`/proc/$pid/cmdline` reads (S07/S12), pgrep-based pid deltas
(T10/T14b/H04/S16/S30), `mount(8)` parsing (S16), source-tree probes
against `scripts/launcher-common.sh` (S02), `dpkg-query` / `rpm -qR` /
`rpm -qf` calls (S03/S04/S05/T13), `safeStorage.encryptString`
round-trip across two launches (S25), `extraEnv` precedence over
isolation env (S19), the `lib/electron-mocks.ts` mock-then-call
helpers — `installOpenDialogMock` (T17), `installShowItemInFolderMock`
(T25), `installOpenExternalMock` (T24) — and the
`createIsolation({ seedFromHost: true })` primitive that lets
login-required tests run hermetically against a copy of the host's
signed-in auth state (T07, T16, T26).
Note on eipc channels: the `LocalSessions_$_*` and `CustomPlugins_$_*`
channel names referenced in the case-doc Code anchors do not register
through Electron's standard `ipcMain.handle()` registry — they use a
custom `$eipc_message$_<UUID>_$_claude.web_$_<name>` message-port
protocol whose registry is closure-local and not reachable from
`globalThis` at any ready level. T22, T31, T33, T38 anchor on the
eipc channel-name *strings* in the bundle (Tier 1 fingerprint) rather
than introspecting a registry. See
[`runner-implementation-plan.md`](../../docs/testing/runner-implementation-plan.md)
session 3 status section for the finding.
Per-row pass/skip counts depend on which sweep runs against the row;
see `runner-implementation-plan.md` for tier classification and
matrix-regen for the most-recent per-row outcomes. The Quick Entry