mirror of
https://github.com/aaddrick/claude-desktop-debian.git
synced 2026-05-17 00:26:21 +03:00
docs(testing): session 3 plan/inventory + rotate session 4 prompt
Updates the post-execution status section with session 3's seven
shipped specs, the eipc-registry finding (corrects session 2's T38
assumption), and the four reclassifications (T22/T31/T33/T38 from
Tier 2 IPC probes to Tier 1 fingerprints). Captures the
authentication-state lesson too — launches that depend on
authenticated renderer state need createIsolation({ seedFromHost:
true }), even if the case-doc-shaped Tier 2 form looks hermetic on
paper.
README inventory grows from 50 to 57 specs and adds a note that
LocalSessions_$_* / CustomPlugins_$_* channels use a custom eipc
protocol, not Electron's standard ipcMain.handle() — so future
runners should anchor on channel-name strings (Tier 1) rather than
introspect _invokeHandlers (broken).
Followup prompt rewritten for session 4: focus-shifter primitive +
S11/S14, T35 MCP separation fingerprints (Phase 1) and optional
fixture-readback (Phase 2, may abort), and the eipc-registry
exposer as a flagged primitive gap.
Co-Authored-By: Claude <claude@anthropic.com>
This commit is contained in:
@@ -1,18 +1,22 @@
|
||||
# test-harness runner implementation — session 3 prompt
|
||||
# test-harness runner implementation — session 4 prompt
|
||||
|
||||
This file is meant to be **copied verbatim into a fresh Claude Code
|
||||
session** as the initial user message. Don't paraphrase it; the
|
||||
orchestration depends on the exact directives below.
|
||||
|
||||
You're picking up after a runner-implementation session that landed 10
|
||||
new specs (5 Tier 2 + 4 Tier 2-reframes + 1 Tier 1 reclass), lifting
|
||||
harness coverage from 40/76 (53%) to 50/76 (66%). Three commits on
|
||||
`docs/compat-matrix`:
|
||||
You're picking up after a runner-implementation session that landed 7
|
||||
new specs (T22, T24, T30, T31, T32, T33, T37) and reclassified one
|
||||
session-2 carryover (T38), lifting harness coverage from 50/76 (66%)
|
||||
to 57/76 (75%). Three commits on `docs/compat-matrix`:
|
||||
|
||||
- `XXX` — `test(harness): session 2 runners + lib/claudeai mock helper`
|
||||
(10 new spec files; `installShowItemInFolderMock` added to
|
||||
`lib/claudeai.ts` mirroring the `installOpenDialogMock` pattern;
|
||||
README inventory + plan-doc status section updated).
|
||||
- `XXX` — `test(harness): session 3 runners + eipc-registry finding`
|
||||
(7 new spec files; `lib/electron-mocks.ts` extracted from
|
||||
`lib/claudeai.ts` once T24 brought the third mock-then-call helper
|
||||
online — `installOpenDialogMock` / `installShowItemInFolderMock` /
|
||||
`installOpenExternalMock` plus their `getCalls` readers; T17, T24,
|
||||
T25 imports updated; T22/T31/T33/T38 reclassified to Tier 1
|
||||
fingerprints after the eipc-registry finding; README inventory +
|
||||
plan-doc status updated).
|
||||
|
||||
(Substitute the actual SHA after committing — the user reviews and
|
||||
commits at the end of every session.)
|
||||
@@ -21,250 +25,225 @@ The plan doc at
|
||||
[`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
|
||||
captures the tier classification and execution-time reclassifications.
|
||||
Its "Status (post-execution)" section is the source of truth for what's
|
||||
done and what's deferred — read **session 2** then **session 1**
|
||||
sub-sections.
|
||||
done and what's deferred — read **session 3** first, then **session 2**,
|
||||
then **session 1** sub-sections.
|
||||
|
||||
This session is a continuation, not a restart. Start by reading the
|
||||
plan doc's status sections.
|
||||
|
||||
### Big new findings from session 2
|
||||
### Big new findings from session 3
|
||||
|
||||
1. **Mock-then-call beats invoke-then-cleanup for Tier 2 reframes of
|
||||
side-effecting Electron APIs.** T17's existing
|
||||
`installOpenDialogMock` pattern was extended to T25 via a new
|
||||
`installShowItemInFolderMock` in `lib/claudeai.ts`. Net: no host
|
||||
file-manager pop-up during the run, AND the assertion strengthens
|
||||
from "didn't throw" to "the egress was reached + the path arg
|
||||
flowed through verbatim". Apply this pattern to any future
|
||||
`shell.*` / `dialog.*` Tier 2 reframes (T24 `shell.openExternal`
|
||||
would mock cleanly the same way).
|
||||
1. **`ipcMain._invokeHandlers` does NOT see `claude.web` eipc
|
||||
channels.** This is load-bearing — corrects a session-2 assumption.
|
||||
The `LocalSessions_$_*` / `CustomPlugins_$_*` channels named in
|
||||
case-doc Code anchors use a **custom message-port protocol**
|
||||
(`$eipc_message$_<UUID>_$_claude.web_$_<name>` framing at
|
||||
`index.js:68816`) that's distinct from Electron stdlib IPC. KDE-W
|
||||
run revealed the standard registry holds only three chat-tab
|
||||
MCP-bridge handlers regardless of ready level
|
||||
(`mainVisible`/`claudeAi`/`userLoaded`) and regardless of whether
|
||||
the launch is hermetic or `seedFromHost: true` authenticated. The
|
||||
eipc registry itself is a closure-local — same gotcha as session
|
||||
2's `Sbn()` (S28) and `cE()`/`Tce()` (S19). Reverse-engineering the
|
||||
eipc bootstrap to expose the registry is a primitive gap that
|
||||
would unblock proper Tier 2 runtime probes for **T22, T31, T33,
|
||||
T38** and any future LocalSessions_/CustomPlugins_ tests.
|
||||
Reference: T22's leading comment, plan-doc session 3 status.
|
||||
|
||||
2. **`gdbus monitor --dest <name>` only sees signals OWNED BY that
|
||||
destination, not method calls TO it.** T23 had to switch from the
|
||||
plan's gdbus suggestion to `dbus-monitor` (eavesdrop match rule)
|
||||
to observe `org.freedesktop.Notifications.Notify` calls from
|
||||
Electron. If T27 / T22 / S24 ever ship Tier 2 reframes that need
|
||||
to observe method calls on a service, use `dbus-monitor`.
|
||||
2. **For tests that depend on authenticated renderer state, ALWAYS
|
||||
use `createIsolation({ seedFromHost: true })`.** Session 3's first
|
||||
four launch-based specs (T22/T24/T31/T33 — all originally drafted
|
||||
as IPC handler probes) defaulted to hermetic isolation,
|
||||
i.e. unauthenticated. Even if the eipc registry HAD been at
|
||||
`ipcMain._invokeHandlers`, the LocalSessions/CustomPlugins
|
||||
handlers register only after the renderer's authenticated init
|
||||
path runs — default isolation never gets past `/login`. T16 and
|
||||
T26 are the canonical seedFromHost templates; copy that shape any
|
||||
time the assertion depends on claude.ai's renderer modules being
|
||||
loaded.
|
||||
|
||||
3. **`ipcMain._invokeHandlers` channel naming carries a build-stable
|
||||
UUID prefix:** `$eipc_message$_<UUID>_$_claude.web_$_<name>`. T38
|
||||
anchors on the `_$_<name>` suffix to survive UUID rotation; the
|
||||
prefix is captured as diagnostic. Useful precedent for any future
|
||||
IPC-introspection probes.
|
||||
3. **`shell.openExternal` mock-then-call works identically to
|
||||
`shell.showItemInFolder` mock — but the mock returns
|
||||
`Promise<boolean>` not void.** T24 ships the mock pattern in
|
||||
`lib/electron-mocks.ts`. If a future spec needs to mock another
|
||||
`shell.*` method, mirror this shape: idempotency flag on
|
||||
`globalThis`, recorder pushes to a `__claudeAi*Calls` array, mock
|
||||
matches the documented return type. The mocks live in
|
||||
`lib/electron-mocks.ts` (extracted in session 3 — was in
|
||||
`lib/claudeai.ts` until the third helper landed).
|
||||
|
||||
4. **Closure-local minified helpers are NOT reachable from
|
||||
globalThis.** S28's plan called for inspector-eval against
|
||||
`Sbn()`, but `Sbn` is a closure-local — couldn't be invoked. S28
|
||||
reclassified to Tier 1 (asar fingerprint of the classifier
|
||||
expression). For any future "Tier 2 reframe via inspector-eval
|
||||
against minified helper X" entry: confirm reachability before
|
||||
classifying.
|
||||
4. **Asar fingerprint regex with multi-string proximity gates works
|
||||
well for cadence-style code.** T30 anchors three strings
|
||||
(`300*1e3`, `3600*1e3`, `AutoArchiveEngine`) in colocation with
|
||||
tuned distance windows (≤200 chars, ≤3000 chars), then runs an
|
||||
`.includes()` for a fourth string (`ccAutoArchiveOnPrClose`)
|
||||
inside the captured window. Single match globally. Pattern for
|
||||
any future "these constants are colocated with this class" test.
|
||||
|
||||
5. **`safeStorage` on Linux uses random IVs.** Only decrypted
|
||||
plaintext is comparable across encrypt calls; ciphertext bytes
|
||||
are not deterministic. S25 compares plaintexts.
|
||||
|
||||
6. **`extraEnv` precedence.** `lib/electron.ts:317-323` spreads in
|
||||
order: `process.env`, `LAUNCHER_INJECTED_ENV`, `isolation?.env`,
|
||||
`waylandEnv`, then `opts.extraEnv`, then `CI: '1'`. Override
|
||||
wins. S19 leans on this; load-bearing for future tests that
|
||||
need to override isolation defaults.
|
||||
5. **Build-reference is in BEAUTIFIED form; installed asar is
|
||||
MINIFIED. Numeric literals differ.** T30 case-doc named the
|
||||
constants as `300_000` / `3_600_000` (with underscores —
|
||||
beautified preserves them). The actual installed asar has
|
||||
`300*1e3` / `3600*1e3`. Always grep the installed asar before
|
||||
settling on a fingerprint string. The
|
||||
`/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar`
|
||||
path on KDE-W is the source of truth for fingerprints.
|
||||
|
||||
### Authoritative reference
|
||||
|
||||
Read these in order before fanning out:
|
||||
|
||||
- [`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
|
||||
— tier classification + status section. Read both **session 2**
|
||||
and **session 1** "Status (post-execution)" sub-sections. The
|
||||
Tier-3 list (line ~342) is the candidate pool for further
|
||||
reframes.
|
||||
— tier classification + status section. Read **session 3**,
|
||||
**session 2**, then **session 1** "Status (post-execution)" sub-
|
||||
sections. The Tier-3 list (line ~342) is the candidate pool for
|
||||
further reframes.
|
||||
- [`tools/test-harness/README.md`](../../tools/test-harness/README.md)
|
||||
— runner conventions, the now-50-spec inventory, primitives in
|
||||
`lib/`, isolation defaults, the CDP-gate workaround, the
|
||||
`seedFromHost` reference.
|
||||
— runner conventions, the now-57-spec inventory, primitives in
|
||||
`lib/`, isolation defaults, the CDP-gate workaround, the eipc note.
|
||||
- [`docs/testing/cases/README.md`](cases/README.md) — case-doc
|
||||
structure and the four anchor scopes.
|
||||
- [`tools/test-harness/src/lib/`](../../tools/test-harness/src/lib/)
|
||||
— the existing primitives. Notable additions since session 2:
|
||||
- `claudeai.ts` — `installShowItemInFolderMock` /
|
||||
`getShowItemInFolderCalls` (mirrors `installOpenDialogMock`).
|
||||
If 3+ tests start using mock-then-call, consider extracting to
|
||||
`lib/electron-mocks.ts` — but don't pre-extract.
|
||||
— the existing primitives. Notable additions since session 3:
|
||||
- `electron-mocks.ts` — extracted from `claudeai.ts` once T24
|
||||
brought the third mock-then-call helper online. Three pairs
|
||||
today (`installOpenDialogMock`, `installShowItemInFolderMock`,
|
||||
`installOpenExternalMock` + their readers). If a future spec
|
||||
needs another `shell.*` / `dialog.*` / similar mock, add it
|
||||
here as a fourth sibling.
|
||||
- [`tools/test-harness/src/runners/`](../../tools/test-harness/src/runners/)
|
||||
— every existing spec is a template. Notable session 2 templates:
|
||||
- `T16_code_tab_loads.spec.ts` / `T26_routines_page_renders.spec.ts`
|
||||
— seedFromHost + post-login renderer-side AX nav. T16 uses the
|
||||
existing `CodeTab.activate()`; T26 inlines a similar AX walker
|
||||
for the sidebar. Pattern for any further "click an AX-tree
|
||||
button after login" test.
|
||||
- `T25_show_item_in_folder_no_throw.spec.ts` — mock-then-call
|
||||
pattern (mirrors T17). Use this shape for any future Tier 2
|
||||
reframe of a side-effecting Electron API.
|
||||
- `T38_open_in_editor_handler_registered.spec.ts` — IPC handler
|
||||
registry introspection via `ipcMain._invokeHandlers`. Pattern
|
||||
for any "is this handler wired up" check.
|
||||
- `T23_notification_reaches_dbus.spec.ts` — dbus-monitor
|
||||
subprocess + inspector-fired notification + buffer scan.
|
||||
- `T10_cowork_daemon_respawn.spec.ts` — H04 extension: spawn,
|
||||
SIGKILL, poll for new pid. Pattern for any "service auto-respawn
|
||||
contract" test.
|
||||
- `S25_safestorage_token_persists.spec.ts` — two-launch with
|
||||
shared isolation handle + safeStorage round-trip via tmpfile.
|
||||
- `S28_worktree_permission_classifier.spec.ts` — single-regex
|
||||
asar fingerprint for a multi-string-OR classifier expression.
|
||||
— every existing spec is a template. Notable session 3 templates:
|
||||
- `T22_pr_monitoring_handler.spec.ts` — multi-fingerprint Tier 1
|
||||
(eipc channel-name string + Linux-fallthrough throw site).
|
||||
Pattern for any "the IPC channel name is in the bundle" probe
|
||||
when the registry isn't introspectable.
|
||||
- `T24_open_in_editor_no_throw.spec.ts` — mock-then-call with a
|
||||
`Promise<boolean>` egress. Pattern for any future `shell.*`
|
||||
egress that returns a Promise (not void).
|
||||
- `T30_auto_archive_cadence_constants.spec.ts` — single-regex
|
||||
multi-string-proximity asar fingerprint with a tuned distance
|
||||
window. Pattern for any "these constants are colocated with
|
||||
this class" test.
|
||||
- `T31_side_chat_handlers_registered.spec.ts` —
|
||||
`T33_plugin_browser_handler_registered.spec.ts` — eipc channel-
|
||||
name fingerprints. Pattern for any "is this IPC channel name
|
||||
in the bundle" probe.
|
||||
- `T37_claude_md_memory_fingerprint.spec.ts` — multi-anchor Tier
|
||||
1 with a single-occurrence high-signal log line as the primary
|
||||
anchor + broader namespace tokens for context.
|
||||
- [`docs/testing/cases/*.md`](cases/) — the spec each runner
|
||||
asserts. The **Code anchors:** field tells you exactly where
|
||||
upstream implements the feature.
|
||||
|
||||
### Tests in scope this session
|
||||
|
||||
Five categories, in priority order:
|
||||
**Realistic ceiling: ~5 new specs this session.** Session 3 hit ~7
|
||||
because Tier 1 fingerprints are cheap. Session 4's candidates are
|
||||
heavier — most need either a new primitive (focus-shifter, eipc-
|
||||
registry exposer) or fixture-then-readback against state that may
|
||||
not be reachable.
|
||||
|
||||
Three categories:
|
||||
|
||||
| # | Tests | Source files | Notes |
|
||||
|---|---|---|---|
|
||||
| **A** Deferred from session 2 | T31, T32, S06, S11, S14 | `code-tab-workflow.md` (T31, T32), `shortcuts-and-input.md` (S06, S11, S14) | T31/T32 need a Code-tab session OPEN; S06 needs Wayland row; S11/S14 need a new focus-shifter primitive |
|
||||
| **B** Tier 3 → Tier 2 reframes (read-only) | T22, T35, T37 | `code-tab-workflow.md` (T22), `extensibility.md` (T35, T37) | Each can ship as a *reads-from-disk-or-IPC-registry* probe without writing to the user's account |
|
||||
| **C** Asar fingerprint cleanups | T24, T30, T33 | `code-tab-handoff.md` (T24), `code-tab-workflow.md` (T30), `extensibility.md` (T33) | Each has a load-bearing string set in `index.js` that pins the wiring without needing a launch |
|
||||
| **D** New primitive — focus-shifter | (unblocks A's S11/S14) | `lib/input.ts` | xdotool / ydotool focus-stealing helper. Build only if S11/S14 worth shipping this session |
|
||||
| **E** Mock-then-call extension | T24 (mock form) | `code-tab-handoff.md` (T24) | Mirror of T25's pattern but for `shell.openExternal` — handler reaches the egress with the right URL |
|
||||
| **A** Focus-shifter primitive + S11/S14 | (lib/input.ts) + S11, S14 | `shortcuts-and-input.md` (S11, S14) | One PR builds the primitive (`focusOtherWindow()`), a second PR ships both runners |
|
||||
| **B** T35 — MCP server config picked up | T35 | `extensibility.md` (T35) | Place fixture `~/.claude.json` + `<project>/.mcp.json` under isolation; assert on parsed-state readback. Risky — the parsed-state target may be a closure-local (same blocker as T37b/S19/S28) |
|
||||
| **C** Deferred items audit | various | — | Re-walk session 1/2/3 deferrals; pick anything that's now tractable given the eipc finding + electron-mocks split |
|
||||
|
||||
Realistic ceiling: **~6-8 new specs** this session. Don't try all
|
||||
13 — Categories A and B are heavier than session 2's mix because
|
||||
they need either a Code-tab session opened OR a new primitive built
|
||||
first.
|
||||
#### Category A — focus-shifter primitive (3 specs)
|
||||
|
||||
### Detailed scope per category
|
||||
- **`lib/input.ts:focusOtherWindow()`.** Build the primitive first.
|
||||
- **X11 path:** `xdotool search --name '<test-marker>' windowfocus`
|
||||
or similar. xdotool is available on most rows.
|
||||
- **Wayland path:** No portable focus injection. Skip cleanly per
|
||||
row gate. KDE-W might allow `kwin_x11`-class hacks but those are
|
||||
not portable.
|
||||
- Verify by spawning a marker window (e.g. a `xterm -title
|
||||
'<marker>'` background process), focusing it, then asserting
|
||||
`xprop -root _NET_ACTIVE_WINDOW` returns its WID.
|
||||
- **S11** — Quick Entry shortcut fires from any focus.
|
||||
Launch app → focus marker window → fire `Ctrl+Alt+Space` via
|
||||
ydotool → assert popup appears (existing primitives).
|
||||
Row gate: GNOME-W, Ubu-W (mutter XWayland key-grab story is the
|
||||
load-bearing context). Currently broken on GNOME-W per #404; this
|
||||
runner is a regression detector.
|
||||
- **S14** — Global shortcuts via XDG portal work on Niri.
|
||||
Same shape as S11. Row gate: Niri. Currently fails per case-doc.
|
||||
Reframe possible: assert `--enable-features=GlobalShortcutsPortal`
|
||||
is in argv (this is what S12 already does). The DELIVERY-side
|
||||
test needs the focus-shifter primitive.
|
||||
|
||||
#### Category A — deferred items (5)
|
||||
#### Category B — T35 MCP server config (1 spec)
|
||||
|
||||
- **T31 — Side chat opens.** Needs: `seedFromHost` + Code-tab
|
||||
session OPEN (env pill → Local → choose folder → wait for session
|
||||
load). After session loads, send `Ctrl+;` via ydotool OR find the
|
||||
IPC handler `startSideChat` in `ipcMain._invokeHandlers` (T38
|
||||
pattern) and assert it's registered + invokable. The lighter form
|
||||
is the IPC-registry probe; the heavier form is full
|
||||
click-chain-into-side-chat.
|
||||
- **T32 — Slash command menu.** Needs: same Code-tab session OPEN
|
||||
preamble as T31. Then trigger `/` in the prompt textarea and
|
||||
assert the slash menu renders (AX-tree query for menuitem* nodes
|
||||
in the prompt area). Heavier than T31 because the slash menu is
|
||||
rendered server-side by claude.ai's bundle.
|
||||
- **S06 — URL handler segfault on native Wayland.** Needs:
|
||||
`CLAUDE_HARNESS_USE_WAYLAND=1` row + `coredumpctl info
|
||||
claude-desktop` observation after firing
|
||||
`xdg-open 'claude://chat/new'`. Skip cleanly if not on a
|
||||
Wayland row.
|
||||
- **S11 / S14 — focus-shifter delivery.** Needs: `lib/input.ts`
|
||||
with `focusOtherWindow()` (xdotool on X11; skip on Wayland or
|
||||
use compositor-specific). Then S11 / S14 launch app, focus
|
||||
another window, fire shortcut, assert popup appears. Build the
|
||||
primitive in one PR (Category D), then both runners in a second
|
||||
PR.
|
||||
T35 case-doc anchors at `:215418` (Code-tab loads
|
||||
`<project>/.mcp.json`), `:176766` (`~/.claude.json` reader), `:489098`
|
||||
(Code-session passes `settingSources: ["user","project","local"]` to
|
||||
agent SDK), `:130821` (`claude_desktop_config.json` is chat-tab path
|
||||
constant — separate userData dir per `:130829` `kee()`).
|
||||
|
||||
#### Category B — Tier 3 → Tier 2 reframes (3)
|
||||
**Phase 1 (cheap, ship today): asar separation fingerprint.** Assert:
|
||||
|
||||
These each have a slice that doesn't write to the user's real
|
||||
account:
|
||||
1. `claude_desktop_config.json` string is in `index.js`. (Chat-tab
|
||||
MCP path constant — load-bearing for the per-tab separation.)
|
||||
2. `kee()` resolution path: assert the userData-dir resolver is
|
||||
present.
|
||||
3. The strings `~/.claude.json` and `.mcp.json` are in `index.js`.
|
||||
(Code-tab MCP loaders.)
|
||||
|
||||
- **T22 — PR monitoring (read-only half).** The Tier 3 form opens a
|
||||
PR; the Tier 2 reframe is "after `seedFromHost`, IPC handler
|
||||
`getPrChecks` is registered + the `gh CLI not found in PATH`
|
||||
string is in the bundle". The handler-registered probe is the
|
||||
shippable form; the missing-`gh` warning string is a static
|
||||
fingerprint. Both ship as one runner.
|
||||
- **T35 — MCP server config picked up.** Reframe: place a fixture
|
||||
`claude_desktop_config.json` under the isolation's configDir
|
||||
(no host config touch needed — fresh isolation), then via
|
||||
inspector eval, read whatever main-process state holds the
|
||||
parsed MCP server list. Anchor on a known path under
|
||||
`${configDir}/Claude/`.
|
||||
- **T37 — `CLAUDE.md` memory loads.** Reframe: place a fixture
|
||||
`~/.claude/CLAUDE.md` (or under `CLAUDE_CONFIG_DIR/CLAUDE.md`
|
||||
with extraEnv override — see S19's pattern), then via inspector
|
||||
eval read the loaded memory state. Anchor needs to come from
|
||||
case-doc Code anchors.
|
||||
This Tier 1 form pins the wiring without needing a launch. It does
|
||||
NOT verify "the MCP server actually starts when a Code session
|
||||
opens" — that's the full Tier 3 form, needs login + a Code-tab
|
||||
session OPEN + an MCP server fixture.
|
||||
|
||||
#### Category C — asar fingerprint cleanups (3)
|
||||
**Phase 2 (risky, do only if Phase 1 lands and budget allows):
|
||||
fixture-then-readback Tier 2.** Place a fixture
|
||||
`<isolationDir>/Claude/claude_desktop_config.json` containing a
|
||||
synthetic `mcpServers` entry. Launch with `seedFromHost: true` (so
|
||||
the renderer is authenticated) + extraEnv override pointing the
|
||||
chat-tab loader at the isolationDir. Try inspector-eval to read the
|
||||
parsed MCP server list. **STOP AND REPORT** if the parsed-state
|
||||
target is a closure-local (same blocker as T37b/S19/S28). Don't
|
||||
ship a stub.
|
||||
|
||||
Each is Tier 1 / no launch:
|
||||
#### Category C — deferred items audit
|
||||
|
||||
- **T24 — Open in external editor (asar fingerprint).** The full
|
||||
click-chain T24 is Tier 3. The fingerprint half: assert `Mtt`
|
||||
registry is in `index.js` with the editor scheme strings
|
||||
(`vscode://`, `cursor://`, `zed://`, `windsurf://`).
|
||||
- **T30 — Auto-archive on PR merge (cadence constants).** Static
|
||||
fingerprint of the sweep cadence — assert `300_000` (5 min)
|
||||
and `3_600_000` (1 h) appear near the auto-archive code.
|
||||
- **T33 — Plugin browser (IPC handler registered).** Same shape
|
||||
as T38 — assert `listMarketplaces` IPC handler is registered.
|
||||
Walk through session 1/2/3 deferrals and identify any that are now
|
||||
tractable given session 3's findings. Specifically:
|
||||
|
||||
#### Category D — primitive build
|
||||
|
||||
- **`lib/input.ts:focusOtherWindow()`.** xdotool on X11
|
||||
(`xdotool search --name '<test-marker>' windowfocus`); on
|
||||
Wayland skip cleanly (no portable focus injection). Used by
|
||||
S11 / S14. Don't build unless those are in scope this session.
|
||||
|
||||
#### Category E — mock-then-call extension
|
||||
|
||||
- **T24 (mock form, alternative to Category C).** Mock
|
||||
`shell.openExternal` via a new `installOpenExternalMock` in
|
||||
`lib/claudeai.ts` (mirror of `installShowItemInFolderMock`),
|
||||
then `inspector.evalInMain` calls
|
||||
`shell.openExternal('vscode://file/tmp/test')` and assert
|
||||
the recorded call list contains the URL. Strictly stronger
|
||||
than the Category C fingerprint form. **Pick C OR E for T24
|
||||
— not both.**
|
||||
|
||||
### Why this iteration
|
||||
|
||||
The harness is at 50/76 coverage and every release tag now
|
||||
exercises the smoke-set + a chunk of critical surfaces
|
||||
automatically. Remaining work clusters in three pockets:
|
||||
|
||||
- **Code-tab cluster (T15-T39, mostly login-walled).** Session 2
|
||||
unblocked the *render-only* half via `seedFromHost`; session 3
|
||||
should push into the *open-a-session* half (T31, T32) and the
|
||||
read-only-Tier-3-reframes (T22, T35, T37).
|
||||
- **Wayland-specific tests (S06).** Need a Wayland row + the
|
||||
harness's `CLAUDE_HARNESS_USE_WAYLAND=1` switch.
|
||||
- **Focus-shift-dependent tests (S11, S14).** Need
|
||||
`lib/input.ts:focusOtherWindow()` built first.
|
||||
|
||||
After this session, future sessions can focus on the genuinely
|
||||
heavy Tier 3 work (destructive-write login tests; multi-launch
|
||||
state) with a clearer cost model.
|
||||
|
||||
### Known mechanism-recipe table (session 1 + session 2)
|
||||
|
||||
| Pattern | Use when | Worked example |
|
||||
|---|---|---|
|
||||
| `createIsolation({ seedFromHost: true })` | spec needs a signed-in renderer; read-only | T07, T16, T26 |
|
||||
| `isolation: null` + pre-launch `killHostClaude()` | spec needs SingletonLock collision (delivery probes) | T05 |
|
||||
| Default isolation | most other tests | T01, T03, T04, S29 |
|
||||
| `isolation: <handle>` (shared across launches) | multi-launch persistent state | S35, S25 |
|
||||
| `MainWindow.setState('close')` | exercise the wrapper close-interceptor | T08 |
|
||||
| Mock-then-call for `shell.*` / `dialog.*` | Tier 2 reframe of side-effecting API | T17, T25 |
|
||||
| `ipcMain._invokeHandlers` registry probe | "is this IPC handler wired up" | T38 |
|
||||
| `dbus-monitor` subprocess | observe DBus method calls TO a destination | T23 |
|
||||
| Asar single-regex multi-string-OR fingerprint | classifier-style code that combines several strings | S28 |
|
||||
- **S20** — `powerSaveBlocker` Inhibit. Issue #569 still open; not
|
||||
this session.
|
||||
- **T18** — drag-drop. X11 path is Tier 3 with xdotool drag. Wayland
|
||||
blocked until libei. Not this session.
|
||||
- **T34** — OAuth round-trip. Hard to mock; not this session.
|
||||
- **eipc-registry exposer (primitive gap)**. If you're feeling
|
||||
ambitious, reverse-engineer the eipc bootstrap and find a way to
|
||||
expose the channel→handler registry from main. Would unblock
|
||||
proper Tier 2 runtime probes for T22/T31/T33/T38. **High-risk,
|
||||
high-reward.** Likely involves walking the bundled `index.js`
|
||||
near `:68820` (`le(i)` origin validation) and `:68816` (channel
|
||||
framing) and identifying a stable handle. If the registry is
|
||||
truly closure-local with no exposed surface, abort and document.
|
||||
|
||||
### Constraints to respect (don't violate)
|
||||
|
||||
These are unchanged from sessions 1 and 2 and still load-bearing:
|
||||
These are unchanged from sessions 1/2/3 and still load-bearing:
|
||||
|
||||
- **Default isolation** unless the spec needs otherwise. Never
|
||||
write to `~/.config/Claude` without explicit gating
|
||||
(`CLAUDE_TEST_USE_HOST_CONFIG=1` opt-out, OR `seedFromHost: true`
|
||||
with read-only-then-discard semantics, OR an explicit comment
|
||||
documenting why).
|
||||
- **Default isolation** unless the spec needs otherwise. Use
|
||||
`seedFromHost: true` for any test that depends on authenticated
|
||||
renderer state — never assume default isolation gets past
|
||||
`/login`. T16/T26 are the templates.
|
||||
- **Don't introspect `ipcMain._invokeHandlers` for `claude.web`
|
||||
channels.** Session 3 confirmed those use a custom eipc protocol
|
||||
not in the standard registry. T22/T31/T33/T38 are now Tier 1
|
||||
fingerprints. If you build the eipc-registry exposer (Category
|
||||
C), update the plan-doc and this prompt accordingly.
|
||||
- **CDP auth gate is alive** — runtime SIGUSR1 attach via
|
||||
`app.attachInspector()`, never Playwright's `_electron.launch()`
|
||||
or `chromium.connectOverCDP()`.
|
||||
- **BrowserWindow Proxy gotcha** — use `webContents.getAllWebContents()`
|
||||
not `BrowserWindow.getAllWindows()`. Constructor-level wraps
|
||||
don't work; use prototype-method hooks.
|
||||
- **BrowserWindow Proxy gotcha** — use
|
||||
`webContents.getAllWebContents()` not `BrowserWindow.getAllWindows()`.
|
||||
Constructor-level wraps don't work; use prototype-method hooks.
|
||||
- **`skipUnlessRow()` always first.** First line of every `test()`
|
||||
body when the test is row-gated.
|
||||
- **No fixed sleeps.** `retryUntil` from `lib/retry.ts`, or
|
||||
@@ -279,22 +258,30 @@ These are unchanged from sessions 1 and 2 and still load-bearing:
|
||||
- **Don't break existing runners.** `npm run typecheck` must stay
|
||||
clean. H01-H05 are the canaries; `npm test` must still pass them
|
||||
after every commit.
|
||||
- **For mock-then-call: leading comment must document why mock
|
||||
beats invoke** (T25's leading comment is the worked example —
|
||||
three short paragraphs).
|
||||
- **Always grep the installed asar** to verify a fingerprint string
|
||||
is present (and how often) BEFORE shipping. Build-reference is
|
||||
beautified — strings differ from the minified bundle. Use
|
||||
`node -e "const {extractFile}=require('@electron/asar'); ..."`
|
||||
from inside `tools/test-harness` (where `@electron/asar` is on
|
||||
the require path).
|
||||
- **For mock-then-call: the helper goes in `lib/electron-mocks.ts`,
|
||||
not `lib/claudeai.ts`.** Session 3 extracted them. The pattern is
|
||||
documented in T24/T25's leading comments.
|
||||
|
||||
### Phases
|
||||
|
||||
#### Phase 0 — calibration
|
||||
|
||||
1. `cd tools/test-harness && npm run typecheck` — should pass.
|
||||
2. Read the plan doc's "Status (post-execution)" session 2 section,
|
||||
then read T25's leading comment (the mock-then-call pattern's
|
||||
worked-example doc) and T16/T26 (the seedFromHost-then-AX-nav
|
||||
pattern). Confirm you understand both.
|
||||
3. Pick one Category B candidate (suggest T22 — read-only half) and
|
||||
sketch the runner shape mentally. Don't write it yet — confirm
|
||||
you can plan from the spec.
|
||||
2. Read the plan doc's "Status (post-execution)" session 3 section,
|
||||
then read T22's leading comment (the eipc-registry finding's
|
||||
worked-example doc) and T24's leading comment (the
|
||||
`Promise<boolean>` mock-then-call variant). Confirm you
|
||||
understand both.
|
||||
3. Pick one Category candidate and sketch the runner shape mentally.
|
||||
Don't write it yet — confirm you can plan from the spec. Verify
|
||||
any fingerprint strings exist in the installed asar before
|
||||
committing to them.
|
||||
|
||||
If Phase 0 surfaces a problem (typecheck failing, primitives
|
||||
unclear, patterns not understood), stop and report. Don't fan out.
|
||||
@@ -304,34 +291,23 @@ unclear, patterns not understood), stop and report. Don't fan out.
|
||||
Spawn parallel subagents (cap at 6 in flight) for the highest-
|
||||
confidence candidates first.
|
||||
|
||||
**Suggested initial batch (4-5 specs):**
|
||||
**Suggested initial batch (~3-4 specs):**
|
||||
|
||||
- **B / T22 — PR monitoring read-only half.** seedFromHost +
|
||||
`ipcMain._invokeHandlers` for `getPrChecks` + asar fingerprint
|
||||
for the missing-`gh` warning string.
|
||||
- **C / T24 (asar fingerprint OR mock form, pick one).** If you
|
||||
want stronger coverage, go mock form (Category E shape — mirror
|
||||
of T25's `installOpenExternalMock` helper). If you want a quick
|
||||
Tier 1, go asar fingerprint (`Mtt` registry + scheme strings).
|
||||
- **C / T30 — Auto-archive cadence constants.** Pure asar probe.
|
||||
- **C / T33 — Plugin browser handler registered.** T38 pattern.
|
||||
- **B / T35 — MCP server config picked up.** Fixture under
|
||||
isolation configDir + inspector eval.
|
||||
- **A / `lib/input.ts:focusOtherWindow()` primitive.** Build the
|
||||
X11 path with xdotool, skip cleanly on Wayland. Verify with a
|
||||
marker-window round-trip.
|
||||
- **B / T35 Phase 1 — MCP separation fingerprints.** Pure asar
|
||||
probe; load-bearing strings only.
|
||||
- (Hold A/S11 + A/S14 for batch 2 — they depend on the primitive
|
||||
landing.)
|
||||
|
||||
If those land cleanly, dispatch a second batch:
|
||||
If those land cleanly, dispatch batch 2:
|
||||
|
||||
- **A / T31 — Side chat opens (handler-registered shape).**
|
||||
seedFromHost + `ipcMain._invokeHandlers` for `startSideChat`
|
||||
/ `sendSideChatMessage` / `stopSideChat`. The lighter probe.
|
||||
- **A / T32 — Slash command menu (asar fingerprint).** The full
|
||||
AX-tree form needs a Code-tab session open AND server-side
|
||||
rendered menu — heavy. The fingerprint form: `getSupportedCommands`
|
||||
+ `slashCommands` schema present in `index.js`.
|
||||
- **A / S11 / S14 (only if Category D primitive is built first).**
|
||||
Build `lib/input.ts:focusOtherWindow()` in PR 1; ship S11/S14
|
||||
in PR 2.
|
||||
- **B / T37 — CLAUDE.md memory loads.** Fixture file + inspector
|
||||
eval against the loaded memory state.
|
||||
- **A / S11** — Quick Entry shortcut from any focus.
|
||||
- **A / S14** — Global shortcuts via XDG portal on Niri.
|
||||
- **B / T35 Phase 2** — fixture-then-readback (only if Phase 1
|
||||
lands AND a reachable readback target is found; STOP AND REPORT
|
||||
otherwise).
|
||||
|
||||
#### Per-subagent prompt shape
|
||||
|
||||
@@ -350,8 +326,8 @@ Read in order:
|
||||
Write tools/test-harness/src/runners/<TEST-ID>_short_name.spec.ts.
|
||||
|
||||
[per-test specifics: pattern (seedFromHost / mock-then-call /
|
||||
ipcMain._invokeHandlers / asar fingerprint / shared isolation),
|
||||
assertion shape, skip rules, key constraint warnings]
|
||||
asar fingerprint / shared isolation), assertion shape, skip rules,
|
||||
key constraint warnings]
|
||||
|
||||
Constraints:
|
||||
- Tabs, ~80-char wrap.
|
||||
@@ -367,9 +343,10 @@ If the test isn't reasonable to implement (anchors don't resolve
|
||||
to anything assertable, the test depends on state you can't
|
||||
construct, the existing primitives don't cover the surface), DO
|
||||
NOT write a stub. Report under Open questions and stop. Sessions
|
||||
1 and 2 had cumulative ~6 "stop and report" outcomes that were
|
||||
1, 2, and 3 had cumulative ~8 "stop and report" outcomes that were
|
||||
the right call (S20 deferral, T05 reshape, T07 needs seedFromHost,
|
||||
T08 needs setState('close'), S28 reclassification, T38 framing).
|
||||
T08 needs setState('close'), S28 reclassification, T38 framing,
|
||||
session-3 eipc-registry finding, T37 fixture-readback deferral).
|
||||
|
||||
Report shape (~150 words):
|
||||
## <TEST-ID> runner
|
||||
@@ -389,9 +366,11 @@ After fan-out returns:
|
||||
1. `cd tools/test-harness && npm run typecheck` — must stay clean.
|
||||
2. Run the new runners against KDE-W (the dev box) — but flag the
|
||||
user first if any are destructive (seedFromHost kills running
|
||||
Claude; T31/T32 require an open Code-tab session that may
|
||||
accumulate state). Capture pass/skip/fail per spec for the
|
||||
matrix.
|
||||
Claude). **CRITICAL:** Test that any spec depending on
|
||||
authenticated state actually uses `seedFromHost: true` — session
|
||||
3 shipped specs with default isolation that needed
|
||||
authentication, masking the eipc-registry finding for several
|
||||
iterations. Capture pass/skip/fail per spec for the matrix.
|
||||
3. Update [`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
|
||||
"Status (post-execution)" section to reflect newly-shipped
|
||||
specs and any reclassifications discovered mid-flight.
|
||||
@@ -400,8 +379,8 @@ After fan-out returns:
|
||||
5. Write a final report listing:
|
||||
- Specs landed (pass / skip / needs-tuning per row)
|
||||
- Specs deferred (with the per-test rationale)
|
||||
- Specs reclassified (Tier 3 → Tier 2, Tier 2 → blocked, etc.)
|
||||
- Updated coverage stat (was 50/76 = 66%, now N/76 = M%)
|
||||
- Specs reclassified (Tier 3 → Tier 2, Tier 2 → Tier 1, etc.)
|
||||
- Updated coverage stat (was 57/76 = 75%, now N/76 = M%)
|
||||
6. Don't commit. The user reviews and commits.
|
||||
7. Rotate this prompt: rewrite
|
||||
`docs/testing/runner-implementation-followup-prompt.md` for the
|
||||
@@ -409,7 +388,7 @@ After fan-out returns:
|
||||
|
||||
### Self-correction loop
|
||||
|
||||
Same as sessions 1 and 2:
|
||||
Same as sessions 1, 2, and 3:
|
||||
|
||||
1. Subagent typecheck failure → re-spawn with explicit fix
|
||||
instruction.
|
||||
@@ -417,6 +396,13 @@ Same as sessions 1 and 2:
|
||||
file → re-spawn with explicit "use the Write tool" instruction.
|
||||
3. Two subagents wrote runners that share a primitive but with
|
||||
different shapes → factor into `lib/<topic>.ts` BEFORE shipping.
|
||||
4. **NEW for session 4:** Spec passes locally but the assertion is
|
||||
actually trivial (e.g. an unauthenticated launch where the
|
||||
handler check vacuously passes because no handlers are
|
||||
registered) → re-examine the assertion shape. Session 3's eipc-
|
||||
registry finding came from running the specs and finding only
|
||||
3 handlers in the registry; the lesson is to verify the
|
||||
assertion is meaningful, not just that it passes.
|
||||
|
||||
Cap re-spawns at 2 per file. Past that, mark as needing human
|
||||
review and move on.
|
||||
@@ -425,52 +411,51 @@ review and move on.
|
||||
|
||||
Stop and write the final report when one of:
|
||||
|
||||
1. **All Category A + B + C target specs landed and typecheck-clean.**
|
||||
1. **All Category A + B target specs landed and typecheck-clean.**
|
||||
Write coverage update, stop.
|
||||
2. **Hit re-spawn cap on 3+ runners.** Stop, write up which are
|
||||
blocked.
|
||||
3. **Discovered a primitive gap that breaks 5+ Tier 2/Tier 3
|
||||
tests.** Stop, propose where the new primitive should live in
|
||||
`lib/`. Future session adds the primitive first, then resumes.
|
||||
4. **Session budget hits ~7 new specs.** Stop, synthesize, leave
|
||||
4. **Session budget hits ~5 new specs.** Stop, synthesize, leave
|
||||
the rest for the next session.
|
||||
|
||||
### What you should NOT do
|
||||
|
||||
- **Don't try to land Category A + B + C in one batch.** That's
|
||||
~9-11 specs. Pick the highest-confidence subset for the first
|
||||
batch and decide whether to do more based on what came back.
|
||||
- **Don't try to land Category A + B + C in one batch.** Pick the
|
||||
highest-confidence subset for the first batch.
|
||||
- **Don't ship stubs.** If a runner can't actually assert what the
|
||||
spec says, mark it as Tier 3 / blocked / primitive-gap and don't
|
||||
write a placeholder. The cumulative six "stop and report"
|
||||
outcomes from sessions 1+2 were the right call — every one
|
||||
write a placeholder. The cumulative eight "stop and report"
|
||||
outcomes from sessions 1/2/3 were the right call — every one
|
||||
revealed a real constraint.
|
||||
- **Don't break existing runners.** H01-H05 are the canaries.
|
||||
- **Don't pre-extract `lib/electron-mocks.ts`.** The
|
||||
`installShowItemInFolderMock` + `installOpenDialogMock` pair
|
||||
doesn't yet justify a new file; if T24 ships as Category E
|
||||
(mock form), THAT's the third — extract then.
|
||||
- **Don't restructure `lib/`** beyond targeted additions.
|
||||
Premature abstractions are wrong abstractions.
|
||||
Premature abstractions are wrong abstractions. `electron-mocks.ts`
|
||||
was extracted in session 3 once the third helper landed —
|
||||
threshold-driven, not speculative.
|
||||
- **Don't run destructive Tier 3 tests** that write to the user's
|
||||
real claude.ai account (T22 PR write, T27 scheduling, T29
|
||||
worktree creation, T34 OAuth, T36 hooks). Only the *read-only
|
||||
reframes* of those are in scope this session.
|
||||
reframes* of those are in scope.
|
||||
- **Don't introspect `ipcMain._invokeHandlers` for `claude.web`
|
||||
eipc channels.** Confirmed broken in session 3. If you need
|
||||
runtime IPC verification for those channels, the eipc-registry
|
||||
exposer is the primitive gap to land first.
|
||||
- **Don't implement the #569 power-inhibit patch in this session.**
|
||||
That's a separate workstream. The S20 spec follows the patch,
|
||||
not the other way around.
|
||||
That's a separate workstream.
|
||||
- **Don't commit.** The user reviews and commits.
|
||||
|
||||
### Final report format
|
||||
|
||||
```markdown
|
||||
## Runner implementation summary (session 3)
|
||||
## Runner implementation summary (session 4)
|
||||
|
||||
- Category A landed: N / 5
|
||||
- Category B landed: N / 3
|
||||
- Category C landed: N / 3
|
||||
- Category A landed: N / 3 (focus-shifter primitive + S11 + S14)
|
||||
- Category B landed: N / 1-2 (T35 Phase 1 + maybe Phase 2)
|
||||
- Reclassified mid-flight: N (with reasons)
|
||||
- Coverage: was 50/76 (66%), now <NEW>/76 (<PCT>%)
|
||||
- Coverage: was 57/76 (75%), now <NEW>/76 (<PCT>%)
|
||||
- Typecheck: clean | <errors>
|
||||
- KDE-W test run: <pass/skip/fail counts>
|
||||
|
||||
@@ -478,7 +463,7 @@ Stop and write the final report when one of:
|
||||
|
||||
| Cat | Test ID | File | Assertion shape | Status |
|
||||
|---|---|---|---|---|
|
||||
| B | T22 | T22_pr_monitoring_handler.spec.ts | seedFromHost + IPC handler probe + asar fingerprint | ✓ pass |
|
||||
| A | S11 | S11_quick_entry_from_other_focus.spec.ts | … | ✓ pass |
|
||||
| ... |
|
||||
|
||||
## Notable findings
|
||||
@@ -514,8 +499,24 @@ git diff --stat
|
||||
the right substrate — see `T17_folder_picker.spec.ts` for the
|
||||
end-to-end example. Don't query DOM by CSS selector unless
|
||||
`claudeai.ts` doesn't already cover the surface.
|
||||
- For mock-then-call: see T25 for the canonical pattern. Mock
|
||||
installation is in `lib/claudeai.ts` alongside the dialog-mock;
|
||||
add a sibling export, don't pre-extract a new file.
|
||||
- For mock-then-call: helpers live in `lib/electron-mocks.ts` (not
|
||||
`claudeai.ts` anymore — extracted in session 3). See T24's
|
||||
leading comment for the `Promise<boolean>` variant + T25's for
|
||||
the void variant.
|
||||
- **For asar fingerprints: ALWAYS grep the installed asar first.**
|
||||
Build-reference is beautified; the bundle is minified.
|
||||
```bash
|
||||
cd tools/test-harness && node -e "
|
||||
const {extractFile} = require('@electron/asar');
|
||||
const buf = extractFile(
|
||||
'/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar',
|
||||
'.vite/build/index.js'
|
||||
);
|
||||
const s = buf.toString('utf8');
|
||||
for (const k of ['<your-needle>', '<another>']) {
|
||||
console.log(k, '->', s.split(k).length - 1);
|
||||
}
|
||||
"
|
||||
```
|
||||
|
||||
Begin with Phase 0. Don't fan out until calibration succeeds.
|
||||
|
||||
@@ -18,6 +18,110 @@ work begins.
|
||||
|
||||
## Status (post-execution)
|
||||
|
||||
**Shipped session 3 (7 new specs):** T22, T24, T30, T31, T32, T33, T37.
|
||||
Coverage moved from 50/76 (66%) to 57/76 (75%).
|
||||
|
||||
Session 3 findings + reclassifications:
|
||||
|
||||
- **eipc-registry finding (load-bearing — corrects session 2 T38).** The
|
||||
`LocalSessions_$_*` and `CustomPlugins_$_*` channels named in case-doc
|
||||
Code anchors (`:68816` framing comment, `:71392` listMarketplaces, etc.)
|
||||
do **not** register through Electron's standard `ipcMain.handle()`
|
||||
registry. KDE-W run revealed `ipcMain._invokeHandlers` holds only three
|
||||
chat-tab MCP-bridge handlers (`list-mcp-servers`,
|
||||
`connect-to-mcp-server`, `request-open-mcp-settings`) regardless of
|
||||
ready level (`mainVisible` / `claudeAi` / `userLoaded`) and regardless
|
||||
of whether the launch was hermetic (default isolation) or authenticated
|
||||
(`createIsolation({ seedFromHost: true })`). Confirmed via inspector
|
||||
walk of `globalThis` — no Map containing 5+ keys with the
|
||||
`LocalSessions_$_*` shape exists at any reachable surface. The custom
|
||||
`$eipc_message$_<UUID>_$_claude.web_$_<name>` protocol uses a closure-
|
||||
local message-port registry that's not introspectable from main without
|
||||
reverse-engineering the eipc bootstrap (deferred — same gotcha as
|
||||
session 2's S28 with `Sbn()`).
|
||||
- **T38 reclassified from Tier 2 → Tier 1.** Session 2 shipped T38 as a
|
||||
`ipcMain._invokeHandlers` introspection probe assuming the channel
|
||||
registered through stdlib IPC; the eipc-registry finding above shows
|
||||
that probe never resolved a real handler. Reclassified to a Tier 1
|
||||
asar fingerprint asserting the channel-name string
|
||||
`LocalSessions_$_openInEditor` is present in bundled `index.js` (case-
|
||||
doc anchor `:68816` framing / `:464011` egress). Same drift signal,
|
||||
zero false-positive surface, no launch needed. Updated leading
|
||||
comment links the eipc-registry finding for future maintainers.
|
||||
- **T22, T31, T33 shipped as Tier 1 fingerprints, not Tier 2 IPC
|
||||
probes.** All three were originally drafted using the (now-known-broken)
|
||||
T38 handler-registered pattern. After the eipc-registry finding,
|
||||
rewritten as pure asar fingerprints anchoring on the eipc channel-name
|
||||
strings:
|
||||
- **T22** asserts `LocalSessions_$_getPrChecks` *and* the
|
||||
`"gh CLI not found in PATH"` throw site (case-doc anchors
|
||||
`:464281` / `:464964` / `:464368`). Two-fingerprint runner; the
|
||||
missing-`gh` string is the Linux-specific UX backstop since
|
||||
`installGh()` is macOS-only.
|
||||
- **T31** asserts the side-chat trio: `LocalSessions_$_startSideChat`,
|
||||
`LocalSessions_$_sendSideChatMessage`, `LocalSessions_$_stopSideChat`
|
||||
(case-doc anchors `:487025` / `:487265`). Trio is load-bearing —
|
||||
side chat is broken without all three.
|
||||
- **T33** asserts `CustomPlugins_$_listMarketplaces` and
|
||||
`CustomPlugins_$_listAvailablePlugins` (case-doc anchors `:71392` /
|
||||
`:71534` / `:507176`). Both load-bearing for the plugin-browser
|
||||
populate flow.
|
||||
- **T24 shipped as Tier 2 mock-then-call (Category E pattern).**
|
||||
Mirrors T25's `installShowItemInFolderMock` shape; new
|
||||
`installOpenExternalMock` helper records every `shell.openExternal`
|
||||
call without launching a real editor on the host. Strictly stronger
|
||||
than the asar-fingerprint alternative (Category C / `Mtt` registry
|
||||
fingerprint) — exercises the actual egress at index.js:464011 with
|
||||
the URL flowing through verbatim. The meaningful difference from T25:
|
||||
`shell.openExternal` returns `Promise<boolean>` (not void), so the
|
||||
mock returns a resolved Promise.
|
||||
- **T30 sweep cadence regex tuned to minified bundle.** Case-doc names
|
||||
the constants as `300_000` / `3_600_000` (beautified form); installed
|
||||
asar has them as `300*1e3` / `3600*1e3`. Single regex with two
|
||||
proximity windows — tail of `300*1e3` to `3600*1e3` ≤ 200 chars,
|
||||
tail of `3600*1e3` to `AutoArchiveEngine` ≤ 3000 chars — confirmed
|
||||
to match exactly once globally. Followed by an `.includes()` check
|
||||
for `ccAutoArchiveOnPrClose` inside the captured window to colocate
|
||||
the gate key.
|
||||
- **T37 fixture-readback form deferred — Tier 1 fingerprint shipped.**
|
||||
Session prompt suggested placing a fixture `~/.claude/CLAUDE.md` and
|
||||
inspector-eval'ing the loaded memory state. The parsed-memory state
|
||||
target is a closure-local minified symbol (same gotcha as S28 from
|
||||
session 2 / S19's `cE()`/`Tce()` re-implementation note); without a
|
||||
reachable readback target the fixture form would assert nothing
|
||||
beyond "the spec didn't crash". Shipped as Tier 1 fingerprint
|
||||
anchoring on `[GlobalMemory] Copied CLAUDE.md` (single-occurrence
|
||||
log line, the cleanest possible anchor) plus `CLAUDE.md` filename
|
||||
literal and `CLAUDE_CONFIG_DIR` env-var token.
|
||||
- **`lib/electron-mocks.ts` extracted.** With T24 landing the third
|
||||
mock-then-call helper (after T17's dialog mock and T25's
|
||||
showItemInFolder mock), the threshold from the session prompt was
|
||||
hit. Moved `installOpenDialogMock` / `installShowItemInFolderMock` /
|
||||
`installOpenExternalMock` plus their `getCalls` readers + interfaces
|
||||
out of `lib/claudeai.ts` into `lib/electron-mocks.ts`. T17, T24, T25
|
||||
imports updated. The mocks are generic Electron module patches —
|
||||
not claude.ai-domain — so the new home keeps `claudeai.ts` focused
|
||||
on AX-tree page-objects.
|
||||
- **Authentication state in launch-based specs.** All four launch-
|
||||
based specs in this session (T22/T24/T31/T33 originally) ran with
|
||||
default isolation, i.e. unauthenticated. After the eipc-registry
|
||||
finding the three IPC probes converted to pure file probes (no
|
||||
launch needed); T24 (mock-then-call) doesn't depend on auth state
|
||||
because `shell.openExternal` is a stdlib Electron module patched
|
||||
in main. For future Tier 2 reframes that DO depend on authenticated
|
||||
renderer state (e.g. testing claude.ai DOM after login), the
|
||||
T16/T26 `seedFromHost: true` pattern is the correct gate.
|
||||
|
||||
Tier 2 → Tier 2 candidates remaining for a future session: **S11,
|
||||
S14** (focus-shifter primitive still unbuilt). **T35** (MCP server
|
||||
config picked up — needs a reachable readback for parsed MCP server
|
||||
state, same blocker as T37b/S19/S28). The eipc-registry surface
|
||||
itself is a primitive gap — landing it would unlock proper Tier 2
|
||||
runtime probes for T22/T31/T33/T38 and any future LocalSessions_*
|
||||
or CustomPlugins_* tests.
|
||||
|
||||
---
|
||||
|
||||
**Shipped session 2 (10 new specs):** T10, T16, T23, T25, T26, T38, S10,
|
||||
S19, S25, S28. Coverage moved from 40/76 (53%) to 50/76 (66%).
|
||||
|
||||
|
||||
@@ -7,8 +7,8 @@ architecture, decisions, and rationale.
|
||||
|
||||
## Status
|
||||
|
||||
Fifty specs wired (14 cross-env T-tests, 31 env-specific S-tests, 5
|
||||
H-prefix harness self-tests). See
|
||||
Fifty-seven specs wired (21 cross-env T-tests, 31 env-specific S-tests,
|
||||
5 H-prefix harness self-tests). See
|
||||
[`docs/testing/runner-implementation-plan.md`](../../docs/testing/runner-implementation-plan.md)
|
||||
for the tiered triage of remaining tests and the per-spec rationale
|
||||
behind tier classification.
|
||||
@@ -32,10 +32,17 @@ behind tier classification.
|
||||
| [T14b](../../docs/testing/cases/launch.md#t14--multi-instance-behavior) | Second invocation under same isolation exits cleanly; primary pid stays alive (runtime probe) | spawn delta + pgrep |
|
||||
| [T16](../../docs/testing/cases/code-tab-foundations.md#t16--code-tab-loads) | After `seedFromHost` + `userLoaded`, `CodeTab.activate()` resolves and ≥1 compact pill renders (env pill = Code-body mounted) | L1 + AX-tree |
|
||||
| [T17](../../docs/testing/cases/code-tab-foundations.md#t17--folder-picker-opens) | Code df-pill → env pill → Local → Select folder → Open folder triggers `dialog.showOpenDialog` (requires `CLAUDE_TEST_USE_HOST_CONFIG=1`) | L1 |
|
||||
| [T22](../../docs/testing/cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) | Bundled `index.js` contains `LocalSessions_$_getPrChecks` eipc channel name *and* `gh CLI not found in PATH` Linux-fallthrough throw site (Tier 1 fingerprint — eipc registry not introspectable from main) | file probe |
|
||||
| [T23](../../docs/testing/cases/code-tab-handoff.md#t23--desktop-notifications-fire) | Firing `new Notification({title})` from main reaches the session bus's `org.freedesktop.Notifications.Notify` (observed via `dbus-monitor`) | L1 + DBus subprocess |
|
||||
| [T24](../../docs/testing/cases/code-tab-handoff.md#t24--open-in-external-editor) | After `installOpenExternalMock` mirroring T25's pattern, `evalInMain` calls `shell.openExternal('vscode://file/...')`; mock records the URL verbatim, no real editor launch | L1 (mocked egress) |
|
||||
| [T25](../../docs/testing/cases/code-tab-handoff.md#t25--show-in-files--file-manager) | After `installShowItemInFolderMock` mirroring T17's dialog-mock pattern, `evalInMain` calls `shell.showItemInFolder(<synthetic path>)`; mock records the call verbatim, no throw — no host side effect | L1 (mocked egress) |
|
||||
| [T26](../../docs/testing/cases/routines.md#t26--routines-page-renders) | After `seedFromHost` + `userLoaded`, click "Routines" sidebar AX button; assert "New routine" / "All" / "Calendar" anchor renders | L1 + AX-tree |
|
||||
| [T38](../../docs/testing/cases/code-tab-handoff.md#t38--continue-in-ide) | `ipcMain._invokeHandlers` registry contains a channel ending in `LocalSessions_$_openInEditor` (handler-registered probe) | L1 (IPC introspection) |
|
||||
| [T30](../../docs/testing/cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge) | Bundled `index.js` colocates the auto-archive sweep cadence (`300*1e3` ≤ `3600*1e3` ≤ `AutoArchiveEngine`) with the `ccAutoArchiveOnPrClose` gate key (single-regex multi-string fingerprint) | file probe |
|
||||
| [T31](../../docs/testing/cases/code-tab-workflow.md#t31--side-chat-opens) | Bundled `index.js` contains all three side-chat eipc channel names (`startSideChat`, `sendSideChatMessage`, `stopSideChat`) — load-bearing trio | file probe |
|
||||
| [T32](../../docs/testing/cases/code-tab-workflow.md#t32--slash-command-menu) | Bundled `index.js` contains `LocalSessions_$_getSupportedCommands` eipc channel + `slashCommands` schema field | file probe |
|
||||
| [T33](../../docs/testing/cases/extensibility.md#t33--plugin-browser) | Bundled `index.js` contains `CustomPlugins_$_listMarketplaces` and `CustomPlugins_$_listAvailablePlugins` eipc channel names (browser populate flow) | file probe |
|
||||
| [T37](../../docs/testing/cases/extensibility.md#t37--claudemd-memory-loads) | Bundled `index.js` contains `[GlobalMemory] Copied CLAUDE.md` log line + `CLAUDE.md` filename literal + `CLAUDE_CONFIG_DIR` env-var token (memory-loading wiring) | file probe |
|
||||
| [T38](../../docs/testing/cases/code-tab-handoff.md#t38--continue-in-ide) | Bundled `index.js` contains `LocalSessions_$_openInEditor` eipc channel name (Tier 1 fingerprint — reclassified from session 2's broken `ipcMain._invokeHandlers` probe; eipc registry is closure-local) | file probe |
|
||||
| H01 | CDP auth gate exits with code 1 when spawned with `--remote-debugging-port` and no `CLAUDE_CDP_AUTH` token | spawn probe |
|
||||
| H02 | `frame-fix-wrapper.js` + `frame-fix-entry.js` injected into `app.asar` (Proxy + main-field reference) | file probe |
|
||||
| H03 | Build-pipeline patch fingerprints all present in `app.asar` (KDE gate, frame-fix inject, tray, cowork, claude-code) | file probe |
|
||||
@@ -75,17 +82,30 @@ These specs exercise the substrate primitives in `lib/`: `xprop`
|
||||
shell-outs (T01, T04), `dbus-next` (T03), `dbus-monitor` subprocess
|
||||
eavesdrop (T23), Node-inspector runtime-attach
|
||||
(T07/T16/T17/T26/S10/S29-S35/T05-T14b L1 specs), `app.asar` content reads
|
||||
(S08/S09/S21/S22/S26/S27/S28/T11/T14a/H02/H03/S33), `/proc/$pid/cmdline`
|
||||
reads (S07/S12), pgrep-based pid deltas (T10/T14b/H04/S16/S30),
|
||||
`mount(8)` parsing (S16), source-tree probes against
|
||||
`scripts/launcher-common.sh` (S02), `dpkg-query` / `rpm -qR` / `rpm -qf`
|
||||
calls (S03/S04/S05/T13), `safeStorage.encryptString` round-trip across
|
||||
two launches (S25), `extraEnv` precedence over isolation env (S19),
|
||||
`ipcMain._invokeHandlers` registry introspection (T38), and the
|
||||
(S08/S09/S21/S22/S26/S27/S28/T11/T14a/T22/T30/T31/T32/T33/T37/T38/H02/H03/S33),
|
||||
`/proc/$pid/cmdline` reads (S07/S12), pgrep-based pid deltas
|
||||
(T10/T14b/H04/S16/S30), `mount(8)` parsing (S16), source-tree probes
|
||||
against `scripts/launcher-common.sh` (S02), `dpkg-query` / `rpm -qR` /
|
||||
`rpm -qf` calls (S03/S04/S05/T13), `safeStorage.encryptString`
|
||||
round-trip across two launches (S25), `extraEnv` precedence over
|
||||
isolation env (S19), the `lib/electron-mocks.ts` mock-then-call
|
||||
helpers — `installOpenDialogMock` (T17), `installShowItemInFolderMock`
|
||||
(T25), `installOpenExternalMock` (T24) — and the
|
||||
`createIsolation({ seedFromHost: true })` primitive that lets
|
||||
login-required tests run hermetically against a copy of the host's
|
||||
signed-in auth state (T07, T16, T26).
|
||||
|
||||
Note on eipc channels: the `LocalSessions_$_*` and `CustomPlugins_$_*`
|
||||
channel names referenced in the case-doc Code anchors do not register
|
||||
through Electron's standard `ipcMain.handle()` registry — they use a
|
||||
custom `$eipc_message$_<UUID>_$_claude.web_$_<name>` message-port
|
||||
protocol whose registry is closure-local and not reachable from
|
||||
`globalThis` at any ready level. T22, T31, T33, T38 anchor on the
|
||||
eipc channel-name *strings* in the bundle (Tier 1 fingerprint) rather
|
||||
than introspecting a registry. See
|
||||
[`runner-implementation-plan.md`](../../docs/testing/runner-implementation-plan.md)
|
||||
session 3 status section for the finding.
|
||||
|
||||
Per-row pass/skip counts depend on which sweep runs against the row;
|
||||
see `runner-implementation-plan.md` for tier classification and
|
||||
matrix-regen for the most-recent per-row outcomes. The Quick Entry
|
||||
|
||||
Reference in New Issue
Block a user