Build out a Playwright-based regression-detection harness covering the compat-matrix surfaces (KDE-W, KDE-X, GNOME, Sway, i3, Niri, packaging formats). Adds: - Planning + decision docs under docs/testing/ — README, matrix, runbook, automation, cases/ (11 case files), quick-entry-closeout - Playwright scaffolding (config, tsconfig) - 78 spec runners under tools/test-harness/src/runners/ — T## case- doc runners and S## distribution/smoke runners - Substrate primitives in tools/test-harness/src/lib/: AX-tree loader (snapshotAx + waitForAxNode + axTreeToSnapshot), focus- shifter, eipc-registry, niri-native bridge, drag-drop bridge, electron-mocks, claudeai page-objects, inspector client S03 (DEB Depends declared) and S04 (RPM Requires declared) ship marked test.fail() — they're regression detectors for the case-doc gap (deb.sh emits no Depends:, rpm.sh sets AutoReqProv: no), and the expected-failure shape lets them report green on every host until upstream packaging starts declaring runtime deps. 127 files, no runtime changes; harness is opt-in via 'cd tools/test-harness && npx playwright test'. Co-authored-by: Claude <claude@anthropic.com>
41 KiB
Linux Compatibility Test Harness
In-VM (or on-host) Playwright + DBus runner for the test cases under
docs/testing/cases/. See
docs/testing/automation.md for the
architecture, decisions, and rationale.
Status
Seventy-four specs wired (36 cross-env T-tests, 33 env-specific S-tests, 5 H-prefix harness self-tests).
| Test | What it checks | Layer |
|---|---|---|
| T01 | X11 window with our pid appears within 15s; title matches /claude/i |
L2 (xprop) |
| T02 | claude-desktop --doctor exits 0 |
spawn probe |
| T03 | A StatusNotifierItem is registered by the claude-desktop pid AND exactly one (no rebuild-race duplicates) |
L2 (DBus) |
| T04 | Window has _NET_FRAME_EXTENTS (sum > 0) and a "Claude" title |
L2 (xprop) |
| T05 | xdg-open 'claude://...' delivers via app.on('second-instance') to the running app |
spawn + L1 hook |
| T06 | globalShortcut.isRegistered('Ctrl+Alt+Space') returns true after mainVisible |
L1 |
| T07 | Five topbar buttons render with non-zero rects (uses seedFromHost for hermetic auth) |
L1 + DOM |
| T08 | win.close() fires the wrapper interceptor; window hidden, proc alive |
L1 |
| T09 | setLoginItemSettings({ openAtLogin }) writes/removes $XDG_CONFIG_HOME/autostart/claude-desktop.desktop |
L1 + filesystem |
| T10 | After H04-style spawn detection, kill -9 the daemon and confirm a different pid respawns within ~20s (Patch 6 cooldown + retry) |
pgrep delta + spawn delta |
| T11 | Plugin-install code path fingerprints present in bundled index.js |
file probe |
| T11_runtime | After seedFromHost + userLoaded, the install-flow eipc surface (installPlugin, uninstallPlugin, updatePlugin, listInstalledPlugins, LocalPlugins/getPlugins — five-suffix presence probe) is registered on the claude.ai webContents AND BOTH read-side handlers across the two impl objects are callable through the renderer-side wrapper: CustomPlugins/listInstalledPlugins([]) returns array shape (drives Manage plugins panel), LocalPlugins/getPlugins() returns array shape (reads ~/.claude/plugins/installed_plugins.json per case-doc :465822) — Tier 2 reframe of T11 (case-doc anchor :507181) |
L1 (eipc registry + invoke) |
| T12 | app.getGPUFeatureStatus() returns a populated object; renderer reached visible |
L1 |
| T13 | --doctor does not false-flag rpm/deb installs as missing-dpkg AppImage |
spawn + stdout grep |
| T14a | requestSingleInstanceLock + 'second-instance' strings in bundled index.js (file probe) |
file probe |
| T14b | Second invocation under same isolation exits cleanly; primary pid stays alive (runtime probe) | spawn delta + pgrep |
| T16 | After seedFromHost + userLoaded, CodeTab.activate() resolves and ≥1 compact pill renders (env pill = Code-body mounted) |
L1 + AX-tree |
| T17 | After seedFromHost + userLoaded, Code df-pill → env pill → Local → Select folder → Open folder triggers dialog.showOpenDialog (mock installed via installOpenDialogMock); skips cleanly when host has no signed-in Claude config |
L1 + AX-tree |
| T18 | Bundled mainView.js preload contains the path-resolution bridge fingerprints: getPathForFile (2× — property key + the webUtils.getPathForFile( call, both at case-doc :9267), webUtils, filePickers, and the claudeAppSettings contextBridge.exposeInMainWorld namespace (case-doc :9552) — pins the load-bearing wiring without faking OS-level XDND drag (xdotool can't put file URIs on the X11 selection; Wayland needs per-compositor IPC + libei) |
file probe |
| T19 | After seedFromHost + userLoaded, the integrated-terminal eipc surface (startShellPty, writeShellPty, stopShellPty, resizeShellPty, getShellPtyBuffer — five-suffix presence probe) is registered on the claude.ai webContents AND the foundational LocalSessions/getAll returns array shape (Tier 2 reframe of the case-doc T19 case; case-doc anchors are write-side startShellPty etc. so reframe asserts the FULL terminal IPC surface registers + a stateless read-side surrogate is invocable) |
L1 (eipc registry + invoke) |
| T20 | After seedFromHost + userLoaded, the file-pane eipc surface (readSessionFile, writeSessionFile, pickSessionFile — three-suffix presence probe) is registered on the claude.ai webContents AND the foundational LocalSessions/getAll returns array shape (Tier 2 reframe of the case-doc T20 case; the case-doc's readSessionFile anchor is read-side but needs (sessionId, path) args not constructible from a fresh isolation, so the registration probe + foundational getAll invocation is the strongest non-destructive Tier 2 layer) |
L1 (eipc registry + invoke) |
| T21 | After seedFromHost + userLoaded, the preview-pane eipc surface (getConfiguredServices, startFromConfig, stopServer, getAutoVerify, capturePreviewScreenshot — five-suffix presence probe) is registered on the claude.ai webContents AND BOTH case-doc-anchored read-side handlers are callable through the renderer-side wrapper: getConfiguredServices(cwd) returns array shape, getAutoVerify(cwd) returns boolean shape (Tier 2 reframe of the case-doc T21 case; cwd validator is typeof cwd === 'string' only, smoke-tested session 11) |
L1 (eipc registry + invoke) |
| T22 | Bundled index.js contains LocalSessions_$_getPrChecks eipc channel name and gh CLI not found in PATH Linux-fallthrough throw site (Tier 1 fingerprint) |
file probe |
| T22b | After seedFromHost + userLoaded, the LocalSessions_$_getPrChecks eipc handler is registered on the claude.ai webContents (webContents.ipc._invokeHandlers — Tier 2 runtime probe sibling of T22, strictly stronger than the bundle-string fingerprint) |
L1 (eipc registry) |
| T23 | Firing new Notification({title}) from main reaches the session bus's org.freedesktop.Notifications.Notify (observed via dbus-monitor) |
L1 + DBus subprocess |
| T24 | After installOpenExternalMock mirroring T25's pattern, evalInMain calls shell.openExternal('vscode://file/...'); mock records the URL verbatim, no real editor launch |
L1 (mocked egress) |
| T25 | After installShowItemInFolderMock mirroring T17's dialog-mock pattern, evalInMain calls shell.showItemInFolder(<synthetic path>); mock records the call verbatim, no throw — no host side effect |
L1 (mocked egress) |
| T26 | After seedFromHost + userLoaded, click "Routines" sidebar AX button; assert "New routine" / "All" / "Calendar" anchor renders |
L1 + AX-tree |
| T27 | After seedFromHost + userLoaded, both Cowork and CCD getAllScheduledTasks eipc handlers are registered AND callable through the renderer-side wrapper, returning array shape — Tier 2 reframe of the case-doc T27 case |
L1 (eipc invoke) |
| T30 | Bundled index.js colocates the auto-archive sweep cadence (300*1e3 ≤ 3600*1e3 ≤ AutoArchiveEngine) with the ccAutoArchiveOnPrClose gate key (single-regex multi-string fingerprint) |
file probe |
| T31 | Bundled index.js contains all three side-chat eipc channel names (startSideChat, sendSideChatMessage, stopSideChat) — load-bearing trio |
file probe |
| T31b | After seedFromHost + userLoaded, all three side-chat eipc handlers (startSideChat, sendSideChatMessage, stopSideChat) are registered on the claude.ai webContents — load-bearing trio (Tier 2 runtime sibling of T31) |
L1 (eipc registry) |
| T32 | Bundled index.js contains LocalSessions_$_getSupportedCommands eipc channel + slashCommands schema field |
file probe |
| T33 | Bundled index.js contains CustomPlugins_$_listMarketplaces and CustomPlugins_$_listAvailablePlugins eipc channel names (browser populate flow) |
file probe |
| T33b | After seedFromHost + userLoaded, both plugin-browser eipc handlers (listMarketplaces, listAvailablePlugins) are registered on the claude.ai webContents — load-bearing pair (Tier 2 runtime sibling of T33) |
L1 (eipc registry) |
| T33c | After seedFromHost + userLoaded, both plugin-browser eipc handlers (listMarketplaces, listAvailablePlugins) are callable through the renderer-side wrapper with args = [[]] (empty egressAllowedDomains), each returning array shape — Tier 2 invocation upgrade of T33b, strictly stronger than registration alone |
L1 (eipc invoke) |
| T35 | Bundled index.js contains the four-needle MCP-config separation fingerprint: claude_desktop_config.json (chat-tab path), .claude.json + .mcp.json (Code-tab loaders), "user","project","local" (settingSources triple Code-session passes to the agent SDK) — pins per-tab separation without launch |
file probe |
| T35b | After seedFromHost + userLoaded, the claude.settings/MCP/getMcpServersConfig eipc handler is registered AND callable through the renderer-side wrapper, returning a non-array object (Tier 2 runtime sibling of T35, strictly stronger than the bundle-string fingerprint) |
L1 (eipc invoke) |
| T36 | Bundled index.js contains the hooks runtime fingerprint: hook_started / hook_progress / hook_response (single-occurrence Verbose-transcript runtime emits) plus PreToolUse / UserPromptSubmit registry tokens — pins the runtime hook-fire path the case-doc Verbose-transcript claim hangs on |
file probe |
| T37 | Bundled index.js contains [GlobalMemory] Copied CLAUDE.md log line + CLAUDE.md filename literal + CLAUDE_CONFIG_DIR env-var token (memory-loading wiring) |
file probe |
| T37b | After seedFromHost + userLoaded, the claude.web/CoworkMemory/readGlobalMemory eipc handler is registered AND callable through the renderer-side wrapper, returning the documented string | null shape (Tier 2 runtime sibling of T37) |
L1 (eipc invoke) |
| T38 | Bundled index.js contains LocalSessions_$_openInEditor eipc channel name (Tier 1 fingerprint) |
file probe |
| T38b | After seedFromHost + userLoaded, the LocalSessions_$_openInEditor eipc handler is registered on the claude.ai webContents (Tier 2 runtime sibling of T38) |
L1 (eipc registry) |
| H01 | CDP auth gate exits with code 1 when spawned with --remote-debugging-port and no CLAUDE_CDP_AUTH token |
spawn probe |
| H02 | frame-fix-wrapper.js + frame-fix-entry.js injected into app.asar (Proxy + main-field reference) |
file probe |
| H03 | Build-pipeline patch fingerprints all present in app.asar (KDE gate, frame-fix inject, tray, cowork, claude-code) |
file probe |
| H04 | cowork daemon spawns under app and exits with app — soft-skips on rows where it isn't gated to spawn | pgrep delta |
| H05 | UI-drift canary against the AX-tree fingerprint walker (requires CLAUDE_TEST_USE_HOST_CONFIG=1) |
L1 (AX) |
| S01 | AppImage launches without libfuse.so.2 complaint (skips on non-AppImage rows) |
spawn + stderr grep |
| S02 | No strict == equality against XDG_CURRENT_DESKTOP in launcher / patches (regression detector) |
source-tree probe |
| S03 | dpkg-query Depends: field non-empty (currently fails as upstream-contract regression detector) |
dpkg-query |
| S04 | rpm -qR has at least one non-rpmlib(...) requirement (currently fails per #autoreqprov off) |
rpm -qR |
| S05 | Doctor does not false-flag rpm-installed package (skips when rpm -qf doesn't claim the binary) |
spawn + stdout grep |
| S07 | Under CLAUDE_HARNESS_USE_WAYLAND=1, spawned Electron has --ozone-platform=wayland on argv |
argv probe |
| S08 | setImage-based in-place fast-path injected by tray.sh (KDE-only, file probe) |
file probe |
| S09 | KDE-gate string present in bundled index.js (patch ran at build) |
file probe |
| S10 | KDE-W only — popup runtime getBackgroundColor() === '#00000000' after Quick Entry opens (regression-detector against electron#50213 if bundled Electron in 41.0.4-bisect-window) |
L1 + ydotool |
| S11 | GNOME-X / Ubu-X only (X11-side regression detector) — spawn xterm marker, xdotool windowfocus to it, verify _NET_ACTIVE_WINDOW shifted, fire Ctrl+Alt+Space via ydotool, assert popup visible. Wayland-side mutter regression (#404) is a primitive gap — needs Wayland-native focus injection (libei) |
L1 + xdotool focus + ydotool shortcut |
| S12 | --enable-features=GlobalShortcutsPortal in Electron argv (GNOME-W only — currently a known-failing regression detector) |
argv probe |
| S14 | Niri only — spawn foot marker, niri msg action focus-window to it, verify niri msg --json focused-window shifted, fire Ctrl+Alt+Space via ydotool, assert popup visible. Currently known-failing detector for the Niri portal BindShortcuts path (parallels S12's GNOME-W detector) |
L1 + niri msg focus + ydotool shortcut |
| S15 | --appimage-extract exits 0; squashfs-root/AppRun --version runs without FUSE error |
spawn + filesystem |
| S16 | mount(8) shows new .mount_claude while app is up; gone within 10s of close |
mount delta |
| S17 | Shell-path-worker overlays user's login-shell PATH onto a deliberately-scrubbed env | L1 + utilityProcess |
| S19 | extraEnv: { CLAUDE_CONFIG_DIR } reaches main-process process.env; cE()-equivalent resolves under the override path |
L1 + extraEnv |
| S21 | No handle-lid-switch / HandleLidSwitch strings in bundle (lid policy deferred to OS) |
asar absence probe |
| S22 | new Set(["darwin","win32"]) platform gate present; no 2-element Set pairing linux (file-probe form) |
asar regex |
| S25 | safeStorage.encryptString → file → app restart → file → safeStorage.decryptString round-trips the same plaintext (skips when isEncryptionAvailable === false) |
L1 + shared isolation handle |
| S26 | setFeedURL present + project suppression marker present (currently fails — gated on #567) |
asar fingerprint |
| S27 | installed_plugins.json + homedir resolver present; no */plugins system paths in bundle |
asar fingerprint |
| S28 | Bundled index.js contains the worktree permission classifier expression ("Permission denied" || "Access is denied" || "could not lock config file" → "permission-denied") plus the Failed to create git worktree: log line |
asar fingerprint |
| S29 | Popup opens when main is hidden-to-tray (lazy-create sanity) | L1 |
| S30 | No new claude-desktop pid spawns after post-exit shortcut press | pgrep delta + ydotool |
| S31 | Submit reaches new chat from visible / minimized / hidden-to-tray (QE-7/8/9) | L1 + ydotool |
| S32 | GNOME mutter stale-isFocused() regression (GNOME-W/Ubu-W only — known-failing today) |
L1 + ydotool |
| S33 | Captures bundled Electron version against the #370 / electron#50213 bisect threshold | file read |
| S34 | Popup does not appear when main is fullscreen (upstream contract) | L1 + ydotool |
| S35 | Popup position persists across invocations and across app restart (two-launch test) | L1 + shared isolation handle + ydotool |
| S36 | Multi-monitor fallback — skip-on-single-monitor with documented fixme for the disconnect orchestration |
display probe |
| S37 | Main-window destroy unreachable on Linux per close-to-tray override — documented skip | — |
These specs exercise the substrate primitives in lib/: xprop
shell-outs (T01, T04), dbus-next (T03), dbus-monitor subprocess
eavesdrop (T23), Node-inspector runtime-attach
(T07/T16/T17/T26/S10/S29-S35/T05-T14b L1 specs), app.asar content reads
(S08/S09/S21/S22/S26/S27/S28/T11/T14a/T18/T22/T30/T31/T32/T33/T35/T36/T37/T38/H02/H03/S33 — mostly index.js; T18 reads mainView.js),
/proc/$pid/cmdline reads (S07/S12), pgrep-based pid deltas
(T10/T14b/H04/S16/S30), mount(8) parsing (S16), source-tree probes
against scripts/launcher-common.sh (S02), dpkg-query / rpm -qR /
rpm -qf calls (S03/S04/S05/T13), safeStorage.encryptString
round-trip across two launches (S25), extraEnv precedence over
isolation env (S19), the lib/electron-mocks.ts mock-then-call
helpers — installOpenDialogMock (T17), installShowItemInFolderMock
(T25), installOpenExternalMock (T24) — the lib/input.ts
focus-shifter (focusOtherWindow + spawnMarkerWindow for S11; X11
only — WaylandFocusUnavailable thrown on native Wayland) and its
Niri-native sibling lib/input-niri.ts (niri msg --json for the
focus-injection + readback chain, foot --title for the marker
window; NiriIpcUnavailable thrown off-Niri; consumed by S14), the
lib/eipc.ts registry walker (getEipcChannels /
waitForEipcChannel / waitForEipcChannels against
webContents.ipc._invokeHandlers; opaque on the UUID, suffix-matched
against case-doc anchors; consumed by T19 / T20 / T22b / T31b / T33b /
T38b) plus its session 8 invoke surface (invokeEipcChannel — calls
a registered handler through the renderer-side wrapper at
window['claude.<scope>'].<Iface>.<method>; consumed by T19 / T20 /
T27 / T33c / T35b / T37b), the lib/ax.ts AX-tree substrate
(snapshotAx for one-shot reads + waitForAxNode / waitForAxNodes
for predicate-based polling, plus re-exports of RawElement /
AxNode / axTreeToSnapshot / waitForAxTreeStable from
explore/walker.ts so consumers stay inside lib/; threshold-
driven extraction in session 13 once T26 had to duplicate the
formerly-private snapshotAx from claudeai.ts; consumed by
claudeai.ts page-objects + T26; session 14 migrated activateTab
from a one-shot snapshot to waitForAxNode polling — fixes the
T16 no AX-tree button with accessibleName="Code" found failure
mode where the Code button hadn't rendered yet at click time —
and converted CodeTab.activate's post-click findCompactPills
retry loop to waitForAxNodes) — and the
createIsolation({ seedFromHost: true }) primitive that lets login-
required tests run hermetically against a copy of the host's signed-
in auth state (T07, T11_runtime, T16, T17, T19, T20, T21, T22b, T26,
T27, T31b, T33b, T33c, T35b, T37b, T38b — session 15 migrated T17
from the legacy CLAUDE_TEST_USE_HOST_CONFIG=1 / isolation: null
shape to seedFromHost, fixing a pre-existing 60s spec-timeout
flake where the unauth'd default isolation polled userLoaded past
Playwright's spec budget; session 16 verified the migration end-to-
end — seedFromHost clones the host's signed-in config,
waitForReady('userLoaded') resolves to a post-login URL, and the
session-14 CodeTab.activate({ timeout: 15_000 }) succeeds; T17
now reaches a NEW failure mode at the next chain step
(openFolderPicker after selectLocal, Select folder… pill
doesn't render on /epitaxy workspace route — likely needs /new
context, deferred for a future session).
Note on eipc channels: the LocalSessions_$_* and CustomPlugins_$_*
channel names referenced in the case-doc Code anchors don't register
through Electron's global ipcMain.handle() registry (which only
carries 3 chat-tab MCP-bridge handlers). They DO register through
Electron's stdlib IpcMainImpl — just on the per-webContents IPC
scope (webContents.ipc._invokeHandlers, Electron 17+) rather than
the global one. The framing is
$eipc_message$_<UUID>_$_<scope>_$_<iface>_$_<method> (UUID stable
across builds at c0eed8c9-…); 117 LocalSessions_* + 16
CustomPlugins_* + 50+ other interfaces register on the claude.ai
webContents. T22 / T31 / T33 / T38 ship as Tier 1 fingerprints
against the bundled channel-name strings; T22b / T31b / T33b / T38b
are the runtime registry-presence siblings (strictly stronger,
require seedFromHost). T27 / T33c / T35b / T37b go one step
further — they invoke the resolved handlers through the renderer-
side wrapper at window['claude.<scope>'].<Iface>.<method>. T19 /
T20 are first-runtime-probe siblings of case-doc tests whose anchors
are write-side handlers (startShellPty / writeSessionFile); they
ship a five-suffix / three-suffix registration probe over the
case-doc-anchored write-side surface plus a single foundational
read-side LocalSessions/getAll invocation as the read-side
surrogate (case-doc connection: integrated terminal and file pane
both bind to LocalSessions; getAll proves the LocalSessions impl
object is reachable through the renderer wrapper). T21 and
T11_runtime extend the dual-invocation pattern: when a case-doc has
read-side anchors with resolvable arg shapes, invoke the case-doc-
anchored handlers directly rather than through a foundational
surrogate (T21: getConfiguredServices array + getAutoVerify
boolean on a single Launch impl object; T11_runtime: cross-impl-
object dual invocation — CustomPlugins/listInstalledPlugins array
LocalPlugins/getPluginsarray — proves the install plumbing crosses both interfaces intact, strictly stronger than single- interface coverage). All wrapper invocations use the wrapper exposed bymainView.jsviacontextBridge.exposeInMainWorldafter a top-frame + origin gate (Qc(): claude.ai / claude.com / preview.* / localhost). Calling through the wrapper carries an honestsenderFramefor the inlinedle()/Vi()per-handler origin gate, so the test surface matches real attack surface. T33c also demonstrates the schema-rev path: when invocation rejects withArgument "<name>" at position N ... failed to pass validation, the verbatim rejection string is the cheapest grep target back to the inline hand-rolled validator block (bundle bytes 5013601 / 5018821 for the two CustomPlugins methods). Seelib/eipc.tsfor both surfaces.
Per-row pass/skip counts depend on which sweep runs against the row.
The Quick Entry runners (S29-S35) all share the same primitive set
(installInterceptor() + openAndWaitReady() + scenario-specific
state setup).
Prerequisites
On the host or VM running the sweep:
- Node.js ≥ 20
claude-desktopinstalled (deb / rpm / AppImage), reachable viaclaude-desktoponPATHorCLAUDE_DESKTOP_LAUNCHERenv varxprop(for L2 window queries —dnf install xorg-x11-utilson Fedora;apt install x11-utilson Debian/Ubuntu)zstd(optional — used to bundle results)
Quick Entry runners (S29–S37, future QE-*)
Quick Entry tests inject the OS-level shortcut via ydotool /
/dev/uinput. One-time setup per host or VM:
# Install the binary + daemon
sudo dnf install -y ydotool # or: sudo apt install ydotool
# Make ydotoold's socket world-writable so the test runner reaches it
sudo mkdir -p /etc/systemd/system/ydotool.service.d
sudo tee /etc/systemd/system/ydotool.service.d/override.conf <<'EOF'
[Service]
ExecStart=
ExecStart=/usr/bin/ydotoold --socket-perm=0666
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now ydotool.service
After this, ydotool key 29:1 29:0 (Ctrl tap) should exit 0. The
runner sets YDOTOOL_SOCKET=/tmp/.ydotool_socket automatically;
override the env var if your daemon binds elsewhere.
ydotool cannot drive portal-grabbed shortcuts (kernel uinput
events vs compositor portal grabs) — those tests stay manual until
libei adoption broadens. See docs/testing/automation.md.
Install
cd tools/test-harness
npm install
package-lock.json is gitignored for now; commit it once the dep set is settled.
Run
# All four tests against the locally installed claude-desktop
ROW=KDE-W ./orchestrator/sweep.sh
# Single test
npx playwright test src/runners/T01_app_launch.spec.ts
# Headed (watch the app launch in front of you)
npx playwright test --headed
# Run the full suite under native Wayland instead of X11/XWayland
CLAUDE_HARNESS_USE_WAYLAND=1 npm test
# Grounding probe — dump runtime state for the case-doc grounding sweep
npm run grounding-probe -- --launch --include-synthetic \
--out ../../docs/testing/cases-grounding-runtime.json
Results land at results/results-${ROW}-${DATE}/:
results/results-KDE-W-20260430T143000Z/
├── junit.xml # JUnit summary (matrix-regen input)
├── html/ # Playwright HTML report
└── test-output/ # Per-test attachments (screenshots, logs, etc.)
A bundled results-${ROW}-${DATE}.tar.zst sits next to the dir if zstd
is installed.
Environment variables
| Var | Default | Purpose |
|---|---|---|
ROW |
KDE-W |
Matrix row label, propagated into the bundle name and per-test annotations. Drives skipUnlessRow() in spec files |
CLAUDE_DESKTOP_LAUNCHER |
claude-desktop (PATH lookup) |
Path to the launcher / Electron binary Playwright spawns |
CLAUDE_DESKTOP_ELECTRON |
probed | Override the resolved Electron binary path (skips deb/rpm install probing) |
CLAUDE_DESKTOP_APP_ASAR |
probed | Override the resolved app.asar path |
CLAUDE_TEST_USE_HOST_CONFIG |
unset | When 1, opt out of per-test isolation and use the host's real ~/.config/Claude. Required for tests that need a signed-in claude.ai (S31, future submit-side QE runners). Side effect: these tests write to your real account — chats / settings persist |
CLAUDE_HARNESS_USE_WAYLAND |
unset | When 1, every runner spawns Electron with the native-Wayland backend (--ozone-platform=wayland + sibling flags from launcher-common.sh) instead of the default X11-via-XWayland. CLAUDE_USE_WAYLAND=1 is also exported into the spawn env for in-app paths that read it. Per-launch overrides via launchClaude({ extraEnv }) still win |
YDOTOOL_SOCKET |
/tmp/.ydotool_socket |
Path to the ydotoold socket. Override only if the daemon binds elsewhere |
OUTPUT_DIR |
./results |
Where bundles land |
RESULTS_DIR |
per-run derived | Single-run output dir (set by sweep.sh; usually you don't set this manually) |
Per-test isolation default
launchClaude() creates a fresh XDG_CONFIG_HOME / CLAUDE_CONFIG_DIR
under $TMPDIR/claude-test-* for every launch and removes it on
close(). This is the default to prevent state leaks between tests
(SingletonLock collisions, persisted Quick Entry positions, etc. —
see Decision 1 in docs/testing/automation.md).
Three escape hatches:
launchClaude()— default, fresh per-launch isolation.launchClaude({ isolation })— pass a sharedIsolationhandle to launch the same app twice with persistent state (e.g. S35 position-memory across restart).launchClaude({ isolation: null })— opt out entirely; share the host's~/.config/Claude. Used by tests gated onCLAUDE_TEST_USE_HOST_CONFIGfor signed-in claude.ai access.
Layout
tools/test-harness/
├── package.json
├── tsconfig.json
├── playwright.config.ts
├── src/
│ ├── lib/ # shared helpers
│ │ ├── electron.ts # spawn + isolation + inspector attach
│ │ ├── inspector.ts # Node-inspector RPC client (SIGUSR1 path)
│ │ ├── dbus.ts # dbus-next session-bus + helpers
│ │ ├── sni.ts # StatusNotifierWatcher / Item
│ │ ├── wm.ts # xprop wrappers (X11 + XWayland)
│ │ ├── env.ts # XDG_CURRENT_DESKTOP / SESSION_TYPE branching
│ │ ├── row.ts # skipUnlessRow / skipOnRow primitives
│ │ ├── isolation.ts # per-test XDG_CONFIG_HOME sandbox
│ │ ├── argv.ts # /proc/$pid/cmdline reader + flag check
│ │ ├── asar.ts # in-place app.asar reads (no temp extract)
│ │ ├── quickentry.ts # Quick Entry domain wrapper (popup, MainWindow, ydotool)
│ │ ├── claudeai.ts # claude.ai renderer UI domain (CodeTab, dialog mock, atoms)
│ │ ├── electron-mocks.ts # mock-then-call helpers (dialog/showItemInFolder/openExternal)
│ │ ├── input.ts # focus-shifter primitive (X11 only — xdotool + xprop verify; spawnMarkerWindow xterm)
│ │ ├── input-niri.ts # focus-shifter primitive (Niri only — niri msg --json verify; spawnMarkerWindow foot)
│ │ ├── eipc.ts # eipc-channel registry walker (per-webContents IPC scope; suffix-matched, UUID-opaque)
│ │ ├── retry.ts # poll-until-true with timeout
│ │ └── diagnostics.ts # launcher log, --doctor, session env
│ └── runners/ # one .spec.ts per test ID
│ ├── T01_app_launch.spec.ts
│ ├── T03_tray_icon_present.spec.ts
│ ├── T04_window_decorations.spec.ts
│ ├── T17_folder_picker.spec.ts
│ ├── S09_quick_window_patch_only_kde.spec.ts
│ ├── S12_global_shortcuts_portal_flag.spec.ts
│ ├── S29_quick_entry_lazy_create_closed_to_tray.spec.ts
│ ├── S30_quick_entry_noop_after_app_exit.spec.ts
│ ├── S31_quick_entry_submit_reaches_new_chat.spec.ts
│ ├── S32_quick_entry_submit_gnome_stale_isfocused.spec.ts
│ ├── S33_electron_version_capture.spec.ts
│ ├── S34_shortcut_focuses_fullscreen_main.spec.ts
│ ├── S35_quick_entry_position_persisted_across_restarts.spec.ts
│ ├── S36_quick_entry_fallback_to_primary_display.spec.ts
│ ├── S37_quick_entry_popup_after_main_destroy.spec.ts
│ ├── H01_cdp_gate_canary.spec.ts
│ ├── H02_frame_fix_wrapper_present.spec.ts
│ ├── H03_patch_fingerprints.spec.ts
│ └── H04_cowork_daemon_lifecycle.spec.ts
├── probe.ts # one-off renderer-DOM probe (debugger on :9229)
├── grounding-probe.ts # case-grounding runtime capture (see "Grounding probe" below)
└── orchestrator/
└── sweep.sh # row-aware harness invocation
H-prefix specs are harness self-tests — they validate the harness's preconditions and the build pipeline's invariants (CDP gate alive, patches landed, daemon lifecycle clean). Cheap, run in <1s each except H04 which launches the app.
How L1 testing works (the SIGUSR1 path)
The shipped Electron has a CDP auth gate that exits the app whenever
--remote-debugging-port or --remote-debugging-pipe is on argv and a
valid CLAUDE_CDP_AUTH token isn't in env. Both Playwright's
_electron.launch() and chromium.connectOverCDP() inject the gated
flag, so both are blocked.
The gate doesn't check --inspect or runtime SIGUSR1, which is the
same code path as the in-app Developer → Enable Main Process Debugger
menu item. So:
launchClaude()spawns Electron with no debug-port flags (gate asleep) and waits for the X11 window.app.attachInspector()sendsSIGUSR1to the pid; Node's inspector opens on port 9229.lib/inspector.tsconnects via WebSocket and exposesevalInMain(body)andevalInRenderer(urlFilter, js)for tests.
From the inspector you can:
- Drive the renderer via
webContents.executeJavaScript() - Install main-process mocks (e.g.
dialog.showOpenDialogfor T17) - Inspect any Electron API state
Two gotchas worth knowing:
BrowserWindow.getAllWindows()returns 0 because frame-fix-wrapper substitutes the BrowserWindow class. UsewebContents.getAllWebContents()instead — works correctly and includes both the shell window and the embedded claude.ai BrowserView.Runtime.evaluatewithawaitPromise: truereturns empty objects for awaited Promise resolutions.inspector.evalInMain<T>()returnsJSON.stringify(value)from the IIFE and parses on the caller side to dodge this.
Full writeup with rationale and tradeoffs:
docs/testing/automation.md "The CDP auth gate".
Grounding probe
grounding-probe.ts is a separate entry-point — not a Playwright spec —
that connects to a live Claude Desktop and dumps the runtime state
backing the load-bearing claims in
docs/testing/cases/. It exists because
static grep against the 546k-line beautified bundle has known blind
spots (lazy import()s, dynamic handler tables, conditional wiring),
and some claims (S26 autoUpdater gate, S20 powerSaveBlocker path) can
only be verified at runtime.
# Self-contained: launchClaude() + capture + tear down
npm run grounding-probe -- --launch
# Plus the one synthetic probe (powerSaveBlocker start+stop)
npm run grounding-probe -- --launch --include-synthetic
# Attach to an already-running app (manual --inspect=9229 setup)
npm run grounding-probe -- --port 9229 --out /tmp/probe.json
Output is keyed by test ID — see the file's header comment for the full table. Diff captures across upstream version bumps to spot behavior drift the static sweep would miss. Surfaces inside modals or popups (T22 PR toolbar, T26 preset list, T31 side chat, T32 slash menu) need the surface open at probe time — the AX-tree fingerprint is a snapshot of what's currently on screen.
Known limitations
- T04 uses
xprop(noxdotooldependency — walks_NET_CLIENT_LIST+_NET_WM_PID). Works on X11 native and KDE Wayland (XWayland), not on native-Wayland sessions where the app is running through Ozone-Wayland directly. Per Decision 6, project default is X11; native-Wayland window-state queries are deferred until those tests get added. - T17 is shallow — it intercepts
dialog.showOpenDialogat the Electron main process level. The integration question "does Claude make the right portal call?" is a v2 concern; portal-level mocking viadbus-nextis sketched indocs/testing/automation.mdbut requires displacing the running portal service or running underdbus-run-session. render-matrix.shisn't here yet.sweep.shprints a summary; thematrix.mdregen step from JUnit is the next addition.- No CI wrapper. Decision 4: the harness is invocable from CI but sweeps run from the dev box for the first ~20 tests.
Adding a test
- Pick the
T##/S##fromdocs/testing/cases/. - Drop
src/runners/T##_short_name.spec.ts. Use the existing five as templates — match the layer (L1 / L2) to the test's assertion shape. - First line of the test body:
skipUnlessRow(testInfo, ['KDE-W', ...]). JUnit<skipped>→ matrix-, never✗for a row that doesn't apply. - Tag the test with
severityandsurfaceannotations so the JUnit output carries them. - Capture diagnostics via
testInfo.attach()— these become Decision 7 "always-on" captures regardless of pass/fail. For tests that need richer state on failure, wrap your scenarios in a results-collector and attach a single JSON dump (S31's pattern). - No fixed
sleeps. UseretryUntilor Playwright's auto-wait.
Hooking Electron — read this before reaching for BrowserWindow
scripts/frame-fix-wrapper.js returns the electron module wrapped
in a Proxy whose get trap returns a closure-captured
PatchedBrowserWindow. Constructor-level wraps don't work — your
electron.BrowserWindow = WrappedCtor write lands on the underlying
module but the Proxy keeps returning PatchedBrowserWindow on
read, so the wrap is bypassed. The reliable hook is at the
prototype-method level:
// in inspector.evalInMain(...)
const proto = electron.BrowserWindow.prototype;
const orig = proto.loadFile;
proto.loadFile = function(filePath, ...rest) {
// record `this` + filePath; identify popups by filePath suffix
return orig.call(this, filePath, ...rest);
};
This captures every instance regardless of subclass identity.
Construction-time options (transparent: true, frame: false,
etc.) aren't observable through this hook — use runtime
equivalents instead (getBackgroundColor(), getContentBounds() vs getBounds(), isAlwaysOnTop()). lib/quickentry.ts is the
worked example.