test(harness): add Linux compatibility test harness (#579)

Build out a Playwright-based regression-detection harness covering
the compat-matrix surfaces (KDE-W, KDE-X, GNOME, Sway, i3, Niri,
packaging formats). Adds:

- Planning + decision docs under docs/testing/ — README, matrix,
  runbook, automation, cases/ (11 case files), quick-entry-closeout
- Playwright scaffolding (config, tsconfig)
- 78 spec runners under tools/test-harness/src/runners/ — T## case-
  doc runners and S## distribution/smoke runners
- Substrate primitives in tools/test-harness/src/lib/: AX-tree
  loader (snapshotAx + waitForAxNode + axTreeToSnapshot), focus-
  shifter, eipc-registry, niri-native bridge, drag-drop bridge,
  electron-mocks, claudeai page-objects, inspector client

S03 (DEB Depends declared) and S04 (RPM Requires declared) ship
marked test.fail() — they're regression detectors for the case-doc
gap (deb.sh emits no Depends:, rpm.sh sets AutoReqProv: no), and
the expected-failure shape lets them report green on every host
until upstream packaging starts declaring runtime deps.

127 files, no runtime changes; harness is opt-in via
'cd tools/test-harness && npx playwright test'.

Co-authored-by: Claude <claude@anthropic.com>
This commit is contained in:
Aaddrick
2026-05-04 23:17:37 -04:00
committed by GitHub
parent b8e1a1fc30
commit 3506c14918
128 changed files with 22882 additions and 2 deletions

View File

@@ -2,5 +2,8 @@
# Ref: https://github.com/codespell-project/codespell#using-a-config-file # Ref: https://github.com/codespell-project/codespell#using-a-config-file
skip = .git*,.codespellrc skip = .git*,.codespellrc
check-hidden = true check-hidden = true
# ignore-regex = # ignore-regex =
# ignore-words-list = # openIn — substring of `openInEditor` IPC channel name (upstream).
# YHe — minified function identifier in build-reference anchor.
# hel — three-char literal in QE-13 example ("hel (3) submits").
ignore-words-list = openIn,YHe,hel

7
.gitignore vendored
View File

@@ -24,6 +24,13 @@ Thumbs.db
# Test build output # Test build output
test-build/ test-build/
# Playwright stray output — the harness writes to
# tools/test-harness/results/ per playwright.config.ts, but Playwright
# also drops a default `test-results/.last-run.json` next to the cwd
# it's invoked from. Ignore it at the repo root so an accidental run
# from here doesn't dirty the tree.
test-results/
# Reference files for source inspection # Reference files for source inspection
build-reference/ build-reference/

View File

@@ -15,6 +15,8 @@ The [`docs/learnings/`](docs/learnings/) directory contains hard-won technical k
- [`tray-rebuild-race.md`](docs/learnings/tray-rebuild-race.md) — why destroy + recreate on `nativeTheme` updates briefly duplicates the tray icon on KDE Plasma, and the in-place `setImage` + `setContextMenu` fast-path that avoids the SNI re-registration race - [`tray-rebuild-race.md`](docs/learnings/tray-rebuild-race.md) — why destroy + recreate on `nativeTheme` updates briefly duplicates the tray icon on KDE Plasma, and the in-place `setImage` + `setContextMenu` fast-path that avoids the SNI re-registration race
- [`mcp-double-spawn.md`](docs/learnings/mcp-double-spawn.md) — Stdio MCPs spawn 2× when chat and Code/Agent panels are both active, root cause in upstream session managers, MCP-author workaround - [`mcp-double-spawn.md`](docs/learnings/mcp-double-spawn.md) — Stdio MCPs spawn 2× when chat and Code/Agent panels are both active, root cause in upstream session managers, MCP-author workaround
- [`linux-topbar-shim.md`](docs/learnings/linux-topbar-shim.md) — why claude.ai's in-app topbar is missing on Linux, the four gates that hide it, why the upstream `frame:false` + WCO config has unclickable buttons on X11 (Chromium-level implicit drag region), and the resolution: hybrid mode (system frame + UA-spoof shim → stacked layout, full button functionality) - [`linux-topbar-shim.md`](docs/learnings/linux-topbar-shim.md) — why claude.ai's in-app topbar is missing on Linux, the four gates that hide it, why the upstream `frame:false` + WCO config has unclickable buttons on X11 (Chromium-level implicit drag region), and the resolution: hybrid mode (system frame + UA-spoof shim → stacked layout, full button functionality)
- [`test-harness-electron-hooks.md`](docs/learnings/test-harness-electron-hooks.md) — why constructor-level `BrowserWindow` wraps are silently bypassed by `frame-fix-wrapper`'s Proxy, and the prototype-method hook pattern that works (used by the Quick Entry test runners)
- [`test-harness-ax-tree-walker.md`](docs/learnings/test-harness-ax-tree-walker.md) — five non-obvious traps in the v7 fingerprint walker after the AX-tree migration: AX-enable async lag, navigateTo-to-same-URL no-op, claude.ai's flat `dialog>button[]` lists, the `more options for X` per-row shape, and sidebar virtualization vs the lookup-failure threshold
## Code Style ## Code Style

View File

@@ -0,0 +1,134 @@
# Test-harness AX-tree walker — non-obvious traps
Notes from the v6 → v7 fingerprint migration that switched
`tools/test-harness/explore/walker.ts` from a renderer-side
`document.querySelectorAll` IIFE to Chromium's accessibility tree
(`Accessibility.getFullAXTree` over CDP). All five gotchas below cost
a wasted live-walk to find; capturing them here so the next person
debugging a 0-entry inventory or a redrive cascade can skip the
discovery loop.
## 1. `Accessibility.enable` is async; the first `getFullAXTree` lies
Inspector clients call `target.debugger.sendCommand('Accessibility.enable')`
before the first `getFullAXTree`. Both calls return immediately, but
Chromium populates the AX tree asynchronously — the very first
read can return a tree containing only the `RootWebArea` and a
generic shell (4 nodes total) even when the DOM has hundreds of
interactive elements. The walker's existing `waitForStable` is a
DOM-mutation-quiescence observer with a 1.5s ceiling; on claude.ai's
SPA the DOM mutates constantly so `waitForStable` returns at the
ceiling without the AX tree ever catching up.
**Fix:** `waitForAxTreeStable` polls `getFullAXTree` until two
consecutive reads return the same node count. Called once before the
seed snapshot (with `minNodes: 20` to gate against the 4-node "still
loading" case), once after each `navigateTo` in `redrivePath`, and
baked into every `snapshotSurface` call (with `minNodes: 1` for the
post-click case where the tree is already populated).
**Symptom you'll see:** seed entries: 0. Walker exits with no
inventory. Stderr says `walker: AX tree settled at 4 nodes` (or
similar small number).
## 2. `navigateTo(sameUrl)` is a no-op; redrives carry prior state
The walker's `navigateTo(url)` short-circuits when `currentUrl === url`
(per the original v6 implementation). Every BFS pop re-navigates
to `startUrl` to replay the recorded path against a clean state, but
when `currentUrl` already matches `startUrl` the navigation is
skipped. Anything a prior drill left behind — open dialog, expanded
sidebar, scrolled focus, route params — carries into the next
redrive's snapshots. `clickById` then suffix-matches the requested
fingerprint against a contaminated surface and silently fails to find
elements that were absolutely on the seed surface.
**Fix:** `redrivePath` uses `reloadPage(inspector)` (which evals
`location.reload()` in the renderer) instead of
`navigateTo(startUrl)`. The reload discards the React tree and forces
a fresh mount even when the URL matches.
**Symptom you'll see:** the first one or two BFS items succeed, then
every subsequent redrive fails with
`clickById: no element matches "<seed-id>" on current surface`. The
`<seed-id>` is a button you can verify with the DevTools console is
visibly present.
## 3. claude.ai uses flat `dialog>button[]` and `complementary>button[]`, not `role=list`
The v7 plan's `isListRowChild` check assumes list rows use ARIA list
semantics (`option/listitem` inside `listbox/list`). claude.ai
exposes the connect-apps marketplace as a `dialog` with ~80 plain
`button` children (no `list` wrapper) and the cowork sidebar as a
`complementary` landmark with ~70 plain `button` children. Without
the heuristic those buttons literal-match by name → each gets a
unique stable entry → the BFS queues each individually for drilling
→ inventory bloats from 32 to 442+ entries and most drills fail
because the per-row buttons are virtualized.
**Fix:** `isListRowChild` extended in two ways. (a) `LIST_ROW_ROLES`
includes `button`, `LIST_ANCESTOR_ROLES` includes `group`. (b) A
sibling-count fallback fires when `siblingTotal >= 15` regardless of
ancestor role — sits well above realistic toolbar sizes (≤10) and
well below the smallest claude.ai marketplace (~80). Step 3
(positional fallback) also gates on `!isListRowChild` so list rows
fall through to step 4's `instance` collapse instead of fragmenting
into per-index positionals that can't fold.
**Symptom you'll see:** dialog kind count balloons (>200). One surface
dominates the `surfaceBreakdown` query in the inventory. Each
marketplace card or sidebar row gets its own `kind: structural`
entry with a slugified product name in the id-tail.
## 4. The `more options for X` per-row trigger needs its own shape
Cowork sidebar rows have a "⋮" menu next to each session whose
aria-label is `More options for <session title>`. These don't match
the `cowork-session` shape (which gates on status prefix), so even
after `cowork-session` collapsed the session list, the sibling
"More options for" buttons still emitted individually. Same for any
future per-row action button claude.ai adds.
**Fix:** new `INSTANCE_SHAPES` entry `row-more-options` with regex
`/^More options for /` and matching pattern. Generic enough to cover
any per-row trigger that follows the `<verb> for <row title>` shape.
**Symptom you'll see:** after fixing (1)-(3), a fresh wave of
redrive failures all matching `more-options-for-X` slugs.
## 5. Sidebar virtualization causes structural redrive misses; bump the threshold
claude.ai's cowork sidebar appears to virtualize the session list:
each fresh page load exposes a slightly different subset of sessions
in the AX tree (subset, not just ordering — actually different
membership). The walker captures session N at seed time but on
redrive after `reloadPage` session N may not be in the tree. Each
miss counts toward `MAX_CONSECUTIVE_LOOKUP_FAILURES`, and a stretch
of 25+ consecutive cowork-row redrives can blow through the original
threshold without the renderer being meaningfully wedged.
**Fix:** threshold bumped 25 → 75. The timeout counter (still 5
strikes) gates against actual renderer hangs; the lookup-failure
counter is more about "discovered DOM has drifted from seed", and on
a virtualized list a generous threshold is correct. Subtree pruning
(already in place) keeps the bursts from compounding by dropping
queue items whose path shares the failed step's prefix.
**Symptom you'll see:** the walker aborts mid-walk with
`25 consecutive redrive lookup failures` and the failed ids all
share a common ariaPath prefix (`root.complementary.button-by-name.X`).
## Driver: prefer `walk-isolated.ts` over `explore walk`
`npm run explore:walk` connects to whatever Node inspector is on
:9229 — i.e. the host Claude Desktop the user is currently using.
That mutates the host profile (visited surfaces, navigation history,
route changes) and races with the human at the keyboard.
`tools/test-harness/explore/walk-isolated.ts` mirrors what H05 / U01
do: kills any running host instance, copies auth into a tmpdir
(`createIsolation({ seedFromHost: true })`), spawns a fresh Electron
with isolated `XDG_CONFIG_HOME`, attaches the inspector via
`SIGUSR1`, runs the walk, tears down. Same flag set as
`explore walk` plus `--no-seed` for the rare case you want a
fresh-sign-in run. Use it.

View File

@@ -0,0 +1,99 @@
# Hooking Electron from the test harness
Why constructor-level `BrowserWindow` wraps don't work in this
codebase, and the prototype-method hook that does.
## TL;DR
The test harness attaches a Node inspector at runtime (see
[`docs/testing/automation.md`](../testing/automation.md#the-cdp-auth-gate-and-the-runtime-attach-workaround-that-beats-it))
and from there can evaluate arbitrary JS in the main process. To
observe BrowserWindow construction (e.g. find the Quick Entry popup
ref, capture construction-time options), the natural-feeling
approach is to wrap `electron.BrowserWindow`:
```js
const electron = process.mainModule.require('electron');
const Orig = electron.BrowserWindow;
electron.BrowserWindow = function(opts) {
// record opts...
return new Orig(opts);
};
```
**This is silently bypassed.** `scripts/frame-fix-wrapper.js`
returns the electron module wrapped in a `Proxy`; the Proxy's
`get` trap returns a closure-captured `PatchedBrowserWindow`
class. Reads of `electron.BrowserWindow` go through the trap and
always return `PatchedBrowserWindow`, regardless of what was
written to the underlying module. Writes succeed (Reflect.set on
the target) but reads ignore them. Upstream code calling
`new hA.BrowserWindow(opts)` constructs from `PatchedBrowserWindow`,
your wrap is never invoked, your registry stays empty.
The reliable hook is at the **prototype-method level**:
```js
const proto = electron.BrowserWindow.prototype;
const origLoadFile = proto.loadFile;
proto.loadFile = function(filePath, ...rest) {
// every BrowserWindow instance reaches this, regardless of
// which subclass constructed it
return origLoadFile.call(this, filePath, ...rest);
};
```
This is what `tools/test-harness/src/lib/quickentry.ts:installInterceptor`
does.
## Why prototype-level works through the Proxy
`electron.BrowserWindow` returns `PatchedBrowserWindow`, which
`extends` the original `BrowserWindow` class. Both share the
underlying Electron-native prototype chain via `extends`. Setting
`PatchedBrowserWindow.prototype.loadFile = wrappedFn` shadows the
inherited method on every instance — `Patched`-constructed,
frame-fix-constructed, plain. There's no Proxy in front of
`PatchedBrowserWindow.prototype`, so the assignment sticks and is
visible to all subsequent `instance.loadFile(...)` calls.
`loadFile` and `loadURL` are reasonable identification points
because every BrowserWindow that displays content calls one of
them shortly after construction. The file path / URL is a stable
upstream-controlled string (no minification — these are file paths
to bundle assets), making it a durable identifier across releases.
## Why constructor-level *can* work elsewhere
If frame-fix-wrapper is removed (or stops returning a Proxy), the
naïve constructor wrap would work. Watch for this: an upstream
fork that adopts `BaseWindow` over `BrowserWindow`, or a
build-time replacement of frame-fix-wrapper, would change the
hook surface. The prototype-method approach survives both.
## What can't be observed at the prototype level
Construction-time options (`transparent: true`, `frame: false`,
`skipTaskbar: true`, etc.) are consumed by the native side
during `super(options)` and not stored on the instance in a
reflective form. The harness reads runtime equivalents instead:
- `transparent``getBackgroundColor() === '#00000000'`
- `frame: false``getBounds().width === getContentBounds().width`
(frameless windows have equal frame and content bounds)
- `alwaysOnTop``isAlwaysOnTop()` (note: the popup sets this
via `setAlwaysOnTop()` *after* construction at
`index.js:515399`, so this is the only viable read regardless of
hook approach)
`skipTaskbar` has no public getter; if a test needs it, capture
it at the prototype level by hooking a method that takes the same
options shape, or accept that this signal is unobservable
post-construction.
## See also
- [`tools/test-harness/src/lib/quickentry.ts`](../../tools/test-harness/src/lib/quickentry.ts) — `installInterceptor()` worked example
- [`scripts/frame-fix-wrapper.js`](../../scripts/frame-fix-wrapper.js) — the Proxy + closure
- [`tools/test-harness/src/lib/inspector.ts`](../../tools/test-harness/src/lib/inspector.ts) — how the harness gets main-process JS access in the first place
- [`docs/testing/automation.md`](../testing/automation.md) — overall harness architecture

111
docs/testing/README.md Normal file
View File

@@ -0,0 +1,111 @@
# Linux Compatibility Testing
*Last updated: 2026-05-03*
This directory holds the manual test plan for the Linux fork of Claude Desktop. The structure is designed for human readers today and scripted runners tomorrow.
## Layout
| Folder / file | Purpose |
|---------------|---------|
| [`matrix.md`](./matrix.md) | **The dashboard.** Cross-environment results table + per-section env-specific status snapshots. Single source of truth for test status. |
| [`runbook.md`](./runbook.md) | How to run a sweep: VM setup, diagnostic capture, status update workflow, severity guidance. |
| [`cases/`](./cases/) | Functional test specs grouped by feature surface. Stable IDs: `T###` cross-env, `S###` env-specific. |
## Environment key
| Abbrev | Distro | DE | Display server |
|--------|--------|-----|----------------|
| KDE-W | Fedora 43 | KDE Plasma | Wayland |
| KDE-X | Fedora 43 | KDE Plasma | X11 |
| GNOME | Fedora 43 | GNOME | Wayland |
| Ubu | Ubuntu 24.04 | GNOME | Wayland |
| Sway | Fedora 43 | Sway | Wayland (wlroots) |
| i3 | Fedora 43 | i3 | X11 |
| Niri | Fedora 43 | Niri | Wayland (wlroots) |
| Hypr-O | OmarchyOS | Hyprland | Wayland (wlroots) |
| Hypr-N | NixOS | Hyprland | Wayland (wlroots) |
Status legend: `✓` pass · `✗` fail · `🔧` mitigated · `?` untested · `-` N/A
Cells include linked issue/PR numbers when relevant — e.g. `✗ #404` or `🔧 #406`. A bare `✗` means the failure is verified but no tracking issue is filed yet.
## Severity tiers
Each test is tagged with one of:
| Tier | Meaning | Sweep cadence |
|------|---------|---------------|
| **Smoke** | Release-gate. Must pass before any tag is cut. | Every release tag, on KDE-W + one wlroots row |
| **Critical** | Regression-blocker. Failure on any supported environment blocks the release. | Every release tag, on every active row |
| **Should** | Important but not blocking. Track as bugs, fix before next stable. | Quarterly + on demand |
| **Could** | Edge cases, nice-to-have. | On demand only |
## Smoke set
The minimum set that gates a release. Run on **KDE-W** (daily-driver) plus **Hypr-N** (clean wlroots). Sweep target: ~20 minutes.
| ID | Surface | One-line check |
|----|---------|----------------|
| [T01](./cases/launch.md#t01--app-launch) | Launch | App opens; main window renders within ~10s |
| [T03](./cases/tray-and-window-chrome.md#t03--tray-icon-present) | Tray | Tray icon appears; click toggles window |
| [T04](./cases/tray-and-window-chrome.md#t04--window-decorations-draw) | Window | OS-native frame draws and responds |
| [T05](./cases/shortcuts-and-input.md#t05--url-handler-opens-claudeai-links-in-app) | Input | `xdg-open https://claude.ai/...` opens in-app |
| [T07](./cases/tray-and-window-chrome.md#t07--in-app-topbar-renders--clickable) | Window | Hybrid topbar renders, every button clicks |
| [T08](./cases/tray-and-window-chrome.md#t08--hide-to-tray-on-close) | Window | Close button hides to tray, doesn't quit |
| [T11](./cases/extensibility.md#t11--plugin-install-anthropic--partners) | Extensibility | Anthropic & Partners plugin install completes |
| [T15](./cases/code-tab-foundations.md#t15--sign-in-completes-via-browser-handoff) | Auth | Sign-in completes via `xdg-open` browser handoff |
| [T16](./cases/code-tab-foundations.md#t16--code-tab-loads) | Code tab | Code tab loads (no 403, no blank screen) |
| [T17](./cases/code-tab-foundations.md#t17--folder-picker-opens) | Code tab | Folder picker opens via portal/native chooser |
## Test corpus snapshot
| Bucket | Count |
|--------|-------|
| Cross-environment functional (`T###`) | 39 |
| Environment-specific functional (`S###`) | 37 |
| UI surfaces inventoried | 10 |
| Total functional tests | 76 |
For detailed status by ID, see [`matrix.md`](./matrix.md).
## Automation status
Automation is partially landed. The harness lives at
[`tools/test-harness/`](../../tools/test-harness/) — twenty Playwright
specs wired (T01, T03, T04, T17, S09, S12, S29-S37, plus four H-prefix
self-tests), thirteen passing on KDE-W and six skipping cleanly per
spec intent. See [`tools/test-harness/README.md`](../../tools/test-harness/README.md)
for the live status table, [`automation.md`](./automation.md) for
architectural decisions, and the SIGUSR1 / runtime-attach pattern that
bypasses the app's CDP auth gate.
### Grounding sweep + probe
Separate from the test sweep:
[`runbook.md` "Grounding sweep"](./runbook.md#grounding-sweep) covers
the workflow for verifying case docs themselves against the live
build on every upstream version bump — static anchor pass plus a
runtime probe ([`tools/test-harness/grounding-probe.ts`](../../tools/test-harness/grounding-probe.ts))
that captures IPC handler registry, accelerator state, autoUpdater
gate, AX-tree fingerprint, and other claims static analysis can't
disambiguate. Anchor and drift conventions live in
[`cases/README.md`](./cases/README.md#anchor-scope).
The structure remains automation-friendly for new tests:
1. **Stable test IDs.** `T01`-`T39` and `S01`-`S28` won't move. New tests append. Sequential, not semantic.
2. **Standardized test bodies.** Every functional test has `Severity`, `Steps`, `Expected`, `Diagnostics on failure`, and `References` sections. The Steps and Diagnostics fields are scripted-runner-shaped.
3. **Per-element UI checklists.** Each UI surface file lists interactive elements in a table — every row is a candidate `webContents.executeJavaScript` / `xprop` / DBus assertion.
4. **Severity-driven sweeps.** Tests with a `runner:` field execute via [`tools/test-harness/orchestrator/sweep.sh`](../../tools/test-harness/orchestrator/sweep.sh); JUnit XML lands in `results/results-${ROW}-${DATE}/junit.xml`. Tests without a `runner:` continue to run manually.
For tests that don't have a runner yet, status updates land in [`matrix.md`](./matrix.md) by hand after each manual sweep. For tests that do, the automation invocation is the source of truth — see [`runbook.md`](./runbook.md#automated-runs).
## Conventions
- **One PR per sweep result, not per cell change.** Bundle a full row update into a single commit titled `test: KDE-W sweep $(date +%F)`. Reduces matrix-merge noise.
- **Tested-version pin.** Every status update should mention the `claude-desktop` upstream version + the project version (`v1.3.x+claude...`) in the commit. Otherwise a `✓` from six months ago looks current.
- **Diagnostics on failure are mandatory.** Don't file `✗` without the captures listed in the test's `Diagnostics on failure` block. The runbook covers how to capture each.
- **Issue links go inline.** Status cells link directly to the relevant issue/PR.
See [`runbook.md`](./runbook.md) for the full mechanics.

439
docs/testing/automation.md Normal file
View File

@@ -0,0 +1,439 @@
# Automation Plan
*Last updated: 2026-04-30*
> **Status:** Direction agreed; first vertical slice scaffolded at
> [`tools/test-harness/`](../../tools/test-harness/) covering T01, T03, T04,
> T17 on KDE-W. The [Decisions](#decisions) table captures the calls
> already made; [Still open](#still-open) is the short list of things
> genuinely undecided. This file will fold into [`README.md`](./README.md)
> and [`runbook.md`](./runbook.md) once the harness has run a few real
> sweeps.
The [`README.md`](./README.md) automation roadmap is one paragraph. This file
is the longer version — what shape the harness takes, which tools fit which
tests, which anti-patterns to design against, and what to build first.
## Why this exists
The 67 tests in [`cases/`](./cases/) already have stable IDs and
standardized bodies. That structure is unusually friendly to
automation — but only if the harness is shaped to match the corpus,
rather than the other way around. Three things make that non-trivial:
1. The tests aren't homogeneous. Some are pure-renderer (Code tab), some are
native-OS-level (tray, autostart, URL handler), some are visual/UX checks
that probably stay manual forever.
2. The matrix is nine environments, four display servers, and two package
formats. Input injection on Wayland is genuinely different from X11, and
X11 is the project's default backend (Wayland-native is opt-in until
portal coverage matures across compositors).
3. Many failures are environment-specific by construction (mutter XWayland
key-grab, BindShortcuts on Niri, Omarchy Ozone-Wayland env exports). A
single "run everything everywhere" harness will mis-skip those.
## Decisions
| # | Decision | Rationale |
|---|----------|-----------|
| 1 | **Single language: TypeScript.** Every runner is `.ts`; OS tools are shelled out via `child_process` and wrapped as TS helpers. Python only as a last-resort escape hatch for AT-SPI cases that resist portal mocking. | Playwright Electron is JS-native (post-Spectron); `dbus-next` covers DBus end-to-end; portal mocking removes the dogtail dependency for most native-dialog tests. Three-language overhead doesn't pay back. |
| 2 | **Harness location: `tools/test-harness/`.** Sibling to `scripts/`. | Keeps `docs/testing/` documentation-only; matches the project's existing `tools/` / `scripts/` split. |
| 3 | **VM images: Packer for imperative distros + Nix flake for `Hypr-N`.** | Packer builds golden snapshots that boot fast and rebuild as code; Nix flake handles NixOS natively without a second wrapper. Vagrant's per-boot provisioning model is the wrong tradeoff for hermetic per-test snapshots. |
| 4 | **No CI infrastructure initially.** Harness is invocable from CI (orchestrator is a bash script with `ROW`, `ARTIFACT`, `OUTPUT_DIR` env vars), but sweeps run manually from the dev box for the first ~20 tests. CI wrapper comes after there's signal on which tests are stable enough to run unattended. | Avoids weeks of GHA / nested-KVM debugging for tests that aren't ready to be unattended. The bash orchestrator is the same code either way. |
| 5 | **Selectors: semantic locators only (`getByRole`, `getByLabel`, `getByText`).** No CSS classes against minified renderer output. No proactive `data-testid` injection patch. Escalate per-test only when a specific test proves unstable: first ask upstream for a stable `data-testid`; only carry an `app-asar.sh` patch if upstream declines. | Building selector-injection infrastructure up front is a guess at where rot will happen. Modern React apps usually have enough ARIA roles and visible text for `getByRole`/`getByText` to be durable. Measure before patching. |
| 6 | **X11-default verification is Smoke. Wayland-native characterization is Should.** Add a Smoke test asserting the launcher log shows X11/XWayland selected on each row (the project's release-gate behavior). Add per-row Should tests characterizing what happens if Electron's default Wayland selection is allowed — these are informational, not release-gating. | The project chose X11 default because portal `GlobalShortcuts` coverage is patchy. The new Wayland-default tests exist to map that landscape, not to gate releases on it. |
| 7 | **Diagnostic retention: last 10 greens + all reds, on `main` only.** Captures `--doctor`, launcher log, screenshot every run. Reds retained indefinitely; greens rotate. | Cheap regression-bisect baseline; bounded storage; reds are the things you actually need to look at six weeks later. |
| 8 | **JUnit XML lives as workflow-run artifacts.** Each sweep run uploads `results-${ROW}-${DATE}.tar.zst` containing JUnit + diagnostic bundle. Default 90-day retention, extend to 365 if needed. The matrix-regen step downloads the latest run's artifacts and updates `matrix.md` in a PR. | Zero new infrastructure; GH provides storage, lifecycle, auth. If cross-run analytics later require longer history, promote to a separate `claude-desktop-debian-test-history` repo *then* — not before there's signal on what to keep. |
## The three layers
Looking at the corpus, every test falls into one of three buckets, and each
bucket maps to a different shape of TS code (not a different language):
| Layer | What it covers | Implementation |
|-------|----------------|----------------|
| **L1 — Renderer** | Code tab, plugin install, settings, prompt area, slash menu, side chat | `playwright-electron` (`_electron.launch()`) directly |
| **L2 — Native / OS** | Tray (DBus), window decorations, URL handler (`xdg-open`), autostart, `--doctor`, multi-instance, hide-to-tray, native file picker (T17) | TS + `dbus-next` for DBus; `child_process` shell-outs wrapped as TS helpers (`xprop`, `wlr-randr`, `swaymsg`, `niri msg`, `pgrep`, `ydotool`); `dbus-next`-driven portal mocking for native-dialog tests |
| **L3 — Manual** | "Icon is crisp on HiDPI", drag-and-drop feel, T28 catch-up after suspend (real wall-clock), subjective UX checks | Human eyes; capture in [`runbook.md`](./runbook.md) sweep loop |
The `runner:` field [`README.md`](./README.md) hints at is the right unit.
One TS file per test under `tools/test-harness/runners/`, free to mix L1 and
L2 calls within a single test file. Tests without a `runner:` field stay
manual indefinitely — that's a feature, not a TODO.
## Architecture
```
host (orchestrator) per-row VM (or Nobara host for KDE-W)
───────────────────── ──────────────────────────────────────
tools/sweep.sh ssh → tools/test-harness/run.ts
├── L1 runners (playwright-electron)
├── L2 runners (dbus-next + shell-outs)
└── junit.xml + diagnostic bundle
tools/render-matrix.sh ← scp /tmp/results-${ROW}-${DATE}.tar.zst
matrix.md (regenerated)
```
The orchestrator is dumb: copy artifact in, kick the harness, copy results
out. Per-row variation lives in `tools/test-images/${ROW}/` (Packer recipe +
cloud-init / autoinstall, or a Nix flake for `Hypr-N`). The harness inside
each VM is the same checked-in TS code, branched on `XDG_CURRENT_DESKTOP` /
`XDG_SESSION_TYPE` for env-specific helpers.
Result format pivots on **JUnit XML** — well-trodden ground. Several actions
already exist that turn JUnit into Markdown summaries
([`junit-to-md`](https://github.com/davidahouse/junit-to-md), the
[Test Summary Action](https://github.com/marketplace/actions/junit-test-dashboard)).
The matrix-regen step is just "download artifact, merge per-row JUnit, render
cells, commit a PR."
### Why not drive Playwright over the wire?
The obvious sketch is "orchestrator on the host opens a CDP / DevTools port
on each VM and runs the whole suite from one place." It looks clean but has
real costs:
- CDP over network is fragile; port forwards are a constant footgun on
flaky links.
- Doesn't help with L2 at all — DBus calls, `xprop`, `pgrep`, file-system
probes still have to run in-VM.
- You'd end up maintaining two transports anyway, so the centralization
win evaporates.
In-VM Playwright via `_electron.launch()` is the [official Electron
recommendation](https://www.electronjs.org/docs/latest/tutorial/automated-testing)
since Spectron was archived in Feb 2022. No remote debug port needed; it
spawns Electron directly and gives you a context.
## Toolchain choices per layer
### L1 — `playwright-electron`
- Spawn via `_electron.launch({ args: ['main.js'] })` — no `--remote-debugging-port`.
- Gate `nodeIntegration: true` and `contextIsolation: false` behind
`process.env.CI === '1'` so tests get full main-process access without
weakening production security. (Electron docs explicitly recommend this
pattern.)
- **Locator policy: semantic only.** `getByRole`, `getByLabel`,
`getByText`, `getByPlaceholder`. No CSS selectors against minified class
names — they rot every upstream release. No `data-testid` infrastructure
built up front; if a specific test proves unstable, first ask upstream
for a stable `data-testid`, only carry an `app-asar.sh` patch as a last
resort.
- Use Playwright auto-wait. No fixed `sleep`s anywhere in the harness.
### L2 — `dbus-next` + wrapped shell-outs
The unifying observation: most of L2 is either DBus (which `dbus-next`
handles natively from TS) or short subprocess invocations of OS tools
(which `child_process.exec()` handles, wrapped as a typed TS helper). No
parallel bash test scripts; the test code reads as TS.
- **DBus everywhere it applies.**
[`dbus-next`](https://github.com/dbusjs/node-dbus-next) is actively
maintained, has TypeScript typings, and is designed for Linux desktop
integration. Replaces `gdbus call ...` invocations:
- Tray / SNI state queries (`org.kde.StatusNotifierWatcher`,
`org.freedesktop.DBus`).
- Portal availability checks (`org.freedesktop.portal.Desktop`).
- Suspend inhibitor inspection (`org.freedesktop.login1`).
- AT-SPI introspection where actually needed
(`org.a11y.atspi.*`).
- **Compositor / window-manager state via shell-out helpers.** No good
Node bindings exist for `xprop`, `wlr-randr`, `swaymsg`, `niri msg`
but invoking them from `child_process.exec()` inside a TS helper is
perfectly fine, and the test code stays unified:
```ts
// tools/test-harness/lib/wm.ts
export async function listToplevels(): Promise<Toplevel[]> { ... }
```
Each helper is a thin typed wrapper; the test reads as TS, not
bash-with-extra-steps.
- **Native dialogs (T17 folder picker, etc.) via portal mocking.** The
`org.freedesktop.portal.FileChooser` interface is just DBus. For tests
that exercise the *integration* (does Claude make the right portal call
and handle the result?) — which is what T17 actually tests — register
a mock backend over `dbus-next`, intercept the call, return a canned
path. No real dialog ever renders. This is both faster and a more
honest unit of test than driving a real chooser.
- **AT-SPI escape hatch.** For the rare test where portal mocking isn't
enough (driving an *actual* GTK/Qt dialog tree), the fallback is a
small Python [`dogtail`](https://pypi.org/project/dogtail/) script
invoked via `child_process.exec()` — same shape as the other shell-out
helpers, just Python on the other end. Today, T17 is the only test
that might need this; portal mocking probably covers it. We adopt
Python only when a specific test forces it, not speculatively.
### Input injection — `ydotool` now, `libei` next
- [`ydotool`](https://github.com/ReimuNotMoe/ydotool) goes through
`/dev/uinput`, so it works on both X11 and Wayland. Needs root or a
`uinput` group; not a problem inside a test VM. Invoked via the same
`child_process` shell-out pattern — `tools/test-harness/lib/input.ts`.
- Portal-grabbed shortcuts (T06, S11, S14) `ydotool` **cannot** trigger.
That's a kernel-vs-compositor boundary issue, not a tool gap. Those
tests stay manual until libei is widely available.
- The future-correct path is
[`libei`](https://www.phoronix.com/news/LIBEI-Emulated-Input-Wayland) +
the `RemoteDesktop` portal via `libportal`. KDE, GNOME, and wlroots
are all moving there. Worth a roadmap note that the shortcut tests
have a path to automation — just not today.
### VM lifecycle
- One image-build recipe per row in `tools/test-images/${ROW}/`. Packer
for the imperative distros (Fedora 43, Ubuntu 24.04, OmarchyOS, and
manual-install rows like i3 / Niri); Nix flake for `Hypr-N`.
- Rebuild nightly or per release-tag sweep — don't `apt update` /
`dnf update` inside a test run; mirrors hiccup, tests go red for the
wrong reason.
- Each test gets a hermetic `XDG_CONFIG_HOME` / `CLAUDE_CONFIG_DIR`
(S19 is already the test-isolation primitive). No shared state
between tests.
## The CDP auth gate (and the runtime-attach workaround that beats it)
*Discovered during the first KDE-W run-through; resolved by routing
through the in-app debugger menu's code path.*
The shipped `index.pre.js` contains an authenticated-CDP gate:
```js
uF(process.argv) && !qL() && process.exit(1);
```
`uF(argv)` matches **`--remote-debugging-port`** or
**`--remote-debugging-pipe`** on argv. `qL()` validates an ed25519-signed
token in `CLAUDE_CDP_AUTH` (signed payload
`${timestamp_ms}.${base64(userDataDir)}`, 5-minute TTL) against a hardcoded
public key. If the gate flag is on argv and a valid token isn't in env,
the app exits with code 1 right after `frame-fix-wrapper` completes. Both
Playwright's `_electron.launch()` and `chromium.connectOverCDP()` inject
`--remote-debugging-port=0` and trigger the gate. The signing key is held
upstream; we can't forge tokens.
**Crucially, the gate doesn't check `--inspect` or runtime SIGUSR1.** Those
trigger the **Node inspector**, not the Chrome remote-debugging port —
different surface. Notably, the in-app `Developer → Enable Main Process
Debugger` menu item *also* opens the Node inspector at runtime; that
menu's existence is the hint that this path is tolerated by upstream.
The harness uses this:
1. Spawn Electron with no debug-port flags. Gate stays asleep.
2. Wait for the X11 window to appear (signal that the app is up).
3. Send `SIGUSR1` to the main process pid. Same code path as the menu —
`inspector.open()` runs at runtime and the Node inspector starts on
port 9229.
4. Connect a WebSocket to `http://127.0.0.1:9229/json/list[0].
webSocketDebuggerUrl`.
5. Use `Runtime.evaluate` to run JS in the main process. From there:
- `webContents.getAllWebContents()` lists all live web contents
(including `https://claude.ai/...` once it loads into the
BrowserView).
- `webContents.executeJavaScript(...)` drives renderer-side DOM /
state queries.
- Main-process mocks (e.g. `dialog.showOpenDialog = ...` for T17) are
installed by direct assignment.
[`tools/test-harness/src/lib/inspector.ts`](../../tools/test-harness/src/lib/inspector.ts)
wraps this; [`tools/test-harness/src/lib/electron.ts`](../../tools/test-harness/src/lib/electron.ts)
exposes `app.attachInspector()` on the launched-app handle.
**Two implementation gotchas worth recording:**
- **`BrowserWindow.getAllWindows()` returns 0** because frame-fix-wrapper
substitutes the `BrowserWindow` class and the substitution breaks the
static registry. Use `webContents.getAllWebContents()` instead — that
registry stays intact and includes both the shell window and the
embedded claude.ai BrowserView.
- **`Runtime.evaluate` with `awaitPromise: true` + `returnByValue: true`
returns empty objects** for awaited Promise resolutions on this build's
V8. Workaround: have the IIFE return a `JSON.stringify(value)` and
`JSON.parse` on the caller side. `inspector.evalInMain<T>()` does this
internally so callers don't think about it.
**Status of the harness today:**
- **L2** — fully working (DBus, xprop). T03 / T04 pass.
- **L1 — T01** — passes via X11 window probe (no inspector needed).
- **L1 — T17 / similar** — framework works end-to-end (verified inspector
attach + dialog mock + webContents detection + Code-tab navigation
click). Selector tuning to match claude.ai's actual Code-tab UI is
ordinary iterate-as-needed work, not a blocker.
- **No `app-asar.sh` patch needed** to neutralize the gate. The
`dogtail`/AT-SPI escape hatch (Decision 1) is also no longer the
fallback for L1 — it's only relevant for native dialogs that the
inspector pattern can't reach.
## Notable shifts since the existing roadmap was written
These three changed the landscape in 2025 and the existing
[`README.md`](./README.md) Automation roadmap section predates them:
1. **Electron 38+ defaults to native Wayland.** [Electron 38 release
notes](https://www.electronjs.org/blog/electron-38-0) and the
[Wayland tech talk](https://www.electronjs.org/blog/tech-talk-wayland)
document this. Electron now has a Wayland CI job upstream. The project
keeps X11 as the default backend (Decision 6) because portal coverage
for `GlobalShortcuts` is uneven across compositors — the new tests
characterize what works where, not what to ship by default.
2. **Spectron is dead.** Archived Feb 2022; Playwright is the
[official recommendation](https://www.electronjs.org/blog/spectron-deprecation-notice).
No discussion needed about which framework — that's settled.
3. **`libei` is real and shipping.** KWin, mutter, and wlroots have all
moved. The shortcut-test gap (T06 / S11 / S14) is automatable in the
medium term, not "manual forever."
## Anti-patterns to design against
Pulled from the [Playwright flaky-test
checklist](https://testdino.com/blog/playwright-automation-checklist/),
the [Codepipes anti-patterns
catalogue](https://blog.codepipes.com/testing/software-testing-antipatterns.html),
and the [TestDevLab top 5
list](https://www.testdevlab.com/blog/5-test-automation-anti-patterns-and-how-to-avoid-them).
Designing the harness with these in mind from day one is much cheaper than
backing them out later:
| Anti-pattern | What it looks like | How to avoid in this project |
|---|---|---|
| Silent retry | Test passes on attempt 2; dashboard shows green; flake hidden | Log retry count to JUnit; `matrix.md` shows `✓*` for retried-pass; treat retried-pass as a Should-fix bug |
| Async-wait by `sleep` | `sleep 5` instead of `waitFor`; ICSE 2021 found ~45% of UI flakes here | No fixed sleeps in `tools/test-harness/`. Always poll a condition (window exists, log line, DBus name owned). Lint for `\bsleep\b` and `setTimeout` with literal numbers in test code |
| Mixing orchestration with verification | One test installs the package, launches, checks tray, asserts URL handler — five failure modes, one red cell | One test, one assertion class. Setup goes in shared fixtures, not test bodies |
| End-to-end as the only layer | All regressions caught at full-stack UI level | Keep `scripts/patches/*.sh` independently testable; add unit-level tests on patcher logic separately from the full-app sweep |
| Implementation-coupled selectors | `div.css-7xz92q` deep selectors against minified renderer classes | Decision 5: semantic locators only. If a selector proves unstable, first ask upstream for a stable `data-testid`; only carry an `app-asar.sh` patch as a last resort, per-test |
| Timing-sensitive assertions | "Within 500ms after click, X appears" | Time bounds are upper-bound sanity only. Use Playwright's auto-wait with a generous `timeout`; don't fight the framework |
| Hidden global state across tests | Test 4 fails because test 2 left `~/.config/Claude/SingletonLock` behind | Hermetic per-test `XDG_CONFIG_HOME` / `CLAUDE_CONFIG_DIR` (S19). Treat shared state as an isolation bug, not a known quirk |
| Long-lived VM state drift | Six-month-old snapshot has stale package mirrors; tests fail with 404s | Image rebuild as code (Packer / Nix flake); rebuild nightly or per release-tag. Never `apt update` mid-test |
| Treating skip as fail | wlroots-only test fails on KDE because it can't be skipped properly | `?` and `-` are first-class in [`matrix.md`](./matrix.md). Map JUnit `<skipped>` → `-`, `<error>` (harness broke) → `?`, only `<failure>` → `` |
| Diagnostics only on failure | Test goes red; capture fires; previous green run had no baseline to diff against | Decision 7: capture `--doctor`, launcher log, screenshot **on every run**. Last 10 greens + all reds on `main` |
| Network coupling | "Tray icon present" fails because Cloudflare hiccupped during sign-in | Tests that don't *need* network shouldn't touch it. Sign-in is one fixture; tray test runs on a pre-signed-in profile snapshot |
## What stays manual (for now)
These have no automation path that's worth the cost today, and that's
honest to call out in the roadmap rather than pretending they'll be
automated "soon":
- **T06 / S11 / S14** — global shortcut tests behind portal grabs. Path
exists (libei + RemoteDesktop portal) but compositor-side support is
patchy. Revisit when libei adoption broadens.
- **T15** — sign-in browser handoff. Needs a fixture account and an
upstream auth flow that won't necessarily welcome scripted login.
- **T28** — scheduled task catch-up after suspend. Real wall-clock event;
not worth simulating.
- **Anything in `ui/` tagged "looks right"** — HiDPI sharpness, theme
rendering, drag-feel. AT-SPI sees the tree, not the pixels.
T17 (folder picker) was previously in this list. Portal mocking via
`dbus-next` moves it into L2. If real-dialog testing turns out to be
necessary anyway, the dogtail escape hatch covers it.
The matrix already supports leaving these manual via the `?` / `-` /
existing-cell semantics — no schema change needed.
## Suggested first vertical slice
The smallest end-to-end that proves every architectural decision:
- **One row:** KDE-W (daily-driver host, no VM startup tax).
- **One test:** T01 — App launch.
- **Full pipeline:** orchestrator glue → harness entry → Playwright
`_electron.launch()` → JUnit XML → matrix-regen step → cell flips
from `?` to `` automatically.
That single slice forces every decision out into the open: harness
language (TS), JUnit emission, results-bundle layout, matrix-regen
rules, diagnostic-capture format. Resist building the orchestrator
before there's a passing test it can orchestrate. Once the slice is
real, adding tests 210 is mostly mechanical.
After T01: the next sensible additions are T03 (tray — exercises
`dbus-next` end-to-end), T04 (window decorations — exercises the
shell-out helper pattern), and T17 (folder picker — exercises portal
mocking). Those four runners cover every distinct shape of TS code in
the harness; everything else after them is a recombination.
## Still open
Most of the framing decisions are settled in the [Decisions](#decisions)
table. What remains:
1. **Owner assignments per row.** [`MEMORY.md`](https://github.com/aaddrick/claude-desktop-debian/blob/main/.claude/projects/-home-aaddrick-source-claude-desktop-debian/memory/MEMORY.md)
notes cowork → @RayCharlizard, nix → @typedrat. Hypr-N row is the
natural fit for @typedrat once the Nix flake exists. The other eight
rows: aaddrick by default, but worth asking the contributor base in a
discussion thread.
2. **AT-SPI escape-hatch trigger.** Decision 1 punts on Python until a
specific test forces it. T17 is the only candidate today, and portal
mocking probably covers it. If T17 actually needs real-dialog
automation, that's the first reopen.
3. **Selector rot rate.** Decision 5 starts with semantic locators and
measures. After ~20 tests on the renderer, revisit whether
`getByRole`/`getByText` is holding up or whether per-test
`data-testid` patches are warranted. No prediction; this is a
measure-and-decide.
4. **CI execution model.** Decision 4 punts on this entirely until the
harness has signal on which tests are stable. Reopen after the first
~20 tests have run from the dev box for a few weeks.
5. **Smoke-set Wayland-default test wording.** Decision 6 calls for a
Smoke test asserting X11/XWayland selection on each row, plus
per-row Should tests for Wayland characterization. The exact T-IDs
and case-file homes for those tests need to be drafted next time
`cases/` is touched.
## Sources
Background reading the recommendations draw on. Linked here so the
calls have receipts:
### Electron testing & Playwright
- [Electron — Automated Testing](https://www.electronjs.org/docs/latest/tutorial/automated-testing) — official tutorial, recommends Playwright
- [Electron — Spectron Deprecation Notice](https://www.electronjs.org/blog/spectron-deprecation-notice) — Feb 2022 archive
- [Playwright — Electron class](https://playwright.dev/docs/api/class-electron)
- [Playwright — ElectronApplication class](https://playwright.dev/docs/api/class-electronapplication)
- [Testing Electron apps with Playwright and GitHub Actions (Simon Willison)](https://til.simonwillison.net/electron/testing-electron-playwright)
- [`spaceagetv/electron-playwright-example`](https://github.com/spaceagetv/electron-playwright-example) — multi-window Playwright + Electron example
### DBus / TypeScript
- [`dbus-next` — actively-maintained Node DBus library with TS typings](https://github.com/dbusjs/node-dbus-next)
- [`dbus-next` on npm](https://www.npmjs.com/package/dbus-next)
### Wayland / X11 / input injection
- [Electron — Tech Talk: How Electron went Wayland-native](https://www.electronjs.org/blog/tech-talk-wayland)
- [Electron 38.0.0 release notes](https://www.electronjs.org/blog/electron-38-0)
- [PR #33355: fix calling X11 functions under Wayland](https://github.com/electron/electron/pull/33355)
- [LIBEI — Phoronix overview](https://www.phoronix.com/news/LIBEI-Emulated-Input-Wayland)
- [libei + RemoteDesktop portal — RustDesk discussion](https://github.com/rustdesk/rustdesk/discussions/4515)
- [`ydotool` README](https://github.com/ReimuNotMoe/ydotool)
- [`kwin-mcp` — KDE Plasma 6 Wayland automation tools](https://github.com/isac322/kwin-mcp)
### Portals / AT-SPI
- [XDG Desktop Portal — main repo](https://github.com/flatpak/xdg-desktop-portal)
- [`org.freedesktop.portal.FileChooser` interface XML](https://github.com/flatpak/xdg-desktop-portal/blob/main/data/org.freedesktop.portal.FileChooser.xml)
- [File Chooser portal documentation](https://flatpak.github.io/xdg-desktop-portal/docs/doc-org.freedesktop.portal.FileChooser.html)
- [`dogtail` on PyPI](https://pypi.org/project/dogtail/) — fallback only
- [Automation through Accessibility — Fedora Magazine](https://fedoramagazine.org/automation-through-accessibility/)
### Anti-patterns / flaky tests
- [Playwright automation checklist to reduce flaky tests (TestDino)](https://testdino.com/blog/playwright-automation-checklist/)
- [Flaky Tests: The Complete Guide to Detection & Prevention (TestDino)](https://testdino.com/blog/flaky-tests/)
- [5 Test Automation Anti-Patterns (TestDevLab)](https://www.testdevlab.com/blog/5-test-automation-anti-patterns-and-how-to-avoid-them)
- [Software Testing Anti-patterns (Codepipes)](https://blog.codepipes.com/testing/software-testing-antipatterns.html)
### JUnit XML reporting
- [`junit-to-md`](https://github.com/davidahouse/junit-to-md)
- [Test Summary GitHub Action](https://github.com/marketplace/actions/junit-test-dashboard)
- [Test Reporter](https://github.com/marketplace/actions/test-reporter)
### CI / VM matrix
- [Transient — QEMU CI wrapper](https://www.starlab.io/blog/simple-painless-application-testing-on-virtualized-hardwarenbsp)
- [`cirruslabs/tart` — VMs for CI automation](https://github.com/cirruslabs/tart)
---
*Once the first vertical slice (KDE-W + T01) ships, the relevant pieces of
this file fold into [`README.md`](./README.md) (Automation roadmap) and
[`runbook.md`](./runbook.md) (the harness invocation). Until then: working
notes that have crossed from brainstorm to plan.*

View File

@@ -0,0 +1,94 @@
# Functional Test Cases
Test specifications grouped by feature surface. For live status, see [`../matrix.md`](../matrix.md). For sweep workflow, see [`../runbook.md`](../runbook.md).
## Files
| File | Surfaces covered | Tests |
|------|------------------|-------|
| [`launch.md`](./launch.md) | App startup, doctor, package detection, multi-instance | T01, T02, T13, T14 |
| [`tray-and-window-chrome.md`](./tray-and-window-chrome.md) | Tray icon, window decorations, hybrid topbar, hide-to-tray | T03, T04, T07, T08, S08, S13 |
| [`shortcuts-and-input.md`](./shortcuts-and-input.md) | URL handler, Quick Entry, global shortcuts | T05, T06, S06, S07, S09, S10, S11, S12, S14, S29, S30, S31, S32, S33, S34, S35, S36, S37 |
| [`code-tab-foundations.md`](./code-tab-foundations.md) | Sign-in, Code tab load, folder picker, drag-drop, terminal, file pane | T15, T16, T17, T18, T19, T20 |
| [`code-tab-workflow.md`](./code-tab-workflow.md) | Preview, PR monitor, worktrees, auto-archive, side chat, slash menu | T21, T22, T29, T30, T31, T32 |
| [`code-tab-handoff.md`](./code-tab-handoff.md) | Notifications, external editor, file manager, connector OAuth, IDE handoff | T23, T24, T25, T34, T38, T39 |
| [`routines.md`](./routines.md) | Scheduled tasks, catch-up runs, suspend inhibit, config dir | T26, T27, T28, S19, S20, S21 |
| [`extensibility.md`](./extensibility.md) | Plugins, MCP, hooks, CLAUDE.md memory, worktree storage | T11, T33, T35, T36, T37, S27, S28 |
| [`distribution.md`](./distribution.md) | DEB, RPM, AppImage, dependency pulls, auto-update | S01, S02, S03, S04, S05, S15, S16, S26 |
| [`platform-integration.md`](./platform-integration.md) | Autostart, Cowork, WebGL, PATH inheritance, Computer Use, Dispatch | T09, T10, T12, S17, S18, S22, S23, S24, S25 |
## Standard test body
Every test in this directory follows this structure:
```markdown
### T## — Title
**Severity:** Smoke | Critical | Should | Could
**Surface:** human-readable surface tag (e.g. "Code tab → Environment")
**Applies to:** All | <subset of rows>
**Issues:** linked issue/PR list, or `—`
**Steps:**
1. ...
2. ...
**Expected:** what should happen.
**Diagnostics on failure:** which captures to attach when filing. See [`../runbook.md#diagnostic-capture`](../runbook.md#diagnostic-capture).
**References:** docs links, learnings, related issues.
**Code anchors:** `<file>:<line>` pointers to the upstream code or
wrapper script that backs the load-bearing claim above. Added during
the grounding sweep — see "Anchor scope" for guidance on where
anchors can and can't land.
**Inventory anchor:** (optional) `<element-id>` from
[`../ui-inventory.json`](../ui-inventory.json) — only if the surface
shows up in the v7 walker's idle capture. For surfaces inside modals
or popups, append a sentence noting which click-chain opens them so
the next inventory regeneration can grab them.
```
The Steps and Diagnostics fields are written so they can later become
script entry points without a rewrite.
### Anchor scope
Where the load-bearing claim lives determines where the anchor goes:
- **Upstream code** — any file under
`build-reference/app-extracted/.vite/build/` (most often `index.js`,
the main process). Use `index.js:N` style anchors.
- **Our wrapper code** — `scripts/launcher-common.sh`, `scripts/doctor.sh`,
`scripts/patches/*.sh`, `scripts/frame-fix-wrapper.js`,
`scripts/wco-shim.js`. Use `<repo-relative-path>:N` style anchors.
- **Server-rendered (claude.ai SPA)** — anchorable only via the v7
walker inventory (`docs/testing/ui-inventory.json`) or a runtime
capture from `tools/test-harness/grounding-probe.ts`. Idle-state
inventory misses contextual surfaces (modals, popups, slash menus,
context menus, side panels) — note that explicitly.
- **Upstream `claude` CLI binary** — out of scope for this matrix
(e.g. T39 `/desktop` is a CLI slash-command, not in the Electron
asar). Mark as Ambiguous and link to a separate CLI matrix if one
exists.
If a claim spans multiple scopes (a wrapper script triggering
upstream behavior, e.g. T01's launcher-log + main-window-opens),
list all the anchors. The whole point is making the next sweep
faster — over-anchoring is fine, missing anchors is not.
### Drift markers
When a sweep finds upstream behavior no longer matches the case:
- **Edited Steps/Expected** — fix the case in place, mention what
changed in the commit message. The case is the spec.
- **Missing in build X.Y.Z** — prepend a blockquote under the test
heading: `> **⚠ Missing in build 1.5354.0** — <one-line note>.
Re-verify after next upstream bump.` Use when the feature isn't
in the build at all (deprecated, behind unset flag, never shipped).
- **Ambiguous** — don't edit; flag in the sweep report. Use when
the load-bearing claim could be one of several candidate code
paths and static analysis can't disambiguate.

View File

@@ -0,0 +1,197 @@
# Code Tab — Foundations
Tests covering Code-tab availability on Linux (officially unsupported per upstream docs), sign-in flow, folder picker, drag-and-drop, and the basic editing surfaces (terminal, file pane). See [`../matrix.md`](../matrix.md) for status.
## T15 — Sign-in completes in the embedded webview
> **Drift in build 1.5354.0** — Sign-in is an in-app `mainView.webContents.loadURL` flow, not an `xdg-open` browser handoff. Claude.ai/login renders inside the embedded BrowserView; the resulting `sessionKey` cookie is then exchanged at `${apiHost}/v1/oauth/${org}/authorize` with redirect URI `https://claude.ai/desktop/callback`. No system browser is involved.
**Severity:** Smoke
**Surface:** Auth / embedded webview
**Applies to:** All rows
**Issues:**
**Steps:**
1. Launch a fresh app instance (signed-out state).
2. Click **Sign in**. Observe claude.ai/login rendering inside the app.
3. Authenticate. Observe the in-app navigation completing back to the
workspace.
**Expected:** Sign-in stays inside the embedded webview (`will-navigate`
handler `Ihr` keeps `/login/` paths in-app). After auth the
`sessionKey` cookie is captured and silently exchanged for an OAuth
token via the `desktop/callback` redirect. Account dropdown populates;
no auth banner remains.
**Diagnostics on failure:** DevTools console for the `mainView`
BrowserView, network captures of the `/v1/oauth/{org}/authorize` and
`/v1/oauth/token` calls, launcher log, cookie jar inspection
(`sessionKey` on `.claude.ai`).
**References:** [Code tab auth troubleshooting](https://code.claude.com/docs/en/desktop#403-or-authentication-errors-in-the-code-tab)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:141996` — desktop
OAuth redirect URI `https://claude.ai/desktop/callback`
- `build-reference/app-extracted/.vite/build/index.js:142431` — POST to
`${apiHost}/v1/oauth/${org}/authorize` with `Bearer ${sessionKey}`
- `build-reference/app-extracted/.vite/build/index.js:216565``Ihr`
treats `/login/` paths as in-app (not external)
- `build-reference/app-extracted/.vite/build/index.js:141316`
`mainView.webContents.loadURL(...)` drives the embedded sign-in
## T16 — Code tab loads
**Severity:** Smoke
**Surface:** Code tab — top-level UI
**Applies to:** All rows
**Issues:**
**Steps:**
1. After sign-in, click the **Code** tab at the top center.
2. Wait a few seconds.
**Expected:** Code tab renders the session UI (sidebar, prompt area, environment dropdown). Per upstream docs the Code tab is "not supported" on Linux — the patched build under this project should render the UI normally or surface a clear, actionable message. Not a blank screen, infinite spinner, or `Error 403: Forbidden`.
**Diagnostics on failure:** Screenshot, DevTools console, network captures (auth/feature-flag responses), launcher log, the active patch set in `scripts/patches/`.
**References:** [Use Claude Code Desktop](https://code.claude.com/docs/en/desktop), [Get started with the desktop app](https://code.claude.com/docs/en/desktop-quickstart)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:525066`
`sidebarMode === "code"` rewrites the BrowserView path to `/epitaxy`
- `build-reference/app-extracted/.vite/build/index.js:496066` — Code
deeplinks (`claude://code?...`) navigate to `/epitaxy?...`
- `build-reference/app-extracted/.vite/build/index.js:105273``IHi`
recognises `/epitaxy` and `/epitaxy/...` as the Code-tab path
- `build-reference/app-extracted/.vite/build/index.js:105346`
`sidebarMode` enum contains `"code"`
**Inventory anchor:** `…tablist.tab-by-name.code` (role `tab`, label
`Code`) — confirms the Code tab is reachable from the new-chat tablist
in the captured idle state.
## T17 — Folder picker opens
**Severity:** Smoke
**Surface:** Code tab → Environment selection
**Applies to:** All rows
**Issues:**
**Runner:** [`tools/test-harness/src/runners/T17_folder_picker.spec.ts`](../../../tools/test-harness/src/runners/T17_folder_picker.spec.ts) — runtime-attach via SIGUSR1 + main-process `dialog.showOpenDialog` mock + `webContents.executeJavaScript` to drive the renderer. Click chain to reach the folder-picker button awaits selector tuning
**Steps:**
1. In the Code tab, click the environment pill → **Local****Select folder**.
2. Choose a project directory.
**Expected:** Native file chooser opens. On Wayland sessions the chooser is `xdg-desktop-portal`-backed (verify with `busctl --user tree org.freedesktop.portal.Desktop`). On X11 sessions the GTK/Qt native picker fires. Selected path appears in the env pill.
**Diagnostics on failure:** `systemctl --user status xdg-desktop-portal`, `XDG_SESSION_TYPE`, the portal backend in use (`xdg-desktop-portal-kde`, `xdg-desktop-portal-gnome`, `xdg-desktop-portal-wlr`), launcher log.
**References:** [Local sessions](https://code.claude.com/docs/en/desktop#local-sessions)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:66403` — IPC
channel `claude.web_FileSystem_browseFolder` (renderer → main)
- `build-reference/app-extracted/.vite/build/index.js:509188`
`browseFolder` impl calls `dialog.showOpenDialog` with
`properties: ["openDirectory", "createDirectory"]`
- `build-reference/app-extracted/.vite/build/index.js:450534`
`grantViaPicker` (Operon host-access folder grant) uses the same
`["openDirectory"]` shape
- `tools/test-harness/src/lib/claudeai.ts:122``installOpenDialogMock`
intercepts both `(opts)` and `(window, opts)` arities, matching the
call sites at index.js:509196 and :450534
**Inventory anchor:** `root.main.region.button-by-name.select-folder`
(role `button`, label `Select folder…`) — the persistent button the
T17 runner clicks before the dialog mock fires.
## T18 — Drag-and-drop files into prompt
**Severity:** Critical
**Surface:** Code tab → Prompt area
**Applies to:** All rows
**Issues:**
**Steps:**
1. Open a Code-tab session.
2. From the system file manager, drag one or more files into the prompt area.
3. Repeat with multiple files at once.
**Expected:** Files attach to the prompt. The renderer resolves dropped
`File` objects to absolute paths via the preload-bridged
`claudeAppSettings.filePickers.getPathForFile` (Electron's
`webUtils.getPathForFile`). Multi-file drops attach each file. Works on
both Wayland and X11.
**Diagnostics on failure:** Screen recording, `wl-paste --list-types` (Wayland) or `xclip -selection clipboard -t TARGETS -o` (X11) during drag, DevTools console, launcher log.
**References:** [Add files and context](https://code.claude.com/docs/en/desktop#add-files-and-context-to-prompts)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/mainView.js:9267`
`filePickers.getPathForFile` wraps `webUtils.getPathForFile`
- `build-reference/app-extracted/.vite/build/mainView.js:9552`
exposed to the renderer as `window.claudeAppSettings`
## T19 — Integrated terminal
**Severity:** Critical
**Surface:** Code tab → Terminal pane
**Applies to:** All rows
**Issues:**
**Steps:**
1. In a Code-tab session, press `` Ctrl+` `` (or open via the Views menu).
2. Confirm the terminal opens in the session's working directory.
3. Run `git status`, `npm --version`, `gh auth status`.
**Expected:** Terminal pane opens in the session's working directory, inherits the same `PATH` Claude sees. Standard commands run cleanly. Terminal pane is local-session-only per docs.
**Diagnostics on failure:** Terminal pane content, `echo $PATH` from inside the pane, `pwd`, the shell binary in use, launcher log.
**References:** [Run commands in the terminal](https://code.claude.com/docs/en/desktop#run-commands-in-the-terminal)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:69135` — IPC
channel `claude.web_LocalSessions_startShellPty` (also
`resizeShellPty`, `writeShellPty` at :69184, :69210)
- `build-reference/app-extracted/.vite/build/index.js:486438`
`startShellPty` body: spawns `node-pty` in
`n.worktreePath ?? n.cwd` with `TERM=xterm-256color`
- `build-reference/app-extracted/.vite/build/index.js:486463`
`node-pty` dynamic import (optional dep, `package.json` line 100)
- `build-reference/app-extracted/.vite/build/index.js:259306`
`shell-path-worker/shellPathWorker.js` resolves the user's interactive
PATH; `FX()` (line 259311) returns it for the spawned PTY env
## T20 — File pane opens and saves
**Severity:** Critical
**Surface:** Code tab → File pane
**Applies to:** All rows
**Issues:**
**Steps:**
1. In a Code-tab session, click a file path in chat or diff to open it in the file pane.
2. Make a small edit. Click **Save**.
3. Modify the file externally (e.g. `echo >> file`). Re-edit in the pane. Observe the on-disk-changed warning.
**Expected:** File opens in the editor pane. Edits write back to disk on Save. If the file changed on disk since opening, the pane shows the on-disk-changed warning and offers override or discard. (The conflict check is sha256-based, not mtime-based — `writeSessionFile` reads the current bytes, hashes them, and rejects with `Conflict` if the renderer-supplied `expectedHash` doesn't match.)
**Diagnostics on failure:** `sha256sum <file>` output (and stat mtime for cross-checking), launcher log, DevTools console, screen recording of the warning state.
**References:** [Open and edit files](https://code.claude.com/docs/en/desktop#open-and-edit-files)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:68922` — IPC
channel `claude.web_LocalSessions_readSessionFile`
- `build-reference/app-extracted/.vite/build/index.js:69003` — IPC
channel `claude.web_LocalSessions_writeSessionFile` with
`expectedHash` argument at position 3
- `build-reference/app-extracted/.vite/build/index.js:492874`
`readSessionFile` impl
- `build-reference/app-extracted/.vite/build/index.js:492954`
`writeSessionFile` impl: sha256-hashes current on-disk bytes,
returns `{ status: nW.Conflict, currentHash }` when `expectedHash`
mismatches

View File

@@ -0,0 +1,163 @@
# Code Tab — Handoffs to Other Apps
Tests covering desktop notifications, "Open in" external editor, "Show in Files" file manager, connector OAuth round-trips, IDE handoff, and graceful failure of the macOS/Windows-only `/desktop` CLI command. See [`../matrix.md`](../matrix.md) for status.
## T23 — Desktop notifications fire
**Severity:** Critical
**Surface:** Notifications (libnotify / XDG Notifications)
**Applies to:** All rows
**Issues:**
**Steps:**
1. Trigger each notification source: scheduled-task fire ([T27](./routines.md#t27--scheduled-task-fires-and-notifies)), CI completion ([T22](./code-tab-workflow.md#t22--pr-monitoring-via-gh)), Dispatch handoff ([S24](./platform-integration.md#s24--dispatch-spawned-code-session-appears-with-badge-and-notification)).
2. Observe each notification appears.
3. Click each — confirm it focuses the relevant session.
**Expected:** Notifications appear in the active DE's notification area (Plasma's notification daemon, Mako on wlroots, gnome-shell, etc.) and are clickable to focus the relevant session.
**Diagnostics on failure:** `gdbus call --session --dest=org.freedesktop.Notifications --object-path=/org/freedesktop/Notifications --method=org.freedesktop.DBus.Introspectable.Introspect`, `notify-send "test"` (sanity check daemon), launcher log, DE-specific notification logs.
**References:** [Scheduled tasks](https://code.claude.com/docs/en/desktop-scheduled-tasks), [Monitor pull request status](https://code.claude.com/docs/en/desktop#monitor-pull-request-status)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:494456` (`new hA.Notification(r)` — backed by Electron's libnotify on Linux); `:495110` (`showNotification(title, body, tag, navigateTo)` dispatches Swift on macOS, Electron elsewhere); `:511174`, `:512738` (cu-lock / tool-permission notifications wire a click callback that navigates to `/local_sessions/{sessionId}` to focus the session).
## T24 — Open in external editor
**Severity:** Should
**Surface:** Code tab → Right-click → Open in
**Applies to:** All rows
**Issues:**
**Steps:**
1. Install at least one of: VS Code, Cursor, Zed, Windsurf (any install method —
flatpak, AppImage, distro package). Xcode is darwin-only and absent on Linux.
2. In the Code tab, right-click a file path → **Open in** → choose the editor.
3. Confirm the editor opens at that file.
**Expected:** Right-click → **Open in** launches the chosen editor with the file
path. Editor is invoked by URL scheme (`vscode://file/<path>`,
`cursor://file/<path>`, `zed://file/<path>`, `windsurf://file/<path>`) via
`shell.openExternal`, which delegates to `xdg-open`'s
`x-scheme-handler/<editor>` resolution rather than hard-coded paths.
**Diagnostics on failure:** `xdg-mime query default x-scheme-handler/vscode` (or
`cursor`/`zed`/`windsurf`), `desktop-file-validate` on the editor's `.desktop`
file, `xdg-open vscode://file/<path>` from terminal (sanity check), launcher
log.
**References:** [Open files in other apps](https://code.claude.com/docs/en/desktop#open-files-in-other-apps)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:59076`
(editor enum: VSCode, Cursor, Zed, Windsurf, Xcode); `:463902` (`Mtt`
registry — `vscode://`, `cursor://`, `zed://`, `windsurf://`, `xcode://` with
darwin-only flag on Xcode); `:463956` (`getInstalledEditors` probes via
`app.getApplicationInfoForProtocol`); `:464011`
(`shell.openExternal('<scheme>://file/<encoded-path>:<line>')` — path is
URL-encoded but `/` separators are preserved); `:68816` IPC handler
`LocalSessions.openInEditor(path, editor, sshConfig, line)`.
## T25 — Show in Files / file manager
**Severity:** Should
**Surface:** Code tab → Right-click → Show in Files
**Applies to:** All rows
**Issues:**
**Steps:**
1. In the Code tab, right-click a file path → "Show in Files" (Linux equivalent of macOS "Show in Finder" / Windows "Show in Explorer").
2. Confirm the system file manager opens with the containing folder selected.
**Expected:** System file manager (Nautilus on GNOME, Dolphin on KDE, Thunar on Xfce, etc.) opens with the file pre-selected. Resolution respects `xdg-mime` defaults.
**Diagnostics on failure:** `xdg-mime query default inode/directory`, `xdg-open <dir>` from terminal, the menu label rendered (was it Linux-specific or stuck on "Show in Finder"?), launcher log.
**References:** [Open files in other apps](https://code.claude.com/docs/en/desktop#open-files-in-other-apps)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:66652` IPC
handler `FileSystem.showInFolder(path)`; `:509431` impl thin-wraps
`hA.shell.showItemInFolder(Tc(path))`. Electron's `showItemInFolder` on Linux
falls back to `xdg-open` on the parent directory when no DBus FileManager1
service is present, so the file is rarely pre-selected on minimal DEs — only
the parent folder opens.
## T34 — Connector OAuth round-trip
**Severity:** Critical
**Surface:** Connectors → OAuth handoff
**Applies to:** All rows
**Issues:**
**Steps:**
1. In a Code-tab session, click **+** → **Connectors** → choose a service (Slack, GitHub, Linear, Notion, Google Calendar).
2. Step through the OAuth flow in the system browser.
3. Return to Claude Desktop and verify the connector appears in **Settings → Connectors**.
4. Use the connector in a prompt (e.g. "list my Slack channels").
**Expected:** Adding a connector launches the browser via `xdg-open`, OAuth callback hands control back to Claude Desktop, connector appears in Settings, and is usable in subsequent prompts.
**Diagnostics on failure:** `xdg-mime query default x-scheme-handler/https`, the callback URL scheme, network captures of OAuth redirect, launcher log, DevTools console.
**References:** [Connect external tools](https://code.claude.com/docs/en/desktop#connect-external-tools), [Connectors for everyday life](https://claude.com/blog/connectors-for-everyday-life)
**Code anchors:**
`build-reference/app-extracted/.vite/build/index.js:524819`
(`hA.app.setAsDefaultProtocolClient("claude")` — registers the `claude://`
deep-link scheme used by the OAuth callback); `:525026` mainWindow
`setWindowOpenHandler` routes external URLs through `MAA(url)`
`:525102``:525135` (only `http:`/`https:`/`mailto:`/`tel:`/`sms:`/
`ms-(excel|powerpoint|word):` are forwarded to system handlers; everything
else is dropped); `:136233` `$a(url)` thin-wraps `hA.shell.openExternal(url)`
(this is the single egress point for browser handoff); `:159634`
`mcpSubmitOAuthCallbackUrl(serverName, callbackUrl)` and `:159651`
`claudeOAuthCallback(authorizationCode, state)` — IPC bridges that consume
the deep-link callback. See [`docs/learnings/plugin-install.md`](../../learnings/plugin-install.md)
for orgId/sessionKey cookie chain that gates connector listing.
## T38 — Continue in IDE
**Severity:** Should
**Surface:** Code tab → Continue in menu
**Applies to:** All rows
**Issues:**
**Steps:**
1. In a Code-tab session, click the IDE icon (bottom right of session toolbar) → **Continue in** → choose an IDE.
2. Confirm the IDE opens at the working directory.
**Expected:** Selected IDE opens the project at the current working directory. Resolution via `xdg-open` / `.desktop` files.
**Diagnostics on failure:** `xdg-open <project-dir>` sanity check, `xdg-mime query default x-scheme-handler/vscode` (or matching scheme for the chosen IDE), launcher log, the IDE's `.desktop` file.
**References:** [Continue in another surface](https://code.claude.com/docs/en/desktop#continue-in-another-surface)
**Code anchors:** Same IPC surface as [T24](#t24--open-in-external-editor) —
`build-reference/app-extracted/.vite/build/index.js:68816`
(`LocalSessions.openInEditor(path, editor, sshConfig, line)` accepts a
directory path the same way as a file path); `:463902` editor registry;
`:464011` `shell.openExternal('<scheme>://file/<cwd>')`. The "Continue in"
chooser UI is rendered server-side by claude.ai and not present in the local
asar — only the IPC bridge can be code-anchored.
## T39 — `/desktop` CLI handoff (graceful N/A)
> **Note** — This test exercises the upstream `claude` CLI binary, not the
> Electron app. The CLI ships separately from this packaging (out of
> `build-reference/`), so no anchor in `app-extracted/.vite/build/` exists for
> the slash-command handler. Re-verify behaviour against the CLI binary that
> ships with the upstream version under test (currently 1.5354.0).
**Severity:** Could
**Surface:** CLI `/desktop` command
**Applies to:** All rows (Linux equally)
**Issues:**
**Steps:**
1. In a CLI session, run `/desktop`.
2. Inspect exit code and output.
**Expected:** `/desktop` is documented as macOS/Windows-only. On Linux it must fail gracefully — print a clear "not supported on Linux" message and exit cleanly. No partial state transition, no panic, no corrupted session file.
**Diagnostics on failure:** Full CLI output, exit code, the session file before/after (`~/.claude/sessions/...`), strace if the CLI hangs.
**References:** [Coming from the CLI](https://code.claude.com/docs/en/desktop#coming-from-the-cli)

View File

@@ -0,0 +1,151 @@
# Code Tab — Workflow Surfaces
Tests covering the dev-server preview pane, PR monitoring, worktree isolation, auto-archive, side chat, and the slash command menu. See [`../matrix.md`](../matrix.md) for status.
## T21 — Dev server preview pane
**Severity:** Should
**Surface:** Code tab → Preview pane
**Applies to:** All rows
**Issues:**
**Steps:**
1. In a Code-tab session, ensure `.claude/launch.json` is configured (or let auto-detect populate it).
2. Click **Preview** dropdown → **Start**.
3. Interact with the embedded browser. Verify auto-verify takes screenshots.
4. Stop the server from the dropdown.
**Expected:** Configured dev server starts. Embedded browser renders the running app. Auto-verify takes screenshots and inspects DOM. Stopping from the dropdown actually stops the process.
**Diagnostics on failure:** `lsof -i :<port>` to see the server, screenshot of preview pane state, `.claude/launch.json` content, launcher log, DevTools console.
**References:** [Preview your app](https://code.claude.com/docs/en/desktop#preview-your-app)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:262175``Pae = "Claude Preview"` + `preview_*` MCP tool table (`preview_start`, `preview_stop`, `preview_list`, `preview_screenshot`, `preview_snapshot`, `preview_inspect`, `preview_click`, `preview_fill`, `preview_eval`, `preview_network`, `preview_resize`).
- `build-reference/app-extracted/.vite/build/index.js:259604``setAutoVerify()` and `parseLaunchJson()` (reads `.claude/launch.json`, honours `autoVerify` flag default-on).
- `build-reference/app-extracted/.vite/build/index.js:260015``capturePage()` / `captureViaCDP()` drive `preview_screenshot` against the embedded preview WebContents.
## T22 — PR monitoring via `gh`
**Severity:** Critical
**Surface:** Code tab → CI status bar
**Applies to:** All rows
**Issues:**
**Steps:**
1. Ensure `gh` is installed and authenticated (`gh auth status`).
2. In a Code-tab session, ask Claude to open a PR for a small change.
3. Observe the CI status bar. Toggle **Auto-fix** and **Auto-merge**.
4. Run a separate test on a row where `gh` is **not** installed — confirm the missing-`gh` prompt appears the first time a PR action is taken.
**Expected:** With `gh` present and authenticated, CI status bar surfaces in the session toolbar. Auto-fix and Auto-merge toggles work (auto-merge requires the corresponding GitHub repo setting). If `gh` is missing, the app surfaces a prompt directing the user to https://cli.github.com (auto-install via `installGh` only runs on macOS/brew; Linux returns an error string with the install URL).
**Diagnostics on failure:** `gh auth status`, `which gh`, launcher log, DevTools console, screenshot of status bar, the GitHub repo's "Allow auto-merge" setting.
**References:** [Monitor pull request status](https://code.claude.com/docs/en/desktop#monitor-pull-request-status)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:464281``GitHubPrManager` (`prStateCache`, `prChecksCache`); `getPrChecks` at line 464964 fans out to `gh pr view`.
- `build-reference/app-extracted/.vite/build/index.js:464368``"gh CLI not found in PATH"` throw site that backs the missing-`gh` prompt.
- `build-reference/app-extracted/.vite/build/index.js:464480``installGh()`: macOS-only `brew install gh`; Linux/Windows return error pointing to https://cli.github.com.
- `build-reference/app-extracted/.vite/build/index.js:465019``autoMergeRequest { enabledAt }` GraphQL fragment; `enableAutoMerge` / `disableAutoMerge` at lines 465531 / 465556.
- `build-reference/app-extracted/.vite/build/index.js:534033``AutoFixEngine.handleSessionEvent` toggles on `autoFixEnabled` per session.
## T29 — Worktree isolation
**Severity:** Critical
**Surface:** Code tab → Sidebar (parallel sessions)
**Applies to:** All rows
**Issues:**
**Steps:**
1. In a Code-tab session against a Git project, open two new sessions in parallel via **+ New session**.
2. Make different edits in each session.
3. Confirm `<project-root>/.claude/worktrees/<branch>` exists for each.
4. Archive one session via the sidebar archive icon.
**Expected:** Each session creates an isolated worktree at `<project-root>/.claude/worktrees/<branch>` (or the dir configured in Settings → Claude Code → "Worktree location"). Edits in one session do not appear in another until committed. Archiving removes the worktree.
**Diagnostics on failure:** `git worktree list` from project root, `ls -la <project-root>/.claude/worktrees/`, launcher log.
**References:** [Work in parallel with sessions](https://code.claude.com/docs/en/desktop#work-in-parallel-with-sessions)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:462835``getWorktreeParentDir()`: returns `<baseRepo>/.claude/worktrees`, or `<chillingSlothLocation.customPath>/<basename>` when overridden in Settings.
- `build-reference/app-extracted/.vite/build/index.js:462843``createWorktree()`: runs `git worktree add` with `core.longpaths=true` under the parent dir.
- `build-reference/app-extracted/.vite/build/index.js:463290``git worktree remove --force` invoked on archive (cleanup path).
- `build-reference/app-extracted/.vite/build/index.js:55231``chillingSlothLocation: "default"` settings key (Settings → "Worktree location").
## T30 — Auto-archive on PR merge
**Severity:** Should
**Surface:** Code tab → Sidebar
**Applies to:** All rows
**Issues:**
**Steps:**
1. In Settings → Claude Code, enable **Auto-archive on PR close** (`ccAutoArchiveOnPrClose`).
2. Open a PR from a local session. Merge or close it on GitHub.
3. Wait up to ~56 minutes (sweep runs every 5 minutes, with a 30s startup delay). Observe the sidebar.
**Expected:** Local session whose PR is `merged` or `closed` is archived from the sidebar on the next sweep tick (≤ ~5 min) after the merge/close event. Cached PR-state lookups have a 1-hour cooldown for sessions whose state isn't yet terminal. Remote and SSH sessions are not affected.
**Diagnostics on failure:** Screenshot of sidebar, `gh pr view <num>` output (confirming merge state), launcher log, settings file content (`ccAutoArchiveOnPrClose`).
**References:** [Work in parallel with sessions](https://code.claude.com/docs/en/desktop#work-in-parallel-with-sessions)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:55269` — default `ccAutoArchiveOnPrClose: !1` setting.
- `build-reference/app-extracted/.vite/build/index.js:533517` — sweep cadence constants: `$3n = 300_000` ms (5 min interval), `W3n = 3_600_000` ms (1 h recheck cooldown), `Fst = 10` (concurrent batch size).
- `build-reference/app-extracted/.vite/build/index.js:533520``AutoArchiveEngine.start()` schedules the 5-min interval + 30s initial delay.
- `build-reference/app-extracted/.vite/build/index.js:533537``sweep()` gates on `Qi("ccAutoArchiveOnPrClose")` and archives sessions whose `prState` lowercases to `merged` or `closed` (`D3A` predicate at line 533607).
- `build-reference/app-extracted/.vite/build/index.js:533571``archiveSession(..., { cleanupWorktree: true })` removes the worktree alongside the archive.
## T31 — Side chat opens
**Severity:** Should
**Surface:** Code tab → Side chat overlay
**Applies to:** All rows
**Issues:**
**Steps:**
1. In a Code-tab session, press `Ctrl+;` (or type `/btw` in the prompt).
2. Ask a question in the side chat. Confirm the side chat sees the main thread context.
3. Close the side chat. Confirm focus returns to the main session and the side chat content is not in the main thread.
**Expected:** Side chat opens, has access to main-thread context, but its replies do not appear in the main conversation. Closing returns focus.
**Diagnostics on failure:** Screenshot, launcher log, DevTools console.
**References:** [Ask a side question](https://code.claude.com/docs/en/desktop#ask-a-side-question-without-derailing-the-session)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:487025` — side-chat system-prompt suffix: "You are running in a side chat — a lightweight fork… nothing you say here lands in the main transcript."
- `build-reference/app-extracted/.vite/build/index.js:487265``this.sideChats = new Map()` per-session fork registry.
- `build-reference/app-extracted/.vite/build/index.js:491658``startSideChat()` implementation; emits `side_chat_ready` / `side_chat_assistant` / `side_chat_turn_end` / `side_chat_closed` / `side_chat_error` events.
- `build-reference/app-extracted/.vite/build/mainView.js:7506` — preload IPC bridges: `startSideChat`, `sendSideChatMessage`, `stopSideChat` (the renderer SPA wires `Ctrl+;` / `/btw` to these — UI lives in claude.ai's remote bundle, not build-reference).
## T32 — Slash command menu
**Severity:** Should
**Surface:** Code tab → Prompt slash menu
**Applies to:** All rows
**Issues:**
**Steps:**
1. In a Code-tab session, type `/` in the prompt box.
2. Verify built-in commands, custom skills under `~/.claude/skills/`, project skills, and skills from installed plugins all appear.
3. Select an entry — confirm it inserts as a highlighted token.
**Expected:** Slash menu lists every available command/skill. Selection inserts the token correctly.
**Diagnostics on failure:** Screenshot of slash menu, `ls ~/.claude/skills/`, project `.claude/skills/`, installed plugin manifest, launcher log.
**References:** [Use skills](https://code.claude.com/docs/en/desktop#use-skills)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:459463``getSupportedCommands({sessionId})` aggregates per-session `slashCommands` + cowork command registry (`p2()`) + built-ins (`Q_t`).
- `build-reference/app-extracted/.vite/build/index.js:332711``slashCommands: Di.array(Di.string()).optional()` schema field on the session record.
- `build-reference/app-extracted/.vite/build/index.js:377670``SkillManager` constructor: `skillDir = <agentDir>/.claude/skills`, `_discoverSkills()` walks project skills.
- `build-reference/app-extracted/.vite/build/index.js:444678` — private/public skill split under `<skillsRoot>/skills/{private,public}` for plugin-supplied skills.

View File

@@ -0,0 +1,168 @@
# Distribution — DEB, RPM, AppImage
Tests covering Ubuntu/DEB-specific install behavior, Fedora/RPM-specific install behavior, AppImage fallback paths, and the auto-update interaction with system package managers. See [`../matrix.md`](../matrix.md) for status.
## S01 — AppImage launches without manual `libfuse2t64` install
**Severity:** Critical (for Ubuntu users)
**Surface:** AppImage runtime / FUSE
**Applies to:** Ubu (and any Ubuntu 24.04+ host)
**Issues:**
**Steps:**
1. Fresh Ubuntu 24.04 install with default packages only.
2. Download the project AppImage.
3. Make executable and run it.
**Expected:** AppImage runs without first installing `libfuse2t64`. Either the AppImage bundles its own FUSE shim, the `.desktop`/postinst declares the dep, or the launcher gives a clear error pointing at the package name.
**Currently:** Fails on Ubuntu 24.04 with `dlopen(): error loading libfuse.so.2`. Workaround: `sudo apt install libfuse2t64`. Not yet filed.
**Diagnostics on failure:** Full stderr from the AppImage launch, `ldd ./claude-desktop-*.AppImage`, `dpkg -l | grep -i fuse`.
**References:**
**Code anchors:** `scripts/packaging/appimage.sh:226` (downloads the upstream `appimagetool` AppImage as-is — no FUSE shim or static-mksquashfs bundling), `scripts/launcher-common.sh:64` (AppImage forces `--no-sandbox` "due to FUSE constraints"), `.github/workflows/test-artifacts.yml:47` (CI installs `libfuse2` before running the AppImage — i.e. the runtime hard-depends on libfuse2/libfuse2t64). No postinst dep declaration or user-facing FUSE error message exists.
## S02 — `XDG_CURRENT_DESKTOP=ubuntu:GNOME` doesn't break DE detection
**Severity:** Critical
**Surface:** DE detection / patch gate
**Applies to:** Ubu
**Issues:**
**Steps:**
1. On Ubuntu 24.04 (where `XDG_CURRENT_DESKTOP=ubuntu:GNOME`), launch the app.
2. Inspect launcher log for any DE-detection branches that should fire as GNOME.
3. Audit `scripts/launcher-common.sh` and any DE-gated patches for string-equality checks against `XDG_CURRENT_DESKTOP`.
**Expected:** DE-detection logic handles Ubuntu's colon-separated value. `contains "GNOME"` or splitting on `:` is the safe pattern; `== "GNOME"` would miss Ubuntu.
**Diagnostics on failure:** `echo $XDG_CURRENT_DESKTOP`, the relevant launcher.sh code path, launcher log, the patches that ran or didn't.
**References:** Surfaced via session-capture review.
**Code anchors:** `scripts/launcher-common.sh:35-44` (Niri auto-detect lowercases `XDG_CURRENT_DESKTOP` and uses `*niri*` glob — handles colon-separated values), `scripts/patches/quick-window.sh:34-35` and `:117-118` (KDE gate uses `.toLowerCase().includes("kde")` — substring, not equality), `scripts/doctor.sh:304` (purely informational `_info "Desktop: $desktop"`, no branching). No `==` equality checks against `XDG_CURRENT_DESKTOP` exist anywhere in shell or patched JS.
## S03 — DEB install via APT pulls all required runtime deps
**Severity:** Critical
**Surface:** APT repository / dependency declarations
**Applies to:** Ubu (any DEB-based distro)
**Issues:** [`docs/learnings/apt-worker-architecture.md`](../../learnings/apt-worker-architecture.md)
**Steps:**
1. Add the project's APT repo per the README install instructions.
2. `sudo apt install claude-desktop` on a fresh container/VM.
3. Run `claude-desktop` — first launch should succeed with no further package installs.
**Expected:** All transitive runtime deps are declared in the package and pulled by APT. First launch succeeds without manual `apt install` of any extra package.
**Diagnostics on failure:** `apt-cache depends claude-desktop`, missing-library errors from the launcher, `ldd` against the binary.
**References:** [`docs/learnings/apt-worker-architecture.md`](../../learnings/apt-worker-architecture.md)
**Code anchors:** `scripts/packaging/deb.sh:185-197` (DEBIAN/control file — no `Depends:` field is emitted; relies on bundled Electron + the comment "No external dependencies are required at runtime" at line 183), `scripts/packaging/deb.sh:202-230` (postinst only sets chrome-sandbox suid, no dep-pull). Worker chain serving the package: `worker/src/worker.js:22-31` (`DEB_RE`) and `:33-43` (302 → GitHub Releases).
## S04 — RPM install via DNF pulls all required runtime deps
**Severity:** Critical
**Surface:** DNF repository / dependency declarations
**Applies to:** KDE-W, KDE-X, GNOME, Sway, i3, Niri (any RPM-based distro)
**Issues:** [`docs/learnings/apt-worker-architecture.md`](../../learnings/apt-worker-architecture.md) *(covers both APT and DNF)*
**Steps:**
1. Add the project's DNF repo per the README.
2. `sudo dnf install claude-desktop` on a fresh container/VM.
3. Run `claude-desktop` — first launch should succeed.
**Expected:** All transitive runtime deps are declared in the RPM and pulled by DNF. First launch succeeds with no further package installs.
**Diagnostics on failure:** `dnf repoquery --requires claude-desktop`, `rpm -qR claude-desktop`, launcher missing-library errors.
**References:** [`docs/learnings/apt-worker-architecture.md`](../../learnings/apt-worker-architecture.md)
**Code anchors:** `scripts/packaging/rpm.sh:188` (`AutoReqProv: no` — explicitly disables RPM's auto-dep generation; spec declares no `Requires:`), `scripts/packaging/rpm.sh:194-198` (strip + build-id disabled because Electron binaries don't tolerate them — bundled approach). Worker chain: `worker/src/worker.js:28-31` (`RPM_RE`).
## S05 — Doctor recognises dnf-installed package, doesn't false-flag as AppImage
**Severity:** Should
**Surface:** Doctor package-format detection
**Applies to:** KDE-W, KDE-X, GNOME, Sway, i3, Niri
**Issues:**
**Steps:**
1. On a Fedora/Nobara/RPM-based distro with claude-desktop installed via dnf, run `claude-desktop --doctor`.
2. Look for the install-method line.
**Expected:** Doctor detects rpm install (e.g. via `rpm -qf` against the binary path) and reports it cleanly. No `not found via dpkg (AppImage?)` warning.
**Currently:** Doctor's install-method check is gated on `command -v dpkg-query`, so on RPM-only hosts (no dpkg installed) the block is skipped entirely — no install-method line is printed. On hosts that have *both* `dpkg-query` and an rpm-installed `claude-desktop` (uncommon, e.g. mixed Debian + dnf), the misleading `claude-desktop not found via dpkg (AppImage?)` WARN does fire. Either way, no `rpm -qf` branch exists. Affects KDE-W, KDE-X, GNOME, Sway, i3, Niri rows ([T13](./launch.md#t13--doctor-reports-correct-package-format)). Not yet filed.
**Diagnostics on failure:** Full `--doctor` output, `rpm -qf $(which claude-desktop)`, the doctor source line that decides the format.
**References:** [T13](./launch.md#t13--doctor-reports-correct-package-format)
**Code anchors:** `scripts/doctor.sh:353-362` — install-method check is gated on `command -v dpkg-query`; only runs on Debian-family hosts. Falls through to `_warn 'claude-desktop not found via dpkg (AppImage?)'` only if `dpkg-query` is present but returns empty. On Fedora/RPM hosts (`dpkg-query` absent), the entire block is skipped and **no install-method line is printed at all** — neither the misleading WARN nor a correct `rpm -qf` PASS. The drift is "no detection" rather than "false-flag as AppImage" on dpkg-less systems.
## S15 — AppImage extraction (`--appimage-extract`) works as documented fallback
**Severity:** Could
**Surface:** AppImage runtime / FUSE-less fallback
**Applies to:** Any AppImage row
**Issues:**
**Steps:**
1. On a host without FUSE, run `./claude-desktop-*.AppImage --appimage-extract`.
2. Inspect `squashfs-root/`.
3. Run `squashfs-root/AppRun`.
**Expected:** Extraction completes. `squashfs-root/AppRun` launches the app cleanly without FUSE.
**Diagnostics on failure:** Extraction stderr, `ls squashfs-root/`, AppRun stderr.
**References:** Linked from the runtime error message when FUSE is missing.
**Code anchors:** `scripts/packaging/appimage.sh:282` and `:312` (built with stock `appimagetool`, which always supports `--appimage-extract`), `scripts/packaging/appimage.sh:70-118` (`AppRun` script that lives at `squashfs-root/AppRun` after extraction). CI exercises this path: `tests/test-artifact-appimage.sh:36-44` and `.github/workflows/ci.yml:388` both run `--appimage-extract` and assert `squashfs-root/` exists.
## S16 — AppImage mount cleans up on app exit
**Severity:** Should
**Surface:** AppImage mount lifecycle
**Applies to:** Any AppImage row
**Issues:** [CLAUDE.md "Common Gotchas"](https://github.com/aaddrick/claude-desktop-debian/blob/main/CLAUDE.md)
**Steps:**
1. Launch the AppImage. Confirm `mount | grep claude` shows the mount.
2. Quit the app cleanly via tray → Quit (or `Ctrl+Q`).
3. Re-run `mount | grep claude` — mount should be gone.
**Expected:** AppImage's mount at `/tmp/.mount_claude*` is unmounted and the directory removed when all child Electron processes exit. Stale mounts after force-quit are handled by `pkill -9 -f "mount_claude"` per CLAUDE.md but should not be the common case.
**Diagnostics on failure:** `mount | grep claude` after exit, `ls -la /tmp/.mount_claude*`, `pgrep -af claude`, `journalctl -k -n 50` for mount errors.
**References:** [CLAUDE.md "Common Gotchas"](https://github.com/aaddrick/claude-desktop-debian/blob/main/CLAUDE.md)
**Code anchors:** Mount lifecycle is owned by upstream `appimagetool`'s runtime, not this repo — `scripts/packaging/appimage.sh:282`/`:312` invokes the stock tool with no custom AppRun-side cleanup. `CLAUDE.md:179-183` documents `pkill -9 -f "mount_claude"` as the manual recovery for stale mounts after force-quit. No project-side unmount handler exists; the test asserts upstream behavior, not ours.
## S26 — Auto-update is disabled when installed via `apt` / `dnf`
> **⚠ Missing in build 1.5354.0** — No project-side suppression of upstream auto-update exists; the launcher exports `ELECTRON_FORCE_IS_PACKAGED=true`, which causes upstream's `lii()` gate to return true on Linux and the auto-update tick loop to start. Suppression is "accidental" — it relies on Electron's built-in `autoUpdater` module being unimplemented on Linux (so `setFeedURL`/`checkForUpdates` throw, the `error` listener logs, and no download happens). Tracked at [#567](https://github.com/aaddrick/claude-desktop-debian/issues/567); re-verify after next upstream bump.
**Severity:** Critical
**Surface:** Auto-update path
**Applies to:** All DEB/RPM rows
**Issues:** [#567](https://github.com/aaddrick/claude-desktop-debian/issues/567)
**Steps:**
1. Install via APT or DNF.
2. Launch the app and let it sit for ~5 minutes.
3. Inspect launcher log + filesystem for any auto-update download attempt.
**Expected:** When installed via the project's APT or DNF repo, the in-app auto-update path is suppressed. The app does not download replacement binaries (which would race the package manager). Updates flow through `apt upgrade` / `dnf upgrade` only. AppImage installs may continue to self-update or punt to the user.
**Diagnostics on failure:** Launcher log, network captures (look for downloads from `releases.anthropic.com` or `api.anthropic.com/api/desktop/linux/...`), filesystem changes under `~/.config/Claude/`.
**References:** [`docs/learnings/apt-worker-architecture.md`](../../learnings/apt-worker-architecture.md)
**Code anchors:** `scripts/launcher-common.sh:249` (`export ELECTRON_FORCE_IS_PACKAGED=true` — makes upstream think it's installed); `build-reference/app-extracted/.vite/build/index.js:508761-508769` (upstream `lii()` returns `hA.app.isPackaged` on Linux — passes the gate); `:508554-508559` (only suppression hook is enterprise-policy `disableAutoUpdates`, no Linux/distro carve-out); `:508770-508774` (feed URL `https://api.anthropic.com/api/desktop/linux/<arch>/squirrel/update?...`); `:508800-508803` (calls `hA.autoUpdater.setFeedURL` + `.checkForUpdates()` unconditionally on Linux). No patch in `scripts/patches/*.sh` neutralizes the autoUpdater module or sets `disableAutoUpdates`. AppImage continues to ship update info: `scripts/packaging/appimage.sh:308-309` (`gh-releases-zsync` zsync metadata embedded for releases).

View File

@@ -0,0 +1,153 @@
# Extensibility — Plugins, MCP, Hooks, Memory
Tests covering the Anthropic & Partners plugin install flow, the plugin browser, MCP server config, hooks, `CLAUDE.md` memory loading, and per-user storage of plugins/worktrees. See [`../matrix.md`](../matrix.md) for status.
## T11 — Plugin install (Anthropic & Partners)
**Severity:** Smoke
**Surface:** Plugin browser → install flow
**Applies to:** All rows
**Issues:** [`docs/learnings/plugin-install.md`](../../learnings/plugin-install.md)
**Steps:**
1. In a Code-tab session, click **+** → **Plugins****Add plugin**.
2. Find an Anthropic & Partners plugin. Click **Install**.
3. Verify it lands in **Manage plugins** and its skills appear in the slash menu.
4. Re-install the same plugin to verify idempotence.
**Expected:** Install completes end-to-end: gate logic accepts, backend endpoint responds, plugin appears in the plugin list. Re-install is idempotent.
**Diagnostics on failure:** DevTools network panel during install, launcher log, `~/.claude/plugins/` content, the gate-logic code path (see learnings doc).
**References:** [`docs/learnings/plugin-install.md`](../../learnings/plugin-install.md), [Install plugins](https://code.claude.com/docs/en/desktop#install-plugins)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:507181` (`installPlugin` IPC + gate, with `pluginSource === "remote"` branch and CLI fallback); `:507193` log `[CustomPlugins] installPlugin: attempting remote API install`; `:465816` `dx()` returns `~/.claude/plugins`; `:465822` `installed_plugins.json` (idempotency record).
**Inventory anchor:** `…customize.main.navigation.button-by-name.add-plugin` (role `button`, label `Add plugin`); sibling `…button-by-name.browse-plugins` (label `Browse plugins`). Both are persistent in the Customize panel — anchors the entry-point click chain.
## T33 — Plugin browser
**Severity:** Should
**Surface:** Plugin browser UI
**Applies to:** All rows
**Issues:**
**Steps:**
1. Click **+** → **Plugins****Add plugin**.
2. Confirm entries from the official Anthropic marketplace appear.
3. Install a non-Anthropic plugin end-to-end.
4. Verify it shows in **Manage plugins** and contributes its skills to the slash menu.
**Expected:** Plugin browser opens, shows the marketplace, install completes. Installed plugins appear under Manage plugins and contribute to the slash menu.
**Diagnostics on failure:** Screenshot of plugin browser, network captures, launcher log, `~/.claude/plugins/` listing.
**References:** [Install plugins](https://code.claude.com/docs/en/desktop#install-plugins)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:71392` (`CustomPlugins.listMarketplaces` IPC); `:71534` (`listAvailablePlugins` IPC); `:507176` (`listMarketplaces` main-process handler); `:496236` deep-link route `plugins/new` opens the browser surface.
**Inventory anchor:** `…customize.main.navigation.button-by-name.browse-plugins` (role `button`, label `Browse plugins`); sibling `…link-by-name.connectors` (role `link`, label `Connectors`). The browser surface itself (marketplace listings, install button) appears under a child dialog not captured at idle — re-capture with the dialog open to anchor those.
## T35 — MCP server config picked up
**Severity:** Critical
**Surface:** MCP / Code tab
**Applies to:** All rows
**Issues:**
**Steps:**
1. Add an MCP server to `~/.claude.json` or `<project>/.mcp.json`.
2. Open a Code-tab session against the project.
3. Type `/` in the prompt — verify MCP-provided tools appear in the slash menu (or invoke one directly).
4. Separately, confirm `claude_desktop_config.json` (Chat-tab MCP) is **not** picked up by Code tab.
**Expected:** MCP servers in `~/.claude.json` or `.mcp.json` start when a Code session opens. Tools appear in the slash menu, calls succeed end-to-end. `claude_desktop_config.json` is separate per upstream docs.
**Diagnostics on failure:** Server stderr (MCP servers log to stderr), `~/.claude.json` and `.mcp.json` content, launcher log, DevTools console for MCP wire errors.
**References:** [MCP servers: desktop chat app vs Claude Code](https://code.claude.com/docs/en/desktop#shared-configuration), [`docs/learnings/plugin-install.md`](../../learnings/plugin-install.md)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:215418` (Code-tab loads `<project>/.mcp.json` per scanned dir); `:176766` reads `~/.claude.json`; `:489098` Code-session passes `settingSources: ["user", "project", "local"]` to the agent SDK; `:130821` `claude_desktop_config.json` is the chat-tab path constant (separate userData dir at `:130829` `kee()`), confirming the two trees do not overlap.
## T36 — Hooks fire
**Severity:** Critical
**Surface:** Hooks runtime
**Applies to:** All rows
**Issues:**
**Steps:**
1. Add a `SessionStart` hook in `~/.claude/settings.json` that writes a marker file.
2. Open a new Code-tab session.
3. Confirm the marker file exists.
4. Repeat with `PreToolUse` / `PostToolUse` hooks. Switch transcript view to Verbose to see the hook output.
**Expected:** Hooks defined in `~/.claude/settings.json` execute at the documented points. Hook output is visible in Verbose transcript mode. A failing hook surfaces a clear error rather than silently breaking the session.
**Diagnostics on failure:** Hook script stderr, marker file presence, launcher log, settings file content, Verbose transcript output.
**References:** [Shared configuration](https://code.claude.com/docs/en/desktop#shared-configuration)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:489098` Code-session sets `settingSources: ["user", "project", "local"]` (agent SDK reads `~/.claude/settings.json` hooks from this); `:455717` built-in `PreToolUse` hooks registry the runtime extends; `:455819` `UserPromptSubmit`; `:465680` `PostToolUse`; `:465754` `Stop`; `:493411` runtime emits `hook_started` / `hook_progress` / `hook_response` for `SessionStart` (Verbose transcript path).
## T37 — `CLAUDE.md` memory loads
**Severity:** Critical
**Surface:** Memory / Code tab session prompt
**Applies to:** All rows
**Issues:**
**Steps:**
1. Confirm a project `CLAUDE.md` exists at the working folder.
2. Confirm `~/.claude/CLAUDE.md` exists with at least one identifying token.
3. Open a Code-tab session against the project.
4. Ask Claude "what's in your CLAUDE.md" — verify the response matches on-disk content.
5. Edit `CLAUDE.md`. Start a new session — verify the new content is loaded.
**Expected:** Project `CLAUDE.md` and `CLAUDE.local.md` at the working folder, plus `~/.claude/CLAUDE.md`, are loaded into the session's system prompt. Updates after edit on the next session start.
**Diagnostics on failure:** `cat CLAUDE.md` and `cat ~/.claude/CLAUDE.md` outputs, launcher log, system-prompt dump if accessible (Verbose transcript may show it).
**References:** [Shared configuration](https://code.claude.com/docs/en/desktop#shared-configuration)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:259691` working-dir scan reads `CLAUDE.md` and `.claude/CLAUDE.md`; `:455188` global account memory `zhA(accountId, orgId)` is copied to the per-session `.claude/CLAUDE.md` at session start (`[GlobalMemory] Copied CLAUDE.md`); `:283107` `cE()` resolves `CLAUDE_CONFIG_DIR` or `~/.claude`, the dir whose `CLAUDE.md` the agent SDK loads via `settingSources: ["user", ...]` (see T36 anchor at `:489098`).
## S27 — Plugins install per-user, not into system paths
**Severity:** Should
**Surface:** Plugin storage
**Applies to:** All rows
**Issues:**
**Steps:**
1. As a non-root user, install a plugin via the desktop plugin browser.
2. Inspect `~/.claude/plugins/` for the install.
3. Verify nothing was written under `/usr` or other system-managed trees (`find /usr -newer /tmp/marker -name '*claude*' 2>/dev/null` after `touch /tmp/marker; install plugin`).
**Expected:** Plugins land under `~/.claude/plugins/` (or the equivalent per-user dir). Never under `/usr`. Non-root install/enable/disable works without `sudo`.
**Diagnostics on failure:** `find / -name '*<plugin-name>*' 2>/dev/null`, install logs, launcher log.
**References:** [Install plugins](https://code.claude.com/docs/en/desktop#install-plugins)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:283107` `cE()` resolves the config root to `CLAUDE_CONFIG_DIR` or `~/.claude` — never `/usr`; `:465815` `dx()` returns `<cE()>/plugins`; `:465821`/`:465824`/`:465827` `installed_plugins.json`, `known_marketplaces.json`, `marketplaces/` all sit under `dx()`. No system-path writes in the install path.
## S28 — Worktree creation surfaces clear error on read-only mounts
**Severity:** Could
**Surface:** Worktree creation on read-only filesystem
**Applies to:** All rows (NixOS users hit this most often)
**Issues:**
**Steps:**
1. Place a project on a read-only mount (e.g. squashfs, NFS read-only export, `mount -o ro` bind).
2. Open a Code-tab session against it.
3. Try to start a parallel session that needs a worktree.
**Expected:** Worktree creation fails with a clear error pointing at the read-only mount. No silent loss of work, no writes to a wrong directory, no parent-repo corruption.
**Diagnostics on failure:** `mount | grep <project-path>`, `git worktree add` direct invocation (does it fail the same way?), launcher log, screenshot of error dialog.
**References:** [Work in parallel with sessions](https://code.claude.com/docs/en/desktop#work-in-parallel-with-sessions)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:462841` worktree parent dir is `<repo>/.claude/worktrees` (or `chillingSlothLocation.customPath` override at `:462836`); `:462928` `git worktree add` failure path returns `null` after `R.error("Failed to create git worktree: …")`; `:462760` `Sbn()` classifies "Permission denied" / "Access is denied" / "could not lock config file" as `"permission-denied"` (the read-only-mount taxonomy bucket).

View File

@@ -0,0 +1,77 @@
# Launch & Process Lifecycle
Tests covering app startup, the `--doctor` health check, package-format detection, and multi-instance behavior. See [`../matrix.md`](../matrix.md) for status.
## T01 — App launch
**Severity:** Smoke
**Surface:** App startup
**Applies to:** All rows
**Issues:**
**Runner:** [`tools/test-harness/src/runners/T01_app_launch.spec.ts`](../../../tools/test-harness/src/runners/T01_app_launch.spec.ts)
**Steps:**
1. From a clean session, run `claude-desktop` (deb/rpm) or launch the AppImage.
2. Wait up to 10 seconds.
**Expected:** Main window opens within ~10s. No error toast, no crash. The launcher log at `~/.cache/claude-desktop-debian/launcher.log` shows the expected backend selection (`Using X11 backend via XWayland` on Wayland sessions, or native Wayland when forced).
**Diagnostics on failure:** Launcher log, `--doctor` output, session env (`XDG_SESSION_TYPE`, `XDG_CURRENT_DESKTOP`), `dmesg | tail -50`, any crash report under `~/.config/Claude/logs/`.
**References:**
**Code anchors:** `scripts/launcher-common.sh:98` (X11-via-XWayland log line), `scripts/launcher-common.sh:102` (native-Wayland log line), `build-reference/app-extracted/.vite/build/index.js:524875` (`app.on("ready")` registration), `build-reference/app-extracted/.vite/build/index.js:524881-524931` (main `BrowserWindow` factory `Ori()``titleBarStyle`, mainWindow.js preload, initial `show`).
## T02 — Doctor health check
**Severity:** Critical
**Surface:** CLI / `--doctor`
**Applies to:** All rows
**Issues:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538)
**Steps:**
1. Run `claude-desktop --doctor`.
2. Inspect exit code (`echo $?`) and stdout/stderr.
**Expected:** Exits 0. All checks PASS or report expected WARN. No FAIL checks. Doctor currently reports display-server, menu-bar mode, Electron path/version, Chrome sandbox perms, SingletonLock, MCP config, Node.js, desktop entry, disk space, and a Cowork section — it does **not** surface the resolved titlebar style. See also [T13](#t13--doctor-reports-correct-package-format) for the package-format detection slice.
**Diagnostics on failure:** Full `--doctor` output, the install path being inspected (`which claude-desktop`), package metadata (`dpkg -S` / `rpm -qf` against the binary).
**References:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538)
**Code anchors:** `scripts/doctor.sh:280` (`run_doctor` entry point), `scripts/doctor.sh:301-319` (display-server check), `scripts/doctor.sh:401-417` (SingletonLock check), `scripts/doctor.sh:744-753` (exit-code summary).
## T13 — Doctor reports correct package format
**Severity:** Should
**Surface:** CLI / `--doctor`
**Applies to:** All rows (currently `✗` on every Fedora row — see [S05](./distribution.md#s05--doctor-recognises-dnf-installed-package-doesnt-false-flag-as-appimage))
**Issues:***(no issue filed; surfaced via session-capture review)*
**Steps:**
1. Install via the relevant package manager (`apt` / `dnf`) or AppImage.
2. Run `claude-desktop --doctor` and look for the install-method line.
**Expected:** Doctor identifies the install method correctly. On RPM-based distros (Fedora, Nobara) it does **not** report `not found via dpkg (AppImage?)` — that warning currently false-flags every dnf install. On DEB-based distros it does not assume AppImage when dpkg returns the package metadata.
**Diagnostics on failure:** `dpkg -S $(which claude-desktop)`, `rpm -qf $(which claude-desktop)`, full `--doctor` output, the line of doctor source that decides the format.
**References:** [S05](./distribution.md#s05--doctor-recognises-dnf-installed-package-doesnt-false-flag-as-appimage)
**Code anchors:** `scripts/doctor.sh:353-362` — version probe is dpkg-only (`dpkg-query -W -f='${Version}' claude-desktop`); on RPM/AppImage hosts that lack `dpkg-query` the block is skipped, but on a Fedora host that *does* have `dpkg-query` installed (e.g. for cross-distro tooling) the `_warn 'claude-desktop not found via dpkg (AppImage?)'` branch fires for any dnf-installed copy. There is no corresponding `rpm -qf` / `rpm -q claude-desktop` branch.
## T14 — Multi-instance behavior
**Severity:** Critical
**Surface:** App lifecycle
**Applies to:** All rows
**Issues:** [PR #536](https://github.com/aaddrick/claude-desktop-debian/pull/536) (closed, docs-only — no in-tree opt-in flag)
**Steps:**
1. Launch `claude-desktop`. Wait for the main window.
2. Launch `claude-desktop` again from another terminal or `.desktop` invocation.
3. Optionally: follow the manual `--user-data-dir` recipe sketched in PR #536 (separate Electron `userData` per profile so each gets its own `SingletonLock` — note the PR was closed, the recipe is not shipped in-tree).
**Expected:** Second invocation focuses the existing window — no new process. The launcher's `cleanup_stale_lock` removes a `SingletonLock` whose owning PID is no longer running. With separate `--user-data-dir` per profile (manual workaround, not an in-tree feature), each profile runs an independent Electron instance.
**Diagnostics on failure:** `pgrep -af claude-desktop`, `ls -la ~/.config/Claude/SingletonLock`, launcher log, any "another instance is running" dialog text.
**References:** [PR #536](https://github.com/aaddrick/claude-desktop-debian/pull/536)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:525162-525173` (`requestSingleInstanceLock()` + `app.on("second-instance", ...)` — shows existing window, restores if minimized, focuses), `build-reference/app-extracted/.vite/build/index.js:525204-525207` (early-return on lost lock at `app.on("ready")`), `scripts/launcher-common.sh:187-208` (`cleanup_stale_lock` — drops a `SingletonLock` symlink whose `hostname-PID` target points at a dead PID).

View File

@@ -0,0 +1,282 @@
# Platform Integration
Tests covering autostart, Cowork integration, WebGL graceful degradation, `.desktop`-launch env inheritance, encrypted env-var storage, the macOS/Windows-only Computer Use feature, and Dispatch session pairing. See [`../matrix.md`](../matrix.md) for status.
## T09 — AutoStart via XDG
**Severity:** Critical
**Surface:** XDG Autostart
**Applies to:** All rows
**Issues:** [PR #450](https://github.com/aaddrick/claude-desktop-debian/pull/450)
**Steps:**
1. In Settings, toggle "Open at Login" / "Start at boot" ON.
2. Inspect `~/.config/autostart/` for a `.desktop` entry.
3. Logout/login. Verify app launches automatically.
4. Toggle OFF. Verify the autostart entry is removed.
**Expected:** Toggling ON creates a `~/.config/autostart/*.desktop` entry that is XDG-spec compliant (not a custom systemd unit or shell hook). After login, app launches automatically. Toggling OFF removes the entry.
**Diagnostics on failure:** `ls -la ~/.config/autostart/`, content of the .desktop file, `desktop-file-validate` on it, launcher log.
**References:** [PR #450](https://github.com/aaddrick/claude-desktop-debian/pull/450)
**Code anchors:**
- `scripts/frame-fix-wrapper.js:376` — XDG Autostart shim
intercepting `app.{get,set}LoginItemSettings` (writes/removes
`$XDG_CONFIG_HOME/autostart/claude-desktop.desktop`).
- `scripts/frame-fix-wrapper.js:429``buildAutostartContent()`
emits the spec-compliant `[Desktop Entry]` block.
- `build-reference/app-extracted/.vite/build/index.js:524205`
upstream `isStartupOnLoginEnabled` / `setStartupOnLoginEnabled` IPC
surface that the wrapper interposes on.
## T10 — Cowork integration
**Severity:** Should
**Surface:** Cowork tab + VM daemon
**Applies to:** All rows
**Issues:** [`docs/learnings/cowork-vm-daemon.md`](../../learnings/cowork-vm-daemon.md)
**Steps:**
1. Sign into the app. Open the Cowork tab.
2. Confirm Cowork-specific UI renders (ghost icon in topbar, Cowork menus).
3. Trigger a Cowork action that needs the VM daemon.
4. Kill the VM daemon process; verify it respawns within the documented timeout.
**Expected:** Cowork features render. VM daemon spawns when needed, files are visible, daemon respawns within the documented timeout if it crashes.
**Diagnostics on failure:** `pgrep -af cowork`, daemon logs, launcher log, the respawn-logic code path (see learnings doc).
**References:** [`docs/learnings/cowork-vm-daemon.md`](../../learnings/cowork-vm-daemon.md)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:143371`
upstream's Windows named-pipe path (`\\.\pipe\cowork-vm-service`)
that `scripts/patches/cowork.sh` Patch 1 rewrites to
`$XDG_RUNTIME_DIR/cowork-vm-service.sock`.
- `build-reference/app-extracted/.vite/build/index.js:143453`
`kUe()` retry loop (5 attempts, 1 s gap) that the auto-launch
injection from Patch 6 piggybacks on after the rewrite.
- `scripts/patches/cowork.sh:244` — Patch 6 (auto-launch + stdio
pipe + 10 s rate-limited respawn — issue #408).
- `scripts/patches/cowork.sh:365` — Patch 6b (extends the
reinstall-delete list with `sessiondata.img` / `rootfs.img.zst`
so a wedged daemon can self-recover).
## T12 — WebGL warn-only
**Severity:** Could
**Surface:** Chromium GPU diagnostics
**Applies to:** All rows (especially VM rows and hybrid-GPU laptops)
**Issues:**
**Steps:**
1. Launch the app. Open DevTools → navigate to `chrome://gpu`.
2. Inspect WebGL1/WebGL2 status.
3. Use the app for ~5 minutes — exercise UI, sidebar, settings.
**Expected:** WebGL1/2 may report as blocklisted (typical on virtio-gpu in VMs and on hybrid GPU laptops). This is informational. UI continues to render without graphical glitches; no feature is broken by the blocklist.
**Diagnostics on failure:** `chrome://gpu` full content, screenshot of any visual glitch, `glxinfo | head -20` (X11) or `eglinfo` (Wayland), `lspci -k | grep -A2 VGA`.
**References:**
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:524809`
`app.disableHardwareAcceleration()` is gated on the user-toggleable
`isHardwareAccelerationDisabled` setting; upstream does not pass
`--ignore-gpu-blocklist` or `--use-gl=*`, so chrome://gpu reflects
Chromium's stock blocklist behaviour.
- `build-reference/app-extracted/.vite/build/index.js:500571`
the only `webgl:!1` override is scoped to the feedback popup
(`in-memory-feedback` partition); main UI does not disable WebGL.
## S17 — App launched from `.desktop` inherits shell `PATH`
**Severity:** Critical
**Surface:** `.desktop`-launch env handling
**Applies to:** All rows
**Issues:**
**Steps:**
1. Configure `~/.bashrc` (or `~/.zshrc`) with `export PATH="$HOME/.custom-bin:$PATH"` and a custom binary in that dir.
2. Launch the app via dmenu/krunner/GNOME Activities/Plasma launcher (i.e. **not** from a terminal).
3. Open a Code-tab terminal pane. Run `which <custom-binary>`.
4. Repeat for `npm`, `node`, `git`, `gh`.
**Expected:** Code session can find tools defined in the user's shell profile, even when the app was launched non-interactively. Either the launcher script sources the user's shell profile, or the app reads `~/.bashrc` / `~/.zshrc` to extract `PATH` the way macOS does.
**Diagnostics on failure:** `echo $PATH` from inside the integrated terminal, the env passed to the app process (`cat /proc/$(pgrep -f electron)/environ | tr '\0' '\n' | grep PATH`), launcher log.
**References:** [Local sessions](https://code.claude.com/docs/en/desktop#local-sessions), [Session not finding installed tools](https://code.claude.com/docs/en/desktop#session-not-finding-installed-tools)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:259300`
`SLr()` resolves the bundled `shell-path-worker/shellPathWorker.js`.
- `build-reference/app-extracted/.vite/build/index.js:259349`
`NLr()` forks it via `utilityProcess.fork`; on success
`FX()` (line 259311) merges the extracted env into `process.env`.
- `build-reference/app-extracted/.vite/build/shell-path-worker/shellPathWorker.js:205`
`extractPathFromShell()` runs the user's login shell (`-l -i`)
and parses the printed `$PATH` between sentinels (mac-style env
inheritance now applied on Linux too).
## S18 — Local environment editor persists across reboot
**Severity:** Should
**Surface:** Local env editor / encrypted store
**Applies to:** All rows
**Issues:**
**Steps:**
1. Open the local environment editor. Add `TEST_VAR=hello`.
2. Restart the app — verify variable is still there.
3. Reboot the host. Sign back in. Verify variable is still there.
**Expected:** Variables saved via the local environment editor (per-app, encrypted) survive a logout/login cycle and a full reboot. On Linux this implies the encrypted store is wired to libsecret / kwallet / gnome-keyring and unlocks at session start.
**Diagnostics on failure:** `secret-tool search` (libsecret), `kwallet5-query` (KDE), `seahorse` UI inspection (GNOME), launcher log, the env-editor IPC call.
**References:** [Local sessions](https://code.claude.com/docs/en/desktop#local-sessions)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:259251`
`I2t = new K_({ name: "ccd-environment-config", ... })` electron-store
backing file (`~/.config/Claude/ccd-environment-config.json`).
- `build-reference/app-extracted/.vite/build/index.js:259253`
`hLr()` writes via `safeStorage.encryptString` (libsecret on Linux).
- `build-reference/app-extracted/.vite/build/index.js:259268`
`J1()` decrypts on read; bails to `{}` if `safeStorage` reports
encryption unavailable (no keyring backend running).
- `build-reference/app-extracted/.vite/build/index.js:70782`
`LocalSessionEnvironment.save` IPC entry that calls into `hLr`.
## S22 — Computer-use toggle is absent or visibly disabled on Linux
**Severity:** Should
**Surface:** Settings → Desktop app → General
**Applies to:** All rows
**Issues:**
**Steps:**
1. Open Settings → Desktop app → General.
2. Look for the "Computer use" toggle.
**Expected:** Toggle either does not render on Linux, or renders as a disabled control with a clear "not supported on Linux" hint. Must not appear functional and silently fail (e.g. flip on but never produce screen-control behavior).
**Diagnostics on failure:** Screenshot of the Settings page, DevTools inspection of the toggle DOM (is it conditionally hidden? disabled? always-rendered?), launcher log.
**References:** [Let Claude use your computer](https://code.claude.com/docs/en/desktop#let-claude-use-your-computer), [Dispatch and computer use](https://claude.com/blog/dispatch-and-computer-use)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:240557`
`qDA = new Set(["darwin", "win32"])` excludes Linux from the
computer-use platform set.
- `build-reference/app-extracted/.vite/build/index.js:241190`
`TF()` (the master enable check) short-circuits to `false` when
`qDA.has(process.platform)` is false, so toggling
`chicagoEnabled` on Linux can't activate the feature.
- `build-reference/app-extracted/.vite/build/index.js:242387`
`tvr()` returns `{ status: "unsupported", reason: "Computer use
is not available on this platform", unsupportedCode:
"unsupported_platform" }` for the Settings UI — confirms the
toggle should render with a platform-unavailable hint, not silent
failure.
## S23 — Dispatch-spawned sessions don't soft-lock on a never-approvable computer-use prompt
**Severity:** Critical (for Dispatch users)
**Surface:** Dispatch session lifecycle on Linux
**Applies to:** All rows with Dispatch enabled
**Issues:**
**Steps:**
1. From a paired phone, dispatch a task that would invoke computer use.
2. Observe the Code-tab session that spawns on the desktop.
3. Try to interact with other parts of the app.
**Expected:** Permission prompt times out or denies cleanly rather than hanging the session indefinitely. User can continue interacting with the rest of the app.
**Diagnostics on failure:** Screenshot of session state, launcher log, sidebar state (is the Dispatch session blocking the whole sidebar?), `pgrep -af claude`.
**References:** [Sessions from Dispatch](https://code.claude.com/docs/en/desktop#sessions-from-dispatch)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:512789`
`tool_permission_request` notification handler explicitly skips
`toolName.startsWith("computer:")`, so the desktop never queues a
user-facing prompt for computer-use tool calls (which couldn't run
on Linux anyway — see S22).
- `build-reference/app-extracted/.vite/build/index.js:241190`
`TF()` gates computer-use execution off entirely on Linux, so a
Dispatch-spawned session that requests it should hit the upstream
"Set up computer use" remote-client setup card
(`index.js:330114`) rather than block on a desktop prompt.
## S24 — Dispatch-spawned Code session appears with badge and notification
**Severity:** Critical
**Surface:** Dispatch handoff
**Applies to:** All rows with Dispatch enabled
**Issues:**
**Steps:**
1. From a paired phone, dispatch a task that routes to Code (e.g. "fix this bug").
2. Observe the desktop sidebar.
3. Confirm a desktop notification fires.
4. Open the session and confirm 30-min approval expiry per upstream docs.
**Expected:** Dispatch task creates a sidebar entry tagged **Dispatch**, posts a desktop notification, and lands ready for review. App-permission approvals on this session expire after 30 minutes per upstream docs.
**Diagnostics on failure:** Screenshot of sidebar (badge present?), notification daemon state, launcher log, the Dispatch pairing config under `~/.config/Claude/`.
**References:** [Sessions from Dispatch](https://code.claude.com/docs/en/desktop#sessions-from-dispatch), [Dispatch and computer use](https://claude.com/blog/dispatch-and-computer-use)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:144561`
`Sd = "dispatch_child"` session-type constant.
- `build-reference/app-extracted/.vite/build/index.js:512200`
`onRemoteSessionStart` IPC routes a Dispatch-initiated child
session into the local sidebar via `dispatchOnRemoteSessionStart`.
- `build-reference/app-extracted/.vite/build/index.js:285621`
`notifyDispatchParentIfNeeded()` posts the
`Task "<title>" <state>` meta-notification when the dispatch
child finishes (lands the result in the parent thread's
notification queue).
- `build-reference/app-extracted/.vite/build/index.js:285954`
`kind:"dispatch_child"` is the sidebar badge tag.
## S25 — Mobile pairing survives Linux session restart
**Severity:** Should
**Surface:** Dispatch pairing persistence
**Applies to:** All rows with Dispatch enabled
**Issues:**
**Steps:**
1. Pair the desktop with a phone.
2. Quit the app fully. Re-launch.
3. Try a Dispatch task. Verify pairing still works without re-pairing.
4. Logout/login the desktop. Re-test.
**Expected:** Pairing remains active across app restart and logout/login. Pairing token is stored under `~/.config/Claude/` (or wherever the secure store lives) and survives.
**Diagnostics on failure:** `ls -la ~/.config/Claude/`, secret-store inspection, launcher log, pairing-flow IPC.
**References:** [Sessions from Dispatch](https://code.claude.com/docs/en/desktop#sessions-from-dispatch)
**Code anchors:**
- `build-reference/app-extracted/.vite/build/index.js:511984`
`ZEe = "coworkTrustedDeviceToken"` electron-store key for the
trusted-device token.
- `build-reference/app-extracted/.vite/build/index.js:511989`
`oYn()` writes the token via `safeStorage.encryptString` (libsecret
on Linux); `aYn()` (`:512003`) decrypts on read.
- `build-reference/app-extracted/.vite/build/index.js:512022`
`gYn()` re-enrolls via `POST /api/auth/trusted_devices` only when
there's no cached token, so a successful pair survives restart.
- `build-reference/app-extracted/.vite/build/index.js:330229`
`_5r = "bridge-state.json"` (per-org/account bridge state under
`~/.config/Claude/bridge-state.json`); `JF()`/`X0A()` at `:330230`
read/locate it.

View File

@@ -0,0 +1,125 @@
# Routines & Scheduled Tasks
Tests covering the Routines page, scheduled task firing, catch-up runs after suspend, and the suspend-inhibit toggle. See [`../matrix.md`](../matrix.md) for status.
## T26 — Routines page renders
**Severity:** Critical
**Surface:** Routines page
**Applies to:** All rows
**Issues:**
**Steps:**
1. Sign into the app, open the Code tab.
2. Click **Routines** in the sidebar.
3. Click **New routine****Local**.
**Expected:** Routines list opens. New-routine form shows all schedule presets (Manual, Hourly, Daily, Weekdays, Weekly), permission-mode picker, model picker, working-folder picker, and worktree toggle.
**Diagnostics on failure:** Screenshot of the Routines page (or the failure state), DevTools console output, launcher log, network captures of the routines API call (`mitmproxy` or DevTools network panel).
**References:** [Schedule recurring tasks](https://code.claude.com/docs/en/desktop-scheduled-tasks)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:507710` (create payload — `permissionMode`, `model`, `userSelectedFolders`, `useWorktree`, `cronExpression`, `fireAt`); `build-reference/app-extracted/.vite/build/index.js:280299` (`@hourly: "0 * * * *"` preset)
**Inventory anchors:** `root.complementary.button-by-name.routines` (sidebar entry); `root.complementary.button-by-name.routines.main.region.button-by-name.new-routine` (form trigger); siblings `…button-by-name.all`, `…button-by-name.calendar` (list-view tabs). Preset list (Hourly/Daily/etc.) lives inside the New-routine modal and is not in the idle-state inventory — re-capture with the modal open to anchor.
## T27 — Scheduled task fires and notifies
**Severity:** Critical
**Surface:** Routines runtime + libnotify
**Applies to:** All rows
**Issues:**
**Steps:**
1. Create a Manual task with a simple instruction (e.g. "echo hello").
2. Click **Run now**. Observe.
3. Optionally: create an Hourly task and verify across the next hour boundary.
**Expected:** A fresh session starts, appears in the **Scheduled** section of the sidebar, and posts a desktop notification when it begins. Subsequent runs respect the deterministic offset described in upstream docs.
**Diagnostics on failure:** Launcher log, screenshot of sidebar, `gdbus call --session --dest=org.freedesktop.Notifications --object-path=/org/freedesktop/Notifications --method=org.freedesktop.DBus.Introspectable.Introspect` (verify daemon present), task SKILL.md content under `~/.claude/scheduled-tasks/<task-name>/`.
**References:** [How scheduled tasks run](https://code.claude.com/docs/en/desktop-scheduled-tasks#how-scheduled-tasks-run)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:282332` (`runNow(A)` — manual dispatch); `build-reference/app-extracted/.vite/build/index.js:512837` (`Rc.showNotification(...,scheduled-${l},...)` — desktop notification on completion); `build-reference/app-extracted/.vite/build/index.js:282654` (`getJitterSecondsForTask` — deterministic per-task offset via `v2r(A, n*60)`, capped by `dispatchJitterMaxMinutes` default 10)
## T28 — Scheduled task catch-up after suspend
**Severity:** Should
**Surface:** Routines runtime / wake-from-suspend
**Applies to:** All rows
**Issues:**
**Steps:**
1. Create an Hourly task.
2. Suspend the host (`systemctl suspend`).
3. Wait past at least one hourly slot. Wake the host.
4. Observe whether a catch-up run starts.
**Expected:** Exactly one catch-up run for the most recently missed slot (older missed slots are discarded). Notification announces the catch-up. Missed runs older than seven days are not retried.
**Diagnostics on failure:** Task history in the routines detail page, launcher log, `journalctl --since="-1 day" | grep -i suspend`.
**References:** [Missed runs](https://code.claude.com/docs/en/desktop-scheduled-tasks#missed-runs)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:281695` (`R2r` — walks back from now, capped at `10080 * 60 * 1e3` ms = 7 days, returns at most one missed slot, dedupes by `IfA` bucket-key); `build-reference/app-extracted/.vite/build/index.js:281942` (`scheduledTaskPostWakeDelayMs` default 60000 ms — gates dispatch after `powerMonitor.on("resume")`); `build-reference/app-extracted/.vite/build/index.js:282569` (catch-up branch: `c ? 0 : this.getJitterSecondsForTask(o.id)` — missed-slot dispatch skips jitter)
## S19 — `CLAUDE_CONFIG_DIR` redirects scheduled-task storage
**Severity:** Could
**Surface:** Config dir env var
**Applies to:** All rows
**Issues:**
**Steps:**
1. In the local environment editor, set `CLAUDE_CONFIG_DIR=/some/other/path`.
2. Restart the app.
3. Create a scheduled task. Inspect filesystem.
**Expected:** Tasks resolve under `${CLAUDE_CONFIG_DIR}/scheduled-tasks/<task-name>/SKILL.md` rather than `~/.claude/scheduled-tasks/`. Pre-existing tasks under the old path are not silently dropped.
**Diagnostics on failure:** `ls -la ${CLAUDE_CONFIG_DIR}/scheduled-tasks/` and `~/.claude/scheduled-tasks/`, launcher log, env dump.
**References:** [Manage scheduled tasks](https://code.claude.com/docs/en/desktop-scheduled-tasks#manage-scheduled-tasks)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:283108` (`cE()` — resolves `process.env.CLAUDE_CONFIG_DIR ?? ~/.claude`, handles `~` prefix); `build-reference/app-extracted/.vite/build/index.js:283118` (`Tce()` — returns `${cE()}/scheduled-tasks`); `build-reference/app-extracted/.vite/build/index.js:488317` and `:509032` (call sites passing `taskFilesDir: Tce()` into the scheduled-tasks substrate)
## S20 — "Keep computer awake" inhibits idle suspend
**Severity:** Should
**Surface:** Suspend inhibitor
**Applies to:** All rows
**Issues:**
**Steps:**
1. Open Settings → Desktop app → General → "Keep computer awake". Toggle ON.
2. Run `systemd-inhibit --list`. Look for a Claude-owned lock with `idle:sleep` what.
3. Toggle OFF. Re-run `systemd-inhibit --list` — lock should be gone.
**Expected:** Toggling ON registers `systemd-inhibit --what=idle:sleep` (or the `org.freedesktop.PowerManagement.Inhibit` DBus call). Toggling OFF releases the lock.
**Diagnostics on failure:** `systemd-inhibit --list` before/after, `busctl --user tree org.freedesktop.PowerManagement` (if the path uses that backend), launcher log, the relevant settings IPC call.
**References:** [How scheduled tasks run](https://code.claude.com/docs/en/desktop-scheduled-tasks#how-scheduled-tasks-run)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:241897` (`hA.powerSaveBlocker.start("prevent-app-suspension")` — single block call, ref-counted by `PhA` Set); `build-reference/app-extracted/.vite/build/index.js:241905` (`hA.powerSaveBlocker.stop(BP)` when last claim drops); `build-reference/app-extracted/.vite/build/index.js:241909` (settings binding: `PHe = "keepAwakeEnabled"`); `build-reference/app-extracted/.vite/build/index.js:241914` (`vy.on("keepAwakeEnabled", YHe)` — toggle observer)
## S21 — Lid-close still suspends per OS policy
**Severity:** Critical
**Surface:** Suspend inhibitor scope
**Applies to:** All rows (laptop hosts)
**Issues:**
**Steps:**
1. With "Keep computer awake" ON, close the laptop lid.
2. Observe whether the machine suspends.
**Expected:** Machine still suspends per logind's `HandleLidSwitch=suspend`. The inhibit lock taken in [S20](#s20--keep-computer-awake-inhibits-idle-suspend) targets `idle:sleep`, not `handle-lid-switch`, so lid-close behavior is unaffected.
**Diagnostics on failure:** `loginctl show-session --property=HandleLidSwitch`, `journalctl --since="-5 minutes"`, the actual `--what=` flags on the Claude-owned inhibitor.
**References:** [How scheduled tasks run](https://code.claude.com/docs/en/desktop-scheduled-tasks#how-scheduled-tasks-run)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:241897` (only `"prevent-app-suspension"` is passed to `powerSaveBlocker.start` — Electron maps this to `idle:sleep`); no `handle-lid-switch` / `HandleLidSwitch` token anywhere in `index.js` (verified via `grep -nE 'lid|HandleLidSwitch|handle-lid' index.js`)

View File

@@ -0,0 +1,365 @@
# Shortcuts & Input
Tests covering URL handling, the Quick Entry global shortcut, and DE-specific shortcut/input failure modes. See [`../matrix.md`](../matrix.md) for status.
## T05 — `claude://` URL handler opens links in-app
**Severity:** Smoke
**Surface:** URL handler / xdg-open
**Applies to:** All rows
**Issues:**
**Steps:**
1. With Claude Desktop running, in another app run `xdg-open 'claude://chat/new?q=hello'` (or click a `claude://` link in a browser/terminal).
2. Observe.
**Expected:** Link is delivered to the running Claude Desktop process — no new browser tab, no crash, no error dialog. (Upstream's `claudeURLHandler` only accepts the `claude:`, `claude-dev:`, `claude-nest:`, `claude-nest-dev:`, `claude-nest-prod:` schemes; bare `https://claude.ai/...` clicks route through the user's default browser, not Claude Desktop. The `.desktop` file registers `MimeType=x-scheme-handler/claude` only, matching the upstream contract.)
**Diagnostics on failure:** `xdg-mime query default x-scheme-handler/claude`, the registered `.desktop` file content, launcher log, app crash report (if any), `coredumpctl list claude-desktop` (if subprocess died — see [S06](#s06--url-handler-doesnt-segfault-on-native-wayland)).
**References:** upstream `index.js:495996-496009` (`bEe()` protocol filter), `index.js:524819` (`setAsDefaultProtocolClient("claude")`), `index.js:525140-525148` (macOS `open-url`), `index.js:525162-525172` (Linux/Win `second-instance` argv path), project `scripts/packaging/{deb,rpm,appimage}.sh` (MimeType registration).
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:495996, 524819, 525140, 525162
## T06 — Quick Entry global shortcut (unfocused)
**Severity:** Critical
**Surface:** Global shortcut / Electron globalShortcut
**Applies to:** All rows
**Issues:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406), [PR #102](https://github.com/aaddrick/claude-desktop-debian/pull/102), [PR #153](https://github.com/aaddrick/claude-desktop-debian/pull/153)
**Steps:**
1. Launch app, focus another application (browser, terminal).
2. Press the configured Quick Entry shortcut (default `Ctrl+Alt+Space`).
3. Type a prompt and submit.
4. Repeat from a different virtual desktop / workspace.
**Expected:** Quick Entry prompt opens regardless of focused app or workspace. Shortcut is globally registered, not focus-bound. Submitting creates a new session and shows it in the main window.
**Diagnostics on failure:** Launcher log (look for `Using X11 backend via XWayland (for global hotkey support)` or portal-shortcut markers), `XDG_SESSION_TYPE`, `XDG_CURRENT_DESKTOP`, output of `gdbus call --session --dest=org.freedesktop.portal.Desktop --object-path=/org/freedesktop/portal/desktop --method=org.freedesktop.DBus.Introspectable.Introspect`, the active patch set in `scripts/patches/`.
**References:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406)
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:499376 (`ort` default accelerator: `"Ctrl+Alt+Space"` non-mac, `"Alt+Space"` on mac), 499416 (`globalShortcut.register`), 525287-525290 (Quick Entry trigger callback registered against `Pw.QUICK_ENTRY`).
## S06 — URL handler doesn't segfault on native Wayland
**Severity:** Critical (for wlroots rows)
**Surface:** URL handler subprocess
**Applies to:** Sway, Niri, Hypr-O, Hypr-N (any native-Wayland session)
**Issues:**
**Steps:**
1. Launch the app on a native Wayland session (no XWayland forcing).
2. From another app, click a `claude.ai` link or run `xdg-open https://claude.ai/...`.
**Expected:** Link opens in-app cleanly. No `Failed to connect to Wayland display` errors followed by a SIGSEGV from the URL handler subprocess.
**Diagnostics on failure:** `coredumpctl info claude-desktop`, `WAYLAND_DISPLAY` env in the subprocess (if capturable via `strace -f -e execve`), launcher log, full env dump.
**Currently:** Sway capture shows `Failed to connect to Wayland display: No such file or directory (2)` followed by `Segmentation fault` from the URL handler subprocess. The main app process keeps running; the URL handler dies. Not yet filed.
**References:**
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:495996 (`bEe()` URL handler), 525140-525148 (`open-url` macOS), 525162-525172 (`second-instance` argv path on Linux); project `scripts/launcher-common.sh:96-99` (`--ozone-platform=x11` default), `scripts/launcher-common.sh:41-44` (Niri force-native-Wayland).
## S07 — `CLAUDE_USE_WAYLAND=1` opt-in path works without crashing
**Severity:** Should
**Surface:** Native Wayland mode
**Applies to:** Sway, Niri, Hypr-O, Hypr-N
**Issues:** [PR #228](https://github.com/aaddrick/claude-desktop-debian/pull/228), [PR #232](https://github.com/aaddrick/claude-desktop-debian/pull/232)
**Steps:**
1. Set `CLAUDE_USE_WAYLAND=1`. Launch the app.
2. Use the app for ~5 minutes — open chats, switch tabs, exercise basic flows.
**Expected:** App forces native Wayland (no XWayland), continues to render and respond. Previously broken paths in PR #228 still hold.
**Diagnostics on failure:** Launcher log (confirm Wayland mode active), `--doctor`, full env dump, screenshot of any crash dialog.
**References:** [PR #228](https://github.com/aaddrick/claude-desktop-debian/pull/228), [PR #232](https://github.com/aaddrick/claude-desktop-debian/pull/232)
**Code anchors:** project `scripts/launcher-common.sh:28-29` (`CLAUDE_USE_WAYLAND=1` opt-out of XWayland), 100-111 (native-Wayland Electron flags: `UseOzonePlatform,WaylandWindowDecorations`, `--ozone-platform=wayland`, `--enable-wayland-ime`, `--wayland-text-input-version=3`, `GDK_BACKEND=wayland`).
## S09 — Quick window patch runs only on KDE (post-#406 gate)
**Severity:** Critical
**Surface:** Patch gate
**Applies to:** All rows (verifies the gate, not the feature)
**Issues:** [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406), [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393)
**Steps:**
1. On a KDE row, launch the app. Inspect launcher log for quick-window-patch markers.
2. On a non-KDE row, launch the app. Inspect launcher log — the markers should be absent.
**Expected:** On KDE sessions the quick-window patch is applied (Quick Entry uses the patched code path). On non-KDE sessions the patch is **not** applied, preventing the [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393) regression on GNOME etc.
**Diagnostics on failure:** Launcher log, `XDG_CURRENT_DESKTOP`, the patch-gate code path in `scripts/patches/`.
**References:** [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406), [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393)
**Code anchors:** project `scripts/patches/quick-window.sh:32-42` (KDE-gated `blur()` insertion), 115-125 (KDE-gated focus/visibility check replacement); upstream sites the patch rewrites are around `index.js:515374-515471` (Quick Entry popup construction + handlers).
## S10 — Quick Entry popup is transparent (no opaque square frame)
**Severity:** Should
**Surface:** Quick Entry window (KDE Wayland)
**Applies to:** KDE-W
**Issues:** [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370), [#223](https://github.com/aaddrick/claude-desktop-debian/issues/223), [PR #244](https://github.com/aaddrick/claude-desktop-debian/pull/244)
**Steps:**
1. On KDE Plasma Wayland, invoke Quick Entry.
2. Observe the popup background.
**Expected:** Quick Entry popup renders with a transparent background — no opaque square frame visible behind the rounded prompt UI.
**Diagnostics on failure:** Screenshot, KDE compositor settings (`kwriteconfig5 --read kwinrc Compositing/Backend`), launcher log, BrowserWindow construction args.
**References:** [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370) (current open report), [#223](https://github.com/aaddrick/claude-desktop-debian/issues/223) (closed predecessor), [PR #244](https://github.com/aaddrick/claude-desktop-debian/pull/244)
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515380 (`transparent: !0`), 515383 (`backgroundColor: "#00000000"`), 515381 (`frame: !1`), 515377 (`skipTaskbar: !0`).
## S11 — Quick Entry shortcut fires from any focus on Wayland (mutter XWayland key-grab)
**Severity:** Critical (for GNOME users)
**Surface:** Global shortcut on GNOME mutter
**Applies to:** GNOME, Ubu
**Issues:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406)
**Steps:**
1. On GNOME/mutter Wayland, launch the app.
2. Focus another application; press the Quick Entry shortcut.
3. Repeat from another virtual desktop.
**Expected:** Shortcut fires regardless of focused app or workspace.
**Diagnostics on failure:** Launcher log (note `Using X11 backend via XWayland (for global hotkey support)`), `XDG_CURRENT_DESKTOP`, mutter version (`gnome-shell --version`), the active patch set.
**Currently:** Fedora 43 GNOME Wayland reproduces [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404) — mutter doesn't honour the XWayland-side key grab, so the shortcut is focus-bound. On Ubuntu 24.04 GNOME, the [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406) KDE-only gate prevents the regressing patch from running, leaving the older (working) code path active — hence `🔧` on Ubu. The unsolved fix path is [S12](#s12----enable-featuresglobalshortcutsportal-launcher-flag-wired-up-for-gnome-wayland).
**References:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406)
**Code anchors:** project `scripts/launcher-common.sh:96-99` (XWayland-default `--ozone-platform=x11`); upstream `index.js:499416` (`globalShortcut.register`).
## S12 — `--enable-features=GlobalShortcutsPortal` launcher flag wired up for GNOME Wayland
**Severity:** Critical
**Surface:** Launcher flag wiring
**Applies to:** GNOME, Ubu (any GNOME Wayland)
**Issues:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404)
**Steps:**
1. On GNOME Wayland, launch the app.
2. Inspect the Electron command line via `pgrep -af claude-desktop` — look for `--enable-features=GlobalShortcutsPortal`.
3. Test Quick Entry shortcut from unfocused state (see [T06](#t06--quick-entry-global-shortcut-unfocused)).
**Expected:** Launcher detects GNOME Wayland and appends `--enable-features=GlobalShortcutsPortal` to Electron's argv, routing global shortcuts through XDG Desktop Portal instead of X11 key grabs. Once wired, [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404) is closeable.
**Diagnostics on failure:** Full process argv (`cat /proc/$(pgrep -f electron)/cmdline | tr '\0' ' '`), launcher log, `XDG_CURRENT_DESKTOP`.
**Currently:** Not yet implemented. Tracking under [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404).
> **⚠ Missing in build 1.5354.0** — `--enable-features=GlobalShortcutsPortal` is not appended by `scripts/launcher-common.sh` for any GNOME Wayland variant. Re-verify after next upstream bump and after #404 lands.
**References:** [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404)
**Code anchors:** project `scripts/launcher-common.sh:59-112` (`build_electron_args` — no `GlobalShortcutsPortal` branch present).
## S14 — Global shortcuts via XDG portal work on Niri
**Severity:** Critical (for Niri users)
**Surface:** XDG Desktop Portal `BindShortcuts`
**Applies to:** Niri
**Issues:**
**Steps:**
1. On Niri, launch the app (the launcher special-cases Niri to native Wayland + portal).
2. Configure the Quick Entry shortcut.
3. Observe portal interaction in launcher log.
**Expected:** `BindShortcuts` succeeds. Configured Quick Entry shortcut is registered and fires.
**Diagnostics on failure:** Launcher log capture of the `BindShortcuts` call, `busctl --user tree org.freedesktop.portal.Desktop`, Niri version, full env.
**Currently:** `Failed to call BindShortcuts (error code 5)` — portal global shortcuts fail on Niri. Different root cause from [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), same user-visible symptom (Quick Entry shortcut doesn't fire). Not yet filed.
**References:**
**Code anchors:** project `scripts/launcher-common.sh:41-44` (Niri force-native-Wayland branch); upstream `index.js:499416` (`globalShortcut.register`, which on native Wayland routes through Electron's `xdg-desktop-portal` `BindShortcuts` path inside Chromium).
## S29 — Quick Entry popup is created lazily on first shortcut press (closed-to-tray sanity)
**Severity:** Critical
**Surface:** Quick Entry popup lifecycle
**Applies to:** All rows
**Issues:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393)
**Steps:**
1. Launch app, wait for main window to appear, hide-to-tray (close via X — see [T08](./tray-and-window-chrome.md#t08--hide-to-tray-on-close)).
2. Confirm no Claude window is mapped (e.g. `wmctrl -l | grep -i claude` returns empty on X11; `swaymsg -t get_tree` for Wayland equivalents).
3. Press the Quick Entry shortcut.
4. Type `hello`, press Enter.
**Expected:** Popup appears even though no Claude window was mapped before the keypress. Upstream constructs the popup `BrowserWindow` lazily on first shortcut invocation (`if (!Ko || ...) Ko = new BrowserWindow(...)` near `index.js:515375`), so the popup does not need a pre-existing main window. New chat session is created and reachable on submit.
**Diagnostics on failure:** Launcher log, `~/.config/Claude/logs/`, `XDG_CURRENT_DESKTOP`, screenshot of empty desktop after shortcut press.
**References:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), upstream `index.js:515375-515397`
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515374 (`if (!Ko ...) Ko = new BrowserWindow(...)` lazy construction guard), 515394 (`preload: ".vite/build/quickWindow.js"`), 515438 (`Ko.loadFile(".vite/renderer/quick_window/quick-window.html")`).
## S30 — Quick Entry shortcut becomes a no-op after full app exit
**Severity:** Should
**Surface:** Global shortcut unregistration
**Applies to:** All rows
**Issues:**
**Steps:**
1. Launch app. Confirm Quick Entry shortcut works (popup opens).
2. Quit Claude Desktop fully via tray → Quit (or `pkill -f app.asar`). Confirm no `electron` processes for the app remain.
3. Press the Quick Entry shortcut.
**Expected:** No popup appears. No error dialog. No zombie process. Electron unregisters the global shortcut on app exit; the shortcut becomes a system-level no-op.
**Diagnostics on failure:** `pgrep -af app.asar` output, `journalctl --user -e -n 100`, OS-level shortcut bindings (`gsettings list-recursively | grep -i shortcut`).
**References:** upstream `index.js:499416` (registration site)
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:499398-499428 (`nG()` register/unregister wrapper — passing `null` accelerator unregisters), 499416 (`hA.globalShortcut.register`), 499403 (`hA.globalShortcut.unregister`).
## S31 — Quick Entry submit makes the new chat reachable from any main-window state
**Severity:** Critical
**Surface:** Submit → main window show
**Applies to:** All rows
**Issues:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406)
**Steps:**
1. For each main-window state: (a) visible-and-focused, (b) minimized, (c) hidden-to-tray, (d) on a different workspace, (e) closed via X (project's hide-to-tray override).
2. Set the state, then invoke Quick Entry, type `hello`, submit.
3. Record what happens to the main window: auto-restored, requires tray click, came to current workspace, stayed on its own workspace.
**Expected:** The new chat session is **reachable** from each starting state. Acceptance is "user can reach the new chat" — not "main window auto-restored." Upstream calls `mainWin.show()` + `mainWin.focus()` only (`index.js:515566, 515599`), with no `restore()`, no `setVisibleOnAllWorkspaces()`, no `moveTop()`. Whether `show()` un-minimizes or migrates workspaces is purely compositor-dependent. The failure case is "new chat created but the user has no way to surface it" — that's a regression. Anything that reaches the chat (even via a tray click) is upstream-acceptable.
**Diagnostics on failure:** `~/.config/Claude/logs/`, screenshot at each state, output of `wmctrl -l` (X11) or `swaymsg -t get_tree` (sway), launcher log.
**Currently:** On non-KDE rows, the post-#406 KDE-only patch gate leaves the upstream code path (`isFocused()` short-circuit) active. Andrej730's #393 GNOME repro shows the stale-`isFocused()` bug can still suppress `show()` in tray-only state. See [S32](#s32--quick-entry-submit-on-gnome-mutter-doesnt-trip-electron-stale-isfocused).
**References:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), upstream `index.js:515566, 515599, 105164-171`
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515567 (`h1() || ut.show(), ut.focus()` in `gHn()` existing-chat path), 515598-515599 (`h1() || ut.show(), ut.focus()` in `ynt()` new-chat path), 105164-105171 (`h1()` returns `ut.isFocused() || mainView.webContents.isFocused()`).
## S32 — Quick Entry submit on GNOME mutter doesn't trip Electron stale-`isFocused()`
**Severity:** Critical (for GNOME users)
**Surface:** Electron `BrowserWindow.isFocused()` on Linux
**Applies to:** GNOME, Ubu
**Issues:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393)
**Steps:**
1. On GNOME Wayland, launch the app, then close to tray.
2. Confirm the app is in tray-only state (no window mapped, no Dash entry, no taskbar entry).
3. Invoke Quick Entry, type `hello`, submit.
4. Repeat after re-pinning the app to the Dash and reproducing the tray-only state from there.
**Expected:** Submit produces a reachable new chat session in both Dash-pinned and not-pinned cases. **The Dash distinction is empirical, not code-driven** — upstream has no notion of Dash presence. The underlying failure mode is Electron's `BrowserWindow.isFocused()` returning stale-true on Linux mutter, which causes upstream's `h1() || ut.show()` short-circuit (`index.js:515566`) to skip `show()`. Andrej730 traced this on #393.
**Diagnostics on failure:** Bundled `index.js` h1() body (extract via `npx asar extract`); add temporary logging in `h1()` per Andrej730's diff in #393 if reproducing locally; `gnome-shell --version`; `~/.config/Claude/logs/`.
**Currently:** Open. The KDE-only gate from PR #406 leaves this path unfixed on GNOME. Resolution requires either (a) widening the patch to all DEs by dropping the `isFocused()` fallback in the patched code, or (b) waiting for an upstream Electron fix to `isFocused()` on Linux.
**References:** [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393) (Andrej730's diagnosis with `eU()` logging output)
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:105164-105171 (`h1()` body — the exact short-circuit Andrej730 instrumented), 515567 + 515598 (the two `h1() || ut.show()` call sites the suppression hits).
## S33 — Quick Entry transparent rendering tracked against bundled Electron version
**Severity:** Should
**Surface:** Bundled Electron version
**Applies to:** All rows (relevant where #370 reproduces)
**Issues:** [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370)
**Steps:**
1. After install, capture the Electron version bundled with the app: extract `app.asar.unpacked` and run the bundled Electron with `--version`, or read it from the bundled binary's metadata.
2. Record the version in [`../matrix.md`](../matrix.md) per row, alongside the [S10](#s10--quick-entry-popup-is-transparent-no-opaque-square-frame) status.
**Expected:** Captured version is recorded. If the version is **41.0.4 through 41.x.y** and S10 fails, the upstream electron/electron#50213 regression hypothesis (per @noctuum's bisect on #370) holds and the issue is blocked on upstream. If the version is **41.0.3 or earlier** and S10 fails, the bisect is wrong — investigate. If the version is **a later release that includes a CSD-rendering fix** and S10 still fails, the upstream-regression hypothesis is also wrong.
**Diagnostics on failure:** Output of the version capture command, link to electron/electron#50213, the BrowserWindow construction args from the bundled `index.js`.
**Currently:** Per @noctuum's bisect, 41.0.4 introduced the regression. No upstream fix shipped as of last check.
**References:** [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370), upstream `index.js:515380, 515383` (already sets `transparent: true` and `backgroundColor: "#00000000"`)
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515380 (`transparent: !0`), 515383 (`backgroundColor: "#00000000"`), 515374-515397 (popup `BrowserWindow` construction args block, including `frame: !1`, `hasShadow: Zr`, `type: Zr ? "panel" : void 0`).
## S34 — Quick Entry shortcut focuses fullscreen main window instead of showing popup
**Severity:** Should
**Surface:** Shortcut behavior on fullscreen main
**Applies to:** All rows
**Issues:**
**Steps:**
1. Launch app. Put the main window into native fullscreen (F11 or platform equivalent).
2. Press the Quick Entry shortcut.
**Expected:** Popup does **not** appear. Main window receives focus and `ide()` runs (upstream behavior at `index.js:525287-525290`). This is intentional upstream UX — assumes the user wants to interact with the existing fullscreen Claude rather than overlay a popup on it.
**Diagnostics on failure:** Screenshot, launcher log, confirm fullscreen state via `wmctrl -l -G` / Wayland equivalent.
**References:** upstream `index.js:525287-525290`
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:525287-525290 (Quick Entry callback: `ut && !ut.isDestroyed() && ut.isFullScreen() ? (ut.focus(), ide()) : Yri()`), 515234-515241 (`ide()``show()` + `focus()` + `webContents.send(TEe.cmdK)` for the cmd-K dispatch).
## S35 — Quick Entry popup position is persisted across invocations and across app restarts
**Severity:** Should
**Surface:** Popup placement memory
**Applies to:** All rows
**Issues:**
**Steps:**
1. Launch app. Invoke Quick Entry. Note the popup position (record monitor + coordinates if possible — e.g. `xdotool getactivewindow getwindowgeometry` on X11).
2. Dismiss (Esc). Re-invoke. Position should be unchanged across this dismiss/re-invoke cycle.
3. Quit Claude Desktop fully (`pkill -f app.asar`). Re-launch. Invoke Quick Entry.
4. Confirm position matches the pre-restart capture.
**Expected:** Popup reappears at the same monitor + position before and after a full app restart. Upstream persists position via `an.get("quickWindowPosition")` (`index.js:515491-515526`), keyed on monitor label + resolution.
**Diagnostics on failure:** Captured coordinates pre/post-restart, content of any persisted settings file (project's settings storage location varies by OS).
**References:** upstream `index.js:515491-515526`
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515444-515461 (`Ko.on("hide", …)` persists `quickWindowPosition` via `an.set(...)`), 515491-515521 (`aHn()` resolves saved monitor by `label + bounds.width + bounds.height`, falling back to label-only or proportional placement), 515489 (`Ko.setPosition(...)` after show).
## S36 — Quick Entry popup falls back to primary display when saved monitor is gone
**Severity:** Smoke
**Surface:** Multi-monitor placement
**Applies to:** All rows with a multi-monitor capable host
**Issues:**
**Steps:**
1. **Multi-monitor required.** With an external monitor connected, invoke Quick Entry on the external monitor. Trigger position persistence (per [S35](#s35--quick-entry-popup-position-is-persisted-across-invocations-and-across-app-restarts)).
2. Disconnect the external monitor (libvirt: detach the second display device; bare metal: unplug).
3. Invoke Quick Entry.
**Expected:** Popup appears on the primary display, not at off-screen coordinates. Upstream falls back to `cHn()` when the saved monitor is no longer present (`index.js:515502`).
**Diagnostics on failure:** `xrandr` (X11) / `wlr-randr` (wlroots) output before and after disconnect, captured popup coordinates, screenshot.
**Skip when:** Single-monitor VM or host. Skip with `-` in the dashboard.
**References:** upstream `index.js:515502`
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515502 (`return cHn();` early-return when no saved position), 515523-515527 (`cHn()` centres popup on `screen.getPrimaryDisplay()` workArea), 515514-515515 (`label`-only match fallback before primary-display fallback).
## S37 — Quick Entry popup remains functional after main window destroy
**Severity:** Should
**Surface:** Popup lifecycle independence from main window
**Applies to:** All rows (where reachable)
**Issues:**
**Steps:**
1. Launch app, focus main window.
2. **Trigger main window destroy without quitting the app.** On this project, the X-button hide-to-tray override means the standard close path does **not** destroy `ut`. Reach the destroy path via one of:
- DevTools console on the main window: `require('electron').remote.getCurrentWindow().destroy()` (if `remote` is exposed; not guaranteed).
- A debug build with the hide-to-tray override removed.
- Skip and mark `-` if unreachable.
3. After destroy: invoke Quick Entry, type `hello`, submit.
**Expected:** Popup appears and accepts input. Upstream's `!ut || ut.isDestroyed()` guard at `index.js:515595` skips the show/focus block without crashing. The new chat is created in the data layer; whether it has a window to surface in is a separate question (upstream contract is "popup itself does not crash").
**Diagnostics on failure:** Crash dump, `~/.config/Claude/logs/`, sequence of actions taken to reach the destroy path.
**Currently:** Likely unreachable on Linux without a debug build, due to project's hide-to-tray override of the X button. Mark `-` (N/A) on rows where the destroy path can't be triggered.
**References:** upstream `index.js:515595`
**Code anchors:** build-reference/app-extracted/.vite/build/index.js:515595-515602 (`setTimeout(() => { !ut || ut.isDestroyed() || (h1() || ut.show(), ut.focus(), Qe == null || Qe.webContents.focus(), iri()); }, 0)` — guard skips show/focus block on destroy without throwing); 515547 (companion guard in `nde()` chat-id submit path: `else if (ut && !ut.isDestroyed())`).

View File

@@ -0,0 +1,123 @@
# Tray & Window Chrome
Tests covering the tray icon, OS-native window decorations, the hybrid in-app topbar (PR #538), and hide-to-tray on close. See [`../matrix.md`](../matrix.md) for status.
## T03 — Tray icon present
**Severity:** Smoke
**Surface:** System tray / SNI
**Applies to:** All rows
**Issues:**
**Runner:** [`tools/test-harness/src/runners/T03_tray_icon_present.spec.ts`](../../../tools/test-harness/src/runners/T03_tray_icon_present.spec.ts) — registration only (left-click toggle + theme-switch in-place rebuild are v2)
**Steps:**
1. Launch the app. Wait a few seconds.
2. Locate the tray icon in the system tray / status area.
3. Right-click → confirm standard menu (Show, Quit, etc.). Left-click → confirm window toggles.
4. Switch the system theme between light and dark; observe the tray icon update.
**Expected:** Tray icon appears within a few seconds of app launch. Right-click exposes the standard menu. Left-click toggles main window visibility. Theme changes update the icon in place without spawning a duplicate.
**Diagnostics on failure:** `RegisteredStatusNotifierItems` from the SNI watcher (see [runbook](../runbook.md#tray--dbus-state-kde)), the tray daemon process for the DE (Plasma's `plasmashell`, GNOME's `gnome-shell` + AppIndicator extension state, etc.), launcher log.
**References:** [`docs/learnings/tray-rebuild-race.md`](../../learnings/tray-rebuild-race.md)
**Code anchors:** `build-reference/app-extracted/.vite/build/index.js:525627` (`vy.on("menuBarEnabled", () => { Sde() })` — re-entry), `index.js:525631-525673` (`function Sde()` — tray construction), `index.js:525645` (`new hA.Tray(hA.nativeImage.createFromPath(t))`), `index.js:525646` (`qh.on("click", () => void Yri())` — left-click handler), `index.js:525653` (`qh.setContextMenu(mnt())` — Linux right-click via context menu), `index.js:515150-515169` (`function mnt()` — Show App + Quit menu items), `index.js:525623` (`hA.nativeTheme.on("updated", ...)` — theme-change re-entry).
## T04 — Window decorations draw
**Severity:** Smoke
**Surface:** Window chrome
**Applies to:** All rows
**Issues:** [PR #127](https://github.com/aaddrick/claude-desktop-debian/pull/127), [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538)
**Runner:** [`tools/test-harness/src/runners/T04_window_decorations.spec.ts`](../../../tools/test-harness/src/runners/T04_window_decorations.spec.ts) — X11 / XWayland only (checks `_NET_FRAME_EXTENTS`); native-Wayland window-state queries are deferred
**Steps:**
1. Launch the app.
2. Confirm window has a working OS-native frame: close, minimize, maximize render and respond.
3. Resize via window edges.
**Expected:** Frame is drawn by the DE/compositor (not the app). All controls render and respond. Resize works.
**Diagnostics on failure:** `xprop _NET_WM_WINDOW_TYPE` (X11) / `swaymsg -t get_tree` or compositor-equivalent (Wayland), launcher log line for `frame:` setting, screenshot.
**References:** [PR #127](https://github.com/aaddrick/claude-desktop-debian/pull/127), [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538) (hybrid mode keeps native frame), [`docs/learnings/linux-topbar-shim.md`](../../learnings/linux-topbar-shim.md)
**Code anchors:** Upstream factory passes `titleBarStyle: "hidden"` and `titleBarOverlay: ys` (Windows-only flag) to `BrowserWindow` at `build-reference/app-extracted/.vite/build/index.js:524892-524909` (`Ori()`). On Linux the wrapper at `scripts/frame-fix-wrapper.js:122` overrides to `options.frame = true` and at `scripts/frame-fix-wrapper.js:129-130` deletes the macOS-only `titleBarStyle` / `titleBarOverlay` so the DE draws the frame. (Hybrid-mode plumbing — `CLAUDE_TITLEBAR_STYLE` resolution and the `native`/`hybrid`/`hidden` branches — lives on `main` per PR #538; the docs/compat-matrix branch's `frame-fix-wrapper.js` carries only the unconditional `frame:true` patch, which is sufficient for T04's "frame draws" assertion.)
## T07 — In-app topbar renders + clickable
**Severity:** Smoke
**Surface:** In-app topbar (hybrid mode)
**Applies to:** All rows on PR #538 builds
**Issues:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538), [PR #127](https://github.com/aaddrick/claude-desktop-debian/pull/127)
**Steps:**
1. Launch a PR #538 build.
2. Observe the in-app topbar below the OS frame.
3. Click each of: hamburger menu, sidebar toggle, search, back, forward, Cowork ghost.
**Expected:** All five topbar buttons render below the native frame. Each responds to mouse clicks (no implicit drag region capturing the events). If any single button fails to render or click, the test is `✗` — note which one in the linked issue.
**Diagnostics on failure:** Screenshot, env (`OZONE_PLATFORM`, `ELECTRON_OZONE_PLATFORM_HINT`, `GDK_BACKEND`, `QT_QPA_PLATFORM`, `MOZ_ENABLE_WAYLAND`, `SDL_VIDEODRIVER`), launcher log, DevTools `document.querySelector('.topbar')` HTML if accessible.
**References:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538), [PR #127](https://github.com/aaddrick/claude-desktop-debian/pull/127), [`docs/learnings/linux-topbar-shim.md`](../../learnings/linux-topbar-shim.md)
**Code anchors:** UA-spoof shim source `scripts/wco-shim.js` (lines 1-30 module guard / `CLAUDE_TITLEBAR_STYLE != 'native'` gate, lines 184-191 `navigator.userAgent` redefinition matching `/(win32|win64|windows|wince)/i`, lines 52-53 `CONTROLS_WIDTH=140` / `TITLEBAR_HEIGHT=40`); injection orchestrator `scripts/patches/wco-shim.sh` (`patch_wco_shim()` prepends shim source to `mainView.js`); hybrid-mode wrapper branch `scripts/frame-fix-wrapper.js:62-70` (`VALID_TITLEBAR_STYLES`, default `hybrid`) and `:152-240` (per-mode `frame` / `titleBarStyle` handling).
## T08 — Hide-to-tray on close
**Severity:** Smoke
**Surface:** Window lifecycle
**Applies to:** All rows
**Issues:** [PR #451](https://github.com/aaddrick/claude-desktop-debian/pull/451)
**Steps:**
1. Launch the app. Click the window close (X) button.
2. Confirm app process is still running (`pgrep -af claude-desktop`).
3. Click the tray icon (or invoke Quick Entry) → window restores.
4. Quit explicitly via tray menu or `Ctrl+Q`.
**Expected:** Close button hides main window to tray, doesn't quit. App keeps running. Tray-click restores. Explicit Quit ends the process.
**Diagnostics on failure:** `pgrep -af claude-desktop` after close, launcher log, screenshot of any dialog.
**References:** [PR #451](https://github.com/aaddrick/claude-desktop-debian/pull/451)
**Code anchors:** Upstream Linux quit-on-last-close at `build-reference/app-extracted/.vite/build/index.js:525550-525552` (`hA.app.on("window-all-closed", () => { Zr || Ap() })``Zr` is darwin). Wrapper interception at `scripts/frame-fix-wrapper.js:178-185` (`this.on('close', e => { if (!result.app._quittingIntentionally && !this.isDestroyed()) { e.preventDefault(); this.hide() } })`) and `scripts/frame-fix-wrapper.js:370-374` (`app.on('before-quit', () => { app._quittingIntentionally = true })` — arms the bypass for tray-Quit / `Ctrl+Q` / SIGTERM). `CLOSE_TO_TRAY` gate (Linux + `CLAUDE_QUIT_ON_CLOSE !== '1'`) at `scripts/frame-fix-wrapper.js:49-51`. Tray Quit menu item `mnt()` `click: rde` at `index.js:515166`; `function rde()` at `index.js:515306-515308` calls `Ap(!1)`.
## S08 — Tray icon doesn't duplicate after `nativeTheme` update
**Severity:** Should
**Surface:** Tray (KDE)
**Applies to:** KDE-W, KDE-X
**Issues:** [`docs/learnings/tray-rebuild-race.md`](../../learnings/tray-rebuild-race.md)
**Steps:**
1. Launch the app on KDE.
2. Toggle system theme (light ↔ dark).
3. Observe the tray for ~10 seconds.
**Expected:** Tray icon updates in place via `setImage` + `setContextMenu`. SNI service stays registered — no de-register / re-register churn that would leave a duplicate icon visible until KDE garbage-collects.
**Diagnostics on failure:** SNI watcher state before/after theme switch (see [runbook](../runbook.md#tray--dbus-state-kde)), launcher log, `journalctl --user -u plasma-plasmashell -n 50`.
**References:** [`docs/learnings/tray-rebuild-race.md`](../../learnings/tray-rebuild-race.md). Mitigated upstream — the in-place fast-path is the current behavior.
**Code anchors:** Upstream destroy+recreate slow-path at `build-reference/app-extracted/.vite/build/index.js:525643` (`qh && (qh.destroy(), (qh = null))`) followed immediately by `new hA.Tray(...)` at `:525645` and `setContextMenu(mnt())` at `:525653` — the SNI re-register that races on KDE. Fast-path injection in `scripts/patches/tray.sh` `patch_tray_inplace_update()` (lines 95-231): extracts `tray_var` / `menu_func` / `path_var` / `enabled_var` dynamically, then injects `if (TRAY && ENABLED !== false) { TRAY.setImage(EL.nativeImage.createFromPath(PATH)); process.platform !== "darwin" && TRAY.setContextMenu(MENU()); return }` before the destroy block. Idempotency marker at `tray.sh:174-180` keys on the post-rename `setImage(...nativeImage.createFromPath(PATH_VAR))` literal. Mutex + 250 ms DBus settle delay (the prior mitigation, kept for the legitimate slow-path entries) at `tray.sh:48-60`.
## S13 — Hybrid topbar shim survives Omarchy's Ozone-Wayland env exports
**Severity:** Critical (for Omarchy users)
**Surface:** In-app topbar (hybrid mode) under Omarchy env
**Applies to:** Hypr-O
**Issues:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538)
**Steps:**
1. On OmarchyOS, export Omarchy's session-wide env (`ELECTRON_OZONE_PLATFORM_HINT=wayland`, `OZONE_PLATFORM=wayland`, `GDK_BACKEND=wayland,x11,*`, `QT_QPA_PLATFORM=wayland;xcb`, `MOZ_ENABLE_WAYLAND=1`, `SDL_VIDEODRIVER=wayland,x11`).
2. Launch a PR #538 build.
3. Click each of the five topbar buttons.
**Expected:** The hybrid-mode topbar shim (`scripts/wco-shim.js`) loads in time to spoof the UA before claude.ai's `isWindows()` check fires. All five topbar buttons render and click.
**Diagnostics on failure:** Full session env, launcher log, `--doctor`, screenshot, video (per @lukedev45's bug report on PR #538), DevTools console for shim-load errors.
**Currently:** Reproduces partial render on OmarchyOS Hyprland per [@lukedev45](https://github.com/lukedev45)'s video on [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538). @aaddrick attempted local repro on KDE Plasma + Wayland with the same env vars and could not reproduce; root cause TBD pending diagnostic capture from a broken run.
**References:** [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538), [`docs/learnings/linux-topbar-shim.md`](../../learnings/linux-topbar-shim.md)
**Code anchors:** Shim is inlined at the top of `mainView.js` (the BrowserView preload), not loaded via `require` — see the rationale at `scripts/patches/wco-shim.sh:23-40` ("Sandboxed preloads can only require a fixed allowlist of modules…"). The injection prepends `scripts/wco-shim.js` source at the start of `app.asar.contents/.vite/build/mainView.js` so the UA override fires before the bundle's `isWindows()` regex (`/(win32|win64|windows|wince)/i`) ever runs in the page main world (`scripts/wco-shim.js:184-191`). The shim's IIFE no-ops on non-Linux at `wco-shim.js:29` and on `CLAUDE_TITLEBAR_STYLE === 'native'` at `wco-shim.js:30-32`, so the only env-export interaction with `OZONE_PLATFORM` etc. is via Chromium's own platform plumbing — none of those exports are read by the shim itself, which makes the partial-render repro on Omarchy mysterious to static analysis.

179
docs/testing/matrix.md Normal file
View File

@@ -0,0 +1,179 @@
# Test Status Matrix
*Last updated: 2026-04-30 · Tested against: claude-desktop 1.4758.0 (project varies per row)*
This is the live dashboard. Update this file (and only this file) when status changes. For the test specs themselves, see [`cases/`](./cases/). For orientation, see [`README.md`](./README.md).
Status legend: `✓` pass · `✗` fail · `🔧` mitigated · `?` untested · `-` N/A. Cells include linked issue/PR numbers when relevant.
## Cross-environment matrix (T-series)
| Test | KDE-W | KDE-X | GNOME | Ubu | Sway | i3 | Niri | Hypr-O | Hypr-N |
|------|-------|-------|-------|-----|------|----|------|--------|--------|
| [T01](./cases/launch.md#t01--app-launch) | ✓ | ? | ? | ? | ? | ? | ? | ? | ✓ |
| [T02](./cases/launch.md#t02--doctor-health-check) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T03](./cases/tray-and-window-chrome.md#t03--tray-icon-present) | ✓ | ? | ? | ? | ? | ? | ? | ? | ? |
| [T04](./cases/tray-and-window-chrome.md#t04--window-decorations-draw) | ✓ | ? | ? | ? | ? | ? | ? | ? | ✓ |
| [T05](./cases/shortcuts-and-input.md#t05--url-handler-opens-claudeai-links-in-app) | ? | ? | ? | ? | ✗ | ? | ? | ? | ? |
| [T06](./cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused) | ✓ | ✓ | ✗ [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404) | 🔧 [#406](https://github.com/aaddrick/claude-desktop-debian/pull/406) | ? | ? | ✗ | ? | ? |
| [T07](./cases/tray-and-window-chrome.md#t07--in-app-topbar-renders--clickable) | ? | ? | ? | ? | ? | ? | ? | ✗ [#538](https://github.com/aaddrick/claude-desktop-debian/pull/538) | ✓ |
| [T08](./cases/tray-and-window-chrome.md#t08--hide-to-tray-on-close) | ✓ | ? | ? | ? | ? | ? | ? | ? | ? |
| [T09](./cases/platform-integration.md#t09--autostart-via-xdg) | ✓ | ? | ? | ? | ? | ? | ? | ? | ? |
| [T10](./cases/platform-integration.md#t10--cowork-integration) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T11](./cases/extensibility.md#t11--plugin-install-anthropic--partners) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T12](./cases/platform-integration.md#t12--webgl-warn-only) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T13](./cases/launch.md#t13--doctor-reports-correct-package-format) | ✗ | ✗ | ✗ | ? | ✗ | ✗ | ✗ | ? | ? |
| [T14](./cases/launch.md#t14--multi-instance-behavior) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T15](./cases/code-tab-foundations.md#t15--sign-in-completes-via-browser-handoff) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T16](./cases/code-tab-foundations.md#t16--code-tab-loads) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T17](./cases/code-tab-foundations.md#t17--folder-picker-opens) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T18](./cases/code-tab-foundations.md#t18--drag-and-drop-files-into-prompt) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T19](./cases/code-tab-foundations.md#t19--integrated-terminal) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T20](./cases/code-tab-foundations.md#t20--file-pane-opens-and-saves) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T21](./cases/code-tab-workflow.md#t21--dev-server-preview-pane) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T22](./cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T23](./cases/code-tab-handoff.md#t23--desktop-notifications-fire) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T24](./cases/code-tab-handoff.md#t24--open-in-external-editor) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T25](./cases/code-tab-handoff.md#t25--show-in-files-file-manager) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T26](./cases/routines.md#t26--routines-page-renders) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T27](./cases/routines.md#t27--scheduled-task-fires-and-notifies) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T28](./cases/routines.md#t28--scheduled-task-catch-up-after-suspend) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T29](./cases/code-tab-workflow.md#t29--worktree-isolation) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T30](./cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T31](./cases/code-tab-workflow.md#t31--side-chat-opens) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T32](./cases/code-tab-workflow.md#t32--slash-command-menu) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T33](./cases/extensibility.md#t33--plugin-browser) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T34](./cases/code-tab-handoff.md#t34--connector-oauth-round-trip) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T35](./cases/extensibility.md#t35--mcp-server-config-picked-up) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T36](./cases/extensibility.md#t36--hooks-fire) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T37](./cases/extensibility.md#t37--claudemd-memory-loads) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T38](./cases/code-tab-handoff.md#t38--continue-in-ide) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| [T39](./cases/code-tab-handoff.md#t39--desktop-cli-handoff-graceful-na) | ? | ? | ? | ? | ? | ? | ? | ? | ? |
## Environment-specific status
### Ubuntu / DEB
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S01](./cases/distribution.md#s01--appimage-launches-without-manual-libfuse2t64-install) | AppImage launches without manual `libfuse2t64` install | ✗ | Workaround documented; not yet filed |
| [S02](./cases/distribution.md#s02--xdg_current_desktopubuntu-gnome-doesnt-break-de-detection) | `XDG_CURRENT_DESKTOP=ubuntu:GNOME` doesn't break DE detection | ? | — |
| [S03](./cases/distribution.md#s03--deb-install-via-apt-pulls-all-required-runtime-deps) | DEB install via APT pulls all required runtime deps | ? | — |
### Fedora / RPM
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S04](./cases/distribution.md#s04--rpm-install-via-dnf-pulls-all-required-runtime-deps) | RPM install via DNF pulls all required runtime deps | ? | — |
| [S05](./cases/distribution.md#s05--doctor-recognises-dnf-installed-package-doesnt-false-flag-as-appimage) | Doctor recognises dnf-installed package (no AppImage false-flag) | ✗ | Affects KDE-W, KDE-X, GNOME, Sway, i3, Niri (T13) |
### Wayland-native (wlroots)
Applies to: Sway, Niri, Hypr-O, Hypr-N (any session running native Wayland rather than XWayland).
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S06](./cases/shortcuts-and-input.md#s06--url-handler-doesnt-segfault-on-native-wayland) | URL handler doesn't segfault on native Wayland | ✗ on Sway | Captured; not yet filed |
| [S07](./cases/shortcuts-and-input.md#s07--claude_use_wayland1-opt-in-path-works-without-crashing) | `CLAUDE_USE_WAYLAND=1` opt-in path works | ? | [#228](https://github.com/aaddrick/claude-desktop-debian/pull/228), [#232](https://github.com/aaddrick/claude-desktop-debian/pull/232) |
### KDE
Applies to: KDE-W, KDE-X.
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S08](./cases/tray-and-window-chrome.md#s08--tray-icon-doesnt-duplicate-after-nativetheme-update) | Tray icon doesn't duplicate after `nativeTheme` update | 🔧 | [`tray-rebuild-race.md`](../learnings/tray-rebuild-race.md) |
| [S09](./cases/shortcuts-and-input.md#s09--quick-window-patch-runs-only-on-kde-post-406-gate) | Quick window patch runs only on KDE | ✓ | [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406) |
| [S10](./cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame) | Quick Entry popup is transparent | ? | [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370), [#223](https://github.com/aaddrick/claude-desktop-debian/issues/223) |
### GNOME
Applies to: GNOME, Ubu (Ubuntu's GNOME), and any other mutter session.
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S11](./cases/shortcuts-and-input.md#s11--quick-entry-shortcut-fires-from-any-focus-on-wayland-mutter-xwayland-key-grab) | Quick Entry shortcut fires from any focus | ✗ on GNOME, 🔧 on Ubu | [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), [PR #406](https://github.com/aaddrick/claude-desktop-debian/pull/406) |
| [S12](./cases/shortcuts-and-input.md#s12----enable-featuresglobalshortcutsportal-launcher-flag-wired-up-for-gnome-wayland) | `--enable-features=GlobalShortcutsPortal` wired up | ? | [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404) |
### Omarchy
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S13](./cases/tray-and-window-chrome.md#s13--hybrid-topbar-shim-survives-omarchys-ozone-wayland-env-exports) | Hybrid topbar shim survives Omarchy's Ozone-Wayland env exports | ✗ | [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538) |
### Niri
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S14](./cases/shortcuts-and-input.md#s14--global-shortcuts-via-xdg-portal-work-on-niri) | Global shortcuts via XDG portal work on Niri | ✗ | Captured; not yet filed |
### AppImage
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S15](./cases/distribution.md#s15--appimage-extraction---appimage-extract-works-as-documented-fallback) | AppImage extraction (`--appimage-extract`) works as fallback | ? | — |
| [S16](./cases/distribution.md#s16--appimage-mount-cleans-up-on-app-exit) | AppImage mount cleans up on app exit | ? | — |
### Linux launcher / `.desktop` env handling
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S17](./cases/platform-integration.md#s17--app-launched-from-desktop-inherits-shell-path) | App launched from `.desktop` inherits shell `PATH` | ? | — |
| [S18](./cases/platform-integration.md#s18--local-environment-editor-persists-across-reboot) | Local environment editor persists across reboot | ? | — |
| [S19](./cases/routines.md#s19--claude_config_dir-redirects-scheduled-task-storage) | `CLAUDE_CONFIG_DIR` redirects scheduled-task storage | ? | — |
### Idle-sleep / suspend
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S20](./cases/routines.md#s20--keep-computer-awake-inhibits-idle-suspend) | "Keep computer awake" inhibits idle suspend | ? | — |
| [S21](./cases/routines.md#s21--lid-close-still-suspends-per-os-policy) | Lid-close still suspends per OS policy | ? | — |
### Computer Use (Linux: out-of-scope per upstream)
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S22](./cases/platform-integration.md#s22--computer-use-toggle-is-absent-or-visibly-disabled-on-linux) | Computer-use toggle is absent or visibly disabled | ? | — |
| [S23](./cases/platform-integration.md#s23--dispatch-spawned-sessions-dont-soft-lock-on-a-never-approvable-computer-use-prompt) | Dispatch sessions don't soft-lock on never-approvable prompt | ? | — |
### Dispatch
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S24](./cases/platform-integration.md#s24--dispatch-spawned-code-session-appears-with-badge-and-notification) | Dispatch-spawned Code session appears with badge + notification | ? | — |
| [S25](./cases/platform-integration.md#s25--mobile-pairing-survives-linux-session-restart) | Mobile pairing survives Linux session restart | ? | — |
### Auto-update vs. system package manager
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S26](./cases/distribution.md#s26--auto-update-is-disabled-when-installed-via-apt--dnf) | Auto-update is disabled when installed via `apt` / `dnf` | ? | — |
### Plugin / worktree storage
| ID | Test | Status | Notes |
|----|------|--------|-------|
| [S27](./cases/extensibility.md#s27--plugins-install-per-user-not-into-system-paths) | Plugins install per-user, not into system paths | ? | — |
| [S28](./cases/extensibility.md#s28--worktree-creation-surfaces-clear-error-on-read-only-mounts) | Worktree creation surfaces clear error on read-only mounts | ? | — |
## Known failures rollup
Tests currently `✗` somewhere — investigation priority order:
| Test | Failing on | Root cause |
|------|------------|------------|
| [T05 / S06](./cases/shortcuts-and-input.md#s06--url-handler-doesnt-segfault-on-native-wayland) | Sway | URL handler subprocess SIGSEGV on native Wayland — `Failed to connect to Wayland display` |
| [T06 / S11](./cases/shortcuts-and-input.md#s11--quick-entry-shortcut-fires-from-any-focus-on-wayland-mutter-xwayland-key-grab) | GNOME | mutter doesn't honour XWayland-side key grab |
| [T06 / S14](./cases/shortcuts-and-input.md#s14--global-shortcuts-via-xdg-portal-work-on-niri) | Niri | `BindShortcuts` returns error code 5 |
| [T07 / S13](./cases/tray-and-window-chrome.md#s13--hybrid-topbar-shim-survives-omarchys-ozone-wayland-env-exports) | Hypr-O | Hybrid topbar shim partial render under Omarchy's Ozone-Wayland env exports |
| [T13 / S05](./cases/launch.md#t13--doctor-reports-correct-package-format) | every Fedora row | Doctor only checks dpkg, false-flags every dnf install as AppImage |
| [S01](./cases/distribution.md#s01--appimage-launches-without-manual-libfuse2t64-install) | Ubuntu 24.04 | AppImage requires `libfuse2t64`; not auto-pulled |
## Notes on the current state
- Most cells are `?` because every captured VM in the recent test session ran the **released** build (`dnf install` / `apt install` / current AppImage), which predates [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538). Topbar verification (T07) on the VM rows specifically requires a branch build deployed before any cell can flip from `?`.
- KDE-W status reflects @aaddrick's daily-driver host (Nobara KDE Plasma Wayland) where multiple features have been in continuous use.
- Hypr-N status reflects @typedrat's report on [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538) ("Working great on NixOS with Hyprland").
- Hypr-O status reflects @lukedev45's broken-case report on [PR #538](https://github.com/aaddrick/claude-desktop-debian/pull/538) (partial render, root cause unconfirmed but Omarchy-env-specific — see [S13](./cases/tray-and-window-chrome.md#s13--hybrid-topbar-shim-survives-omarchys-ozone-wayland-env-exports)).
- T13 is `✗` on every Fedora row because the dpkg false-flag is a deterministic property of the doctor script, not a per-environment failure mode. It will flip to `✓` everywhere once the doctor learns to detect rpm/dnf installs.
- T15T39 are derived from upstream Claude Code Desktop docs (`code.claude.com/docs/en/desktop*`) — features whose Linux behavior is officially undocumented (the docs explicitly state "Linux is not supported" for the Code tab). All cells start as `?` because the upstream Code-tab feature surface has not been systematically exercised on the patched Linux build.

View File

@@ -0,0 +1,118 @@
# Quick Entry — Upstream Contract + Test Index
Reference doc for the Quick Entry surface. Two halves:
- [§ Upstream design intent](#upstream-design-intent) documents what upstream Quick Entry promises vs. doesn't, with code anchors into `build-reference/app-extracted/.vite/build/index.js`. Treat as the authoritative answer when triaging whether a Quick Entry behavior is a Linux compat regression (our problem) or upstream-by-design (not our problem).
- [§ Test list](#test-list) enumerates the QE-N items as conceptual checks and maps each to the concrete S-N / T-N case that backs it. Spec headnotes (S09, S12, S31, S37) cite specific QE-N IDs by anchor; [§ Scaffold integration](#scaffold-integration) is the authoritative QE-N → S-N table.
The QE-N items originated in the close-out sweep for [#393](https://github.com/aaddrick/claude-desktop-debian/issues/393), [#404](https://github.com/aaddrick/claude-desktop-debian/issues/404), and [#370](https://github.com/aaddrick/claude-desktop-debian/issues/370). The sweep has run; what remains is the upstream-contract reference + the test-index mapping.
## Upstream design intent
Read this before reading the test list. Several `QE-*` rows test things upstream does not actually promise — those tests are still valuable as black-box behavior checks, but the calibration of "expected" matters.
Source for everything below: `build-reference/app-extracted/.vite/build/index.js`. Symbol names (`h1`, `ut`, `Ko`, `ynt`, `nde`, `g3A`, `u7A`) drift between releases — anchor on shape, not name.
### What upstream promises
- **Global shortcut** registered via Electron `globalShortcut.register()` (`:499416`). No app-focus gate — fires regardless of which app is focused.
- **Popup is lazily created** on first shortcut press (`if (!Ko || ...) Ko = new BrowserWindow(...)` near `:515375`). The popup `BrowserWindow` is constructed on demand, not at app startup. This is what makes QE-4 (closed-to-tray) work.
- **Position memory:** popup position persists across invocations via `an.get("quickWindowPosition")` (`:515491-515526`), keyed on monitor label + resolution. If the original monitor is gone, falls back to primary display.
- **Submit always creates a NEW chat session** when no `chatId` is provided (`ynt(e)` at `:515546`). Quick Entry never appends to an existing conversation.
- **Click-outside dismiss** is wired in the main process via the popup `blur` handler (`Ko.on("blur", () => g3A(null))` at `:515465`).
- **Popup survives main-window close.** If the user closes the main window via the X button (not full quit), `!ut || ut.isDestroyed()` guards at `:515595` skip the `show()/focus()` calls; the popup itself remains functional.
- **Window construction** sets `transparent: true`, `backgroundColor: "#00000000"`, `frame: false`, `alwaysOnTop: true` (level `"pop-up-menu"`), `skipTaskbar: true`, `resizable: false`, `show: false` (`:515375-515397`). `hasShadow: Zr` and `type: Zr ? "panel" : void 0` are macOS-only (`Zr === process.platform === "darwin"`).
### What upstream does NOT promise
- **Workspace migration.** No `setVisibleOnAllWorkspaces()`, no `moveTop()`, no `setWorkspace()` is called anywhere in the Quick Entry submit path. Whether the main window comes to the user's current workspace or stays on its own is purely a compositor decision driven by `mainWin.show()` + `mainWin.focus()`. **Linux/Wayland behavior here is not part of the upstream feature spec.**
- **Restore from minimized.** No `restore()` call in the submit path. `show()` un-minimizes on most WMs; whether it does on a given Wayland compositor is up to that compositor.
- **Multi-monitor placement on cursor / focused display.** Upstream uses last-saved position or primary display, never "where the user is right now."
- **Multi-window targeting.** All `show`/`focus` calls go through `ut` (the main window). If the user has multiple windows, behavior is undefined.
- **Popup re-creation if its `BrowserWindow` is destroyed.** Upstream does not re-construct `Ko` after destroy — it's only created on first shortcut press.
- **Compositor-aware behavior.** Upstream has no concept of "GNOME vs KDE vs wlroots." Anywhere our patches branch on `XDG_CURRENT_DESKTOP`, that's our project compensating for compositor-specific Electron breakage, not implementing an upstream-defined contract.
### Edge case: fullscreen main window
`:525287-525290` reads (paraphrased): *"if `ut` exists and `ut.isFullScreen()` is true, focus `ut` and call `ide()`; else show the Quick Entry popup."* So if the main window is fullscreen when the shortcut fires, **the popup does not appear** — the shortcut focuses the main window instead. QE-1 needs this caveat.
### Edge case: `h1()` is a *don't-show-if-already-focused* optimization
The visibility-check function (`h1()` at `:105164-105171`) is upstream's mechanism for "don't redundantly call `show()` if the main window is already focused." Sound design. The reason it's broken on Linux is Electron's `BrowserWindow.isFocused()` returning stale-true after `hide()` on Linux backends — i.e., **the patch we apply is fixing a Linux-Electron bug, not diverging from upstream intent.** Once `isFocused()` returns honest values on Linux, the patch could be retired.
## Test list
Each item is a single check. Severity tier matches the existing scaffolding (Critical / Should / Smoke). Existing test ID in parentheses — `(new)` means this item should be added to [`cases/shortcuts-and-input.md`](./cases/shortcuts-and-input.md) before this sweep is reproducible by anyone else.
### Shortcut activation — covers #404
| ID | Severity | Step | Expected | Existing |
|----|----------|------|----------|----------|
| QE-1 | Smoke | App focused (not fullscreen), press shortcut | Popup appears. **Edge case from upstream design:** if main window is fullscreen, the shortcut focuses main and runs `ide()` instead of showing the popup (`:525287-525290`). Test this fullscreen variant separately as QE-1b — popup should *not* appear. | [S34](./cases/shortcuts-and-input.md#s34--quick-entry-shortcut-focuses-fullscreen-main-window-instead-of-showing-popup) (QE-1b only) |
| QE-2 | Critical | Other app focused, press shortcut | Popup appears | [T06](./cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused), [S11](./cases/shortcuts-and-input.md#s11--quick-entry-shortcut-fires-from-any-focus-on-wayland-mutter-xwayland-key-grab) |
| QE-3 | Critical | App on a different workspace, press shortcut | Popup appears on current workspace | [T06](./cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused) |
| QE-4 | Critical | App closed-to-tray (no window mapped), press shortcut | Popup appears | [S29](./cases/shortcuts-and-input.md#s29--quick-entry-popup-is-created-lazily-on-first-shortcut-press-closed-to-tray-sanity) |
| QE-5 | Should | App quit entirely, press shortcut | No popup, no error, no zombie process | [S30](./cases/shortcuts-and-input.md#s30--quick-entry-shortcut-becomes-a-no-op-after-full-app-exit) |
| QE-6 | Should | Inspect Electron argv via `cat /proc/$(pgrep -f 'app\.asar')/cmdline \| tr '\0' ' '` (the launcher script also matches `claude-desktop`, so anchor on `app.asar` to hit the Electron process). Cross-check launcher log line `Using X11 backend via XWayland (for global hotkey support)` vs `Using native Wayland backend (global hotkeys may not work)` (verbatim from `scripts/launcher-common.sh:98, 102`). | **Pre-S12 fix:** flag absent; shortcut fails on GNOME Wayland (this is the #404 repro). **Post-S12 fix:** `--enable-features=GlobalShortcutsPortal` present in argv on GNOME Wayland; QE-2 / QE-3 begin to pass. | [S12](./cases/shortcuts-and-input.md#s12----enable-featuresglobalshortcutsportal-launcher-flag-wired-up-for-gnome-wayland) |
### Submit → main window — covers #393
| ID | Severity | Step | Expected | Existing |
|----|----------|------|----------|----------|
| QE-7 | Smoke | Main window visible, submit prompt from QE | Popup closes; main window navigates to a **new** chat session (not appended to current chat — `ynt(e)` at `:515546` always creates new). | [S31](./cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) |
| QE-8 | Critical | Main window minimized, submit | **Upstream calls `show() + focus()` only — no `restore()`.** Whether the WM un-minimizes is compositor-dependent. Test as black-box: record whether the new chat is reachable to the user (window comes back to view, OR user has to click tray/dock to see it). Both outcomes are upstream-acceptable; only "new chat created but unreachable" is a regression. | [S31](./cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) |
| QE-9 | Critical | Main window hidden-to-tray (after [T08](./cases/tray-and-window-chrome.md#t08--hide-to-tray-on-close)), submit | Same as QE-8 — `show()` should re-map a hidden window on most compositors, but upstream doesn't guarantee it. The new chat must be reachable; the path to reach it (auto vs tray-click) is compositor-dependent. | [S31](./cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) |
| QE-10 | Should | Main window on different workspace, submit | **Upstream has no workspace logic** (no `setVisibleOnAllWorkspaces`, no `moveTop`). Outcome is whatever the compositor decides on `show()` + `focus()`. Record observed behavior per row; do not treat any single outcome as the "right" one. | [S31](./cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) |
| QE-11 | Critical | **GNOME-specific (Andrej730 repro):** App in tray, *not* present in Dash/dock, submit | Main window opens. The codebase doesn't reason about Dash presence — this is purely a compositor-observed state. The underlying failure is `BrowserWindow.isFocused()` returning stale-true on GNOME mutter, which causes the patched (KDE) code path's `h1() || ut.show()` chain to short-circuit before `show()`. Test as a black-box repro. | [S32](./cases/shortcuts-and-input.md#s32--quick-entry-submit-on-gnome-mutter-doesnt-trip-electron-stale-isfocused) |
| QE-12 | Should | App in tray, *also* present in Dash/dock, submit | Main window opens (this state should not trip the stale-focus bug, but verify) | [S32](./cases/shortcuts-and-input.md#s32--quick-entry-submit-on-gnome-mutter-doesnt-trip-electron-stale-isfocused) |
| QE-13 | Smoke | Submit prompt with 1-2 chars (`hi`) | Upstream silently drops. The actual gate is `> 2` chars at `index.js:515530, 515533` — anything 3+ submits. So `hi` (2) drops, `hel` (3) submits. Document, do not fix. | — |
### Visual / window appearance — covers #370
| ID | Severity | Step | Expected | Existing |
|----|----------|------|----------|----------|
| QE-14 | Should | Inspect popup background | Transparent; no opaque square frame visible behind the rounded UI. **Note:** upstream already sets `transparent: true` and `backgroundColor: "#00000000"` (`:515380, :515383`), so the #370 triage-bot suggestion to "try setting backgroundColor to transparent" is moot — those are already in place. The Electron 41.0.4 regression is at the CSD/shadow rendering layer below those flags, not at the option-passing layer. | [S10](./cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame) |
| QE-15 | Smoke | Inspect popup chrome | No titlebar, no close/min/max buttons (frameless) | — |
| QE-16 | Smoke | Inspect popup edges | Drop shadow + rounded corners render (compositor-dependent — note where missing) | — |
| QE-17 | Smoke | Open popup, then click on another window | Popup stays above (always-on-top) | — |
| QE-18 | Should | `electron --version` against the running app's bundled binary; record version in matrix | When > 41.0.4 ships and #370 still reproduces, the upstream-regression hypothesis is wrong | [S33](./cases/shortcuts-and-input.md#s33--quick-entry-transparent-rendering-tracked-against-bundled-electron-version) |
### Patch-application sanity — regression prevention
| ID | Severity | Step | Expected | Existing |
|----|----------|------|----------|----------|
| QE-19 | Critical | **All rows.** Extract the installed `app.asar` (`npx asar extract /usr/lib/claude-desktop/app.asar /tmp/inspect-installed`) and grep the bundled JS for the KDE gate string injected by the patch: `grep -c 'XDG_CURRENT_DESKTOP' /tmp/inspect-installed/.vite/build/index.js`. The patch (`scripts/patches/quick-window.sh:34-35, 117-118`) injects `(process.env.XDG_CURRENT_DESKTOP\|\|"").toLowerCase().includes("kde")` — that string is the runtime fingerprint. Note: the `Patched quick window` / `WARNING: No quick entry show() calls patched` lines from the patch are **build-time stdout** (not in `launcher.log`); check the build log if you built locally. | Bundled JS contains the KDE gate string (patch ran at build time). The patch ships in every build; the KDE-vs-non-KDE branch is decided at runtime by the env-var check. **Runtime gate effectiveness is verified implicitly by QE-7 through QE-12 passing on KDE and the unpatched-equivalent path running on non-KDE.** | [S09](./cases/shortcuts-and-input.md#s09--quick-window-patch-runs-only-on-kde-post-406-gate) |
### Input behavior smoke — catches collateral breakage
| ID | Severity | Step | Expected | Existing |
|----|----------|------|----------|----------|
| QE-21 | Smoke | In popup: `Esc` dismisses; click-outside dismisses; `Shift+Enter` inserts newline; `Enter` submits | All four behave as labelled. **Implementation notes for diagnostics:** click-outside is wired in the **main process** via the popup's `blur` handler (`:515465`). `Esc` / `Enter` / `Shift+Enter` are **renderer-side** (not visible in `index.js`); they go through IPC to `requestDismiss()` (`:515409`) and `requestDismissWithPayload()`. If a dismiss key fails, isolate which side is broken before reporting. | — |
### Popup placement & lifecycle — upstream contract sanity
These verify upstream-promised behaviors that aren't directly broken by #393/#404/#370 but live in the same surface area. Failures here would indicate a separate regression — file a new issue rather than folding it into the close-out trio.
| ID | Severity | Step | Expected | Existing |
|----|----------|------|----------|----------|
| QE-22 | Should | Invoke Quick Entry. Note popup position. Dismiss (Esc). Quit Claude Desktop entirely (`pkill -f app.asar` after closing the main window, or via tray → Quit). Re-launch. Invoke Quick Entry. | Popup reappears at the same monitor + position as before the restart. Upstream persists position via `an.get("quickWindowPosition")` (`:515491-515526`), keyed on monitor label + resolution. Position must survive a full app restart, not just dismiss/re-invoke. | [S35](./cases/shortcuts-and-input.md#s35--quick-entry-popup-position-is-persisted-across-invocations-and-across-app-restarts) |
| QE-23 | Smoke | **Multi-monitor required.** With an external monitor connected, invoke Quick Entry on the external monitor — let the position be saved (trigger QE-22's persistence path). Disconnect the external monitor (libvirt: `virsh detach-device` for the second display, or unplug the host monitor passing through). Invoke Quick Entry. | Popup falls back to the primary display via `cHn()` (`:515502`). Does **not** appear at off-screen coordinates. Skip this row in single-monitor VMs. | [S36](./cases/shortcuts-and-input.md#s36--quick-entry-popup-falls-back-to-primary-display-when-saved-monitor-is-gone) |
| QE-24 | Should | Launch app, focus main window, then **destroy** the main window without quitting the app. On this project the X button hide-to-tray override means the standard close path won't destroy `ut`; force the destroy via a) DevTools console (`Cmd+Opt+I` / `Ctrl+Shift+I``require('electron').remote.getCurrentWindow().destroy()` if exposed), or b) accept that this case is unreachable on Linux without a code change and skip. After destroy, invoke Quick Entry, type, submit. | Popup remains functional (lazy-recreation on shortcut press; the `!ut \|\| ut.isDestroyed()` guard at `:515595` skips the show/focus block but does not crash). New chat creation may not have a window to surface in — if app remains running with no main window, this is the "popup outlives main" path upstream guarantees. **If unreachable on Linux, mark this row N/A and document why.** | [S37](./cases/shortcuts-and-input.md#s37--quick-entry-popup-remains-functional-after-main-window-destroy) |
## Scaffold integration
The `QE-*` items in [§ Test list](#test-list) map onto formal `S##` test cases in [`cases/shortcuts-and-input.md`](./cases/shortcuts-and-input.md):
| Case | Title | Backs |
|------|-------|-------|
| [S29](./cases/shortcuts-and-input.md#s29--quick-entry-popup-is-created-lazily-on-first-shortcut-press-closed-to-tray-sanity) | Popup created lazily on first shortcut press (closed-to-tray sanity) | QE-4 |
| [S30](./cases/shortcuts-and-input.md#s30--quick-entry-shortcut-becomes-a-no-op-after-full-app-exit) | Shortcut becomes no-op after full app exit | QE-5 |
| [S31](./cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) | Submit makes the new chat reachable from any main-window state | QE-7 through QE-10 |
| [S32](./cases/shortcuts-and-input.md#s32--quick-entry-submit-on-gnome-mutter-doesnt-trip-electron-stale-isfocused) | Submit on GNOME mutter doesn't trip Electron stale-`isFocused()` | QE-11, QE-12 |
| [S33](./cases/shortcuts-and-input.md#s33--quick-entry-transparent-rendering-tracked-against-bundled-electron-version) | Transparent rendering tracked against bundled Electron version | QE-18 |
| [S34](./cases/shortcuts-and-input.md#s34--quick-entry-shortcut-focuses-fullscreen-main-window-instead-of-showing-popup) | Shortcut focuses fullscreen main instead of showing popup | QE-1b |
| [S35](./cases/shortcuts-and-input.md#s35--quick-entry-popup-position-is-persisted-across-invocations-and-across-app-restarts) | Popup position persisted across invocations and across app restarts | QE-22 |
| [S36](./cases/shortcuts-and-input.md#s36--quick-entry-popup-falls-back-to-primary-display-when-saved-monitor-is-gone) | Popup falls back to primary display when saved monitor is gone | QE-23 |
| [S37](./cases/shortcuts-and-input.md#s37--quick-entry-popup-remains-functional-after-main-window-destroy) | Popup remains functional after main window destroy | QE-24 |
QE-13, QE-15, QE-16, QE-17, and QE-21 are visual / input checks with no formal S-ID — run them by eye against [§ Upstream design intent](#upstream-design-intent).

340
docs/testing/runbook.md Normal file
View File

@@ -0,0 +1,340 @@
# Testing Runbook
*Last updated: 2026-05-03*
How to run a test sweep, capture diagnostics, file failures, and update [`matrix.md`](./matrix.md). For the test specs themselves, see [`cases/`](./cases/). For the automation harness, see [`automation.md`](./automation.md) and [`tools/test-harness/`](../../tools/test-harness/). For the grounding sweep workflow (verify case docs against the live build), see [Grounding sweep](#grounding-sweep) below.
## When to sweep
| Trigger | Scope | Rows |
|---------|-------|------|
| Release tag (`vX.Y.Z+claude...`) | Smoke set | KDE-W + Hypr-N (or Sway) |
| Release tag, monthly | Smoke + Critical | All active rows |
| Upstream Claude Desktop bump | Smoke set + [grounding sweep](#grounding-sweep) | KDE-W + one wlroots row |
| PR touching `scripts/patches/*.sh` | Tests in the affected surface (use surface tags in cases files) | KDE-W minimum |
| Bug report citing an env | The relevant test on the reporter's row | Just that row |
## Setup: VM matrix
Each non-host row in [`matrix.md`](./matrix.md) is a QEMU/KVM guest. Standard config:
- 4 GB RAM, 2 vCPU minimum
- virtio-gpu **with** `gl=on` (3D acceleration). On hybrid GPU hosts, pin `rendernode=/dev/dri/renderD129` (AMD); avoid renderD128 (NVIDIA, EGL init fails on aaddrick's laptop)
- 32 GB qcow2 disk
- Bridged networking
- Virgil 3D enabled where possible (helps WebGL detection in T12)
ISOs / images per row:
| Row | Source |
|-----|--------|
| Fedora 43 (KDE-W, KDE-X, GNOME, Sway, i3, Niri) | https://fedoraproject.org/spins/ for KDE/GNOME, https://fedoraproject.org/sericea/ for Sway, manual install for i3/Niri |
| Ubuntu 24.04 (Ubu) | https://ubuntu.com/download/desktop |
| OmarchyOS (Hypr-O) | https://omarchy.org |
| NixOS (Hypr-N) | https://nixos.org/download with Hyprland module |
For the host (KDE-W), test against Nobara directly — no VM needed.
## Setup: building the install candidate
```bash
# Build from the branch under test
./build.sh --build appimage --clean no
./build.sh --build deb --clean no
./build.sh --build rpm --clean no
# Or pull from CI artifacts for a tagged release
gh run download <RUN_ID> -n claude-desktop-deb-amd64
gh run download <RUN_ID> -n claude-desktop-rpm-amd64
gh run download <RUN_ID> -n claude-desktop-appimage-amd64
```
Drop the resulting `.deb` / `.rpm` / `.AppImage` into a shared folder mounted into each guest, or `scp` per-guest.
## Running a sweep: the standard loop
For each test in scope:
1. **Read the test spec** in `cases/<surface>.md` (or `ui/<surface>.md` for UI checklists). Note the `Severity`, `Steps`, and `Expected` sections.
2. **Execute the steps** as described.
3. **Compare against Expected.** Mark internally as `✓`, `✗`, `🔧`, or `?` (untested if you couldn't run it for env reasons; `-` if N/A).
4. **On `✗`**: capture the diagnostics from the test's `Diagnostics on failure` block (see [diagnostic capture](#diagnostic-capture) below). File an issue if one isn't already linked.
5. **Update [`matrix.md`](./matrix.md)** in a single PR per row per sweep, titled `test: <ROW> sweep YYYY-MM-DD`.
## Diagnostic capture
Standard captures referenced from test `Diagnostics on failure` blocks:
### `--doctor` output
```bash
claude-desktop --doctor 2>&1 | tee /tmp/doctor.txt
```
Or for AppImage:
```bash
./claude-desktop-*.AppImage --doctor 2>&1 | tee /tmp/doctor.txt
```
### Launcher log
```bash
cat ~/.cache/claude-desktop-debian/launcher.log
```
Truncate and re-run if the file is stale:
```bash
: > ~/.cache/claude-desktop-debian/launcher.log
claude-desktop 2>&1 | tee -a ~/.cache/claude-desktop-debian/launcher.log
```
### Session env
```bash
echo "XDG_SESSION_TYPE=$XDG_SESSION_TYPE"
echo "XDG_CURRENT_DESKTOP=$XDG_CURRENT_DESKTOP"
echo "WAYLAND_DISPLAY=$WAYLAND_DISPLAY"
echo "DISPLAY=$DISPLAY"
echo "GDK_BACKEND=$GDK_BACKEND"
echo "QT_QPA_PLATFORM=$QT_QPA_PLATFORM"
echo "OZONE_PLATFORM=$OZONE_PLATFORM"
echo "ELECTRON_OZONE_PLATFORM_HINT=$ELECTRON_OZONE_PLATFORM_HINT"
```
### Tray / DBus state (KDE)
```bash
# List registered tray icons
gdbus call --session --dest=org.kde.StatusNotifierWatcher \
--object-path=/StatusNotifierWatcher \
--method=org.freedesktop.DBus.Properties.Get \
org.kde.StatusNotifierWatcher RegisteredStatusNotifierItems
# Find which process owns a connection
gdbus call --session --dest=org.freedesktop.DBus \
--object-path=/org/freedesktop/DBus \
--method=org.freedesktop.DBus.GetConnectionUnixProcessID ":1.XXXX"
```
### Portal availability (Wayland)
```bash
systemctl --user status xdg-desktop-portal
busctl --user tree org.freedesktop.portal.Desktop
```
### Suspend inhibitors
```bash
systemd-inhibit --list
```
### App version
```bash
claude-desktop --version
gh variable get CLAUDE_DESKTOP_VERSION
gh variable get REPO_VERSION
```
Always include the upstream version + project version in the issue body and the matrix-update commit message.
## Filing failures
Issue title format: `[<row>] <T## or S##>: <one-line symptom>`
Issue body template:
```markdown
**Test:** [T17 — Folder picker opens](./docs/testing/cases/code-tab-foundations.md#t17--folder-picker-opens)
**Environment:** GNOME (Fedora 43, Wayland)
**Project version:** v1.3.23+claude1.4758.0
**Upstream version:** 1.4758.0
## Steps
<paste from test spec>
## Expected
<paste from test spec>
## Actual
<observed behavior>
## Diagnostics
<--doctor output, launcher log, session env, anything else from the test's Diagnostics block>
## Notes
<any hypotheses, related PRs, recent regressions>
```
Link the issue back into [`matrix.md`](./matrix.md) on the affected cell using the standard format: `✗ #NNN`.
## Updating the matrix
One PR per sweep per row. Bundle every status change for that row into a single commit so the matrix history reads as a sequence of sweep events, not individual cell flips.
Commit message template:
```
test(<row>): sweep <YYYY-MM-DD> — <project_version>+claude<upstream_version>
- T01 ? → ✓
- T03 ? → ✓
- T05 ? → ✗ (filed #NNN)
- T17 ? → ✓
- ...
```
If the same sweep also turned up new tests worth adding, those go in a separate commit before the status update so the diff stays focused.
## Severity guidance for new tests
When adding a test to `cases/` or `ui/`, pick severity using these heuristics:
| Tier | Pick when | Example |
|------|-----------|---------|
| Smoke | First-launch experience; if this fails the app is unusable for normal users | T01 (app launch), T03 (tray), T16 (Code tab loads) |
| Critical | Feature is documented in upstream docs **and** breaks core workflows when broken | T22 (PR monitoring), T34 (connector OAuth), T17 (folder picker) |
| Should | Quality-of-life or documented edge case; users hit it but have a workaround | T28 (catch-up after suspend), S26 (auto-update vs apt) |
| Could | Niche, env-specific, or graceful-degradation checks | T39 (`/desktop` CLI N/A), S22 (computer-use toggle absent on Linux) |
When in doubt, file as **Should**. Smoke and Critical mean release gates — be conservative about adding gates.
## Adding a new test
1. Pick the right surface file in `cases/` (or create one with prior buy-in if no existing surface fits — don't sprinkle new files lightly).
2. Use the next free ID: highest `T##` + 1 for cross-env, highest `S##` + 1 for env-specific. Don't reuse retired IDs.
3. Follow the standard structure: `**Severity:**`, `**Surface:**`, `**Applies to:**`, `**Steps:**`, `**Expected:**`, `**Diagnostics on failure:**`, `**References:**`.
4. Add the row to [`matrix.md`](./matrix.md) with all-`?` initial state.
5. Mention the new test in the PR description so reviewers know to read the spec.
For UI checklist additions, append rows to the relevant `ui/<surface>.md` table. UI rows don't need `T##` / `S##` IDs — the surface file + element name is the identity.
## Automated runs
The harness at [`tools/test-harness/`](../../tools/test-harness/) drives any
test with a `runner:` field. As of 2026-04-30, that's T01, T03, T04, T17.
### Invoking a sweep
```sh
cd tools/test-harness
npm install # first time only
ROW=KDE-W ./orchestrator/sweep.sh
```
Output:
- `results/results-${ROW}-${DATE}/junit.xml` — the JUnit summary (one
testsuite per `.spec.ts` file, with the test's annotations preserved as
metadata).
- `results/results-${ROW}-${DATE}/test-output/<test>/` — per-test
attachments (screenshots, launcher log, session env, frame extents,
click-attempt diagnostics, etc.). Captured on every run, not just on
failure (Decision 7).
- `results/results-${ROW}-${DATE}/html/` — Playwright's HTML report.
- `results/results-${ROW}-${DATE}.tar.zst` — bundled artifact for
off-machine inspection (when `zstd` is available).
`sweep.sh` prints a summary line at the end:
```
summary: tests=4 failures=0 errors=0 skipped=1
```
### Translating results to the matrix
JUnit `<failure>``✗`, `<error>` (harness broke) → `?`, `<skipped>`
`-` (when intentionally not applicable) or stays `?` (when the test
couldn't reach an assertion — common case for renderer tests that need
sign-in or selectors that haven't been tuned). For now this mapping is
manual: open `junit.xml`, update `matrix.md` cells, commit. A
`render-matrix.sh` to do this automatically is on the to-do list.
### Coexistence with manual tests
Tests without a `runner:` continue to flow through the manual loop above.
The matrix doesn't distinguish automated from manual cells — a `✓` is a
`✓` regardless of how it was produced. The `runner:` field on each case
makes the source-of-truth explicit per-test.
### Path through the CDP auth gate (why this works)
The shipped Electron exits if `--remote-debugging-port` is on argv
without a valid `CLAUDE_CDP_AUTH` token. Both `_electron.launch()` and
`chromium.connectOverCDP()` inject that flag. The harness sidesteps the
gate by spawning Electron clean and attaching the Node inspector via
`SIGUSR1` at runtime — same code path as `Developer → Enable Main
Process Debugger`. From there, main-process JS evaluation reaches the
renderer through `webContents.executeJavaScript()`. Full writeup:
[`automation.md`](./automation.md#the-cdp-auth-gate-and-the-runtime-attach-workaround-that-beats-it).
### Wayland-mode sweep
Default backend is X11-via-XWayland (matches `launcher-common.sh`'s
default). To sweep the suite under native Wayland, set
`CLAUDE_HARNESS_USE_WAYLAND=1`:
```sh
CLAUDE_HARNESS_USE_WAYLAND=1 ROW=KDE-W ./orchestrator/sweep.sh
```
Every `launchClaude()` swaps to the Wayland flag set
(`--ozone-platform=wayland` + WaylandWindowDecorations / IME / text-
input-version=3, mirroring `scripts/launcher-common.sh:132-139`) and
exports `CLAUDE_USE_WAYLAND=1` + `GDK_BACKEND=wayland` into the spawn
env. Per-launch overrides via `launchClaude({ extraEnv })` still win,
so a single test can opt back to X11 inside a Wayland-mode sweep.
Caveat: T04 (`_NET_FRAME_EXTENTS` xprop check) only works under
XWayland — native-Wayland sessions have no X11 client list, so T04
will skip with a "no X11 client list" diagnostic.
## Grounding sweep
Separate from the test sweep. Where the test sweep verifies *upstream
Linux compat behavior* against case specs, the grounding sweep
verifies *the specs themselves* against upstream behavior — making
sure the Steps and Expected fields haven't bit-rotted past what the
shipped build actually does. Run on every upstream `CLAUDE_DESKTOP_VERSION`
bump.
### Static pass
For each file under [`cases/`](./cases/), confirm every test's
`**Code anchors:**` field still resolves and the Steps/Expected match
behavior. The convention is documented in
[`cases/README.md`](./cases/README.md#anchor-scope) — anchors are
either upstream code (`build-reference/app-extracted/.vite/build/`),
wrapper scripts (`scripts/`), v7 walker inventory, or out-of-scope
(CLI binary, server-rendered SPA).
When a test drifts, edit Steps/Expected in place. When a feature is
gone from the build, prepend
`> **⚠ Missing in build X.Y.Z** — <note>. Re-verify after next
upstream bump.` under the test heading.
### Runtime pass
Run [`tools/test-harness/grounding-probe.ts`](../../tools/test-harness/grounding-probe.ts)
against the live build:
```sh
cd tools/test-harness
npm run grounding-probe -- --launch --include-synthetic \
--out ../../docs/testing/cases-grounding-runtime.json
```
Captures runtime state for tests where static greps can't disambiguate
(IPC handler registry, `globalShortcut.isRegistered()` for known
accelerators, `app.getLoginItemSettings()`, `safeStorage`,
`autoUpdater.getFeedURL()`, SNI tray registration, AX-tree fingerprint
of whatever's on screen). Output is keyed by test ID — diff against
the previous version's capture to spot drift the static pass missed.
Surfaces inside modals or popups (T22 PR toolbar, T26 preset list,
T31 side chat, T32 slash menu) need the surface open at probe time.
Open the relevant view in the running app before re-running with
`--port 9229` (attach mode).

5
tools/test-harness/.gitignore vendored Normal file
View File

@@ -0,0 +1,5 @@
node_modules/
results/
*.log
.DS_Store
package-lock.json

View File

@@ -0,0 +1,474 @@
# Linux Compatibility Test Harness
In-VM (or on-host) Playwright + DBus runner for the test cases under
[`docs/testing/cases/`](../../docs/testing/cases/). See
[`docs/testing/automation.md`](../../docs/testing/automation.md) for the
architecture, decisions, and rationale.
## Status
Seventy-four specs wired (36 cross-env T-tests, 33 env-specific S-tests,
5 H-prefix harness self-tests).
| Test | What it checks | Layer |
|------|----------------|-------|
| [T01](../../docs/testing/cases/launch.md#t01--app-launch) | X11 window with our pid appears within 15s; title matches `/claude/i` | L2 (xprop) |
| [T02](../../docs/testing/cases/launch.md#t02--doctor-health-check) | `claude-desktop --doctor` exits 0 | spawn probe |
| [T03](../../docs/testing/cases/tray-and-window-chrome.md#t03--tray-icon-present) | A `StatusNotifierItem` is registered by the claude-desktop pid AND exactly one (no rebuild-race duplicates) | L2 (DBus) |
| [T04](../../docs/testing/cases/tray-and-window-chrome.md#t04--window-decorations-draw) | Window has `_NET_FRAME_EXTENTS` (sum > 0) and a "Claude" title | L2 (xprop) |
| [T05](../../docs/testing/cases/shortcuts-and-input.md#t05--claude-url-handler) | `xdg-open 'claude://...'` delivers via `app.on('second-instance')` to the running app | spawn + L1 hook |
| [T06](../../docs/testing/cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused) | `globalShortcut.isRegistered('Ctrl+Alt+Space')` returns true after `mainVisible` | L1 |
| [T07](../../docs/testing/cases/tray-and-window-chrome.md#t07--in-app-topbar) | Five topbar buttons render with non-zero rects (uses `seedFromHost` for hermetic auth) | L1 + DOM |
| [T08](../../docs/testing/cases/tray-and-window-chrome.md#t08--close-x-hides-to-tray) | `win.close()` fires the wrapper interceptor; window hidden, proc alive | L1 |
| [T09](../../docs/testing/cases/platform-integration.md#t09--autostart-via-xdg) | `setLoginItemSettings({ openAtLogin })` writes/removes `$XDG_CONFIG_HOME/autostart/claude-desktop.desktop` | L1 + filesystem |
| [T10](../../docs/testing/cases/platform-integration.md#t10--cowork-integration) | After H04-style spawn detection, `kill -9` the daemon and confirm a *different* pid respawns within ~20s (Patch 6 cooldown + retry) | pgrep delta + spawn delta |
| [T11](../../docs/testing/cases/extensibility.md#t11--plugin-install) | Plugin-install code path fingerprints present in bundled `index.js` | file probe |
| [T11_runtime](../../docs/testing/cases/extensibility.md#t11--plugin-install) | After `seedFromHost` + `userLoaded`, the install-flow eipc surface (`installPlugin`, `uninstallPlugin`, `updatePlugin`, `listInstalledPlugins`, `LocalPlugins/getPlugins` — five-suffix presence probe) is registered on the claude.ai webContents AND BOTH read-side handlers across the two impl objects are callable through the renderer-side wrapper: `CustomPlugins/listInstalledPlugins([])` returns array shape (drives Manage plugins panel), `LocalPlugins/getPlugins()` returns array shape (reads `~/.claude/plugins/installed_plugins.json` per case-doc :465822) — Tier 2 reframe of T11 (case-doc anchor :507181) | L1 (eipc registry + invoke) |
| [T12](../../docs/testing/cases/platform-integration.md#t12--webgl-warn-only) | `app.getGPUFeatureStatus()` returns a populated object; renderer reached visible | L1 |
| [T13](../../docs/testing/cases/launch.md#t13--doctor-reports-correct-package-format) | `--doctor` does not false-flag rpm/deb installs as missing-dpkg AppImage | spawn + stdout grep |
| [T14a](../../docs/testing/cases/launch.md#t14--multi-instance-behavior) | `requestSingleInstanceLock` + `'second-instance'` strings in bundled `index.js` (file probe) | file probe |
| [T14b](../../docs/testing/cases/launch.md#t14--multi-instance-behavior) | Second invocation under same isolation exits cleanly; primary pid stays alive (runtime probe) | spawn delta + pgrep |
| [T16](../../docs/testing/cases/code-tab-foundations.md#t16--code-tab-loads) | After `seedFromHost` + `userLoaded`, `CodeTab.activate()` resolves and ≥1 compact pill renders (env pill = Code-body mounted) | L1 + AX-tree |
| [T17](../../docs/testing/cases/code-tab-foundations.md#t17--folder-picker-opens) | After `seedFromHost` + `userLoaded`, Code df-pill → env pill → Local → Select folder → Open folder triggers `dialog.showOpenDialog` (mock installed via `installOpenDialogMock`); skips cleanly when host has no signed-in Claude config | L1 + AX-tree |
| [T18](../../docs/testing/cases/code-tab-foundations.md#t18--drag-and-drop-files-into-prompt) | Bundled `mainView.js` preload contains the path-resolution bridge fingerprints: `getPathForFile` (2× — property key + the `webUtils.getPathForFile(` call, both at case-doc :9267), `webUtils`, `filePickers`, and the `claudeAppSettings` `contextBridge.exposeInMainWorld` namespace (case-doc :9552) — pins the load-bearing wiring without faking OS-level XDND drag (xdotool can't put file URIs on the X11 selection; Wayland needs per-compositor IPC + libei) | file probe |
| [T19](../../docs/testing/cases/code-tab-foundations.md#t19--integrated-terminal) | After `seedFromHost` + `userLoaded`, the integrated-terminal eipc surface (`startShellPty`, `writeShellPty`, `stopShellPty`, `resizeShellPty`, `getShellPtyBuffer` — five-suffix presence probe) is registered on the claude.ai webContents AND the foundational `LocalSessions/getAll` returns array shape (Tier 2 reframe of the case-doc T19 case; case-doc anchors are write-side `startShellPty` etc. so reframe asserts the FULL terminal IPC surface registers + a stateless read-side surrogate is invocable) | L1 (eipc registry + invoke) |
| [T20](../../docs/testing/cases/code-tab-foundations.md#t20--file-pane-opens-and-saves) | After `seedFromHost` + `userLoaded`, the file-pane eipc surface (`readSessionFile`, `writeSessionFile`, `pickSessionFile` — three-suffix presence probe) is registered on the claude.ai webContents AND the foundational `LocalSessions/getAll` returns array shape (Tier 2 reframe of the case-doc T20 case; the case-doc's `readSessionFile` anchor is read-side but needs (sessionId, path) args not constructible from a fresh isolation, so the registration probe + foundational `getAll` invocation is the strongest non-destructive Tier 2 layer) | L1 (eipc registry + invoke) |
| [T21](../../docs/testing/cases/code-tab-workflow.md#t21--dev-server-preview-pane) | After `seedFromHost` + `userLoaded`, the preview-pane eipc surface (`getConfiguredServices`, `startFromConfig`, `stopServer`, `getAutoVerify`, `capturePreviewScreenshot` — five-suffix presence probe) is registered on the claude.ai webContents AND BOTH case-doc-anchored read-side handlers are callable through the renderer-side wrapper: `getConfiguredServices(cwd)` returns array shape, `getAutoVerify(cwd)` returns boolean shape (Tier 2 reframe of the case-doc T21 case; cwd validator is `typeof cwd === 'string'` only, smoke-tested session 11) | L1 (eipc registry + invoke) |
| [T22](../../docs/testing/cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) | Bundled `index.js` contains `LocalSessions_$_getPrChecks` eipc channel name *and* `gh CLI not found in PATH` Linux-fallthrough throw site (Tier 1 fingerprint) | file probe |
| [T22b](../../docs/testing/cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) | After `seedFromHost` + `userLoaded`, the `LocalSessions_$_getPrChecks` eipc handler is registered on the claude.ai webContents (`webContents.ipc._invokeHandlers` — Tier 2 runtime probe sibling of T22, strictly stronger than the bundle-string fingerprint) | L1 (eipc registry) |
| [T23](../../docs/testing/cases/code-tab-handoff.md#t23--desktop-notifications-fire) | Firing `new Notification({title})` from main reaches the session bus's `org.freedesktop.Notifications.Notify` (observed via `dbus-monitor`) | L1 + DBus subprocess |
| [T24](../../docs/testing/cases/code-tab-handoff.md#t24--open-in-external-editor) | After `installOpenExternalMock` mirroring T25's pattern, `evalInMain` calls `shell.openExternal('vscode://file/...')`; mock records the URL verbatim, no real editor launch | L1 (mocked egress) |
| [T25](../../docs/testing/cases/code-tab-handoff.md#t25--show-in-files--file-manager) | After `installShowItemInFolderMock` mirroring T17's dialog-mock pattern, `evalInMain` calls `shell.showItemInFolder(<synthetic path>)`; mock records the call verbatim, no throw — no host side effect | L1 (mocked egress) |
| [T26](../../docs/testing/cases/routines.md#t26--routines-page-renders) | After `seedFromHost` + `userLoaded`, click "Routines" sidebar AX button; assert "New routine" / "All" / "Calendar" anchor renders | L1 + AX-tree |
| [T27](../../docs/testing/cases/routines.md#t27--scheduled-task-fires-and-notifies) | After `seedFromHost` + `userLoaded`, both Cowork and CCD `getAllScheduledTasks` eipc handlers are registered AND callable through the renderer-side wrapper, returning array shape — Tier 2 reframe of the case-doc T27 case | L1 (eipc invoke) |
| [T30](../../docs/testing/cases/code-tab-workflow.md#t30--auto-archive-on-pr-merge) | Bundled `index.js` colocates the auto-archive sweep cadence (`300*1e3``3600*1e3``AutoArchiveEngine`) with the `ccAutoArchiveOnPrClose` gate key (single-regex multi-string fingerprint) | file probe |
| [T31](../../docs/testing/cases/code-tab-workflow.md#t31--side-chat-opens) | Bundled `index.js` contains all three side-chat eipc channel names (`startSideChat`, `sendSideChatMessage`, `stopSideChat`) — load-bearing trio | file probe |
| [T31b](../../docs/testing/cases/code-tab-workflow.md#t31--side-chat-opens) | After `seedFromHost` + `userLoaded`, all three side-chat eipc handlers (`startSideChat`, `sendSideChatMessage`, `stopSideChat`) are registered on the claude.ai webContents — load-bearing trio (Tier 2 runtime sibling of T31) | L1 (eipc registry) |
| [T32](../../docs/testing/cases/code-tab-workflow.md#t32--slash-command-menu) | Bundled `index.js` contains `LocalSessions_$_getSupportedCommands` eipc channel + `slashCommands` schema field | file probe |
| [T33](../../docs/testing/cases/extensibility.md#t33--plugin-browser) | Bundled `index.js` contains `CustomPlugins_$_listMarketplaces` and `CustomPlugins_$_listAvailablePlugins` eipc channel names (browser populate flow) | file probe |
| [T33b](../../docs/testing/cases/extensibility.md#t33--plugin-browser) | After `seedFromHost` + `userLoaded`, both plugin-browser eipc handlers (`listMarketplaces`, `listAvailablePlugins`) are registered on the claude.ai webContents — load-bearing pair (Tier 2 runtime sibling of T33) | L1 (eipc registry) |
| [T33c](../../docs/testing/cases/extensibility.md#t33--plugin-browser) | After `seedFromHost` + `userLoaded`, both plugin-browser eipc handlers (`listMarketplaces`, `listAvailablePlugins`) are callable through the renderer-side wrapper with `args = [[]]` (empty `egressAllowedDomains`), each returning array shape — Tier 2 invocation upgrade of T33b, strictly stronger than registration alone | L1 (eipc invoke) |
| [T35](../../docs/testing/cases/extensibility.md#t35--mcp-server-config-picked-up) | Bundled `index.js` contains the four-needle MCP-config separation fingerprint: `claude_desktop_config.json` (chat-tab path), `.claude.json` + `.mcp.json` (Code-tab loaders), `"user","project","local"` (settingSources triple Code-session passes to the agent SDK) — pins per-tab separation without launch | file probe |
| [T35b](../../docs/testing/cases/extensibility.md#t35--mcp-server-config-picked-up) | After `seedFromHost` + `userLoaded`, the `claude.settings/MCP/getMcpServersConfig` eipc handler is registered AND callable through the renderer-side wrapper, returning a non-array object (Tier 2 runtime sibling of T35, strictly stronger than the bundle-string fingerprint) | L1 (eipc invoke) |
| [T36](../../docs/testing/cases/extensibility.md#t36--hooks-fire) | Bundled `index.js` contains the hooks runtime fingerprint: `hook_started` / `hook_progress` / `hook_response` (single-occurrence Verbose-transcript runtime emits) plus `PreToolUse` / `UserPromptSubmit` registry tokens — pins the runtime hook-fire path the case-doc Verbose-transcript claim hangs on | file probe |
| [T37](../../docs/testing/cases/extensibility.md#t37--claudemd-memory-loads) | Bundled `index.js` contains `[GlobalMemory] Copied CLAUDE.md` log line + `CLAUDE.md` filename literal + `CLAUDE_CONFIG_DIR` env-var token (memory-loading wiring) | file probe |
| [T37b](../../docs/testing/cases/extensibility.md#t37--claudemd-memory-loads) | After `seedFromHost` + `userLoaded`, the `claude.web/CoworkMemory/readGlobalMemory` eipc handler is registered AND callable through the renderer-side wrapper, returning the documented `string \| null` shape (Tier 2 runtime sibling of T37) | L1 (eipc invoke) |
| [T38](../../docs/testing/cases/code-tab-handoff.md#t38--continue-in-ide) | Bundled `index.js` contains `LocalSessions_$_openInEditor` eipc channel name (Tier 1 fingerprint) | file probe |
| [T38b](../../docs/testing/cases/code-tab-handoff.md#t38--continue-in-ide) | After `seedFromHost` + `userLoaded`, the `LocalSessions_$_openInEditor` eipc handler is registered on the claude.ai webContents (Tier 2 runtime sibling of T38) | L1 (eipc registry) |
| H01 | CDP auth gate exits with code 1 when spawned with `--remote-debugging-port` and no `CLAUDE_CDP_AUTH` token | spawn probe |
| H02 | `frame-fix-wrapper.js` + `frame-fix-entry.js` injected into `app.asar` (Proxy + main-field reference) | file probe |
| H03 | Build-pipeline patch fingerprints all present in `app.asar` (KDE gate, frame-fix inject, tray, cowork, claude-code) | file probe |
| H04 | cowork daemon spawns under app and exits with app — soft-skips on rows where it isn't gated to spawn | pgrep delta |
| H05 | UI-drift canary against the AX-tree fingerprint walker (requires `CLAUDE_TEST_USE_HOST_CONFIG=1`) | L1 (AX) |
| [S01](../../docs/testing/cases/distribution.md#s01--appimage-launches-without-manual-libfuse2t64) | AppImage launches without `libfuse.so.2` complaint (skips on non-AppImage rows) | spawn + stderr grep |
| [S02](../../docs/testing/cases/distribution.md#s02--xdg_current_desktopubuntugnome-prefix-form-doesnt-break-de-detection) | No strict `==` equality against `XDG_CURRENT_DESKTOP` in launcher / patches (regression detector) | source-tree probe |
| [S03](../../docs/testing/cases/distribution.md#s03--deb-install-pulls-runtime-deps) | `dpkg-query Depends:` field non-empty (currently fails as upstream-contract regression detector) | dpkg-query |
| [S04](../../docs/testing/cases/distribution.md#s04--rpm-install-pulls-runtime-deps) | `rpm -qR` has at least one non-`rpmlib(...)` requirement (currently fails per #autoreqprov off) | rpm -qR |
| [S05](../../docs/testing/cases/distribution.md#s05--doctor-recognises-dnf-installed-package-doesnt-false-flag-as-appimage) | Doctor does not false-flag rpm-installed package (skips when `rpm -qf` doesn't claim the binary) | spawn + stdout grep |
| [S07](../../docs/testing/cases/shortcuts-and-input.md#s07--claude_use_waylandvar) | Under `CLAUDE_HARNESS_USE_WAYLAND=1`, spawned Electron has `--ozone-platform=wayland` on argv | argv probe |
| [S08](../../docs/testing/cases/tray-and-window-chrome.md#s08--tray-icon-doesnt-duplicate-after-nativetheme-update) | `setImage`-based in-place fast-path injected by `tray.sh` (KDE-only, file probe) | file probe |
| [S09](../../docs/testing/cases/shortcuts-and-input.md#s09--quick-window-patch-runs-only-on-kde-post-406-gate) | KDE-gate string present in bundled `index.js` (patch ran at build) | file probe |
| [S10](../../docs/testing/cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame) | KDE-W only — popup runtime `getBackgroundColor() === '#00000000'` after Quick Entry opens (regression-detector against electron#50213 if bundled Electron in 41.0.4-bisect-window) | L1 + ydotool |
| [S11](../../docs/testing/cases/shortcuts-and-input.md#s11--quick-entry-shortcut-fires-from-any-focus-on-wayland-mutter-xwayland-key-grab) | GNOME-X / Ubu-X only (X11-side regression detector) — spawn xterm marker, `xdotool windowfocus` to it, verify `_NET_ACTIVE_WINDOW` shifted, fire `Ctrl+Alt+Space` via ydotool, assert popup visible. Wayland-side mutter regression (#404) is a primitive gap — needs Wayland-native focus injection (libei) | L1 + xdotool focus + ydotool shortcut |
| S12 | `--enable-features=GlobalShortcutsPortal` in Electron argv (GNOME-W only — currently a known-failing regression detector) | argv probe |
| [S14](../../docs/testing/cases/shortcuts-and-input.md#s14--global-shortcuts-via-xdg-portal-work-on-niri) | Niri only — spawn `foot` marker, `niri msg action focus-window` to it, verify `niri msg --json focused-window` shifted, fire `Ctrl+Alt+Space` via ydotool, assert popup visible. Currently known-failing detector for the Niri portal `BindShortcuts` path (parallels S12's GNOME-W detector) | L1 + niri msg focus + ydotool shortcut |
| [S15](../../docs/testing/cases/distribution.md#s15--appimage-extraction---appimage-extract-works-as-documented-fallback) | `--appimage-extract` exits 0; `squashfs-root/AppRun --version` runs without FUSE error | spawn + filesystem |
| [S16](../../docs/testing/cases/distribution.md#s16--appimage-mount-cleans-up-on-app-exit) | `mount(8)` shows new `.mount_claude` while app is up; gone within 10s of close | mount delta |
| [S17](../../docs/testing/cases/platform-integration.md#s17--app-launched-from-desktop-inherits-shell-path) | Shell-path-worker overlays user's login-shell PATH onto a deliberately-scrubbed env | L1 + utilityProcess |
| [S19](../../docs/testing/cases/routines.md#s19--claude_config_dir-redirects-scheduled-task-storage) | `extraEnv: { CLAUDE_CONFIG_DIR }` reaches main-process `process.env`; `cE()`-equivalent resolves under the override path | L1 + extraEnv |
| [S21](../../docs/testing/cases/routines.md#s21--lid-close-still-suspends-per-os-policy) | No `handle-lid-switch` / `HandleLidSwitch` strings in bundle (lid policy deferred to OS) | asar absence probe |
| [S22](../../docs/testing/cases/platform-integration.md#s22--computer-use-toggle-absent-or-visibly-disabled-on-linux) | `new Set(["darwin","win32"])` platform gate present; no 2-element Set pairing linux (file-probe form) | asar regex |
| [S25](../../docs/testing/cases/platform-integration.md#s25--mobile-pairing-survives-linux-session-restart) | `safeStorage.encryptString → file → app restart → file → safeStorage.decryptString` round-trips the same plaintext (skips when `isEncryptionAvailable === false`) | L1 + shared isolation handle |
| [S26](../../docs/testing/cases/distribution.md#s26--auto-update-is-disabled-when-installed-via-aptdnf) | `setFeedURL` present + project suppression marker present (currently fails — gated on #567) | asar fingerprint |
| [S27](../../docs/testing/cases/extensibility.md#s27--plugins-install-per-user) | `installed_plugins.json` + homedir resolver present; no `*/plugins` system paths in bundle | asar fingerprint |
| [S28](../../docs/testing/cases/extensibility.md#s28--worktree-creation-surfaces-clear-error-on-read-only-mounts) | Bundled `index.js` contains the worktree permission classifier expression (`"Permission denied" \|\| "Access is denied" \|\| "could not lock config file" → "permission-denied"`) plus the `Failed to create git worktree:` log line | asar fingerprint |
| [S29](../../docs/testing/cases/shortcuts-and-input.md#s29--quick-entry-popup-is-created-lazily-on-first-shortcut-press-closed-to-tray-sanity) | Popup opens when main is hidden-to-tray (lazy-create sanity) | L1 |
| [S30](../../docs/testing/cases/shortcuts-and-input.md#s30--quick-entry-shortcut-becomes-a-no-op-after-full-app-exit) | No new claude-desktop pid spawns after post-exit shortcut press | pgrep delta + ydotool |
| [S31](../../docs/testing/cases/shortcuts-and-input.md#s31--quick-entry-submit-makes-the-new-chat-reachable-from-any-main-window-state) | Submit reaches new chat from visible / minimized / hidden-to-tray (QE-7/8/9) | L1 + ydotool |
| S32 | GNOME mutter stale-`isFocused()` regression (GNOME-W/Ubu-W only — known-failing today) | L1 + ydotool |
| [S33](../../docs/testing/cases/shortcuts-and-input.md#s33--quick-entry-transparent-rendering-tracked-against-bundled-electron-version) | Captures bundled Electron version against the #370 / electron#50213 bisect threshold | file read |
| [S34](../../docs/testing/cases/shortcuts-and-input.md#s34--quick-entry-shortcut-focuses-fullscreen-main-window-instead-of-showing-popup) | Popup does **not** appear when main is fullscreen (upstream contract) | L1 + ydotool |
| [S35](../../docs/testing/cases/shortcuts-and-input.md#s35--quick-entry-popup-position-is-persisted-across-invocations-and-across-app-restarts) | Popup position persists across invocations *and* across app restart (two-launch test) | L1 + shared isolation handle + ydotool |
| S36 | Multi-monitor fallback — skip-on-single-monitor with documented `fixme` for the disconnect orchestration | display probe |
| S37 | Main-window destroy unreachable on Linux per close-to-tray override — documented skip | — |
These specs exercise the substrate primitives in `lib/`: `xprop`
shell-outs (T01, T04), `dbus-next` (T03), `dbus-monitor` subprocess
eavesdrop (T23), Node-inspector runtime-attach
(T07/T16/T17/T26/S10/S29-S35/T05-T14b L1 specs), `app.asar` content reads
(S08/S09/S21/S22/S26/S27/S28/T11/T14a/T18/T22/T30/T31/T32/T33/T35/T36/T37/T38/H02/H03/S33 — mostly `index.js`; T18 reads `mainView.js`),
`/proc/$pid/cmdline` reads (S07/S12), pgrep-based pid deltas
(T10/T14b/H04/S16/S30), `mount(8)` parsing (S16), source-tree probes
against `scripts/launcher-common.sh` (S02), `dpkg-query` / `rpm -qR` /
`rpm -qf` calls (S03/S04/S05/T13), `safeStorage.encryptString`
round-trip across two launches (S25), `extraEnv` precedence over
isolation env (S19), the `lib/electron-mocks.ts` mock-then-call
helpers — `installOpenDialogMock` (T17), `installShowItemInFolderMock`
(T25), `installOpenExternalMock` (T24) — the `lib/input.ts`
focus-shifter (`focusOtherWindow` + `spawnMarkerWindow` for S11; X11
only — `WaylandFocusUnavailable` thrown on native Wayland) and its
Niri-native sibling `lib/input-niri.ts` (`niri msg --json` for the
focus-injection + readback chain, `foot --title` for the marker
window; `NiriIpcUnavailable` thrown off-Niri; consumed by S14), the
`lib/eipc.ts` registry walker (`getEipcChannels` /
`waitForEipcChannel` / `waitForEipcChannels` against
`webContents.ipc._invokeHandlers`; opaque on the UUID, suffix-matched
against case-doc anchors; consumed by T19 / T20 / T22b / T31b / T33b /
T38b) plus its session 8 invoke surface (`invokeEipcChannel` — calls
a registered handler through the renderer-side wrapper at
`window['claude.<scope>'].<Iface>.<method>`; consumed by T19 / T20 /
T27 / T33c / T35b / T37b), the `lib/ax.ts` AX-tree substrate
(`snapshotAx` for one-shot reads + `waitForAxNode` / `waitForAxNodes`
for predicate-based polling, plus re-exports of `RawElement` /
`AxNode` / `axTreeToSnapshot` / `waitForAxTreeStable` from
`explore/walker.ts` so consumers stay inside `lib/`; threshold-
driven extraction in session 13 once T26 had to duplicate the
formerly-private `snapshotAx` from `claudeai.ts`; consumed by
`claudeai.ts` page-objects + T26; session 14 migrated `activateTab`
from a one-shot snapshot to `waitForAxNode` polling — fixes the
T16 `no AX-tree button with accessibleName="Code" found` failure
mode where the Code button hadn't rendered yet at click time —
and converted `CodeTab.activate`'s post-click `findCompactPills`
retry loop to `waitForAxNodes`) — and the
`createIsolation({ seedFromHost: true })` primitive that lets login-
required tests run hermetically against a copy of the host's signed-
in auth state (T07, T11_runtime, T16, T17, T19, T20, T21, T22b, T26,
T27, T31b, T33b, T33c, T35b, T37b, T38b — session 15 migrated T17
from the legacy `CLAUDE_TEST_USE_HOST_CONFIG=1` / `isolation: null`
shape to `seedFromHost`, fixing a pre-existing 60s spec-timeout
flake where the unauth'd default isolation polled `userLoaded` past
Playwright's spec budget; session 16 verified the migration end-to-
end — `seedFromHost` clones the host's signed-in config,
`waitForReady('userLoaded')` resolves to a post-login URL, and the
session-14 `CodeTab.activate({ timeout: 15_000 })` succeeds; T17
now reaches a NEW failure mode at the next chain step
(`openFolderPicker` after `selectLocal`, `Select folder…` pill
doesn't render on `/epitaxy` workspace route — likely needs `/new`
context, deferred for a future session).
Note on eipc channels: the `LocalSessions_$_*` and `CustomPlugins_$_*`
channel names referenced in the case-doc Code anchors don't register
through Electron's *global* `ipcMain.handle()` registry (which only
carries 3 chat-tab MCP-bridge handlers). They DO register through
Electron's stdlib `IpcMainImpl` — just on the per-`webContents` IPC
scope (`webContents.ipc._invokeHandlers`, Electron 17+) rather than
the global one. The framing is
`$eipc_message$_<UUID>_$_<scope>_$_<iface>_$_<method>` (UUID stable
across builds at `c0eed8c9-…`); 117 `LocalSessions_*` + 16
`CustomPlugins_*` + 50+ other interfaces register on the claude.ai
webContents. T22 / T31 / T33 / T38 ship as Tier 1 fingerprints
against the bundled channel-name strings; T22b / T31b / T33b / T38b
are the runtime registry-presence siblings (strictly stronger,
require `seedFromHost`). T27 / T33c / T35b / T37b go one step
further — they invoke the resolved handlers through the renderer-
side wrapper at `window['claude.<scope>'].<Iface>.<method>`. T19 /
T20 are first-runtime-probe siblings of case-doc tests whose anchors
are write-side handlers (`startShellPty` / `writeSessionFile`); they
ship a five-suffix / three-suffix registration probe over the
case-doc-anchored write-side surface plus a single foundational
read-side `LocalSessions/getAll` invocation as the read-side
surrogate (case-doc connection: integrated terminal and file pane
both bind to LocalSessions; `getAll` proves the LocalSessions impl
object is reachable through the renderer wrapper). T21 and
T11_runtime extend the dual-invocation pattern: when a case-doc has
read-side anchors with resolvable arg shapes, invoke the case-doc-
anchored handlers directly rather than through a foundational
surrogate (T21: `getConfiguredServices` array + `getAutoVerify`
boolean on a single Launch impl object; T11_runtime: cross-impl-
object dual invocation — `CustomPlugins/listInstalledPlugins` array
+ `LocalPlugins/getPlugins` array — proves the install plumbing
crosses both interfaces intact, strictly stronger than single-
interface coverage). All wrapper
invocations use the wrapper exposed by `mainView.js` via
`contextBridge.exposeInMainWorld` after a top-frame + origin gate
(`Qc()`: claude.ai / claude.com / preview.* / localhost). Calling
through the wrapper carries an honest `senderFrame` for the inlined
`le()` / `Vi()` per-handler origin gate, so the test surface matches
real attack surface. T33c also
demonstrates the schema-rev path: when invocation rejects with
`Argument "<name>" at position N ... failed to pass validation`,
the verbatim rejection string is the cheapest grep target back to
the inline hand-rolled validator block (bundle bytes 5013601 /
5018821 for the two CustomPlugins methods). See `lib/eipc.ts` for
both surfaces.
Per-row pass/skip counts depend on which sweep runs against the row.
The Quick Entry runners (S29-S35) all share the same primitive set
(`installInterceptor()` + `openAndWaitReady()` + scenario-specific
state setup).
## Prerequisites
On the host or VM running the sweep:
- Node.js ≥ 20
- `claude-desktop` installed (deb / rpm / AppImage), reachable via `claude-desktop` on `PATH` or `CLAUDE_DESKTOP_LAUNCHER` env var
- `xprop` (for L2 window queries — `dnf install xorg-x11-utils` on Fedora; `apt install x11-utils` on Debian/Ubuntu)
- `zstd` (optional — used to bundle results)
### Quick Entry runners (S29S37, future QE-*)
Quick Entry tests inject the OS-level shortcut via `ydotool` /
`/dev/uinput`. One-time setup per host or VM:
```sh
# Install the binary + daemon
sudo dnf install -y ydotool # or: sudo apt install ydotool
# Make ydotoold's socket world-writable so the test runner reaches it
sudo mkdir -p /etc/systemd/system/ydotool.service.d
sudo tee /etc/systemd/system/ydotool.service.d/override.conf <<'EOF'
[Service]
ExecStart=
ExecStart=/usr/bin/ydotoold --socket-perm=0666
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now ydotool.service
```
After this, `ydotool key 29:1 29:0` (Ctrl tap) should exit 0. The
runner sets `YDOTOOL_SOCKET=/tmp/.ydotool_socket` automatically;
override the env var if your daemon binds elsewhere.
ydotool **cannot** drive portal-grabbed shortcuts (kernel uinput
events vs compositor portal grabs) — those tests stay manual until
libei adoption broadens. See [`docs/testing/automation.md`](../../docs/testing/automation.md#input-injection--ydotool-now-libei-next).
## Install
```sh
cd tools/test-harness
npm install
```
`package-lock.json` is gitignored for now; commit it once the dep set is settled.
## Run
```sh
# All four tests against the locally installed claude-desktop
ROW=KDE-W ./orchestrator/sweep.sh
# Single test
npx playwright test src/runners/T01_app_launch.spec.ts
# Headed (watch the app launch in front of you)
npx playwright test --headed
# Run the full suite under native Wayland instead of X11/XWayland
CLAUDE_HARNESS_USE_WAYLAND=1 npm test
# Grounding probe — dump runtime state for the case-doc grounding sweep
npm run grounding-probe -- --launch --include-synthetic \
--out ../../docs/testing/cases-grounding-runtime.json
```
Results land at `results/results-${ROW}-${DATE}/`:
```
results/results-KDE-W-20260430T143000Z/
├── junit.xml # JUnit summary (matrix-regen input)
├── html/ # Playwright HTML report
└── test-output/ # Per-test attachments (screenshots, logs, etc.)
```
A bundled `results-${ROW}-${DATE}.tar.zst` sits next to the dir if `zstd`
is installed.
## Environment variables
| Var | Default | Purpose |
|-----|---------|---------|
| `ROW` | `KDE-W` | Matrix row label, propagated into the bundle name and per-test annotations. Drives `skipUnlessRow()` in spec files |
| `CLAUDE_DESKTOP_LAUNCHER` | `claude-desktop` (PATH lookup) | Path to the launcher / Electron binary Playwright spawns |
| `CLAUDE_DESKTOP_ELECTRON` | probed | Override the resolved Electron binary path (skips deb/rpm install probing) |
| `CLAUDE_DESKTOP_APP_ASAR` | probed | Override the resolved `app.asar` path |
| `CLAUDE_TEST_USE_HOST_CONFIG` | unset | When `1`, opt out of per-test isolation and use the host's real `~/.config/Claude`. Required for tests that need a signed-in claude.ai (S31, future submit-side QE runners). **Side effect:** these tests write to your real account — chats / settings persist |
| `CLAUDE_HARNESS_USE_WAYLAND` | unset | When `1`, every runner spawns Electron with the native-Wayland backend (`--ozone-platform=wayland` + sibling flags from `launcher-common.sh`) instead of the default X11-via-XWayland. `CLAUDE_USE_WAYLAND=1` is also exported into the spawn env for in-app paths that read it. Per-launch overrides via `launchClaude({ extraEnv })` still win |
| `YDOTOOL_SOCKET` | `/tmp/.ydotool_socket` | Path to the `ydotoold` socket. Override only if the daemon binds elsewhere |
| `OUTPUT_DIR` | `./results` | Where bundles land |
| `RESULTS_DIR` | per-run derived | Single-run output dir (set by `sweep.sh`; usually you don't set this manually) |
### Per-test isolation default
`launchClaude()` creates a fresh `XDG_CONFIG_HOME` / `CLAUDE_CONFIG_DIR`
under `$TMPDIR/claude-test-*` for every launch and removes it on
`close()`. This is the default to prevent state leaks between tests
(SingletonLock collisions, persisted Quick Entry positions, etc. —
see Decision 1 in [`docs/testing/automation.md`](../../docs/testing/automation.md)).
Three escape hatches:
- **`launchClaude()`** — default, fresh per-launch isolation.
- **`launchClaude({ isolation })`** — pass a shared `Isolation` handle
to launch the same app twice with persistent state (e.g. S35
position-memory across restart).
- **`launchClaude({ isolation: null })`** — opt out entirely; share
the host's `~/.config/Claude`. Used by tests gated on
`CLAUDE_TEST_USE_HOST_CONFIG` for signed-in claude.ai access.
## Layout
```
tools/test-harness/
├── package.json
├── tsconfig.json
├── playwright.config.ts
├── src/
│ ├── lib/ # shared helpers
│ │ ├── electron.ts # spawn + isolation + inspector attach
│ │ ├── inspector.ts # Node-inspector RPC client (SIGUSR1 path)
│ │ ├── dbus.ts # dbus-next session-bus + helpers
│ │ ├── sni.ts # StatusNotifierWatcher / Item
│ │ ├── wm.ts # xprop wrappers (X11 + XWayland)
│ │ ├── env.ts # XDG_CURRENT_DESKTOP / SESSION_TYPE branching
│ │ ├── row.ts # skipUnlessRow / skipOnRow primitives
│ │ ├── isolation.ts # per-test XDG_CONFIG_HOME sandbox
│ │ ├── argv.ts # /proc/$pid/cmdline reader + flag check
│ │ ├── asar.ts # in-place app.asar reads (no temp extract)
│ │ ├── quickentry.ts # Quick Entry domain wrapper (popup, MainWindow, ydotool)
│ │ ├── claudeai.ts # claude.ai renderer UI domain (CodeTab, dialog mock, atoms)
│ │ ├── electron-mocks.ts # mock-then-call helpers (dialog/showItemInFolder/openExternal)
│ │ ├── input.ts # focus-shifter primitive (X11 only — xdotool + xprop verify; spawnMarkerWindow xterm)
│ │ ├── input-niri.ts # focus-shifter primitive (Niri only — niri msg --json verify; spawnMarkerWindow foot)
│ │ ├── eipc.ts # eipc-channel registry walker (per-webContents IPC scope; suffix-matched, UUID-opaque)
│ │ ├── retry.ts # poll-until-true with timeout
│ │ └── diagnostics.ts # launcher log, --doctor, session env
│ └── runners/ # one .spec.ts per test ID
│ ├── T01_app_launch.spec.ts
│ ├── T03_tray_icon_present.spec.ts
│ ├── T04_window_decorations.spec.ts
│ ├── T17_folder_picker.spec.ts
│ ├── S09_quick_window_patch_only_kde.spec.ts
│ ├── S12_global_shortcuts_portal_flag.spec.ts
│ ├── S29_quick_entry_lazy_create_closed_to_tray.spec.ts
│ ├── S30_quick_entry_noop_after_app_exit.spec.ts
│ ├── S31_quick_entry_submit_reaches_new_chat.spec.ts
│ ├── S32_quick_entry_submit_gnome_stale_isfocused.spec.ts
│ ├── S33_electron_version_capture.spec.ts
│ ├── S34_shortcut_focuses_fullscreen_main.spec.ts
│ ├── S35_quick_entry_position_persisted_across_restarts.spec.ts
│ ├── S36_quick_entry_fallback_to_primary_display.spec.ts
│ ├── S37_quick_entry_popup_after_main_destroy.spec.ts
│ ├── H01_cdp_gate_canary.spec.ts
│ ├── H02_frame_fix_wrapper_present.spec.ts
│ ├── H03_patch_fingerprints.spec.ts
│ └── H04_cowork_daemon_lifecycle.spec.ts
├── probe.ts # one-off renderer-DOM probe (debugger on :9229)
├── grounding-probe.ts # case-grounding runtime capture (see "Grounding probe" below)
└── orchestrator/
└── sweep.sh # row-aware harness invocation
```
H-prefix specs are harness self-tests — they validate the harness's
preconditions and the build pipeline's invariants (CDP gate alive,
patches landed, daemon lifecycle clean). Cheap, run in <1s each
except H04 which launches the app.
## How L1 testing works (the SIGUSR1 path)
The shipped Electron has a CDP auth gate that exits the app whenever
`--remote-debugging-port` or `--remote-debugging-pipe` is on argv and a
valid `CLAUDE_CDP_AUTH` token isn't in env. Both Playwright's
`_electron.launch()` and `chromium.connectOverCDP()` inject the gated
flag, so both are blocked.
The gate doesn't check `--inspect` or runtime `SIGUSR1`, which is the
same code path as the in-app `Developer → Enable Main Process Debugger`
menu item. So:
1. `launchClaude()` spawns Electron with no debug-port flags (gate
asleep) and waits for the X11 window.
2. `app.attachInspector()` sends `SIGUSR1` to the pid; Node's inspector
opens on port 9229.
3. `lib/inspector.ts` connects via WebSocket and exposes
`evalInMain(body)` and `evalInRenderer(urlFilter, js)` for tests.
From the inspector you can:
- Drive the renderer via `webContents.executeJavaScript()`
- Install main-process mocks (e.g. `dialog.showOpenDialog` for T17)
- Inspect any Electron API state
Two gotchas worth knowing:
- `BrowserWindow.getAllWindows()` returns 0 because frame-fix-wrapper
substitutes the BrowserWindow class. Use `webContents.getAllWebContents()`
instead — works correctly and includes both the shell window and the
embedded claude.ai BrowserView.
- `Runtime.evaluate` with `awaitPromise: true` returns empty objects for
awaited Promise resolutions. `inspector.evalInMain<T>()` returns
`JSON.stringify(value)` from the IIFE and parses on the caller side
to dodge this.
Full writeup with rationale and tradeoffs:
[`docs/testing/automation.md` "The CDP auth gate"](../../docs/testing/automation.md#the-cdp-auth-gate-and-the-runtime-attach-workaround-that-beats-it).
## Grounding probe
`grounding-probe.ts` is a separate entry-point — not a Playwright spec —
that connects to a live Claude Desktop and dumps the runtime state
backing the load-bearing claims in
[`docs/testing/cases/`](../../docs/testing/cases/). It exists because
static grep against the 546k-line beautified bundle has known blind
spots (lazy `import()`s, dynamic handler tables, conditional wiring),
and some claims (S26 autoUpdater gate, S20 powerSaveBlocker path) can
only be verified at runtime.
```sh
# Self-contained: launchClaude() + capture + tear down
npm run grounding-probe -- --launch
# Plus the one synthetic probe (powerSaveBlocker start+stop)
npm run grounding-probe -- --launch --include-synthetic
# Attach to an already-running app (manual --inspect=9229 setup)
npm run grounding-probe -- --port 9229 --out /tmp/probe.json
```
Output is keyed by test ID — see the file's header comment for the
full table. Diff captures across upstream version bumps to spot
behavior drift the static sweep would miss. Surfaces inside modals
or popups (T22 PR toolbar, T26 preset list, T31 side chat, T32 slash
menu) need the surface open at probe time — the AX-tree fingerprint
is a snapshot of what's currently on screen.
## Known limitations
- **T04** uses `xprop` (no `xdotool` dependency — walks `_NET_CLIENT_LIST` + `_NET_WM_PID`). Works on X11 native and KDE Wayland (XWayland), **not** on native-Wayland sessions where the app is running through Ozone-Wayland directly. Per Decision 6, project default is X11; native-Wayland window-state queries are deferred until those tests get added.
- **T17** is shallow — it intercepts `dialog.showOpenDialog` at the Electron main process level. The integration question "does Claude make the right *portal* call?" is a v2 concern; portal-level mocking via `dbus-next` is sketched in [`docs/testing/automation.md`](../../docs/testing/automation.md) but requires displacing the running portal service or running under `dbus-run-session`.
- **`render-matrix.sh`** isn't here yet. `sweep.sh` prints a summary; the `matrix.md` regen step from JUnit is the next addition.
- **No CI wrapper.** Decision 4: the harness is invocable from CI but sweeps run from the dev box for the first ~20 tests.
## Adding a test
1. Pick the `T##` / `S##` from [`docs/testing/cases/`](../../docs/testing/cases/).
2. Drop `src/runners/T##_short_name.spec.ts`. Use the existing five as templates — match the layer (L1 / L2) to the test's assertion shape.
3. First line of the test body: `skipUnlessRow(testInfo, ['KDE-W', ...])`. JUnit `<skipped>` → matrix `-`, never `✗` for a row that doesn't apply.
4. Tag the test with `severity` and `surface` annotations so the JUnit output carries them.
5. Capture diagnostics via `testInfo.attach()` — these become Decision 7 "always-on" captures regardless of pass/fail. For tests that need richer state on failure, wrap your scenarios in a results-collector and attach a single JSON dump (S31's pattern).
6. No fixed `sleep`s. Use `retryUntil` or Playwright's auto-wait.
### Hooking Electron — read this before reaching for `BrowserWindow`
`scripts/frame-fix-wrapper.js` returns the `electron` module wrapped
in a `Proxy` whose `get` trap returns a closure-captured
`PatchedBrowserWindow`. **Constructor-level wraps don't work** — your
`electron.BrowserWindow = WrappedCtor` write lands on the underlying
module but the Proxy keeps returning `PatchedBrowserWindow` on
read, so the wrap is bypassed. The reliable hook is at the
**prototype-method level**:
```ts
// in inspector.evalInMain(...)
const proto = electron.BrowserWindow.prototype;
const orig = proto.loadFile;
proto.loadFile = function(filePath, ...rest) {
// record `this` + filePath; identify popups by filePath suffix
return orig.call(this, filePath, ...rest);
};
```
This captures every instance regardless of subclass identity.
Construction-time options (`transparent: true`, `frame: false`,
etc.) aren't observable through this hook — use runtime
equivalents instead (`getBackgroundColor()`, `getContentBounds()
vs getBounds()`, `isAlwaysOnTop()`). `lib/quickentry.ts` is the
worked example.

View File

@@ -0,0 +1,309 @@
// Probe to verify whether the eipc channel registry (LocalSessions_$_*,
// CustomPlugins_$_*) is reachable from main via webContents.ipc._invokeHandlers
// instead of the empty-on-this-build globalThis.ipcMain._invokeHandlers.
//
// Run from tools/test-harness against a running claude-desktop with the
// main-process debugger enabled (Developer → Enable Main Process Debugger
// in the app menu, or `claude-desktop` was launched with --inspect):
// npx tsx eipc-registry-probe.ts
//
// Useful states to probe (re-run to compare):
// * fresh launch — whichever tab opens by default
// * /epitaxy with a Code session open
// * /chats with a chat thread open
// * cowork tab loaded
// The per-interface breakdown surfaces which interfaces register lazily
// vs eagerly — useful for designing the lib/eipc.ts primitive's wait
// semantics.
//
// Non-destructive — read-only enumeration of handler keys. Doesn't invoke
// anything, doesn't register anything, doesn't mutate state.
import { InspectorClient } from './src/lib/inspector.js';
import { writeFileSync } from 'node:fs';
interface InterfaceCount {
scope: string;
iface: string;
count: number;
sampleMethods: string[];
}
interface PerWcReport {
id: number;
url: string;
type: string;
hasIpc: boolean;
hasInvokeHandlers: boolean;
totalHandlers: number;
framedCount: number;
unframedCount: number;
scopes: string[];
byInterface: InterfaceCount[];
unframedSample: string[];
}
async function main() {
const client = await InspectorClient.connect(9229);
// Confirm globalThis.ipcMain._invokeHandlers is empty (or near-empty)
// — that's session 3's finding and we want it on the record alongside
// the per-wc reading for contrast.
const ipcMainReport = await client.evalInMain<{
hasIpcMain: boolean;
ipcMainKeys: string[];
ipcMainCount: number;
}>(`
const electron = process.mainModule.require('electron');
const ipcMain = electron.ipcMain;
const map = ipcMain && ipcMain._invokeHandlers;
if (!map) {
return { hasIpcMain: !!ipcMain, ipcMainKeys: [], ipcMainCount: 0 };
}
const keys = (typeof map.keys === 'function')
? Array.from(map.keys())
: Object.keys(map);
return {
hasIpcMain: true,
ipcMainKeys: keys,
ipcMainCount: keys.length,
};
`);
// Per-webContents enumeration with full framing parse:
// $eipc_message$_<UUID>_$_<scope>_$_<interface>_$_<method>
// Scope examples: claude.settings, claude.web, claude.app_internal.
// Interface examples: GlobalShortcut, LocalSessions, CustomPlugins.
// We group by scope.iface to show which feature areas are populated
// on each webContents — what registers eagerly vs on-tab-load.
const perWcReports = await client.evalInMain<PerWcReport[]>(`
const { webContents } = process.mainModule.require('electron');
const re = /^\\$eipc_message\\$_[0-9a-f-]+_\\$_([^_]+(?:\\.[^_]+)*)_\\$_([^_]+)_\\$_(.+)$/;
const all = webContents.getAllWebContents();
const out = [];
for (const w of all) {
const ipc = w.ipc;
const invokeMap = ipc && ipc._invokeHandlers;
let keys = [];
let hasInvokeHandlers = false;
if (invokeMap) {
hasInvokeHandlers = true;
if (typeof invokeMap.keys === 'function') {
keys = Array.from(invokeMap.keys());
} else {
keys = Object.keys(invokeMap);
}
}
const groups = new Map();
const scopes = new Set();
let framedCount = 0;
let unframedCount = 0;
const unframedSample = [];
for (const k of keys) {
const m = re.exec(k);
if (!m) {
unframedCount++;
if (unframedSample.length < 8) unframedSample.push(k);
continue;
}
framedCount++;
const scope = m[1];
const iface = m[2];
const method = m[3];
scopes.add(scope);
const groupKey = scope + '/' + iface;
let g = groups.get(groupKey);
if (!g) {
g = { scope, iface, count: 0, sampleMethods: [] };
groups.set(groupKey, g);
}
g.count++;
if (g.sampleMethods.length < 4) g.sampleMethods.push(method);
}
const byInterface = Array.from(groups.values())
.sort((a, b) => b.count - a.count);
out.push({
id: w.id,
url: w.getURL(),
type: w.getType ? w.getType() : 'unknown',
hasIpc: !!ipc,
hasInvokeHandlers,
totalHandlers: keys.length,
framedCount,
unframedCount,
scopes: Array.from(scopes).sort(),
byInterface,
unframedSample,
});
}
return out;
`);
// For each case-doc anchored channel, find which webContents (if any)
// hosts it. The framing prefix `$eipc_message$_<UUID>_$_claude.web_$_`
// is build-stable per session 2's T38 finding, so we match by suffix.
const expected = [
// T22 — gh PR check monitoring
'LocalSessions_$_getPrChecks',
// T31 — side chat trio
'LocalSessions_$_startSideChat',
'LocalSessions_$_sendSideChatMessage',
'LocalSessions_$_stopSideChat',
// T33 — plugin browser
'CustomPlugins_$_listMarketplaces',
'CustomPlugins_$_listAvailablePlugins',
// T38 — Continue in IDE
'LocalSessions_$_openInEditor',
];
const expectedReport = await client.evalInMain<
Array<{ suffix: string; foundOn: number[]; matchedKeys: string[] }>
>(`
const { webContents } = process.mainModule.require('electron');
const expected = ${JSON.stringify(expected)};
const all = webContents.getAllWebContents();
const out = [];
for (const suffix of expected) {
const foundOn = [];
const matchedKeys = [];
for (const w of all) {
const ipc = w.ipc;
const invokeMap = ipc && ipc._invokeHandlers;
if (!invokeMap) continue;
const keys = (typeof invokeMap.keys === 'function')
? Array.from(invokeMap.keys())
: Object.keys(invokeMap);
for (const k of keys) {
if (k.endsWith(suffix)) {
if (!foundOn.includes(w.id)) foundOn.push(w.id);
if (!matchedKeys.includes(k)) matchedKeys.push(k);
}
}
}
out.push({ suffix, foundOn, matchedKeys });
}
return out;
`);
// Snapshot the framing UUID(s) — useful to confirm build-stability
// across the per-wc registries (session 2 noted it as build-stable
// `c0eed8c9-...`).
const framingReport = await client.evalInMain<{
uuidsSeen: string[];
samplesPerUuid: Record<string, string[]>;
}>(`
const { webContents } = process.mainModule.require('electron');
const re = /^\\$eipc_message\\$_([0-9a-f-]+)_\\$_/;
const uuidsSeen = new Set();
const samples = {};
for (const w of webContents.getAllWebContents()) {
const ipc = w.ipc;
const invokeMap = ipc && ipc._invokeHandlers;
if (!invokeMap) continue;
const keys = (typeof invokeMap.keys === 'function')
? Array.from(invokeMap.keys())
: Object.keys(invokeMap);
for (const k of keys) {
const m = re.exec(k);
if (!m) continue;
const uuid = m[1];
uuidsSeen.add(uuid);
if (!samples[uuid]) samples[uuid] = [];
if (samples[uuid].length < 3) samples[uuid].push(k);
}
}
return {
uuidsSeen: Array.from(uuidsSeen),
samplesPerUuid: samples,
};
`);
console.log('=== globalThis.ipcMain._invokeHandlers (session 3 baseline) ===');
console.log(JSON.stringify(ipcMainReport, null, 2));
console.log('\n=== Per-webContents IPC registries ===');
console.log(JSON.stringify(perWcReports, null, 2));
console.log('\n=== Expected case-doc-anchored channel resolution ===');
console.log(JSON.stringify(expectedReport, null, 2));
console.log('\n=== Framing UUID(s) observed ===');
console.log(JSON.stringify(framingReport, null, 2));
// Cross-webContents per-interface deltas — useful when comparing
// "fresh launch" vs "after navigating to /epitaxy" vs "after opening
// cowork tab". Lists every (scope, iface) seen anywhere with the
// per-wc breakdown of which has it.
const interfaceAcrossWcs = (() => {
const matrix = new Map<string, Map<number, number>>();
for (const wc of perWcReports) {
for (const g of wc.byInterface) {
const key = `${g.scope}/${g.iface}`;
let row = matrix.get(key);
if (!row) {
row = new Map();
matrix.set(key, row);
}
row.set(wc.id, g.count);
}
}
const out: Array<{
interfaceKey: string;
perWc: Record<string, number>;
total: number;
}> = [];
for (const [key, row] of matrix) {
const perWc: Record<string, number> = {};
let total = 0;
for (const [wcId, count] of row) {
perWc[`wc${wcId}`] = count;
total += count;
}
out.push({ interfaceKey: key, perWc, total });
}
out.sort((a, b) => b.total - a.total);
return out;
})();
console.log('\n=== Interface presence across webContents ===');
console.log(JSON.stringify(interfaceAcrossWcs, null, 2));
const totalAll = perWcReports.reduce((a, r) => a + r.totalHandlers, 0);
const totalFramed = perWcReports.reduce((a, r) => a + r.framedCount, 0);
const totalUnframed = perWcReports.reduce((a, r) => a + r.unframedCount, 0);
const expectedFound = expectedReport.filter((e) => e.foundOn.length > 0).length;
const totalDistinctInterfaces = new Set(
perWcReports.flatMap((r) => r.byInterface.map((g) => `${g.scope}/${g.iface}`)),
).size;
console.log('\n=== Summary ===');
console.log(JSON.stringify({
webContentsCount: perWcReports.length,
webContentsUrls: perWcReports.map((r) => `wc${r.id}: ${r.url}`),
ipcMainHandlerCount: ipcMainReport.ipcMainCount,
perWcTotalHandlerCount: totalAll,
perWcFramedCount: totalFramed,
perWcUnframedCount: totalUnframed,
distinctInterfacesAcrossAllWcs: totalDistinctInterfaces,
expectedSuffixesFound: `${expectedFound} / ${expected.length}`,
framingUuidsObserved: framingReport.uuidsSeen.length,
}, null, 2));
const out = {
ipcMainReport,
perWcReports,
expectedReport,
framingReport,
interfaceAcrossWcs,
};
writeFileSync('/tmp/eipc-registry-probe.json', JSON.stringify(out, null, 2));
console.log('\nFull dump → /tmp/eipc-registry-probe.json');
client.close();
process.exit(0);
}
main().catch((err) => {
console.error('probe failed:', err);
process.exit(1);
});

View File

@@ -0,0 +1,468 @@
// Grounding probe — dumps Claude Desktop runtime state that backs the
// load-bearing claims in docs/testing/cases/. Output is keyed by
// test-ID so the next grounding sweep can diff captures across
// upstream versions.
//
// Two modes:
// - attach (default): connect to an already-running app on port 9229
// (manual `--inspect=9229` run, or a launchClaude() instance that
// called attachInspector()).
// - --launch: spin up a fresh isolated instance via launchClaude(),
// capture, tear down. Self-contained — usable in CI.
//
// Mostly read-only; --include-synthetic enables short-lived state
// changes (powerSaveBlocker start+stop) to close API-only gaps.
//
// Captures, keyed by test ID:
// T01 app metadata, webContents count
// T03 SNI / tray registration via DBus (KDE StatusNotifierWatcher)
// T06 globalShortcut.isRegistered() for known accelerators
// T09 app.getLoginItemSettings()
// T22 AX fingerprint (PR toolbar — open the surface before probing)
// T23 Notification.isSupported()
// T24 IPC channels matching /external|editor|openIn/i
// T26 AX fingerprint (Routines page — open before probing)
// T31 AX fingerprint (side chat — open before probing)
// T32 AX fingerprint (slash menu — type "/" before probing)
// T38 IPC channels matching /external|editor|openIn/i (editor handoff)
// S18 safeStorage.isEncryptionAvailable() + backend
// S20 powerSaveBlocker (gated by --include-synthetic)
// S22 process.platform (Computer Use gate)
// S25 safeStorage (cowork trusted-device token)
// S26 autoUpdater.getFeedURL() — empirical answer to the structural-
// open claim that static analysis couldn't resolve
//
// Usage:
// cd tools/test-harness
// npx tsx grounding-probe.ts # attach :9229
// npx tsx grounding-probe.ts --launch # self-contained
// npx tsx grounding-probe.ts --launch --include-synthetic
// npx tsx grounding-probe.ts --out ../../docs/testing/cases-grounding-runtime.json
// npx tsx grounding-probe.ts --port 9229 --out path/to/file.json
//
// Extending: add a section in capture() with a `client.evalInMain`
// dump targeting whatever runtime state your new test cares about,
// then map the result into `tests[<id>]`.
import { writeFileSync } from 'node:fs';
import { InspectorClient } from './src/lib/inspector.js';
import { launchClaude } from './src/lib/electron.js';
// dbus-next is loaded lazily inside captureSni() — importing here would
// pull in a session-bus connection on environments without one (CI
// containers, sshfs, etc.) and break the probe before it ever runs.
// Accelerators we expect to be registered on Linux. T06 = Quick Entry
// default. S31/S32 — fullscreen + cmd-K dispatch. Extend per case docs.
const KNOWN_ACCELERATORS = [
'Alt+Space',
'Ctrl+Alt+Space',
'CommandOrControl+Shift+L',
];
interface AxFingerprintNode {
role: string;
name: string;
hasPopup: boolean;
}
interface GroundingCapture {
capturedAt: string;
appVersion: string;
appPath: string;
isPackaged: boolean;
platform: string;
// Cross-test corpus — useful as a denormalized source the per-test
// entries reference by index/key. Keep these flat so jq queries
// don't need to walk a nested tree.
ipcInvokeChannels: string[];
ipcOnChannels: string[];
webContents: Array<{ id: number; url: string; type: string }>;
// Reduced AX tree of the current claude.ai webContents, shared by
// every test entry that names a renderer-side surface. Stored once
// at the top level rather than copied per-test — diff stability
// matters more than per-test isolation here.
axFingerprint: AxFingerprintNode[];
// Per-test bag — extend as new probes land. Each entry is the
// runtime state the test's load-bearing claim depends on, in a
// shape that's easy to diff across captures. Renderer-side tests
// reference $.axFingerprint via { axFingerprintRef: true }.
tests: Record<string, unknown>;
// Probe-level diagnostics — what we tried and couldn't capture.
// Surfaced so the grounding sweep can flag uncovered surfaces.
gaps: string[];
}
interface CaptureOptions {
includeSynthetic: boolean;
}
async function capture(
client: InspectorClient,
opts: CaptureOptions,
): Promise<GroundingCapture> {
const gaps: string[] = [];
// App metadata — every test references at least one of these.
const appMeta = await client.evalInMain<{
appVersion: string;
appPath: string;
isPackaged: boolean;
appReady: boolean;
platform: string;
}>(`
const { app } = process.mainModule.require('electron');
return {
appVersion: app.getVersion(),
appPath: app.getAppPath(),
isPackaged: app.isPackaged,
appReady: app.isReady(),
platform: process.platform,
};
`);
// IPC handler registry. Every claude.web_* channel registers via
// ipcMain.handle() (invoke side) or ipcMain.on() (fire-and-forget).
// Private API — surfaces shift across Electron versions; tolerate
// both shapes.
const ipc = await client.evalInMain<{ invoke: string[]; on: string[] }>(`
const { ipcMain } = process.mainModule.require('electron');
const invoke = ipcMain._invokeHandlers
? Array.from(ipcMain._invokeHandlers.keys())
: [];
const on = ipcMain.eventNames ? ipcMain.eventNames().map(String) : [];
return { invoke, on };
`);
// WebContents inventory — proves which BrowserViews / BrowserWindows
// exist at probe time. Note: BrowserWindow.getAllWindows() returns
// 0 because frame-fix-wrapper substitutes the class (see
// inspector.ts header comment) — webContents registry stays intact.
const webContents = await client.evalInMain<
Array<{ id: number; url: string; type: string }>
>(`
const { webContents } = process.mainModule.require('electron');
return webContents.getAllWebContents().map(w => ({
id: w.id,
url: w.getURL(),
type: w.getType ? w.getType() : 'unknown',
}));
`);
// Global shortcuts — T06, S31/S32 reference these. isRegistered()
// is the canonical runtime probe; matches the case-doc claim about
// what's bound at startup.
const accelerators = await client.evalInMain<
Array<{ accelerator: string; registered: boolean }>
>(`
const { globalShortcut } = process.mainModule.require('electron');
const list = ${JSON.stringify(KNOWN_ACCELERATORS)};
return list.map(a => ({
accelerator: a,
registered: globalShortcut.isRegistered(a),
}));
`);
// Autostart resolution — T09. On Linux Electron's openAtLogin is a
// documented no-op; our wrapper installs an XDG Autostart shim
// (frame-fix-wrapper.js:376). The empirical check confirms which
// path is active.
const loginItems = await client.evalInMain<{
openAtLogin: boolean;
wasOpenedAtLogin?: boolean;
executableWillLaunchAtLogin?: boolean;
}>(`
const { app } = process.mainModule.require('electron');
return app.getLoginItemSettings();
`);
// safeStorage — S18 (env-config encryption) + S25 (cowork trusted-
// device token). Linux backend is libsecret; availability gates
// whether tokens persist or stall.
const safeStorage = await client.evalInMain<{
available: boolean;
backend: string;
}>(`
const { safeStorage } = process.mainModule.require('electron');
let backend = 'unknown';
try {
if (safeStorage.getSelectedStorageBackend) {
backend = safeStorage.getSelectedStorageBackend();
}
} catch (_) { /* older Electron — backend not exposed */ }
return {
available: safeStorage.isEncryptionAvailable(),
backend,
};
`);
// autoUpdater feedURL — S26. The case doc claims the gate is open
// by construction (lii() returns true on Linux when packaged).
// Accidental coverage from Electron's Linux autoUpdater being
// unimplemented saves us from real download attempts. This probe
// puts that on the record empirically.
const autoUpdater = await client.evalInMain<{
feedURL: string | null;
feedURLError: string | null;
}>(`
const { autoUpdater } = process.mainModule.require('electron');
let feedURL = null, feedURLError = null;
try {
feedURL = autoUpdater.getFeedURL ? autoUpdater.getFeedURL() : null;
} catch (e) {
feedURLError = String(e && e.message);
}
return { feedURL, feedURLError };
`);
// Tray — T03. We can't enumerate Tray instances via public API,
// but we can confirm Notification support is alive (T23 prerequisite).
const notifications = await client.evalInMain<{ supported: boolean }>(`
const { Notification } = process.mainModule.require('electron');
return { supported: Notification.isSupported() };
`);
// Powermonitor / suspend inhibit — S20. powerSaveBlocker has no
// public enumeration API. Synthetic probe (gated behind
// --include-synthetic) starts a blocker, reads isStarted, stops
// immediately. Brief inhibit (~ms) is harmless; what we get back
// is empirical proof the API path is alive on this host. Doesn't
// verify the case-doc claim that `keepAwakeEnabled` setting toggles
// trigger this — that requires correlating settings IO with the
// `PhA` Set at index.js:241897, which depends on minified-name
// stability and is left to the next sweep.
let powerSaveBlocker: {
apiAvailable: boolean;
startWorks: boolean;
idType: string;
probeError: string | null;
} | null = null;
if (opts.includeSynthetic) {
powerSaveBlocker = await client.evalInMain(`
const { powerSaveBlocker } = process.mainModule.require('electron');
let id = null, started = false, probeError = null;
try {
id = powerSaveBlocker.start('prevent-app-suspension');
started = powerSaveBlocker.isStarted(id);
} catch (e) {
probeError = String(e && e.message);
} finally {
if (id !== null) {
try { powerSaveBlocker.stop(id); } catch (_) {}
}
}
return {
apiAvailable: true,
startWorks: started,
idType: typeof id,
probeError,
};
`);
} else {
gaps.push(
'S20: powerSaveBlocker not probed (skip-synthetic). ' +
'Re-run with --include-synthetic to confirm API path.',
);
}
// Editor handoff scheme registry — T24/T38. Static case anchor
// (`Mtt` at index.js:463902) names the registry; variable is
// minified, so we identify by IPC handler name pattern instead.
// The case doc claims schemes vscode/cursor/zed/windsurf are wired
// up on Linux (xcode is darwin-only). The IPC channel that calls
// `shell.openExternal('<scheme>://file/<encoded-path>:<line>')`
// will be one of these matches.
const editorIpcChannels = [
...ipc.invoke.filter((c) => /external|editor|openIn/i.test(c)),
...ipc.on.filter((c) => /external|editor|openIn/i.test(c)),
];
// Renderer AX fingerprint — T22/T26/T31/T32. `getAccessibleTree`
// snapshots whatever's *currently on screen*. To anchor surfaces
// inside modals/popups (preset list, slash menu, side chat, PR
// toolbar), open the surface in the running app before probe time.
// Reduced form (role+name+hasPopup) keeps the output grep-able and
// avoids re-shipping ui-inventory.json's full schema.
const claudeAi = webContents.find((w) => w.url.includes('claude.ai'));
let axFingerprint: AxFingerprintNode[] = [];
if (claudeAi) {
try {
const tree = await client.getAccessibleTree('claude.ai');
axFingerprint = tree
.filter((n) => !n.ignored && n.role && n.name)
.map((n) => ({
role: n.role!.value,
name: n.name!.value,
hasPopup: !!n.properties?.find((p) => p.name === 'haspopup'),
}))
.filter((n) => n.name.length > 0);
} catch (e) {
gaps.push(
`renderer-ax: getAccessibleTree threw: ${e instanceof Error ? e.message : String(e)}`,
);
}
} else {
gaps.push(
'renderer-ax: no claude.ai webContents at probe time. ' +
'Sign in to the app before re-running to capture renderer state.',
);
}
// Tray / SNI registration — T03. Linux tray icons register against
// org.kde.StatusNotifierWatcher (KDE protocol used by GNOME's
// AppIndicator extension too). We can attribute an SNI item to the
// app's pid via `findItemByPid`. Lazily imported because dbus-next
// connects on first call to getSessionBus(), and we want
// non-DBus environments to still get a partial probe rather than
// hard-fail.
const ourPid = await client.evalInMain<number>('return process.pid;');
let sni: {
ourPid: number;
registeredItem: { service: string; objectPath: string } | null;
probeError: string | null;
} = { ourPid, registeredItem: null, probeError: null };
try {
const sniLib = await import('./src/lib/sni.js');
const dbusLib = await import('./src/lib/dbus.js');
try {
sni.registeredItem = await sniLib.findItemByPid(ourPid);
} finally {
await dbusLib.disconnectBus();
}
} catch (e) {
sni.probeError = e instanceof Error ? e.message : String(e);
}
// T22 PR toolbar / T31 side chat / T32 slash menu — these surfaces
// are now captured if the user has the relevant view open at probe
// time (see `axFingerprint` above). Empty fingerprint at idle is
// expected; flag here only if the renderer was reachable but the
// captured tree was empty (which would suggest the AX walker hit
// a permission gate or was disabled).
if (claudeAi && axFingerprint.length === 0) {
gaps.push(
'renderer-ax: claude.ai webContents present but AX tree empty. ' +
'Either Accessibility was not enabled or the page is mid-load.',
);
}
gaps.push(
'T39 /desktop: lives in the upstream `claude` CLI binary, not the ' +
'Electron asar — not reachable from this probe.',
);
return {
capturedAt: new Date().toISOString(),
appVersion: appMeta.appVersion,
appPath: appMeta.appPath,
isPackaged: appMeta.isPackaged,
platform: appMeta.platform,
ipcInvokeChannels: ipc.invoke,
ipcOnChannels: ipc.on,
webContents,
axFingerprint,
tests: {
T01: { appReady: appMeta.appReady, webContentsCount: webContents.length },
T03: sni,
T06: { accelerators },
T09: loginItems,
T22: { axFingerprintRef: true, count: axFingerprint.length },
T23: notifications,
T24: { editorIpcChannels },
T26: { axFingerprintRef: true, count: axFingerprint.length },
T31: { axFingerprintRef: true, count: axFingerprint.length },
T32: { axFingerprintRef: true, count: axFingerprint.length },
T38: { editorIpcChannels },
S18: safeStorage,
S20: powerSaveBlocker,
S22: {
platform: appMeta.platform,
expectedDisabledOnLinux: appMeta.platform === 'linux',
},
S25: safeStorage,
S26: {
...autoUpdater,
isPackaged: appMeta.isPackaged,
platform: appMeta.platform,
note: 'Gate is structurally open; saved by Electron autoUpdater being unimplemented on Linux.',
},
},
gaps,
};
}
interface ParsedArgs {
port: number;
out: string;
launch: boolean;
includeSynthetic: boolean;
}
function parseArgs(argv: string[]): ParsedArgs {
const flags = new Set<string>();
const args = new Map<string, string>();
for (let i = 2; i < argv.length; i++) {
const tok = argv[i];
if (!tok || !tok.startsWith('--')) continue;
const key = tok.replace(/^--/, '');
const next = argv[i + 1];
if (next && !next.startsWith('--')) {
args.set(key, next);
i++;
} else {
flags.add(key);
}
}
return {
port: Number(args.get('port') ?? 9229),
out: args.get('out') ?? '/tmp/grounding-probe.json',
launch: flags.has('launch'),
includeSynthetic: flags.has('include-synthetic'),
};
}
async function main() {
const parsed = parseArgs(process.argv);
const { out, launch, includeSynthetic } = parsed;
let client: InspectorClient;
let cleanup: () => Promise<void>;
if (launch) {
// Self-contained: fresh isolation per run, tear down on exit.
// 'mainVisible' is the lowest level that gives us the inspector
// without waiting on claude.ai network load. Sufficient for
// every probe in capture() — none touch renderer DOM.
const app = await launchClaude();
const ready = await app.waitForReady('mainVisible');
client = ready.inspector;
cleanup = async () => {
client.close();
await app.close();
};
} else {
client = await InspectorClient.connect(parsed.port);
cleanup = async () => {
client.close();
};
}
try {
const result = await capture(client, { includeSynthetic });
writeFileSync(out, JSON.stringify(result, null, 2));
console.log(
`grounding-probe: wrote ${out} ` +
`(${result.ipcInvokeChannels.length} invoke channels, ` +
`${result.webContents.length} webContents, ` +
`${result.axFingerprint.length} ax nodes, ` +
`${result.gaps.length} gaps` +
`${launch ? ', --launch' : ''}` +
`${includeSynthetic ? ', synthetic' : ''})`,
);
} finally {
await cleanup();
}
}
main().catch((err) => {
console.error('grounding-probe failed:', err);
process.exit(1);
});

View File

@@ -0,0 +1,108 @@
#!/usr/bin/env bash
# sweep.sh — run a test sweep for a row.
#
# Usage:
# ROW=KDE-W ./orchestrator/sweep.sh
# CLAUDE_DESKTOP_LAUNCHER=/usr/bin/claude-desktop ROW=KDE-W ./orchestrator/sweep.sh
#
# Output bundle layout:
# results/results-${ROW}-${DATE}/
# ├── junit.xml
# ├── html/ (Playwright HTML report)
# └── test-output/ (per-test attachments)
set -uo pipefail
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly script_dir
harness_dir="$(dirname "$script_dir")"
readonly harness_dir
readonly row="${ROW:-KDE-W}"
date_str="$(date -u +%Y%m%dT%H%M%SZ)"
readonly date_str
readonly bundle_id="results-${row}-${date_str}"
readonly results_root="${OUTPUT_DIR:-${harness_dir}/results}"
readonly bundle_dir="${results_root}/${bundle_id}"
mkdir -p "$bundle_dir"
cd "$harness_dir" || exit 1
# Backend banner. CLAUDE_HARNESS_USE_WAYLAND=1 flips every runner from
# the default X11/XWayland backend to native Wayland — see the
# "Environment variables" table in tools/test-harness/README.md.
if [[ "${CLAUDE_HARNESS_USE_WAYLAND:-}" == '1' ]]; then
printf 'sweep: native Wayland backend (CLAUDE_HARNESS_USE_WAYLAND=1)\n' >&2
fi
# Fast-fail prereq checks — only matter when the sweep includes
# Quick Entry runners (S31, future S29/S30/S32/S34/S35/S37 +
# T06 / QE-* additions). Skip with QE_PREREQ_CHECK=0 if running
# a sweep that excludes those.
if [[ "${QE_PREREQ_CHECK:-1}" == "1" ]]; then
if ! command -v ydotool >/dev/null 2>&1; then
printf 'sweep: ydotool not on PATH — Quick Entry runners will skip.\n' >&2
printf ' install: dnf install ydotool / apt install ydotool\n' >&2
printf ' to suppress this check: QE_PREREQ_CHECK=0\n' >&2
fi
socket="${YDOTOOL_SOCKET:-/tmp/.ydotool_socket}"
if [[ ! -S "$socket" ]]; then
printf 'sweep: ydotoold socket missing at %s — daemon not running.\n' \
"$socket" >&2
printf ' start: sudo systemctl start ydotool.service\n' >&2
printf ' see tools/test-harness/README.md "Quick Entry runners" for one-time setup\n' >&2
fi
fi
ROW="$row" \
RESULTS_DIR="$bundle_dir" \
npx playwright test
rc=$?
# Bundle into tar.zst for orchestrator pickup. Best-effort — keep the
# uncompressed dir even if zstd is unavailable.
if command -v zstd >/dev/null 2>&1; then
tar --zstd -cf "${results_root}/${bundle_id}.tar.zst" \
-C "$results_root" "$bundle_id" 2>/dev/null \
&& printf 'bundle: %s/%s.tar.zst\n' "$results_root" "$bundle_id"
fi
printf 'row=%s exit=%d dir=%s\n' "$row" "$rc" "$bundle_dir"
# Quick summary if junit.xml landed. Prefer Node so we sum across all
# <testsuite> elements (grep+head only saw the first suite, undercounting
# multi-suite reports). Fall back to the legacy grep path when node isn't
# on PATH so the harness stays usable on minimal images.
if [[ -f "${bundle_dir}/junit.xml" ]]; then
if command -v node >/dev/null 2>&1; then
read -r tests failures errors skipped \
< <(node -e "$(cat <<'EOF'
const fs = require('fs');
const xml = fs.readFileSync(process.argv[1], 'utf8');
const sumAttr = (a) => Array.from(
xml.matchAll(new RegExp(`<testsuite[^>]*\\b${a}="(\\d+)"`, 'g'))
).reduce((s, m) => s + parseInt(m[1], 10), 0);
console.log([
sumAttr('tests'), sumAttr('failures'),
sumAttr('errors'), sumAttr('skipped'),
].join(' '));
EOF
)" "${bundle_dir}/junit.xml")
printf 'summary: tests=%s failures=%s errors=%s skipped=%s\n' \
"$tests" "$failures" "$errors" "$skipped"
elif command -v grep >/dev/null 2>&1; then
tests="$(grep -oP 'tests="\K\d+' "${bundle_dir}/junit.xml" \
| head -1 || printf '?')"
failures="$(grep -oP 'failures="\K\d+' "${bundle_dir}/junit.xml" \
| head -1 || printf '?')"
errors="$(grep -oP 'errors="\K\d+' "${bundle_dir}/junit.xml" \
| head -1 || printf '?')"
skipped="$(grep -oP 'skipped="\K\d+' "${bundle_dir}/junit.xml" \
| head -1 || printf '?')"
printf 'summary: tests=%s failures=%s errors=%s skipped=%s\n' \
"$tests" "$failures" "$errors" "$skipped"
fi
fi
exit "$rc"

View File

@@ -0,0 +1,26 @@
{
"name": "claude-desktop-debian-test-harness",
"version": "0.0.1",
"private": true,
"description": "Linux compatibility test harness for claude-desktop-debian",
"type": "module",
"engines": {
"node": ">=20"
},
"scripts": {
"test": "playwright test",
"sweep": "bash orchestrator/sweep.sh",
"typecheck": "tsc --noEmit",
"grounding-probe": "npx tsx grounding-probe.ts"
},
"devDependencies": {
"@playwright/test": "^1.48.0",
"@types/node": "^20.16.0",
"playwright": "^1.48.0",
"typescript": "^5.6.0"
},
"dependencies": {
"@electron/asar": "^3.2.10",
"dbus-next": "^0.10.2"
}
}

View File

@@ -0,0 +1,25 @@
/// <reference types="node" />
import { defineConfig } from '@playwright/test';
const resultsDir = process.env.RESULTS_DIR ?? './results/local';
export default defineConfig({
testDir: './src/runners',
testMatch: /.*\.spec\.ts$/,
fullyParallel: false,
workers: 1,
retries: process.env.CI ? 1 : 0,
forbidOnly: !!process.env.CI,
timeout: 60_000,
expect: { timeout: 10_000 },
outputDir: `${resultsDir}/test-output`,
reporter: [
['list'],
['junit', { outputFile: `${resultsDir}/junit.xml` }],
['html', { outputFolder: `${resultsDir}/html`, open: 'never' }],
],
use: {
trace: 'retain-on-failure',
screenshot: 'only-on-failure',
},
});

163
tools/test-harness/probe.ts Normal file
View File

@@ -0,0 +1,163 @@
// Standalone probe that connects to a running claude-desktop with the
// main process debugger enabled (port 9229) and dumps renderer-DOM
// shapes useful for designing reusable abstractions in lib/claudeai.ts.
//
// Run from tools/test-harness:
// npx tsx probe.ts
//
// Non-destructive — observes only, doesn't click anything.
import { InspectorClient } from './src/lib/inspector.js';
import { writeFileSync } from 'node:fs';
async function main() {
const client = await InspectorClient.connect(9229);
const webContentsList = await client.evalInMain<
Array<{ id: number; url: string; type: string }>
>(`
const { webContents } = process.mainModule.require('electron');
return webContents.getAllWebContents().map(w => ({
id: w.id,
url: w.getURL(),
type: w.getType ? w.getType() : 'unknown',
}));
`);
const target = webContentsList.find((w) => w.url.includes('claude.ai'));
if (!target) {
console.error('No claude.ai webContents — open the app to a logged-in state first.');
console.error('webContents observed:', webContentsList);
process.exit(1);
}
console.log('=== webContents ===');
console.log(JSON.stringify(webContentsList, null, 2));
console.log('Targeting:', target.url, `(id=${target.id})`);
// All "pill"-shape buttons on the page.
const pills = await client.evalInRenderer<{
dfPills: Array<{ ariaLabel: string | null; text: string; visible: boolean; classSig: string }>;
menuButtons: Array<{
ariaLabel: string | null;
text: string;
expanded: boolean;
truncateMaxW: string | null;
classSig: string;
}>;
summary: { totalButtons: number; ariaHaspopupMenu: number; dfPills: number };
}>(
'claude.ai',
`
(() => {
const buttons = Array.from(document.querySelectorAll('button'));
const dfPills = buttons
.filter(b => /\\bdf-pill\\b/.test(b.className))
.map(b => ({
ariaLabel: b.getAttribute('aria-label'),
text: (b.textContent || '').trim().slice(0, 80),
visible: !!b.getClientRects().length,
classSig: b.className.slice(0, 120),
}));
const menuButtons = buttons
.filter(b => b.getAttribute('aria-haspopup') === 'menu')
.map(b => {
const truncSpan = b.querySelector('span.truncate');
const maxW = truncSpan
? (truncSpan.className.match(/max-w-\\[[^\\]]+\\]/) || [null])[0]
: null;
return {
ariaLabel: b.getAttribute('aria-label'),
text: (b.textContent || '').trim().slice(0, 80),
expanded: b.getAttribute('aria-expanded') === 'true',
truncateMaxW: maxW,
classSig: b.className.slice(0, 120),
};
});
return {
dfPills,
menuButtons,
summary: {
totalButtons: buttons.length,
ariaHaspopupMenu: menuButtons.length,
dfPills: dfPills.length,
},
};
})()
`,
);
console.log('\n=== Pills summary ===');
console.log(JSON.stringify(pills.summary, null, 2));
console.log('\n=== df-pill buttons ===');
console.log(JSON.stringify(pills.dfPills, null, 2));
console.log('\n=== aria-haspopup=menu buttons (sample) ===');
console.log(JSON.stringify(pills.menuButtons.slice(0, 10), null, 2));
// Currently open menu (if any) — items, structure.
const openMenu = await client.evalInRenderer<{
menuPresent: boolean;
ariaLabelledBy: string | null;
items: Array<{ role: string; text: string; ariaChecked: string | null; disabled: boolean }>;
} | null>(
'claude.ai',
`
(() => {
const menu = document.querySelector('[role=menu][data-open]') || document.querySelector('[role=menu]');
if (!menu) return null;
const items = Array.from(menu.querySelectorAll('[role=menuitem], [role=menuitemradio], [role=menuitemcheckbox]'))
.map(el => ({
role: el.getAttribute('role') || '',
text: (el.textContent || '').trim().slice(0, 80),
ariaChecked: el.getAttribute('aria-checked'),
disabled: el.hasAttribute('data-disabled') || el.getAttribute('aria-disabled') === 'true',
}));
return {
menuPresent: true,
ariaLabelledBy: menu.getAttribute('aria-labelledby'),
items,
};
})()
`,
);
console.log('\n=== Currently open menu ===');
console.log(openMenu ? JSON.stringify(openMenu, null, 2) : 'no menu open');
// URL and basic page state.
const pageState = await client.evalInRenderer<{
url: string;
title: string;
readyState: string;
hasComposer: boolean;
hasSidebar: boolean;
}>(
'claude.ai',
`
(() => ({
url: location.href,
title: document.title,
readyState: document.readyState,
hasComposer: !!document.querySelector('[data-testid*=composer], textarea[placeholder*=Reply], textarea[placeholder*=Message]'),
hasSidebar: !!document.querySelector('nav, [role=navigation]'),
}))()
`,
);
console.log('\n=== Page state ===');
console.log(JSON.stringify(pageState, null, 2));
const out = { webContentsList, pills, openMenu, pageState };
writeFileSync('/tmp/claude-probe.json', JSON.stringify(out, null, 2));
console.log('\nFull dump → /tmp/claude-probe.json');
client.close();
process.exit(0);
}
main().catch((err) => {
console.error('probe failed:', err);
process.exit(1);
});

View File

@@ -0,0 +1,44 @@
// Read a process's argv from /proc/<pid>/cmdline.
//
// /proc/<pid>/cmdline is a single string of NUL-separated args (no
// trailing NUL on most kernels; trim defensively). Used by QE-6 / S12
// to verify the launcher appended the right Electron flags, and by
// future flag-presence tests (Decision 6 Wayland-default Smoke, S07
// CLAUDE_USE_WAYLAND, etc.).
//
// readPidArgv returns null if the process is gone — callers usually
// want to retry until the pid stabilizes.
import { readFile } from 'node:fs/promises';
export async function readPidArgv(pid: number): Promise<string[] | null> {
try {
const raw = await readFile(`/proc/${pid}/cmdline`, 'utf8');
// Strip trailing NUL if present, then split. Empty argv is
// theoretically possible (kernel threads); preserve it.
const trimmed = raw.endsWith('\0') ? raw.slice(0, -1) : raw;
return trimmed.length === 0 ? [] : trimmed.split('\0');
} catch {
return null;
}
}
export function argvHasFlag(argv: string[], flag: string): boolean {
// Matches `--enable-features=GlobalShortcutsPortal` (full equality)
// and `--enable-features` (bare flag, value in next argv slot).
// Substring match handles `--enable-features=Foo,Bar` correctly when
// flag is `--enable-features=Foo`.
for (const arg of argv) {
if (arg === flag) return true;
if (arg.startsWith(`${flag}=`)) return true;
// Comma-separated --enable-features value: match any subkey.
if (flag.includes('=')) {
const [key, val] = flag.split('=', 2);
if (arg.startsWith(`${key}=`)) {
const values = arg.slice(key!.length + 1).split(',');
if (values.includes(val!)) return true;
}
}
}
return false;
}

View File

@@ -0,0 +1,55 @@
// Read files out of the installed app.asar without on-disk extraction.
//
// Used by QE-19 / S09 (verify the KDE-gate string is in the bundled
// JS) and by future patch-sanity tests for tray.sh / cowork.sh /
// claude-code.sh patches. Reading via @electron/asar avoids the
// `npx asar extract /tmp/inspect-installed` dance — same outcome, no
// temp tree, JSON-grepable from inside a TS spec.
//
// Path resolution mirrors lib/electron.ts:resolveInstall(): respect
// CLAUDE_DESKTOP_APP_ASAR if set, otherwise probe the deb and rpm
// install locations.
import { extractFile, listPackage } from '@electron/asar';
import { existsSync } from 'node:fs';
const DEFAULT_ASAR_PATHS = [
'/usr/lib/claude-desktop/app.asar',
'/opt/Claude/resources/app.asar',
'/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar',
'/opt/Claude/node_modules/electron/dist/resources/app.asar',
];
export function resolveAsarPath(): string {
const env = process.env.CLAUDE_DESKTOP_APP_ASAR;
if (env) return env;
for (const candidate of DEFAULT_ASAR_PATHS) {
if (existsSync(candidate)) return candidate;
}
throw new Error(
'Could not locate app.asar. Set CLAUDE_DESKTOP_APP_ASAR or install ' +
'the deb/rpm package.',
);
}
export function readAsarFile(filename: string, asarPath?: string): string {
const archive = asarPath ?? resolveAsarPath();
const buf = extractFile(archive, filename);
return buf.toString('utf8');
}
export function asarContains(
filename: string,
needle: string | RegExp,
asarPath?: string,
): boolean {
const contents = readAsarFile(filename, asarPath);
return typeof needle === 'string'
? contents.includes(needle)
: needle.test(contents);
}
export function listAsar(asarPath?: string): string[] {
const archive = asarPath ?? resolveAsarPath();
return listPackage(archive, { isPack: false });
}

View File

@@ -0,0 +1,440 @@
// AX-tree loading + traversal primitives — shared substrate for any
// test that reads from Chromium's accessibility tree.
//
// Why this exists
// ---------------
// Sessions 1-12 grew two parallel AX consumers without consolidating
// the loading shape:
//
// 1. `lib/claudeai.ts` page-objects (CodeTab.activate, openPill,
// clickMenuItem, findCompactPills) carry a private `snapshotAx`
// that gates on `waitForAxTreeStable` then calls
// `inspector.getAccessibleTree('claude.ai')` and converts via
// `axTreeToSnapshot`. Every page-object that polls for a node
// rolls its own retryUntil/while loop around that helper.
//
// 2. `src/runners/T26_routines_page_renders.spec.ts` re-implemented
// the same `snapshotAx` shape inline because the claudeai.ts
// version isn't exported. Its leading comment explicitly noted
// this was "premature abstraction" at 1 consumer; with 2 it is
// threshold-driven extraction.
//
// Plus the user reports recurring flake in tests that use the AX tree:
// queries fire before the relevant subtree is mounted, and individual
// specs each pick their own retryUntil budget. The proposed
// `waitForAxNode` primitive collapses the snapshot+find+retry shape
// into one helper with a single tunable budget per consumer, reducing
// both the surface area for budget drift and the duplication.
//
// What this primitive does
// ------------------------
// - `snapshotAx(inspector, opts)` — single AX tree read with the
// stability gate. Replaces the duplicated implementations in
// `claudeai.ts` (private) and `T26_routines_page_renders.spec.ts`
// (inlined). `opts.fast` skips the stability gate for inside-poll
// callers (matches the existing claudeai.ts contract).
// - `waitForAxNode(inspector, predicate, opts)` — repeatedly snapshot
// the AX tree and return the first element matching `predicate`,
// subject to a timeout. Built against the loops in `CodeTab.activate`
// (poll for compact pills), `openPill` (poll for menu items),
// `clickMenuItem` (poll for matching menuitem), and T26's pre/post-
// click anchor scans. The predicate carries the discrimination
// logic the caller already had inline; the primitive owns the
// stability-gate + retry loop.
// - Owns the AX-snapshot substrate: `RawElement`, `axTreeToSnapshot`,
// and `waitForAxTreeStable`. These are the runner-facing surface for
// converting Chromium's `Accessibility.getFullAXTree` output into
// a flat snapshot the page-objects and specs can search.
//
// Scope boundaries
// ----------------
// This is NOT a "wait for surface rendered" registry. The plan-doc
// proposal mentioned `waitForRenderedSurface(client, surfaceKey)`
// with a registry of named surface anchors — that's still
// speculative (no consumer asks for it). When a third consumer
// emerges that already knows it wants a named surface anchor (e.g.
// "the Code tab body has mounted"), promote the relevant claudeai.ts
// page-object into a registry entry. Today, `waitForAxNode` with a
// predicate covers every observed callsite.
//
// This is also NOT a CSS-querySelector primitive. T07 polls the DOM
// via `document.querySelector('[data-testid=...]')` for the topbar;
// that's a different abstraction (DOM, not AX) with no extraction
// signal yet — leave it inline in T07 until a second consumer
// surfaces.
import type { AxNode, InspectorClient } from './inspector.js';
import { retryUntil, sleep } from './retry.js';
export type { AxNode } from './inspector.js';
// Outermost-to-innermost AX ancestor chain. `walkLandmarkAncestors`
// (in lib/claudeai.ts) filters this to the landmark / grouping subset
// for fingerprint paths.
interface RawAncestor {
role: string | null;
name: string | null;
}
export interface RawElement {
// Per-element data sourced from Chromium's accessibility tree.
// `computedRole` is `AxNode.role.value` — the platform-computed role
// rather than the tag-derived one, so `<button role="link">` is a
// link.
computedRole: string;
// Accessible name as the AX tree computed it. Single source of
// truth for the leaf's identity — there is no separate aria-label
// / text-content fallback.
accessibleName: string | null;
// `!ignored` from the AX tree. The walker filters ignored nodes
// out at snapshot construction time, so this is always true post-
// filter; kept on the type so resolver-side code can still gate
// on it without special-casing AX-derived inputs.
visible: boolean;
// Any landmark dialog / alertdialog ancestor in the AX path.
insideModalDialog: boolean;
// Outermost-to-innermost AX ancestor chain (excluding the element
// itself and any ignored nodes).
ancestors: RawAncestor[];
// Among the parent AX node's non-ignored children that share this
// element's computed role, where does it sit and how many siblings
// of that role exist?
siblingPosition: number;
siblingTotal: number;
// `AxNode.backendDOMNodeId`. Required for the click path
// (`DOM.resolveNode` → `Runtime.callFunctionOn`); null only on AX
// nodes that don't back a DOM element (which won't reach this
// list, since interactive ARIA roles always do).
backendDOMNodeId: number | null;
// AX `haspopup` token (`<button aria-haspopup="menu">` →
// `'menu'`). null when the property is absent or its value is the
// literal string `'false'`. Surfaced for claudeai.ts page-objects,
// which use it to discriminate menu triggers from ordinary action
// buttons that happen to share an accessible name.
hasPopup: string | null;
}
// Roles we treat as "interactive leaves" — emitted to the snapshot
// and used as queue seeds. Expressed in AX-role terms so
// `<button role="link">` shows up as `link`, which is what AX reports.
const INTERACTIVE_AX_ROLES = new Set<string>([
'button',
'link',
'menuitem',
'menuitemradio',
'menuitemcheckbox',
'tab',
'option',
]);
// Roles that indicate a dialog ancestor; any such ancestor flips
// `insideModalDialog`.
const DIALOG_AX_ROLES = new Set<string>(['dialog', 'alertdialog']);
// Pull the AX `hasPopup` token out of `node.properties[]`. CDP
// exposes it as `{ name: 'hasPopup', value: { type: 'token', value:
// 'menu' } }` on supporting elements (note the camelCase — the
// underlying ARIA attribute is `aria-haspopup` lowercase, but
// Chromium's AXProperty name is `hasPopup`). Absent properties array,
// missing entry, or the literal string `'false'` all collapse to
// `null` so consumers don't have to special-case those.
function readHasPopup(node: AxNode): string | null {
const props = node.properties;
if (!Array.isArray(props)) return null;
for (const p of props) {
if (p?.name !== 'hasPopup') continue;
const v = p.value?.value;
if (typeof v !== 'string') return null;
if (v === '' || v === 'false') return null;
return v;
}
return null;
}
// `axTreeToSnapshot` adapts CDP's `Accessibility.getFullAXTree`
// output into the RawElement shape the rest of the harness consumes.
// Filtering rules:
// - `ignored` nodes are dropped from emission and from sibling
// counts (they're not exposed to assistive tech and we don't want
// to drill into them either). Their children remain visible to
// the ancestor walk via the raw tree links.
// - Only nodes whose `role.value` is in `INTERACTIVE_AX_ROLES` get
// emitted as elements. Everything else (RootWebArea, generics,
// paragraphs) shows up only as ancestors.
export function axTreeToSnapshot(nodes: AxNode[]): RawElement[] {
const byId = new Map<string, AxNode>();
for (const n of nodes) byId.set(n.nodeId, n);
const childrenById = new Map<string, AxNode[]>();
for (const n of nodes) {
if (n.parentId === undefined) continue;
let arr = childrenById.get(n.parentId);
if (!arr) {
arr = [];
childrenById.set(n.parentId, arr);
}
arr.push(n);
}
const ancestorName = (n: AxNode): string | null => {
const v = n.name?.value;
return v && v.trim().length > 0 ? v : null;
};
const out: RawElement[] = [];
for (const node of nodes) {
if (node.ignored === true) continue;
const role = node.role?.value;
if (!role || !INTERACTIVE_AX_ROLES.has(role)) continue;
const accessibleName = ancestorName(node);
const ancestors: RawAncestor[] = [];
let modal = false;
{
let pid = node.parentId;
while (pid !== undefined) {
const p = byId.get(pid);
if (!p) break;
if (p.ignored !== true) {
const arole = p.role?.value ?? null;
ancestors.push({ role: arole, name: ancestorName(p) });
if (arole && DIALOG_AX_ROLES.has(arole)) modal = true;
}
pid = p.parentId;
}
}
ancestors.reverse();
let siblingPosition = 0;
let siblingTotal = 1;
if (node.parentId !== undefined) {
const sibs = (childrenById.get(node.parentId) ?? []).filter(
(c) => c.ignored !== true && c.role?.value === role,
);
const idx = sibs.indexOf(node);
if (idx >= 0) {
siblingPosition = idx;
siblingTotal = Math.max(sibs.length, 1);
}
}
out.push({
computedRole: role,
accessibleName,
visible: true,
insideModalDialog: modal,
ancestors,
siblingPosition,
siblingTotal,
backendDOMNodeId: node.backendDOMNodeId ?? null,
hasPopup: readHasPopup(node),
});
}
return out;
}
// Wait for the AX tree to stop growing/shrinking — two consecutive
// reads at the same node count means Chromium has finished computing
// the accessibility tree for the current DOM. Used by the seed phase
// because:
// 1. `Accessibility.enable` is implicit on the first
// `getFullAXTree` call, and the very first tree is often a
// partial computation.
// 2. claude.ai's SPA mounts ~58s after the renderer signals
// `claudeAi` ready; a snapshot taken too early reliably sees an
// empty surface.
// Cheap to call (≥800ms when already stable, on the order of seconds
// when not).
export async function waitForAxTreeStable(
inspector: InspectorClient,
opts: { timeoutMs?: number; pollMs?: number; minNodes?: number } = {},
): Promise<number> {
const timeoutMs = opts.timeoutMs ?? 30000;
const pollMs = opts.pollMs ?? 400;
const minNodes = opts.minNodes ?? 1;
const deadline = Date.now() + timeoutMs;
let prevSize = -1;
let stableReads = 0;
let lastSize = 0;
while (Date.now() < deadline) {
const nodes = await inspector.getAccessibleTree('claude.ai');
lastSize = nodes.length;
if (lastSize === prevSize && lastSize >= minNodes) {
stableReads += 1;
if (stableReads >= 2) return lastSize;
} else {
stableReads = 0;
prevSize = lastSize;
}
if (Date.now() < deadline) await sleep(pollMs);
}
return lastSize;
}
export interface SnapshotAxOptions {
// Skip the upfront `waitForAxTreeStable` gate. Default false —
// i.e. callers gate by default. Pass true inside polling loops
// where the gate fights the loop: each iteration would block
// waiting for "no node-count change" even when the change we're
// polling for is exactly the AX tree updating.
//
// `waitForAxNode` itself uses fast=true on every iteration after
// gating once at the start; consumers calling `snapshotAx` from
// inside a hand-rolled loop should do the same.
fast?: boolean;
// AX-stability gate budget when `fast` is false. Default 10000ms
// — matches the existing claudeai.ts/T26 inline implementations.
// Increase for cold-cache cases on slow machines.
stabilityTimeoutMs?: number;
// Renderer URL filter for `inspector.getAccessibleTree`. Default
// 'claude.ai'. Tests against a different webContents (find_in_page,
// main_window) can override but the AX tree on those is much
// simpler — `claude.ai` is the only one current consumers care
// about.
urlFilter?: string;
}
// Single AX-tree read, returning the walker's flat RawElement[]
// snapshot. Identical contract to the private `snapshotAx` formerly in
// `claudeai.ts` and the inlined one formerly in T26 — extracted here
// so both consumers share an implementation.
//
// Cost: ~800ms when the stability gate hits "stable" on the first
// pair of reads (interior-loop fast=true callers skip this); a few
// seconds on cold-cache. The AX tree itself is comparatively cheap
// to fetch and convert (~50-100ms).
export async function snapshotAx(
inspector: InspectorClient,
opts: SnapshotAxOptions = {},
): Promise<RawElement[]> {
if (!opts.fast) {
await waitForAxTreeStable(inspector, {
minNodes: 1,
timeoutMs: opts.stabilityTimeoutMs ?? 10_000,
});
}
const url = opts.urlFilter ?? 'claude.ai';
const nodes: AxNode[] = await inspector.getAccessibleTree(url);
return axTreeToSnapshot(nodes);
}
export interface WaitForAxNodeOptions {
// Total budget for the polling loop. Default 5000ms — matches the
// claudeai.ts / T26 callsites that the primitive replaces. Override
// upward for cold-cache or post-click cases (T26 uses 10s post-
// click; CodeTab.activate uses 5s default but T16 passes 15s).
timeoutMs?: number;
// Per-iteration interval. Default 200ms — matches the existing
// inline retryUntil({ interval: 200 }) calls. The AX tree fetch
// itself dominates the loop cost; a shorter interval gives no
// throughput benefit and a longer one delays the resolution.
intervalMs?: number;
// Renderer URL filter passed through to `snapshotAx`. Default
// 'claude.ai'.
urlFilter?: string;
// Whether to gate on `waitForAxTreeStable` once before entering
// the poll loop. Default true. When the caller has just mutated
// the page (e.g. clicked a button and is waiting for the
// resulting menu to render) the upfront stability gate is what
// keeps the first iteration from racing the in-flight render.
// After the upfront gate, every iteration uses fast=true so the
// loop iterates without re-blocking on stability.
stabilityGate?: boolean;
// AX-stability gate budget for the upfront `waitForAxTreeStable`
// when `stabilityGate` is true. Default 5000ms. Independent from
// the outer poll budget — the gate is a hard precondition, not
// part of the find loop.
stabilityTimeoutMs?: number;
}
// Poll the AX tree until the predicate matches a node, or the budget
// runs out. Returns the matched RawElement on success, null on
// timeout.
//
// The predicate runs over RawElement (the walker-snapshot shape) so
// callers can use the same `el.computedRole === 'button' &&
// el.accessibleName === 'Code'` form they already have inline. The
// helper does NOT click the matched node — callers receive the
// RawElement and can pass `el.backendDOMNodeId` to
// `inspector.clickByBackendNodeId` if a click follows. Keeping click
// out of the find primitive lets composite consumers (e.g. "find then
// click then poll for the menu") chain cleanly.
//
// On timeout, returns null. Callers that want a hard fail with a
// diagnostic should pattern-match `if (!found) throw new Error(...)`
// — the primitive doesn't throw because some specs surface
// missing-node as a clean fail with a JSON snapshot attachment
// rather than an uncaught timeout.
//
// The `name` param is purely for diagnostic message hygiene if a
// consumer wraps a throw around the null return — it's appended to
// the implicit "looking for a node matching <predicate>" so failure
// logs read meaningfully. Optional; pass an empty string to suppress.
export async function waitForAxNode(
inspector: InspectorClient,
predicate: (el: RawElement) => boolean,
opts: WaitForAxNodeOptions = {},
): Promise<RawElement | null> {
const stabilityGate = opts.stabilityGate ?? true;
if (stabilityGate) {
await waitForAxTreeStable(inspector, {
minNodes: 1,
timeoutMs: opts.stabilityTimeoutMs ?? 5_000,
});
}
return retryUntil(
async () => {
const elements = await snapshotAx(inspector, {
fast: true,
urlFilter: opts.urlFilter,
});
return elements.find(predicate) ?? null;
},
{
timeout: opts.timeoutMs ?? 5_000,
interval: opts.intervalMs ?? 200,
},
);
}
// Same shape as `waitForAxNode` but returns every match rather than
// the first. Useful for consumers that want to enumerate all menu
// items or all compact pills after a stability point — the
// findCompactPills caller in claudeai.ts is a one-shot snapshot
// today, but if a consumer needs to wait for "at least one compact
// pill" plus enumerate the resulting set, this avoids a second
// round-trip.
//
// Returns the (possibly empty) array on success, null on timeout
// when no element ever matched. A successful call with zero matches
// is impossible by construction — the loop only resolves once the
// post-filter array is non-empty.
export async function waitForAxNodes(
inspector: InspectorClient,
predicate: (el: RawElement) => boolean,
opts: WaitForAxNodeOptions = {},
): Promise<RawElement[] | null> {
const stabilityGate = opts.stabilityGate ?? true;
if (stabilityGate) {
await waitForAxTreeStable(inspector, {
minNodes: 1,
timeoutMs: opts.stabilityTimeoutMs ?? 5_000,
});
}
return retryUntil(
async () => {
const elements = await snapshotAx(inspector, {
fast: true,
urlFilter: opts.urlFilter,
});
const matches = elements.filter(predicate);
return matches.length > 0 ? matches : null;
},
{
timeout: opts.timeoutMs ?? 5_000,
interval: opts.intervalMs ?? 200,
},
);
}

View File

@@ -0,0 +1,397 @@
// claude.ai renderer-UI domain wrapper — single point of coupling to
// upstream's accessibility tree for tests that drive the renderer.
//
// Why centralize: claude.ai's UI ships from a different release train
// than the Electron shell, so any cross-spec drift would be an N-file
// fix. Confining the discovery here means the rest of the harness can
// speak in domain verbs (`activate('Code')`, `openEnvPill()`, …) and
// we only retune one file when upstream drifts.
//
// Discovery substrate is Chromium's accessibility tree
// (`Accessibility.getFullAXTree` over CDP), shared with the v7 walker.
// Reading from AX rather than the DOM means the page-objects survive
// tailwind class regeneration and React-tree restructuring as long as
// the platform-computed role + accessible name + ancestor landmarks
// stay stable. See docs/learnings/test-harness-ax-tree-walker.md for
// the gotchas (AX-enable async lag, post-click stability gating, list
// virtualization).
//
// Discrimination shapes used:
// - Top-level tabs: `role: 'button'` whose accessibleName matches
// the literal tab label ('Chat' | 'Cowork' | 'Code'). The
// `df-pill` tailwind anchor and `aria-label` selector are gone —
// the AX-computed name is the durable contract.
// - Compact pills (the env pill on Code, the "Select folder…" pill
// after Local is chosen): `role: 'button'` with
// `hasPopup === 'menu'`, scoped away from the cowork sidebar by
// filtering out per-row `^More options for ` triggers. The visible
// label is the button's accessibleName.
// - Menu items: any of `menuitem` / `menuitemradio` /
// `menuitemcheckbox` (collected as MENU_ITEM_ROLES below).
import type { InspectorClient } from './inspector.js';
import {
snapshotAx,
waitForAxNode,
waitForAxNodes,
waitForAxTreeStable,
} from './ax.js';
import { retryUntil, sleep } from './retry.js';
// All three CDP-exposed menu-item variants. Caller code wants to treat
// them uniformly — radios and checkboxes are still "items in an open
// menu the user can pick".
const MENU_ITEM_ROLES = new Set<string>([
'menuitem',
'menuitemradio',
'menuitemcheckbox',
]);
// AccessibleName patterns that indicate a per-row trigger button on
// the cowork sidebar (~70+ of them on a busy account). They share the
// same `hasPopup: 'menu'` signal as the compact pills we actually
// want, so excluding them by name is the load-bearing discriminator.
const ROW_MORE_OPTIONS_RE = /^More options for /;
// `snapshotAx` and the stability gate are now in `lib/ax.ts` —
// extracted there in session 13 once T26 had to redefine the same
// helper inline (two consumers = threshold-driven extraction). Page-
// objects below import via the lib aliases; consumers outside this
// file should reach for `lib/ax.ts` directly rather than re-importing
// through `lib/claudeai.ts`.
// One of the three top-level pills. Click is fire-and-forget — the
// router rerenders the tab body inline (no URL change on Code), so
// callers must poll for whatever signal indicates *their* next step is
// ready (e.g. CodeTab.activate polls for the env pill).
//
// AX-tree match: `role: 'button'` with the literal tab name as the
// accessible name. The visible label and aria-label happen to coincide
// today, and the AX-computed name follows the same cascade — pinning
// to the name keeps the page-object durable across the tailwind
// regenerations that motivated the migration.
//
// Pre-click polling budget. Up to session 13, this was a one-shot
// snapshot — if the tab button hadn't rendered yet when activateTab
// was called, the function returned `{ clicked: false }` immediately.
// Session 13's `waitForAxNode` substrate makes "wait for the button to
// appear" a one-line shape-only change. Default 5000ms matches the
// `lib/ax.ts` defaults; callers that previously relied on the no-retry
// shape pass `timeout: 0` (e.g. via `waitForAxNode`'s timeoutMs) to
// keep the old behaviour, though no caller currently does so. T16
// passes 15s through `CodeTab.activate({ timeout })` — that budget is
// still spent on the post-click pill poll; the pre-click click budget
// is independent.
export async function activateTab(
inspector: InspectorClient,
name: 'Chat' | 'Cowork' | 'Code',
opts: { timeout?: number } = {},
): Promise<{ clicked: boolean }> {
const target = await waitForAxNode(
inspector,
(el) =>
el.computedRole === 'button' && el.accessibleName === name,
{ timeoutMs: opts.timeout ?? 5_000 },
);
if (!target || target.backendDOMNodeId === null) {
return { clicked: false };
}
await inspector.clickByBackendNodeId('claude.ai', target.backendDOMNodeId);
return { clicked: true };
}
// A "compact pill" — the React component used by both the env pill and
// the "Select folder…" pill. AX shape: `role: 'button'` with
// `hasPopup === 'menu'`, scoped away from cowork sidebar row triggers
// (`/^More options for /`). The tailwind `max-w-[Npx]` field used to
// be carried as a diagnostic in v6; that signal isn't in the AX tree
// (and it was tailwind-specific, exactly the kind of thing the
// migration was meant to drop), so it's gone — callers only used it
// in error messages.
export interface CompactPill {
text: string;
}
export async function findCompactPills(
inspector: InspectorClient,
): Promise<CompactPill[]> {
const elements = await snapshotAx(inspector);
return elements
.filter(
(el) =>
el.computedRole === 'button' &&
el.hasPopup === 'menu' &&
el.accessibleName !== null &&
el.accessibleName.length > 0 &&
!ROW_MORE_OPTIONS_RE.test(el.accessibleName),
)
.map((el) => ({ text: el.accessibleName as string }));
}
// Open a compact pill whose accessibleName matches `labelPattern`.
// Discrimination: `role: 'button'` AND `hasPopup === 'menu'` AND the
// AX-computed name passes the regex. The hasPopup gate is what stops
// us trial-clicking action buttons that happen to share text with a
// pill — the pill always carries an aria-haspopup contract (it opens
// a popover) while a same-named action button does not.
//
// Polls the AX tree post-click for the menu to render (any role in
// MENU_ITEM_ROLES). Returns the rendered menu item names so the caller
// can validate without a second snapshot round-trip.
export async function openPill(
inspector: InspectorClient,
labelPattern: RegExp,
opts: { timeout?: number } = {},
): Promise<{ opened: boolean; items: string[] }> {
const timeout = opts.timeout ?? 5000;
const elements = await snapshotAx(inspector);
const target = elements.find(
(el) =>
el.computedRole === 'button' &&
el.hasPopup === 'menu' &&
el.accessibleName !== null &&
labelPattern.test(el.accessibleName),
);
if (!target || target.backendDOMNodeId === null) {
return { opened: false, items: [] };
}
await inspector.clickByBackendNodeId('claude.ai', target.backendDOMNodeId);
// Menu render is async and the AX tree lags DOM by hundreds of ms
// (see docs/learnings/test-harness-ax-tree-walker.md §1). Gate
// once on stability post-click, then poll fast — re-gating on every
// iteration would burn 800ms+ each cycle waiting for "no change"
// when what we want is "menuitems appear".
await waitForAxTreeStable(inspector, { minNodes: 1, timeoutMs: 5_000 });
const deadline = Date.now() + timeout;
while (Date.now() < deadline) {
const post = await snapshotAx(inspector, { fast: true });
const items = post.filter((el) => MENU_ITEM_ROLES.has(el.computedRole));
if (items.length > 0) {
return {
opened: true,
items: items.map((el) => (el.accessibleName ?? '').slice(0, 80)),
};
}
await sleep(100);
}
return { opened: false, items: [] };
}
// Click any menuitem (any of MENU_ITEM_ROLES) whose accessibleName
// matches `textPattern`. Caller opens the menu first. Polls the AX
// snapshot — menu render is async and the AX tree lags DOM by
// hundreds of ms.
//
// Returns the matched item's text and the full item list at the time
// of the match — the second is useful for diagnostics when `clicked`
// is null.
export async function clickMenuItem(
inspector: InspectorClient,
textPattern: RegExp,
opts: { timeout?: number } = {},
): Promise<{ clicked: string | null; items: string[] }> {
const timeout = opts.timeout ?? 1500;
// Caller has just opened a menu — gate once on stability so the
// first iteration sees the populated tree, then poll fast for the
// match. Same shape as openPill's post-click handling.
await waitForAxTreeStable(inspector, { minNodes: 1, timeoutMs: 5_000 });
const deadline = Date.now() + timeout;
let lastItemNames: string[] = [];
while (Date.now() < deadline) {
const elements = await snapshotAx(inspector, { fast: true });
const items = elements.filter((el) =>
MENU_ITEM_ROLES.has(el.computedRole),
);
lastItemNames = items.map((el) => (el.accessibleName ?? '').slice(0, 80));
const match = items.find(
(el) =>
el.accessibleName !== null && textPattern.test(el.accessibleName),
);
if (match && match.backendDOMNodeId !== null) {
const text = (match.accessibleName ?? '').slice(0, 80);
await inspector.clickByBackendNodeId(
'claude.ai',
match.backendDOMNodeId,
);
return { clicked: text, items: lastItemNames };
}
await sleep(100);
}
return { clicked: null, items: lastItemNames };
}
// Dispatch an Escape keydown to the document. Used by openEnvPill's
// trial-click loop to dismiss the menu when the wrong pill was hit.
// We dispatch on document because the popover trigger may not have
// retained focus.
export async function pressEscape(inspector: InspectorClient): Promise<void> {
await inspector.evalInRenderer<null>(
'claude.ai',
`(() => {
document.dispatchEvent(new KeyboardEvent('keydown', {
key: 'Escape', code: 'Escape', keyCode: 27, which: 27,
bubbles: true, cancelable: true,
}));
return null;
})()`,
);
}
// Code tab domain operations. Instance-shaped (carries the inspector)
// to match QuickEntry / MainWindow in quickentry.ts.
//
// Only valid after the renderer has loaded a logged-in claude.ai page;
// callers should `app.waitForReady('userLoaded')` first. activate()
// itself doesn't repeat that check — it would just fail to find the
// Code button on /login, which surfaces as a clear error.
export class CodeTab {
constructor(private readonly inspector: InspectorClient) {}
// Click the Code tab, then poll up to `timeout` for at least one
// compact pill to render. The env pill rendering is the cheapest
// signal that the Code-tab body has mounted and is interactive —
// the URL doesn't change (route stays `/new` etc.), so we can't
// anchor on navigation. Throws on miss with the candidate count for
// triage.
//
// Session 14 migration: the pre-click `activateTab` call now polls
// up to `opts.timeout` for the Code button itself to appear (was a
// one-shot snapshot prior — the T16 failure mode). Same budget
// covers both phases; in practice the click resolves in well under
// a second when the Code button is present, so the post-click pill
// poll inherits the bulk of the budget.
async activate(opts: { timeout?: number } = {}): Promise<void> {
const timeout = opts.timeout ?? 5000;
const result = await activateTab(this.inspector, 'Code', { timeout });
if (!result.clicked) {
throw new Error(
'CodeTab.activate: no AX-tree button with accessibleName="Code" found',
);
}
// Post-click: poll the AX tree for at least one compact pill.
// `waitForAxNodes` carries the snapshot+filter+sleep loop
// formerly hand-rolled here, with the same per-iteration cadence
// (200ms) and overall budget. Predicate matches `findCompactPills`
// — `role: 'button'` + `hasPopup: 'menu'` + non-empty
// accessibleName + not a per-row "More options for X" trigger.
const ready = await waitForAxNodes(
this.inspector,
(el) =>
el.computedRole === 'button' &&
el.hasPopup === 'menu' &&
el.accessibleName !== null &&
el.accessibleName.length > 0 &&
!ROW_MORE_OPTIONS_RE.test(el.accessibleName),
{ timeoutMs: timeout, intervalMs: 200 },
);
if (!ready) {
throw new Error(
`CodeTab.activate: no compact pill rendered within ${timeout}ms ` +
`after clicking Code — tab body may not have mounted`,
);
}
}
// Open the env pill (the compact pill whose menu contains a `^Local`
// menuitemradio). Trial-click strategy: for each compact pill, try
// opening it and check for the Local item. If absent, dismiss with
// Escape and try the next. Necessary because nothing in the DOM
// distinguishes the env pill from a future second compact pill at
// rest — only the menu contents disambiguate.
//
// Returns the matched pill's label text and the rendered menu
// items. Throws if no candidate yields a Local-bearing menu.
async openEnvPill(): Promise<{ pillText: string; items: string[] }> {
const pills = await findCompactPills(this.inspector);
if (pills.length === 0) {
throw new Error(
'CodeTab.openEnvPill: no compact pills on the page — ' +
'did you call activate() first?',
);
}
// Iterate by label rather than DOM index so we can use openPill
// with an exact-text anchor — avoids re-querying ordinals after
// each Escape (the DOM may shift).
for (const pill of pills) {
const labelRe = new RegExp(`^${escapeRegExp(pill.text)}$`);
const opened = await openPill(this.inspector, labelRe, { timeout: 1500 });
if (!opened.opened) continue;
const hasLocal = opened.items.some((t) => /^Local\b/.test(t));
if (hasLocal) {
return { pillText: pill.text, items: opened.items };
}
await pressEscape(this.inspector);
// Brief settle so the next openPill doesn't race the popover
// teardown. 150ms matches the original T17 implementation.
await sleep(150);
}
throw new Error(
`CodeTab.openEnvPill: probed ${pills.length} compact pill(s), ` +
`none yielded a menu containing /^Local\\b/`,
);
}
// Click the `^Local` menuitemradio inside the (already-open) env-pill
// menu. textContent reads "Local, environment settings, right arrow"
// because of the SR-only suffix; we anchor on /^Local\b/.
async selectLocal(): Promise<void> {
const result = await clickMenuItem(this.inspector, /^Local\b/);
if (!result.clicked) {
throw new Error(
`CodeTab.selectLocal: no /^Local\\b/ item in the open menu. ` +
`Items: ${JSON.stringify(result.items)}`,
);
}
}
// Full chain: open env pill → Local → wait for the "Select folder…"
// pill to render → open it → click "Open folder…". After this
// resolves, dialog.showOpenDialog has been invoked (the caller
// installs the mock first and polls getOpenDialogCalls to confirm).
//
// Each step throws on its own miss with enough metadata to tell
// which selector decayed; the caller can wrap the whole chain in
// try/catch for partial-state attachment.
async openFolderPicker(): Promise<void> {
await this.openEnvPill();
await this.selectLocal();
// The Select-folder pill renders after Local is chosen. Same
// CompactPill shape — anchor on the leading "Select folder"
// text. 4s budget matches the T17 wait that proved sufficient
// in practice on KDE-W.
const selectOpened = await retryUntil(
async () => {
const r = await openPill(this.inspector, /^Select folder/, {
timeout: 1000,
});
return r.opened ? r : null;
},
{ timeout: 4000, interval: 200 },
);
if (!selectOpened) {
throw new Error(
'CodeTab.openFolderPicker: "Select folder…" pill did not ' +
'open within 4s after Local was clicked',
);
}
// The Select-folder menu has a "Recent" group (radios — clicking
// reuses the past path silently, no dialog) followed by
// "Open folder…" (menuitem — fires the picker). Click the
// menuitem variant explicitly; clickMenuItem matches all
// menuitem* roles, so the leading-text anchor is what
// disambiguates here.
const openClicked = await clickMenuItem(this.inspector, /^Open folder/);
if (!openClicked.clicked) {
throw new Error(
`CodeTab.openFolderPicker: no /^Open folder/ menuitem in ` +
`the Select-folder menu. Items: ${JSON.stringify(openClicked.items)}`,
);
}
}
}
// Standard "escape regex special chars in a literal string" helper.
// Used to build an exact-match RegExp from a captured pill label.
function escapeRegExp(s: string): string {
return s.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

View File

@@ -0,0 +1,40 @@
import { sessionBus, type MessageBus, type ClientInterface } from 'dbus-next';
let cached: MessageBus | null = null;
export function getSessionBus(): MessageBus {
if (!cached) {
cached = sessionBus();
}
return cached;
}
export async function disconnectBus(): Promise<void> {
if (cached) {
cached.disconnect();
cached = null;
}
}
// dbus-next exposes interface methods as dynamic properties typed loosely. Cast
// at the call site rather than re-typing every D-Bus interface we touch.
type DynamicMethod = (...args: unknown[]) => Promise<unknown>;
export function method(iface: ClientInterface, name: string): DynamicMethod {
const fn = (iface as unknown as Record<string, DynamicMethod | undefined>)[name];
if (typeof fn !== 'function') {
throw new Error(`D-Bus method ${name} not found on interface`);
}
return fn.bind(iface);
}
export async function getConnectionPid(connectionName: string): Promise<number> {
const bus = getSessionBus();
const proxy = await bus.getProxyObject(
'org.freedesktop.DBus',
'/org/freedesktop/DBus',
);
const iface = proxy.getInterface('org.freedesktop.DBus');
const result = await method(iface, 'GetConnectionUnixProcessID')(connectionName);
return result as number;
}

View File

@@ -0,0 +1,65 @@
import { readFile } from 'node:fs/promises';
import { homedir } from 'node:os';
import { join } from 'node:path';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
const exec = promisify(execFile);
const LAUNCHER_LOG = join(
homedir(),
'.cache/claude-desktop-debian/launcher.log',
);
export async function readLauncherLog(): Promise<string | null> {
try {
return await readFile(LAUNCHER_LOG, 'utf8');
} catch {
return null;
}
}
export interface DoctorResult {
output: string;
exitCode: number | null;
}
export async function runDoctor(launcher?: string): Promise<DoctorResult> {
const bin = launcher ?? process.env.CLAUDE_DESKTOP_LAUNCHER ?? 'claude-desktop';
try {
const { stdout, stderr } = await exec(bin, ['--doctor'], { timeout: 15_000 });
return {
output: `${stdout}\n${stderr}`.trim(),
exitCode: 0,
};
} catch (err) {
// --doctor may exit non-zero if checks fail; still return the output
// and the actual exit code so T02/T13/S05 can assert against it.
const e = err as { stdout?: string; stderr?: string; code?: number };
const combined = `${e.stdout ?? ''}\n${e.stderr ?? ''}`.trim();
return {
output: combined,
exitCode: typeof e.code === 'number' ? e.code : null,
};
}
}
export function captureSessionEnv(): Record<string, string> {
const keys = [
'XDG_SESSION_TYPE',
'XDG_CURRENT_DESKTOP',
'WAYLAND_DISPLAY',
'DISPLAY',
'GDK_BACKEND',
'QT_QPA_PLATFORM',
'OZONE_PLATFORM',
'ELECTRON_OZONE_PLATFORM_HINT',
'CLAUDE_DESKTOP_LAUNCHER',
];
const out: Record<string, string> = {};
for (const k of keys) {
const v = process.env[k];
if (v !== undefined) out[k] = v;
}
return out;
}

View File

@@ -0,0 +1,413 @@
// "eipc" channel-registry primitive — runtime discovery of the custom
// `$eipc_message$_<UUID>_$_<scope>_$_<iface>_$_<method>` handlers
// registered on each per-webContents IPC scope.
//
// Why this exists
// ---------------
// Sessions 2-6 of the runner-implementation work treated the eipc
// registry as unreachable from main: the standard Electron
// `ipcMain._invokeHandlers` map only carries 3 chat-tab MCP-bridge
// handlers (`list-mcp-servers`, `connect-to-mcp-server`,
// `request-open-mcp-settings`); the 700+ `claude.web_$_*` /
// `claude.settings_$_*` etc. channels were assumed to be closure-
// local. Session 3's `globalThis` walk came up empty, which kept
// T22/T31/T33/T38 stuck as Tier 1 asar fingerprints rather than
// runtime registry probes.
//
// Session 7 found the missing piece: handlers DO go through
// Electron's stdlib `IpcMainImpl` — just not the GLOBAL `ipcMain`
// instance. Each `webContents` has its own `webContents.ipc` (per-
// `WebContents` IPC scope, introduced in Electron 17+), and that's
// where every `e.ipc.handle("$eipc_message$_..._$_<scope>_$_<iface>_$_<method>", fn)`
// call lands. Verified empirically against a debugger-attached
// running Claude:
// - find_in_page wc: 78 handlers (settings/find-in-page only)
// - main_window wc: 79 handlers (settings/title-bar only)
// - claude.ai wc: 490 handlers (full surface — including
// 117 LocalSessions, 16 CustomPlugins)
// - global ipcMain: 3 handlers (the chat-tab MCP-bridge trio)
//
// All `claude.web_$_*` interfaces (LocalSessions, CustomPlugins,
// CoworkSpaces, CoworkArtifacts, CoworkMemory, ClaudeCode, etc.)
// register on the claude.ai webContents. They're sticky across route
// changes — once registered (during webContents init), they don't
// deregister when the user navigates between /chats and /epitaxy.
// So the wait-for-channel poll just needs claude.ai to be alive +
// finished initial handler registration, NOT a specific route.
//
// What this primitive does
// ------------------------
// Read-only enumeration via `getEipcChannels` / `findEipcChannel` /
// `waitForEipcChannel(s)`. Handler PRESENCE checks (T22b / T31b / T33b
// / T38b) — that's strictly stronger than the asar fingerprint (a
// handler registered at runtime is a handler that actually wired up,
// not just a string in the bundle).
//
// Plus `invokeEipcChannel` (session 8 addition) — calls a registered
// handler through the renderer-side wrapper at `window['claude.<scope>']
// .<Iface>.<method>(...args)`. The wrapper is exposed by `mainView.js`
// preload via `contextBridge.exposeInMainWorld` after a frame + origin
// gate (top-level frame, origin in `{claude.ai, claude.com,
// preview.claude.ai, preview.claude.com, localhost}`). Because the
// `inspector.evalInRenderer('claude.ai', ...)` path runs inside the
// claude.ai renderer, the wrapper is present and the synthesized
// `IpcMainInvokeEvent` carries an honest `senderFrame` — the alternative
// of pulling the function out of `_invokeHandlers` and synthesizing a
// fake event with `senderFrame.url = 'https://claude.ai/'` works (the
// gates are duck-typed structural checks) but spoofs a security-relevant
// claim. Going through the wrapper keeps the test surface aligned with
// real attack surface.
//
// `invokeEipcChannel` is read-by-default but doesn't enforce a
// read-only allowlist — the safety property is that consumers pass
// case-doc-anchored suffixes verbatim, which limits the blast radius
// to whatever the case doc said the test should poke. Don't pass
// `start*` / `set*` / `write*` / `run*` / `openIn*` suffixes; those
// mutate user state.
//
// Framing opacity
// ---------------
// The `$eipc_message$_<UUID>_$_<scope>_$_<iface>_$_<method>` framing
// has been UUID-stable across builds (session 2 noted
// `c0eed8c9-c94a-4931-8cc3-3a08694e9863`; session 7 confirmed it's
// still that, single UUID across all 647 per-wc handlers). The
// primitive does not pin the UUID — match by suffix so a future
// build that rotates the UUID doesn't silently break every consuming
// spec. Suffix matching is also what the case-doc anchors use
// (`LocalSessions_$_getPrChecks` etc.), so consumers can pass the
// case-doc string verbatim.
import { retryUntil } from './retry.js';
import type { InspectorClient } from './inspector.js';
// One handler entry on a webContents. `suffix` is the part after the
// UUID — `<scope>_$_<iface>_$_<method>` — useful for dedup / display.
// `fullKey` is the full registry key including the framing prefix and
// UUID, kept for diagnostic attachments where the raw form matters
// (drift detection, regression triage). `webContentsId` lets a caller
// disambiguate when a future scope registers the same suffix on
// multiple webContents (today only `claude.settings/*` does this and
// every wc gets the same set; non-issue for current consumers).
export interface EipcChannel {
suffix: string;
fullKey: string;
webContentsId: number;
webContentsUrl: string;
}
export interface GetEipcChannelsOptions {
// Substring match on `webContents.getURL()`. Default: 'claude.ai'.
// Pass an empty string to enumerate every webContents.
urlFilter?: string;
// Optional scope filter — e.g. 'claude.web' to drop settings-
// scope handlers. Matched against the segment immediately after
// the UUID. Empty / undefined returns all scopes.
scope?: string;
// Optional interface filter — e.g. 'LocalSessions'. Matched
// against the segment after the scope. Empty / undefined returns
// all interfaces.
iface?: string;
}
// Internal: shape returned by the inspector eval below. Kept private
// so the `EipcChannel` interface above is the public type contract.
interface RawEntry {
wcId: number;
wcUrl: string;
fullKey: string;
}
// Enumerate every eipc-framed handler key registered on every matching
// webContents. The UUID is opaque to the caller — only the suffix
// (`<scope>_$_<iface>_$_<method>`) is exposed via the EipcChannel
// type. Filtering by `scope` / `iface` happens after the inspector
// eval (the eval keeps its filter set minimal so a single eval call
// covers every consumer's needs).
//
// Returns an empty array when no matching webContents exists (e.g.
// the spec called this before claude.ai loaded). Callers that need
// a "wait until present" semantic should use `waitForEipcChannel`
// instead.
export async function getEipcChannels(
inspector: InspectorClient,
opts: GetEipcChannelsOptions = {},
): Promise<EipcChannel[]> {
const urlFilter = opts.urlFilter ?? 'claude.ai';
const raw = await inspector.evalInMain<RawEntry[]>(`
const { webContents } = process.mainModule.require('electron');
const urlFilter = ${JSON.stringify(urlFilter)};
const out = [];
for (const wc of webContents.getAllWebContents()) {
const url = wc.getURL();
if (urlFilter && !url.includes(urlFilter)) continue;
const ipc = wc.ipc;
const map = ipc && ipc._invokeHandlers;
if (!map) continue;
const keys = (typeof map.keys === 'function')
? Array.from(map.keys())
: Object.keys(map);
for (const k of keys) {
out.push({ wcId: wc.id, wcUrl: url, fullKey: k });
}
}
return out;
`);
// Match the framing prefix and capture the suffix. Anything that
// doesn't match (e.g. a non-eipc handler that snuck onto a wc
// scope) gets filtered out — only eipc-framed entries are part of
// this primitive's contract.
const re = /^\$eipc_message\$_[0-9a-f-]+_\$_(.+)$/;
const out: EipcChannel[] = [];
for (const entry of raw) {
const m = re.exec(entry.fullKey);
if (!m) continue;
const suffix = m[1]!;
if (opts.scope) {
// Suffix shape: `<scope>_$_<iface>_$_<method>`. Anchor at
// the start so 'claude.web' matches but 'web' doesn't
// match `claude.settings` etc.
if (!suffix.startsWith(`${opts.scope}_$_`)) continue;
}
if (opts.iface) {
// Interface segment is after the scope — search for
// `_$_<iface>_$_` in the suffix. Anchored separators
// avoid accidentally matching a method name that happens
// to contain the iface string.
if (!suffix.includes(`_$_${opts.iface}_$_`)) continue;
}
out.push({
suffix,
fullKey: entry.fullKey,
webContentsId: entry.wcId,
webContentsUrl: entry.wcUrl,
});
}
return out;
}
export interface FindEipcChannelOptions {
// Substring match on `webContents.getURL()`. Default: 'claude.ai'.
urlFilter?: string;
}
// Locate the first registered handler whose suffix ends with
// `caseDocSuffix`. Designed so callers can pass the case-doc-anchored
// string verbatim — e.g. `LocalSessions_$_getPrChecks`. Returns null
// when no match exists (caller decides whether to fail, skip, or
// retry).
//
// This is a synchronous one-shot; for the populate-on-init wait, use
// `waitForEipcChannel` — it wraps this in a retryUntil.
export async function findEipcChannel(
inspector: InspectorClient,
caseDocSuffix: string,
opts: FindEipcChannelOptions = {},
): Promise<EipcChannel | null> {
const channels = await getEipcChannels(inspector, {
urlFilter: opts.urlFilter,
});
for (const ch of channels) {
if (ch.suffix.endsWith(caseDocSuffix)) return ch;
}
return null;
}
export interface WaitForEipcChannelOptions {
urlFilter?: string;
// Total budget for the poll. Default 15s — the claude.ai
// webContents' initial handler registration completes within a
// second of `userLoaded` on the dev box, so 15s leaves wide
// margin for slow-cache cases.
timeoutMs?: number;
intervalMs?: number;
}
// Poll until the named channel is registered, or the budget runs out.
// Use this when the spec just reached `waitForReady('userLoaded')` —
// the claude.ai webContents may exist but its handlers might not have
// finished registering yet. The poll is cheap (one inspector eval per
// tick + a string scan) so the default interval can be aggressive.
//
// Returns the EipcChannel on success, null on timeout. Callers that
// want a hard fail on timeout should `expect(channel, '...').not.toBeNull()`
// — the primitive doesn't throw because some specs want to surface
// missing-handler as a clean fail with diagnostics rather than an
// uncaught timeout.
export async function waitForEipcChannel(
inspector: InspectorClient,
caseDocSuffix: string,
opts: WaitForEipcChannelOptions = {},
): Promise<EipcChannel | null> {
return retryUntil(
() => findEipcChannel(inspector, caseDocSuffix, opts),
{
timeout: opts.timeoutMs ?? 15_000,
interval: opts.intervalMs ?? 250,
},
);
}
// Convenience: resolve a list of case-doc suffixes in one round-trip.
// Returns a Map keyed by the input suffix so callers can iterate the
// expected list and report per-suffix presence. Missing suffixes have
// `null` values.
//
// Single inspector call by design — the `getEipcChannels` cost is
// dominated by the eval round-trip, not the in-process filtering, so
// batching is strictly cheaper than N calls to `findEipcChannel`.
export async function findEipcChannels(
inspector: InspectorClient,
caseDocSuffixes: readonly string[],
opts: FindEipcChannelOptions = {},
): Promise<Map<string, EipcChannel | null>> {
const channels = await getEipcChannels(inspector, {
urlFilter: opts.urlFilter,
});
const out = new Map<string, EipcChannel | null>();
for (const suffix of caseDocSuffixes) {
const hit = channels.find((c) => c.suffix.endsWith(suffix));
out.set(suffix, hit ?? null);
}
return out;
}
// Wait until ALL of the listed suffixes are registered, or the budget
// runs out. Useful for trios like T31's side-chat (start/send/stop) —
// the trio is load-bearing as a unit; partial registration is a fail.
//
// Returns the resolved Map on full success. On timeout, returns the
// last-observed Map (some entries may be null) so callers can surface
// the partial state in their diagnostic attachment before failing.
export async function waitForEipcChannels(
inspector: InspectorClient,
caseDocSuffixes: readonly string[],
opts: WaitForEipcChannelOptions = {},
): Promise<Map<string, EipcChannel | null>> {
let lastSnapshot = new Map<string, EipcChannel | null>();
const result = await retryUntil(
async () => {
const snap = await findEipcChannels(
inspector,
caseDocSuffixes,
opts,
);
lastSnapshot = snap;
for (const v of snap.values()) if (v === null) return null;
return snap;
},
{
timeout: opts.timeoutMs ?? 15_000,
interval: opts.intervalMs ?? 250,
},
);
return result ?? lastSnapshot;
}
export interface InvokeEipcChannelOptions {
// Renderer URL filter. Default 'claude.ai' — the only webContents
// whose origin passes the wrapper-exposure gate (`Qc()` in
// `mainView.js`: `https://claude.ai`, `https://claude.com`,
// preview.*, localhost). The `find_in_page` and `main_window`
// webContents register `claude.settings/*` handlers in their
// per-wc IPC scope but their renderers run from `file://`, so
// `window['claude.settings']` is never exposed there and invocation
// through them would need a different (main-side, fake-event)
// approach not implemented in this primitive.
urlFilter?: string;
// Inspector eval timeout. Default = InspectorClient.defaultTimeoutMs
// (30s). Read-only handlers like `getMcpServersConfig` /
// `readGlobalMemory` / `getAllScheduledTasks` return well within
// 1s on a warm app; the 30s budget is for cold-cache cases.
timeoutMs?: number;
}
// Invoke an eipc handler through the renderer-side wrapper at
// `window['claude.<scope>'].<Iface>.<method>(...args)`. The suffix is
// resolved against the per-wc registry first (same matching rules as
// `findEipcChannel` — accepts both fully-qualified
// `claude.web_$_LocalSessions_$_getPrChecks` and the more concise
// `LocalSessions_$_getPrChecks`) and the scope/iface/method triplet is
// pulled from the resolved full suffix.
//
// Why through the renderer wrapper, not a direct main-side call:
// handlers register via `e.ipc.handle(framedName, async (event, args)
// => { if (!le(event)) throw ...; return A.<method>(args); })` — the
// origin gate is inlined at registration time (variants `le`/`Vi`/`mm`
// in the bundle, all duck-typed structural checks against
// `event.senderFrame.url` and `event.senderFrame.parent === null`).
// Pulling the function out of `_invokeHandlers` and calling it with a
// synthesized event whose `senderFrame.url` is `'https://claude.ai/'`
// works (the gate is structural, not `instanceof`-checked) but spoofs
// the gate's security claim. The wrapper IS at claude.ai, so the
// synthesized event carries an honest senderFrame and the test surface
// matches real attack surface.
//
// Errors:
// - "no handler registered with suffix": the registry walk returned
// nothing matching. Same shape as `findEipcChannel` returning null;
// waitForEipcChannel first if your spec needs the populate-on-init
// poll.
// - "eipc namespace missing in renderer: claude.<scope>": the wrapper
// isn't exposed on this renderer. Either the urlFilter selected a
// webContents whose origin failed `Qc()`, or the build flipped the
// scope's exposure gate. Check `evalInRenderer(urlFilter,
// 'Object.keys(window).filter(k => k.startsWith("claude."))')`.
// - String-form rejection from the renderer eval: the gate / arg-
// validator / result-validator inside the handler closure rejected.
// The framed channel name appears in the error message — use it to
// pinpoint which handler rejected.
//
// Args are JSON-marshaled into the renderer eval. Return value is
// JSON-deserialized via `evalInRenderer`'s `executeJavaScript` path.
// Non-JSON-serializable handler returns (Date, Buffer, circular refs)
// would mangle through this primitive — none of the current Tier 2
// case-doc consumers return such shapes; flag if a future one does.
export async function invokeEipcChannel<T = unknown>(
inspector: InspectorClient,
caseDocSuffix: string,
args: readonly unknown[] = [],
opts: InvokeEipcChannelOptions = {},
): Promise<T> {
const urlFilter = opts.urlFilter ?? 'claude.ai';
const channel = await findEipcChannel(inspector, caseDocSuffix, {
urlFilter,
});
if (!channel) {
throw new Error(
`invokeEipcChannel: no handler registered with suffix ` +
`'${caseDocSuffix}' on a webContents matching ` +
`'${urlFilter}'`,
);
}
// Full suffix is `<scope>_$_<iface>_$_<method>`. Scope contains a
// dot (e.g. claude.web) but the `_$_` separator is unambiguous —
// a 3-part split gives [scope, iface, method] cleanly.
const parts = channel.suffix.split('_$_');
if (parts.length !== 3) {
throw new Error(
`invokeEipcChannel: bad suffix shape '${channel.suffix}' ` +
`(expected '<scope>_$_<iface>_$_<method>')`,
);
}
const [scope, iface, method] = parts;
const argsJson = JSON.stringify(args);
const js = `(async () => {
const ns = window[${JSON.stringify(scope)}];
if (!ns) throw new Error(
'eipc namespace missing in renderer: ' + ${JSON.stringify(scope)}
);
const ifaceObj = ns[${JSON.stringify(iface)}];
if (!ifaceObj) throw new Error(
'eipc interface missing: ' + ${JSON.stringify(iface)} +
' (under ' + ${JSON.stringify(scope)} + ')'
);
const fn = ifaceObj[${JSON.stringify(method)}];
if (typeof fn !== 'function') throw new Error(
'eipc method not a function: ' + ${JSON.stringify(method)} +
' (under ' + ${JSON.stringify(scope)} + '.' + ${JSON.stringify(iface)} + ')'
);
return await fn.apply(ifaceObj, ${argsJson});
})()`;
return inspector.evalInRenderer<T>(urlFilter, js, opts.timeoutMs);
}

View File

@@ -0,0 +1,206 @@
// Mock-then-call helpers for side-effecting Electron module APIs.
//
// Tests that exercise an Electron egress whose real invocation would
// touch the host system (open a file manager, launch an editor, show a
// dialog) install a recorder mock first, then invoke the API via
// `inspector.evalInMain` and assert against the recorded calls. The
// pattern strengthens "didn't throw" probes into "the egress was
// reached + the args flowed through verbatim", with no host side
// effect.
//
// Each helper:
// - is idempotent within an Electron lifecycle (guarded by a
// globalThis flag so re-installation in retry loops is a no-op),
// - records `{ ts, ...args }` into a globalThis call list,
// - returns a value matching the real API's documented contract
// (void / Promise<boolean> / canned dialog result).
//
// The companion `get*Calls()` reader returns `[]` if the mock was
// never installed (rather than throwing) so pre-install reads in
// retry loops are cheap.
//
// Extracted from `lib/claudeai.ts` once the third helper landed
// (T17 dialog → T25 showItemInFolder → T24 openExternal). These
// helpers are not claude.ai-domain — they're generic Electron module
// patches — so the extraction keeps `claudeai.ts` focused on the AX-
// tree page-objects and gives future mock-then-call tests an obvious
// home to add to.
//
// Caller pattern: see `runners/T17_folder_picker.spec.ts`,
// `runners/T25_show_item_in_folder_no_throw.spec.ts`,
// `runners/T24_open_in_editor_no_throw.spec.ts`.
import type { InspectorClient } from './inspector.js';
// ----- dialog.showOpenDialog -----------------------------------------
// Replace dialog.showOpenDialog with a mock that records every call
// and returns a canned result. Idempotent — re-installing within the
// same Electron lifecycle is a no-op (guarded by
// globalThis.__claudeAiDialogMockInstalled). Mirrors the shape of
// QuickEntry.installInterceptor (quickentry.ts:86) so callers across
// libs feel consistent.
//
// The first BrowserWindow positional arg is optional in Electron's
// API, so the mock handles both `showOpenDialog(opts)` and
// `showOpenDialog(window, opts)` shapes.
export async function installOpenDialogMock(
inspector: InspectorClient,
cannedResult: { canceled: boolean; filePaths: string[] } = {
canceled: false,
filePaths: ['/tmp/claude-test-folder'],
},
): Promise<void> {
const canned = JSON.stringify(cannedResult);
await inspector.evalInMain<null>(`
if (globalThis.__claudeAiDialogMockInstalled) return null;
const { dialog } = process.mainModule.require('electron');
globalThis.__claudeAiDialogCalls = [];
const original = dialog.showOpenDialog.bind(dialog);
dialog.showOpenDialog = async function(...args) {
const browserWindowArg = args[0]
&& typeof args[0] === 'object'
&& args[0].constructor
&& args[0].constructor.name === 'BrowserWindow';
const opts = browserWindowArg ? args[1] : args[0];
globalThis.__claudeAiDialogCalls.push({
ts: Date.now(),
nargs: args.length,
title: opts && opts.title,
properties: opts && opts.properties,
});
return ${canned};
};
void original;
globalThis.__claudeAiDialogMockInstalled = true;
return null;
`);
}
export interface OpenDialogCall {
ts: number;
nargs: number;
title?: string;
properties?: string[];
}
// Read the recorded call list. Returns [] if the mock was never
// installed (rather than throwing) — pre-install reads in retry
// loops stay cheap.
export async function getOpenDialogCalls(
inspector: InspectorClient,
): Promise<OpenDialogCall[]> {
return await inspector.evalInMain<OpenDialogCall[]>(
`return globalThis.__claudeAiDialogCalls || []`,
);
}
// ----- shell.showItemInFolder ----------------------------------------
// Replace electron.shell.showItemInFolder with a mock that records
// every call without performing the underlying DBus FileManager1 /
// xdg-open dispatch. Same idempotency-flag pattern as
// installOpenDialogMock.
//
// Why mock vs. invoke real: `showItemInFolder` is fire-and-forget on
// Linux (returns void, no success signal). Invoking it for real opens
// the host's actual file manager — fine in a click-chain test, but
// disruptive when the assertion is just "the JS-level call is
// reachable + accepts a path arg + the IPC layer terminates here".
// The mock keeps the same assertion shape with no host side effect.
export async function installShowItemInFolderMock(
inspector: InspectorClient,
): Promise<void> {
await inspector.evalInMain<null>(`
if (globalThis.__claudeAiShowItemMockInstalled) return null;
const { shell } = process.mainModule.require('electron');
globalThis.__claudeAiShowItemCalls = [];
const original = shell.showItemInFolder.bind(shell);
shell.showItemInFolder = function(fullPath) {
globalThis.__claudeAiShowItemCalls.push({
ts: Date.now(),
path: typeof fullPath === 'string' ? fullPath : String(fullPath),
});
// Return undefined like the real method — callers don't
// inspect the return value.
};
void original;
globalThis.__claudeAiShowItemMockInstalled = true;
return null;
`);
}
export interface ShowItemInFolderCall {
ts: number;
path: string;
}
export async function getShowItemInFolderCalls(
inspector: InspectorClient,
): Promise<ShowItemInFolderCall[]> {
return await inspector.evalInMain<ShowItemInFolderCall[]>(
`return globalThis.__claudeAiShowItemCalls || []`,
);
}
// ----- shell.openExternal --------------------------------------------
// Replace electron.shell.openExternal with a mock that records every
// call without performing the underlying xdg-open / scheme-handler
// dispatch. Same idempotency-flag pattern as installOpenDialogMock /
// installShowItemInFolderMock.
//
// Why mock vs. invoke real: `shell.openExternal` is the single egress
// for all URL-scheme handoffs (browser, OAuth callback, editor URL
// schemes like `vscode://file/<path>`). Invoking it for real on a
// host with the matching scheme handler installed launches the target
// app (e.g. a full VS Code window) — fine in a click-chain test,
// disruptive when the assertion is just "the JS-level call is
// reachable + the URL flowed through verbatim". The mock keeps the
// same assertion shape with no host side effect.
//
// Unlike `showItemInFolder`, `openExternal` returns `Promise<boolean>`
// (true on success, false otherwise — see Electron docs), so the mock
// must return a resolved Promise with the canned boolean rather than
// undefined, otherwise callers that `await` the result would observe
// `undefined` instead of the documented contract.
export async function installOpenExternalMock(
inspector: InspectorClient,
cannedResult: boolean = true,
): Promise<void> {
const canned = JSON.stringify(cannedResult);
await inspector.evalInMain<null>(`
if (globalThis.__claudeAiOpenExternalMockInstalled) return null;
const { shell } = process.mainModule.require('electron');
globalThis.__claudeAiOpenExternalCalls = [];
const original = shell.openExternal.bind(shell);
shell.openExternal = async function(url, options) {
globalThis.__claudeAiOpenExternalCalls.push({
ts: Date.now(),
url: typeof url === 'string' ? url : String(url),
options: options,
});
// Return a resolved Promise<boolean> like the real method —
// callers that await the result expect the documented
// contract (true on success, false otherwise).
return ${canned};
};
void original;
globalThis.__claudeAiOpenExternalMockInstalled = true;
return null;
`);
}
export interface OpenExternalCall {
ts: number;
url: string;
options?: unknown;
}
export async function getOpenExternalCalls(
inspector: InspectorClient,
): Promise<OpenExternalCall[]> {
return await inspector.evalInMain<OpenExternalCall[]>(
`return globalThis.__claudeAiOpenExternalCalls || []`,
);
}

View File

@@ -0,0 +1,515 @@
import { spawn, execFile, type ChildProcess } from 'node:child_process';
import { existsSync, readlinkSync, rmSync } from 'node:fs';
import { homedir } from 'node:os';
import { dirname, join } from 'node:path';
import { promisify } from 'node:util';
import { sleep, retryUntil } from './retry.js';
import { findX11WindowByPid } from './wm.js';
import { InspectorClient } from './inspector.js';
import { createIsolation, type Isolation } from './isolation.js';
import { MainWindow, waitForUserLoaded } from './quickentry.js';
const exec = promisify(execFile);
export interface LaunchOptions {
extraEnv?: Record<string, string>;
args?: string[];
// Pass an existing Isolation to share config across multiple
// launches in one test (e.g. S35 position-memory across restart).
// Pass `null` to opt out of isolation entirely (legacy: shares
// ~/.config/Claude with the host). Default: a fresh isolation per
// launch, cleaned up on close().
isolation?: Isolation | null;
}
// Tiered readiness levels for waitForReady(). Higher levels include
// every check from lower levels. Pick the lowest level a test
// actually needs:
// - 'window' X11 window mapped (no inspector, no renderer state)
// - 'mainVisible' main shell BrowserWindow.isVisible() === true
// - 'claudeAi' any claude.ai webContents reachable (may be /login)
// - 'userLoaded' claude.ai URL past /login (lHn() precondition; the
// tightest gate before exercising QE submit paths)
export type ReadyLevel = 'window' | 'mainVisible' | 'claudeAi' | 'userLoaded';
export interface WaitForReadyOptions {
// Overall budget across all levels. Each step consumes from the
// remaining budget. Default 90_000ms covers the userLoaded path
// (~5-10s startup + main visible + 30s claude.ai load + login
// nav) with margin. Override down for cheaper levels.
timeout?: number;
}
export interface WindowReady {
wid: string;
}
export interface MainVisibleReady extends WindowReady {
inspector: InspectorClient;
}
export interface ClaudeAiReady extends MainVisibleReady {
// First claude.ai webContents URL observed. Absent if claude.ai
// never loaded within the budget — caller can treat as a skip
// (host likely not signed in).
claudeAiUrl?: string;
}
export interface UserLoadedReady extends ClaudeAiReady {
// claude.ai URL past /login. Absent if the renderer never
// navigated past the login page within the budget.
postLoginUrl?: string;
}
// Maps each level to the precise return shape its callers see.
// Conditional type rather than overloads because the implementation
// is a single closure with a union return — overloads would require
// either an unsafe cast or function-declaration overloads, both
// noisier than this.
export type ReadyResultFor<L extends ReadyLevel> =
L extends 'window' ? WindowReady :
L extends 'mainVisible' ? MainVisibleReady :
L extends 'claudeAi' ? ClaudeAiReady :
L extends 'userLoaded' ? UserLoadedReady :
never;
export interface ClaudeApp {
process: ChildProcess;
pid: number;
isolation: Isolation | null;
// Populated on close(). When the spawned Electron exits with
// non-zero `code` and was NOT killed by us (`signal === null`),
// this carries the data so a runner can `testInfo.attach()` the
// crash info without us coupling electron.ts to Playwright APIs
// or breaking the existing `await app.close()` sites that ignore
// the return value. Stays null while the proc is still running.
lastExitInfo: { code: number | null; signal: NodeJS.Signals | null } | null;
close(): Promise<void>;
waitForX11Window(timeoutMs?: number): Promise<string>;
attachInspector(timeoutMs?: number): Promise<InspectorClient>;
// Tiered "is the app ready for the kind of work this test does"
// helper. See ReadyLevel for what each level checks. Throws on
// timeout for 'window' / 'mainVisible' (hard-fail levels). For
// 'claudeAi' / 'userLoaded', returns with the corresponding field
// (claudeAiUrl, postLoginUrl) absent on timeout so callers can
// `testInfo.skip()` rather than fail when the host isn't signed in.
waitForReady<L extends ReadyLevel>(
level: L,
opts?: WaitForReadyOptions,
): Promise<ReadyResultFor<L>>;
}
// CDP auth gate: index.pre.js has
// uF(process.argv) && !qL() && process.exit(1);
// where uF matches --remote-debugging-port / --remote-debugging-pipe on argv
// and qL validates a token in CLAUDE_CDP_AUTH against a hardcoded ed25519
// public key (signed payload `${timestamp_ms}.${base64(userDataDir)}`,
// 5-minute TTL). Both Playwright's _electron.launch() and
// chromium.connectOverCDP() inject --remote-debugging-port=0 and trip the
// gate. Signing key is upstream's; we can't forge tokens.
//
// Workaround: the gate doesn't check --inspect or runtime SIGUSR1 (the
// "Developer → Enable Main Process Debugger" menu's code path). So we
// spawn without any debug-port flags (gate stays asleep), wait for the
// X11 window to appear, then send SIGUSR1 to attach the Node inspector at
// runtime. From there lib/inspector.ts gives us main-process JS eval,
// which reaches the renderer via webContents.executeJavaScript() and
// supports main-process mocks (e.g. dialog.showOpenDialog for T17).
// Default backend: X11 via XWayland. Mirrors launcher-common.sh's
// build_electron_args() X11 branch (the launcher itself isn't invoked
// because we spawn Electron directly to keep CLAUDE_CDP_AUTH out of
// the picture — see the SIGUSR1 attach comment above).
const LAUNCHER_INJECTED_FLAGS_X11 = [
'--disable-features=CustomTitlebar',
'--ozone-platform=x11',
'--no-sandbox',
];
// Native-Wayland backend, opted into by CLAUDE_HARNESS_USE_WAYLAND=1.
// Mirrors launcher-common.sh's Wayland branch (lines 132-135). Tests
// that need to drive the app under native Wayland (#226 follow-ups,
// future S07 sweep) flip the harness-level switch and every runner
// inherits this without per-spec changes.
const LAUNCHER_INJECTED_FLAGS_WAYLAND = [
'--disable-features=CustomTitlebar',
'--enable-features=UseOzonePlatform,WaylandWindowDecorations',
'--ozone-platform=wayland',
'--enable-wayland-ime',
'--wayland-text-input-version=3',
'--no-sandbox',
];
const LAUNCHER_INJECTED_ENV: Record<string, string> = {
ELECTRON_FORCE_IS_PACKAGED: 'true',
ELECTRON_USE_SYSTEM_TITLE_BAR: '1',
};
// Top-level opt-in: when CLAUDE_HARNESS_USE_WAYLAND=1, every
// launchClaude() call swaps the X11 flag set for the Wayland one and
// also exports CLAUDE_USE_WAYLAND=1 into the spawn env (so any in-app
// path that reads the launcher var stays consistent). Caller-supplied
// extraEnv still wins — a single test can override per-launch.
function harnessUseWayland(): boolean {
return process.env.CLAUDE_HARNESS_USE_WAYLAND === '1';
}
const DEFAULT_INSTALL_PATHS = [
{
electron: '/usr/lib/claude-desktop/node_modules/electron/dist/electron',
asar: '/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar',
},
{
electron: '/opt/Claude/node_modules/electron/dist/electron',
asar: '/opt/Claude/node_modules/electron/dist/resources/app.asar',
},
];
interface AppPaths {
electron: string;
asar: string;
}
// Per-launch state needed by the SIGINT/SIGTERM cleanup. Tracks the
// child proc + isolation root so a Ctrl-C through Playwright doesn't
// leak Electron processes or the per-launch tmpdir. Stored separately
// from ClaudeApp so the signal handler doesn't reach into closure
// internals — `proc` and `root` are everything cleanup needs.
interface ActiveLaunch {
proc: ChildProcess;
// Isolation root to remove on signal. null when caller opted out
// (`isolation: null`) or supplied a shared handle (`ownsIsolation`
// false — that handle's lifetime is the test's, not ours).
root: string | null;
}
const activeLaunches = new Set<ActiveLaunch>();
let signalHandlersInstalled = false;
// Install once across every launch in the test process. Handler is
// synchronous: SIGKILL each spawned proc, rmSync each owned isolation
// root, then re-emit the signal so Playwright's own teardown still
// runs (and the process actually exits — without re-emit, Node would
// notice the handler swallowed the signal and stay alive).
//
// Only owns processes/dirs from this module, not anything Playwright
// itself spawned, so the cleanup is safe to run in parallel with
// Playwright's teardown.
function ensureSignalHandlers(): void {
if (signalHandlersInstalled) return;
signalHandlersInstalled = true;
const cleanup = (signal: NodeJS.Signals) => {
for (const launch of activeLaunches) {
try {
launch.proc.kill('SIGKILL');
} catch {
// proc may already be dead
}
if (launch.root) {
try {
rmSync(launch.root, { recursive: true, force: true });
} catch {
// best-effort — tmpdir cleanup is not load-bearing
}
}
}
activeLaunches.clear();
// Re-emit so default disposition runs. Removing our handler
// first prevents an infinite loop.
process.removeListener('SIGINT', sigintHandler);
process.removeListener('SIGTERM', sigtermHandler);
process.kill(process.pid, signal);
};
const sigintHandler = () => cleanup('SIGINT');
const sigtermHandler = () => cleanup('SIGTERM');
process.on('SIGINT', sigintHandler);
process.on('SIGTERM', sigtermHandler);
}
function resolveInstall(): AppPaths {
const envBin = process.env.CLAUDE_DESKTOP_ELECTRON;
const envAsar = process.env.CLAUDE_DESKTOP_APP_ASAR;
if (envBin && envAsar) return { electron: envBin, asar: envAsar };
for (const candidate of DEFAULT_INSTALL_PATHS) {
if (existsSync(candidate.electron) && existsSync(candidate.asar)) {
return candidate;
}
}
throw new Error(
'Could not locate claude-desktop install. Set CLAUDE_DESKTOP_ELECTRON ' +
'and CLAUDE_DESKTOP_APP_ASAR, or install the deb/rpm package.',
);
}
// Mirrors the pre-launch cleanup in launcher-common.sh (cleanup_orphaned_
// cowork_daemon + cleanup_stale_lock + cleanup_stale_cowork_socket).
//
// When `configDir` is provided (isolated test mode), the SingletonLock
// path is relative to that dir rather than ~/.config/Claude — the host
// config is left untouched.
export async function cleanupPreLaunch(configDir?: string): Promise<void> {
try {
await exec('pkill', ['-f', 'cowork-vm-service\\.js']);
} catch {
// pkill returns non-zero when no matches; that's fine.
}
const lockPath = configDir
? join(configDir, 'SingletonLock')
: join(homedir(), '.config/Claude/SingletonLock');
try {
const target = readlinkSync(lockPath);
const pidMatch = target.match(/-(\d+)$/);
if (pidMatch && !existsSync(`/proc/${pidMatch[1]}`)) {
rmSync(lockPath, { force: true });
}
} catch {
// Lock doesn't exist or isn't a symlink — both fine.
}
const sockPath = join(
process.env.XDG_RUNTIME_DIR ?? '/tmp',
'cowork-vm-service.sock',
);
if (existsSync(sockPath)) {
try {
rmSync(sockPath, { force: true });
} catch {
// Stale socket may already be gone.
}
}
}
export async function launchClaude(opts: LaunchOptions = {}): Promise<ClaudeApp> {
// Isolation default: create a fresh per-launch sandbox unless the
// caller passed `null` (legacy ~/.config/Claude) or supplied a
// pre-existing handle (shared across multiple launches in one test).
let isolation: Isolation | null;
let ownsIsolation = false;
if (opts.isolation === null) {
isolation = null;
} else if (opts.isolation) {
isolation = opts.isolation;
} else {
isolation = await createIsolation();
ownsIsolation = true;
}
await cleanupPreLaunch(isolation?.configDir);
const { electron: electronBin, asar } = resolveInstall();
const appDir = dirname(dirname(dirname(dirname(electronBin))));
const useWayland = harnessUseWayland();
const launcherFlags = useWayland
? LAUNCHER_INJECTED_FLAGS_WAYLAND
: LAUNCHER_INJECTED_FLAGS_X11;
// CLAUDE_USE_WAYLAND only when the harness-level gate is on.
// Spread BEFORE opts.extraEnv so a single test can override.
const waylandEnv: Record<string, string> = useWayland
? { CLAUDE_USE_WAYLAND: '1', GDK_BACKEND: 'wayland' }
: {};
const proc = spawn(
electronBin,
[...launcherFlags, asar, ...(opts.args ?? [])],
{
cwd: appDir,
env: {
...process.env,
...LAUNCHER_INJECTED_ENV,
...(isolation?.env ?? {}),
...waylandEnv,
...opts.extraEnv,
CI: '1',
} as Record<string, string>,
stdio: 'ignore',
detached: false,
},
);
if (!proc.pid) {
if (ownsIsolation && isolation) await isolation.cleanup();
throw new Error('Failed to spawn Electron — no pid');
}
// Register signal handlers + add this launch to the active set so a
// Ctrl-C through Playwright SIGKILLs the Electron child and (if we
// own the tmpdir) rmSync's the isolation root. Owned-isolation
// signal cleanup uses dirname(configHome) — Isolation doesn't
// expose `root`, but createIsolation builds configHome as
// `<root>/config`, so the parent dir is the tmpdir to remove.
ensureSignalHandlers();
const isolationRoot =
ownsIsolation && isolation ? dirname(isolation.configHome) : null;
const launchEntry: ActiveLaunch = { proc, root: isolationRoot };
activeLaunches.add(launchEntry);
// Single-slot inspector tracking. Only one inspector ever attaches
// per launch (SIGUSR1 opens port 9229; reusing the port across
// re-attaches isn't supported). Stored so close() can release the
// WebSocket even if the runner forgets — previously every runner
// did `inspector.close(); finally app.close();` and the WS leaked
// when an `expect()` between those threw.
let trackedInspector: InspectorClient | null = null;
const waitForX11Window = async (timeoutMs = 15_000): Promise<string> => {
const wid = await retryUntil(
async () => findX11WindowByPid(proc.pid!),
{ timeout: timeoutMs, interval: 250 },
);
if (!wid) {
throw new Error(
`X11 window for pid ${proc.pid} did not appear within ${timeoutMs}ms`,
);
}
return wid;
};
const attachInspector = async (timeoutMs = 15_000): Promise<InspectorClient> => {
// Send SIGUSR1 to open the Node inspector at runtime — same code
// path as Developer → Enable Main Process Debugger menu item.
// Then poll http://127.0.0.1:9229/json/list until it answers.
process.kill(proc.pid!, 'SIGUSR1');
const start = Date.now();
let lastErr: unknown = null;
while (Date.now() - start < timeoutMs) {
try {
const client = await InspectorClient.connect(9229);
trackedInspector = client;
return client;
} catch (err) {
lastErr = err;
await sleep(250);
}
}
throw new Error(
`Inspector did not become ready on port 9229 within ${timeoutMs}ms: ${
lastErr instanceof Error ? lastErr.message : String(lastErr)
}`,
);
};
const waitForReady = async (
level: ReadyLevel,
opts: WaitForReadyOptions = {},
): Promise<WindowReady | MainVisibleReady | ClaudeAiReady | UserLoadedReady> => {
const overall = opts.timeout ?? 90_000;
const start = Date.now();
// Each step uses the remaining overall budget rather than
// a fixed per-step timeout. If startup is slow, downstream
// steps still get whatever's left; if startup is fast, the
// later steps inherit the unused margin.
const remaining = () => Math.max(0, overall - (Date.now() - start));
const wid = await waitForX11Window(remaining());
if (level === 'window') return { wid };
const inspector = await attachInspector(remaining());
// 'mainVisible' — the main shell BrowserWindow has been
// shown. MainWindow.getState() resolves the window via
// claude.ai webContents, so this poll implicitly also
// requires that webContents to exist; the explicit
// 'claudeAi' step below is for the URL-list signal that
// some tests want even when window visibility is incidental.
const mainWin = new MainWindow(inspector);
const visibleState = await retryUntil(
async () => {
const s = await mainWin.getState();
return s && s.visible ? s : null;
},
{ timeout: remaining(), interval: 250 },
);
if (!visibleState) {
throw new Error(
`waitForReady('${level}'): main window did not become ` +
`visible within ${overall}ms`,
);
}
if (level === 'mainVisible') return { wid, inspector };
// 'claudeAi' — a claude.ai-domain webContents exists in
// the registry. May still be on /login. Soft-fails on
// timeout: returns without claudeAiUrl so the caller
// can skip (host likely not signed in).
const claudeAiUrl = await retryUntil(
async () => {
const all = await inspector.evalInMain<{ url: string }[]>(`
const { webContents } = process.mainModule.require('electron');
return webContents.getAllWebContents().map(w => ({ url: w.getURL() }));
`);
return all.find((w) => w.url.includes('claude.ai'))?.url ?? null;
},
{ timeout: remaining(), interval: 500 },
);
if (!claudeAiUrl) {
return { wid, inspector };
}
if (level === 'claudeAi') return { wid, inspector, claudeAiUrl };
// 'userLoaded' — URL past /login. Necessary precondition
// for upstream's lHn() (`!user.isLoggedOut`) returning
// true, which gates Ko.show() in the shortcut handler.
// NOT sufficient on its own — main-process user state
// loads on a separate timeline from the renderer URL,
// so QE submit paths still need openAndWaitReady's
// retry loop on top of this.
const postLoginUrl =
(await waitForUserLoaded(inspector, remaining())) ?? undefined;
return { wid, inspector, claudeAiUrl, postLoginUrl };
};
const app: ClaudeApp = {
process: proc,
pid: proc.pid,
isolation,
lastExitInfo: null,
async close() {
// Drop the inspector first — InspectorClient.close() is now
// idempotent (see lib/inspector.ts) so the runner-side
// `inspector.close()` calls keep working even when this
// fires too. Wrapped in try/catch because a thrown ws.close
// shouldn't block the proc/iso cleanup below.
if (trackedInspector) {
try {
trackedInspector.close();
} catch {
// already closed
}
trackedInspector = null;
}
if (proc.exitCode === null && proc.signalCode === null) {
proc.kill('SIGTERM');
await Promise.race([
new Promise<void>((resolve) => proc.once('exit', () => resolve())),
sleep(5000),
]);
if (proc.exitCode === null && proc.signalCode === null) {
proc.kill('SIGKILL');
}
}
// Capture exit info BEFORE iso cleanup. Runners can attach
// app.lastExitInfo to testInfo when non-null + signal === null
// (we didn't kill it, so a non-zero code means a real crash).
app.lastExitInfo = {
code: proc.exitCode,
signal: proc.signalCode,
};
activeLaunches.delete(launchEntry);
if (ownsIsolation && isolation) {
await isolation.cleanup();
}
},
waitForX11Window,
attachInspector,
// TS can't verify a closure with a union return matches the
// generic conditional signature, even though the runtime
// branches do produce the right shape per level. The cast
// preserves the public contract.
waitForReady: waitForReady as ClaudeApp['waitForReady'],
};
return app;
}

View File

@@ -0,0 +1,30 @@
export interface DesktopEnv {
desktop: string;
sessionType: string;
isWayland: boolean;
isX11: boolean;
isKDE: boolean;
isGNOME: boolean;
isSWAY: boolean;
isHYPR: boolean;
isNIRI: boolean;
row: string;
}
export function getEnv(): DesktopEnv {
const desktop = process.env.XDG_CURRENT_DESKTOP ?? '';
const sessionType = process.env.XDG_SESSION_TYPE ?? '';
const upper = desktop.toUpperCase();
return {
desktop,
sessionType,
isWayland: sessionType === 'wayland',
isX11: sessionType === 'x11',
isKDE: upper.includes('KDE'),
isGNOME: upper.includes('GNOME'),
isSWAY: upper.includes('SWAY'),
isHYPR: upper.includes('HYPRLAND'),
isNIRI: upper.includes('NIRI'),
row: process.env.ROW ?? 'KDE-W',
};
}

View File

@@ -0,0 +1,111 @@
// Detect-and-kill any running Claude Desktop process owned by the
// current user. Used before seeding a hermetic isolation from the
// host config, because Cookies (SQLite) and Local Storage / IndexedDB
// (LevelDB) all hold writer locks while the host app is running — a
// naive cp would either copy a torn page or fail outright on the
// LevelDB LOCK file.
//
// SIGTERM first, wait up to 5s for graceful exit, SIGKILL survivors.
// Loud stderr output: the user needs to know we're force-quitting
// their app so they can blame us, not Claude Desktop, when their
// unsaved chat draft disappears.
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { sleep } from './retry.js';
const exec = promisify(execFile);
// Patterns that match host installs (deb, rpm, AppImage, dev tree).
// argv-based via `pgrep -f`: matches the installed binary path or
// the mounted AppImage path. The harness's own launches always set
// XDG_CONFIG_HOME to a tmpdir, so they wouldn't be confused with
// the host even if the patterns overlapped — but kill runs BEFORE
// our launch, so at this moment there's nothing of ours to confuse.
const HOST_PROCESS_PATTERNS = [
'/usr/lib/claude-desktop/',
'/opt/Claude/',
'\\.mount_[Cc]laude',
'/usr/bin/claude-desktop',
];
// Per-pid graceful-exit budget. Electron flushes LevelDB + checkpoints
// the SQLite WAL on SIGTERM; 5s covers a typical shutdown with margin.
const SIGTERM_GRACE_MS = 5_000;
const POLL_INTERVAL_MS = 200;
interface HostProcess {
pid: number;
argv: string;
}
async function findHostProcesses(): Promise<HostProcess[]> {
const pattern = HOST_PROCESS_PATTERNS.join('|');
try {
const { stdout } = await exec('pgrep', ['-af', pattern]);
return stdout
.split('\n')
.filter(Boolean)
.map((line) => {
const space = line.indexOf(' ');
const pid = Number(space === -1 ? line : line.slice(0, space));
const argv = space === -1 ? '' : line.slice(space + 1);
return { pid, argv };
})
.filter((p) => Number.isFinite(p.pid) && p.pid !== process.pid);
} catch {
// pgrep returns 1 when nothing matches — happy path.
return [];
}
}
function isAlive(pid: number): boolean {
try {
// Signal 0: existence check, no signal delivered.
process.kill(pid, 0);
return true;
} catch {
return false;
}
}
export async function killHostClaude(): Promise<void> {
const procs = await findHostProcesses();
if (procs.length === 0) return;
process.stderr.write(
`host-claude: ${procs.length} running Claude process(es) found; ` +
'sending SIGTERM (auth-state seed needs writer-lock release):\n',
);
for (const { pid, argv } of procs) {
process.stderr.write(` pid=${pid} ${argv.slice(0, 120)}\n`);
try {
process.kill(pid, 'SIGTERM');
} catch {
// Race: already exited between pgrep and now.
}
}
const deadline = Date.now() + SIGTERM_GRACE_MS;
while (Date.now() < deadline) {
if (!procs.some((p) => isAlive(p.pid))) return;
await sleep(POLL_INTERVAL_MS);
}
const survivors = procs.filter((p) => isAlive(p.pid));
if (survivors.length === 0) return;
process.stderr.write(
`host-claude: ${survivors.length} survived SIGTERM; sending SIGKILL:\n`,
);
for (const { pid } of survivors) {
process.stderr.write(` pid=${pid}\n`);
try {
process.kill(pid, 'SIGKILL');
} catch {
// Race: already exited.
}
}
// Final beat so /proc entries clear before the seed copy starts.
await sleep(POLL_INTERVAL_MS);
}

View File

@@ -0,0 +1,393 @@
// Focus-shifter primitive for "Quick Entry shortcut fires from any
// focus" (S14) on Niri sessions — the Wayland-native sibling of
// lib/input.ts. The runner needs to (a) spawn a sacrificial window
// with a known title, (b) shove keyboard focus to it, then (c) press
// the global shortcut and observe whether the QE popup appears
// regardless of focus.
//
// Niri only — by design.
// - There is no portable focus-injection on native Wayland. Each
// compositor exposes a different IPC: niri msg here, swaymsg for
// Sway, hyprctl for Hyprland, riverctl for River. The libei-based
// "input emulation" portal is the long-term cross-compositor
// answer but isn't widely deployed (KDE/GNOME are getting it,
// niri/sway/hypr are not yet). We pay one file per compositor
// until a second consumer surfaces the dispatcher need; a
// hypothetical lib/input-wayland.ts would just switch on
// XDG_CURRENT_DESKTOP and delegate. With only S14 consuming this,
// a dispatcher would be ceremony.
// - lib/input.ts (X11) and this file are independent: they don't
// share a focus-id type — niri window IDs are u64 numerics, X11
// WIDs are hex strings. Callers handle one or the other based on
// session detection; nothing crosses the boundary.
//
// Why niri msg --json over plain text: the niri wiki explicitly
// contracts the JSON output as stable while the plain-text form is
// described as unstable / human-readable-only. A test harness that
// regex-greps human-readable IPC output is one niri release away
// from a quiet break.
//
// Why we verify post-focus via niri msg focused-window: niri msg
// action focus-window exits 0 even when the focus didn't actually
// land (the action queues into the compositor and a competing input
// event or a closing window can race it). The only honest answer is
// to read focused-window back out and compare IDs. This mirrors
// lib/input.ts's xprop-readback paragraph but for niri's IPC. ~3s
// budget covers slow compositor paths; anything beyond is a refusal
// not a slow ack — surface as an error so S14 sees it.
//
// Why foot for the marker terminal: it's the niri-default in many
// distros (Fedora niri spin, several Arch derivatives), accepts
// --title <T> verbatim with no de-escaping surprises, and ships in
// most niri setups so a single binary covers the common case. We
// deliberately don't fall back to alacritty / kitty — the X11
// primitive uses xterm-only and the simplicity is worth more than
// the marginal robustness; an environment without foot can install
// it the same way an X11 environment without xterm installs xterm.
//
// Why detached:false on the marker spawn: keep the foot child in the
// parent's process group so the OS cleans it up if the test crashes.
// (Session 5 recon sketched detached:true; lib/input.ts uses
// detached:false and is the safer pattern — a leaked terminal past a
// crashed test run is worse than a marker that dies cleanly with its
// parent.)
//
// No fixed sleeps. The verification poll uses retryUntil so a fast
// compositor finishes in ~50ms while a slow one gets the full budget.
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { retryUntil } from './retry.js';
const exec = promisify(execFile);
// Caller catches this and calls test.skip() — it's an environment
// gap (not a Niri session, or niri msg not on PATH), not a
// regression. Subclassing Error gives consumers a clean
// `instanceof` check without parsing message strings.
export class NiriIpcUnavailable extends Error {
constructor(message?: string) {
super(
message ??
'niri msg IPC unavailable: either this is not a Niri ' +
'session (XDG_CURRENT_DESKTOP !== "niri") or the ' +
'`niri` binary is missing from PATH. Install the ' +
'`niri-ipc` / `niri` package, or skip on this row.',
);
this.name = 'NiriIpcUnavailable';
}
}
// Mirrors lib/input.ts's XdotoolUnavailable — the install command is
// the actually-useful part of the error. Consumers should usually
// skip rather than fail; the absence of foot is an environment
// configuration issue, not a Claude Desktop regression.
export class FootUnavailable extends Error {
constructor(message?: string) {
super(
message ??
'foot binary not found on PATH. Install with ' +
'`dnf install foot` / `apt install foot`.',
);
this.name = 'FootUnavailable';
}
}
// Single source of truth for the Niri / not-Niri branch. Pure env
// check, no process spawn — matches the simplicity of isX11Session()
// in lib/input.ts. A `niri msg version` probe would be more
// authoritative (catches the case where someone manually overrides
// XDG_CURRENT_DESKTOP) but adds a fork-per-call cost that's
// disproportionate to how rare the override is in practice.
//
// The literal string 'niri' is the value niri itself sets in
// XDG_CURRENT_DESKTOP per its own documentation; we trust that and
// nothing else (no case-folding, no startswith).
export function isNiriSession(): boolean {
return process.env.XDG_CURRENT_DESKTOP === 'niri';
}
// Niri's --json output for several IPC calls is wrapped in a
// Result-style envelope: `{"Ok": <payload>}`. Newer/older niri
// versions sometimes return the bare payload. Defensively unwrap one
// layer of `.Ok` if present, then return the payload as-is. Returns
// null if the input is null/undefined.
function unwrapOk(value: unknown): unknown {
if (value === null || value === undefined) return null;
if (typeof value === 'object' && value !== null && 'Ok' in value) {
return (value as { Ok: unknown }).Ok;
}
return value;
}
// Shape of a niri window row, restricted to the fields we use. The
// real schema has more (workspace_id, is_floating, etc.) — we don't
// commit to those.
interface NiriWindow {
id: number;
title: string | null;
app_id: string | null;
is_focused?: boolean;
}
// Read the currently-focused niri window via `niri msg --json
// focused-window`.
//
// Returns null on:
// - Non-Niri session (gated out by isNiriSession()).
// - niri binary missing / spawn ENOENT — analogous to lib/input.ts
// returning null on xprop spawn failure rather than throwing.
// focusOtherWindow's poll fails through to its own timeout.
// - JSON parse failure or unexpected shape (defensive — should
// not happen against a healthy niri but the cost of a null
// return is one re-poll).
// - No focused window (e.g. all workspaces empty).
export async function getFocusedWindowId(): Promise<number | null> {
if (!isNiriSession()) return null;
let stdout: string;
try {
({ stdout } = await exec('niri', [
'msg',
'--json',
'focused-window',
]));
} catch {
return null;
}
const trimmed = stdout.trim();
if (!trimmed) return null;
let parsed: unknown;
try {
parsed = JSON.parse(trimmed);
} catch {
return null;
}
// Two known wrappings: `{Ok: {FocusedWindow: <window>}}` (older)
// and the bare window object (newer). Try unwrapping in order.
const okUnwrapped = unwrapOk(parsed);
let candidate: unknown = okUnwrapped;
if (
typeof okUnwrapped === 'object' &&
okUnwrapped !== null &&
'FocusedWindow' in okUnwrapped
) {
candidate = (okUnwrapped as { FocusedWindow: unknown }).FocusedWindow;
}
if (
typeof candidate !== 'object' ||
candidate === null ||
!('id' in candidate)
) {
return null;
}
const id = (candidate as { id: unknown }).id;
if (typeof id !== 'number' || !Number.isFinite(id)) return null;
return id;
}
// Resolve a window title to its niri ID via `niri msg --json
// windows`. The list is `Vec<Window>`; we filter on title match AND
// app_id !== 'Claude' so we never accidentally pick the test target
// itself. Returns null on zero matches; returns the first match's
// ID on multi-match (mirrors xdotool's first-match behavior in
// lib/input.ts).
async function resolveWindowIdByTitle(
title: string,
): Promise<number | null> {
const { stdout } = await exec('niri', ['msg', '--json', 'windows']);
const trimmed = stdout.trim();
if (!trimmed) return null;
let parsed: unknown;
try {
parsed = JSON.parse(trimmed);
} catch {
return null;
}
// Same Ok-wrapping defense as getFocusedWindowId.
const unwrapped = unwrapOk(parsed);
if (!Array.isArray(unwrapped)) return null;
for (const row of unwrapped as NiriWindow[]) {
if (
row &&
typeof row === 'object' &&
typeof row.id === 'number' &&
row.title === title &&
row.app_id !== 'Claude'
) {
return row.id;
}
}
return null;
}
// Shift Niri focus to the first window whose title matches `title`
// and whose app_id is not 'Claude' (so we never target Claude's own
// window), then verify the shift actually took.
//
// Throws:
// - NiriIpcUnavailable when not a Niri session, or niri binary
// missing.
// - Plain Error when no window matches (caller's bug — forgot to
// spawn the marker, or used the wrong title).
// - Plain Error when niri msg returns 0 but focused-window never
// reflects the focus change within ~3s (compositor refused the
// activation; this is the diagnostic path S14 wants surfaced,
// not swallowed).
export async function focusOtherWindow(title: string): Promise<void> {
if (!isNiriSession()) {
throw new NiriIpcUnavailable();
}
let targetId: number | null;
try {
targetId = await resolveWindowIdByTitle(title);
} catch (err) {
const e = err as { code?: string | number };
if (e.code === 'ENOENT') throw new NiriIpcUnavailable();
throw err;
}
if (targetId === null) {
throw new Error(
`focusOtherWindow: no Niri window matches title ${JSON.stringify(title)} ` +
'(with app_id != "Claude"). Did the marker window finish ' +
'mapping? Caller should await spawnMarkerWindow + a short ' +
'readiness poll before calling focusOtherWindow.',
);
}
try {
await exec('niri', [
'msg',
'action',
'focus-window',
'--id',
String(targetId),
]);
} catch (err) {
const e = err as { code?: string | number };
if (e.code === 'ENOENT') throw new NiriIpcUnavailable();
throw err;
}
const matched = await retryUntil(
async () => {
const active = await getFocusedWindowId();
return active === targetId ? true : null;
},
{ timeout: 3_000, interval: 100 },
);
if (!matched) {
throw new Error(
'focusOtherWindow: niri msg action focus-window returned 0 ' +
`but focused-window never settled to id=${targetId} ` +
`for title ${JSON.stringify(title)}. Compositor may have ` +
'refused the activation request.',
);
}
}
// Handle returned from spawnMarkerWindow. Lifecycle is owned by the
// caller — the test that spawned it must kill() in afterEach (or
// equivalent), otherwise the foot terminal leaks past the test run.
export interface MarkerWindow {
pid: number;
title: string;
kill(): Promise<void>;
}
// Spawn a long-lived foot terminal with a known title, suitable as
// a focus target on a Niri session. Backgrounded with detached:false
// so the parent test process owns its lifetime — if the test
// crashes, the OS cleans up the child when the parent dies.
//
// Throws FootUnavailable if foot isn't on PATH (both at spawn-throw
// time AND via the 'error' event, mirroring lib/input.ts's redundant
// ENOENT handling — Node delivers ENOENT through different paths
// across versions).
export async function spawnMarkerWindow(
title: string,
): Promise<MarkerWindow> {
const { spawn } = await import('node:child_process');
let child;
try {
// `sleep 600` keeps the foot terminal alive for 10min — longer
// than any reasonable single test, short enough that a leaked
// terminal self-cleans within the sweep. foot's --title sets
// the window title field that niri's windows list reports.
child = spawn('foot', ['--title', title, '-e', 'sleep', '600'], {
detached: false,
stdio: 'ignore',
});
} catch (err) {
const e = err as { code?: string | number };
if (e.code === 'ENOENT') {
throw new FootUnavailable();
}
throw err;
}
const earlyError = await new Promise<Error | null>((resolve) => {
const onError = (err: Error) => {
child.removeListener('spawn', onSpawn);
resolve(err);
};
const onSpawn = () => {
child.removeListener('error', onError);
resolve(null);
};
child.once('error', onError);
child.once('spawn', onSpawn);
});
if (earlyError) {
const e = earlyError as Error & { code?: string | number };
if (e.code === 'ENOENT') {
throw new FootUnavailable();
}
throw earlyError;
}
const pid = child.pid;
if (typeof pid !== 'number') {
throw new Error(
'spawnMarkerWindow: child.pid was undefined after spawn',
);
}
let killed = false;
const kill = async (): Promise<void> => {
if (killed) return;
killed = true;
if (child.exitCode !== null || child.signalCode !== null) {
return;
}
// SIGTERM with a short grace period before SIGKILL. foot
// honors SIGTERM cleanly; the SIGKILL fallback is for the
// pathological "child wedged in a syscall" case.
const exited = new Promise<void>((resolve) => {
child.once('exit', () => resolve());
});
try {
child.kill('SIGTERM');
} catch {
// Process may have died between the check and the kill.
}
const graceMs = 500;
const timedOut = await Promise.race([
exited.then(() => false),
new Promise<boolean>((resolve) =>
setTimeout(() => resolve(true), graceMs),
),
]);
if (timedOut) {
try {
child.kill('SIGKILL');
} catch {
// Already dead.
}
await exited;
}
};
return { pid, title, kill };
}

View File

@@ -0,0 +1,346 @@
// Focus-shifter primitive for "Quick Entry shortcut fires from any focus"
// (S11, S14). The runner needs to (a) spawn a sacrificial window with
// a known title, (b) shove keyboard focus to it, then (c) press the
// global shortcut and observe whether the QE popup appears regardless
// of focus.
//
// X11 only — by design.
// - There is no portable focus-injection on native Wayland. Each
// compositor exposes its own IPC (swaymsg, kitten, hyprctl,
// niri msg) and the libei-based "input emulation" portal isn't
// universally honored. Rather than bake a per-compositor matrix
// into the harness, runners on native Wayland rows must skip
// this test entirely. WaylandFocusUnavailable is the signal.
// - Wayland-with-XWayland (KDE-W default, Ubu-W default, GNOME-W
// when XDG_SESSION_TYPE=x11 is forced) is *not* an X11 session
// for our purposes — the WAYLAND-SIDE windows xdotool can't see
// are exactly the windows S11/S14 care about. The single source
// of truth is XDG_SESSION_TYPE === 'x11'. Anything else: skip.
//
// Why xdotool over xprop+wmctrl-equivalent: xdotool ships
// `search --name <regex> windowfocus` as one atomic call. Doing it
// with raw xprop means walking _NET_CLIENT_LIST, fetching _NET_WM_NAME
// per WID, picking a match, then sending an _NET_ACTIVE_WINDOW
// ClientMessage — which xprop can't generate, only read. wmctrl can,
// but adds a second binary dependency for no win.
//
// Why we verify post-focus via xprop: xdotool exits 0 even when
// focus didn't actually shift. Some compositors (mutter under
// XWayland-forced mode notably) accept the WM_TAKE_FOCUS / SetInputFocus
// pair and then quietly refuse the activation. The only honest
// answer is to read _NET_ACTIVE_WINDOW back out and compare WIDs.
// xdotool prints decimal WIDs; xprop prints `0x...` hex. We
// normalize to lowercase 0x-prefixed hex with leading zeros stripped.
//
// No fixed sleeps. The verification poll uses retryUntil so a fast
// compositor finishes in ~50ms while a slow one gets the full budget.
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { retryUntil } from './retry.js';
const exec = promisify(execFile);
// Caller catches this and calls test.skip() — it's an environment gap,
// not a regression. Subclassing Error gives consumers a clean
// `instanceof` check without parsing message strings.
export class WaylandFocusUnavailable extends Error {
constructor(message?: string) {
super(
message ??
'focusOtherWindow: native Wayland session — no portable ' +
'focus-injection path. Skip on this row.',
);
this.name = 'WaylandFocusUnavailable';
}
}
// Mirrors quickentry.ts's ensureYdotool message style — the install
// command is the actually-useful part of the error. Consumers should
// usually skip rather than fail; the absence of xdotool is an
// environment configuration issue, not a Claude Desktop regression.
export class XdotoolUnavailable extends Error {
constructor(message?: string) {
super(
message ??
'xdotool binary not found on PATH. Install with ' +
'`dnf install xdotool` / `apt install xdotool`.',
);
this.name = 'XdotoolUnavailable';
}
}
// Single source of truth for the X11/Wayland branch. Every other
// function in this file calls this — do not duplicate the env check.
//
// XDG_SESSION_TYPE is set by logind. Possible values per spec are
// `x11`, `wayland`, `tty`, `mir`, `unspecified`. We only trust the
// literal string `x11` — anything else, including missing, returns
// false. That means an unset env var on a real X11 box returns false
// here; that's the correct conservative default since we can't
// verify the assumption.
export function isX11Session(): boolean {
return process.env.XDG_SESSION_TYPE === 'x11';
}
// Normalize a WID to lowercase 0x-prefixed hex with leading zeros
// stripped after the prefix. Accepts decimal (xdotool stdout) or hex
// (xprop stdout, with or without 0x). Returns null on parse failure.
//
// Examples:
// '94371842' → '0x5a00002'
// '0x05a00002' → '0x5a00002'
// '0X5A00002' → '0x5a00002'
function normalizeWid(raw: string): string | null {
const s = raw.trim();
if (!s) return null;
const isHex = /^0x/i.test(s);
const n = isHex ? parseInt(s, 16) : parseInt(s, 10);
if (!Number.isFinite(n) || n <= 0) return null;
return '0x' + n.toString(16);
}
// Read the currently-focused X11 window via _NET_ACTIVE_WINDOW.
//
// Returns null on:
// - Native Wayland (xprop may still respond via XWayland but the
// value is meaningless for native-Wayland clients — they don't
// appear in the X11 active-window list at all). Returning null
// here lets focusOtherWindow's poll fail through to its own
// timeout, but in practice native-Wayland rows are gated out
// earlier by isX11Session().
// - xprop missing / spawn failure.
// - Output that doesn't match the documented format (defensive —
// this should never happen on a real EWMH-compliant WM but the
// cost of a null return is one re-poll).
export async function getFocusedWindowId(): Promise<string | null> {
if (!isX11Session()) return null;
let stdout: string;
try {
({ stdout } = await exec('xprop', [
'-root',
'_NET_ACTIVE_WINDOW',
]));
} catch {
return null;
}
// Documented format:
// _NET_ACTIVE_WINDOW(WINDOW): window id # 0x5a00002
const m = stdout.match(/window id #\s*(0x[0-9a-fA-F]+)/);
if (!m || !m[1]) return null;
return normalizeWid(m[1]);
}
// Resolve a window title to its WID via xdotool. xdotool prints one
// decimal WID per matching line — we take the first (and warn via
// thrown Error if there are zero matches; multi-match is silently
// resolved to the first, mirroring xdotool's own windowfocus
// behavior).
async function resolveWindowIdByTitle(
title: string,
): Promise<string | null> {
const { stdout } = await exec('xdotool', ['search', '--name', title]);
const lines = stdout
.split('\n')
.map((l) => l.trim())
.filter(Boolean);
if (lines.length === 0) return null;
const first = lines[0];
if (!first) return null;
return normalizeWid(first);
}
// Shift X11 focus to the first window whose title matches `title`,
// then verify the shift actually took.
//
// Throws:
// - WaylandFocusUnavailable on native Wayland.
// - XdotoolUnavailable when xdotool isn't on PATH.
// - Plain Error when no window matches the title (caller's bug —
// forgot to spawn the marker, or used the wrong title).
// - Plain Error when xdotool reports success but xprop never
// reflects the focus change within ~3s (compositor refused the
// activation; this is the diagnostic path S11/S14 actually want
// to surface, not swallow).
export async function focusOtherWindow(title: string): Promise<void> {
if (!isX11Session()) {
throw new WaylandFocusUnavailable();
}
// Resolve target WID first so we know what to verify against.
// Combining this with `windowfocus` would save a roundtrip but
// would also make the post-focus comparison impossible.
let targetWid: string | null;
try {
targetWid = await resolveWindowIdByTitle(title);
} catch (err) {
const e = err as { code?: string | number };
if (e.code === 'ENOENT') throw new XdotoolUnavailable();
throw err;
}
if (!targetWid) {
throw new Error(
`focusOtherWindow: no X11 window matches title ${JSON.stringify(title)}. ` +
'Did the marker window finish mapping? Caller should ' +
'await spawnMarkerWindow + a short readiness poll before ' +
'calling focusOtherWindow.',
);
}
// Send the focus request. xdotool's windowfocus issues a
// SetInputFocus, which is best-effort; the verify-via-xprop
// step below is the actual assertion.
try {
await exec('xdotool', ['search', '--name', title, 'windowfocus']);
} catch (err) {
const e = err as { code?: string | number };
if (e.code === 'ENOENT') throw new XdotoolUnavailable();
throw err;
}
// Poll _NET_ACTIVE_WINDOW until it matches the target. ~3s budget
// covers slow compositor activation paths (mutter cold-path is
// the worst observed, ~800ms). Anything beyond 3s is a refusal,
// not a slow ack — surface as an error so S11/S14 see it.
const matched = await retryUntil(
async () => {
const active = await getFocusedWindowId();
return active === targetWid ? true : null;
},
{ timeout: 3_000, interval: 100 },
);
if (!matched) {
throw new Error(
`focusOtherWindow: xdotool windowfocus returned 0 but ` +
`_NET_ACTIVE_WINDOW never settled to ${targetWid} ` +
`for title ${JSON.stringify(title)}. Compositor may ` +
'have refused the activation request.',
);
}
}
// Handle returned from spawnMarkerWindow. Lifecycle is owned by the
// caller — the test that spawned it must kill() in afterEach (or
// equivalent), otherwise the xterm leaks past the test run.
export interface MarkerWindow {
pid: number;
title: string;
kill(): Promise<void>;
}
// Spawn a long-lived xterm with a known title, suitable as a focus
// target. Backgrounded with detached:false so the parent test process
// owns its lifetime — if the test crashes, the OS cleans up the child
// when the parent dies.
//
// Why xterm: it's the lowest-common-denominator X11 terminal — every
// X11 row has it (or can install it via the standard package). It
// honors -title verbatim (no de-escaping surprises) and -e accepts
// a single command without argv parsing quirks. Alternatives like
// `xclock` / `xeyes` either don't accept arbitrary titles or are
// missing on minimal Fedora installs.
//
// Throws if xterm isn't on PATH. Caller's responsibility to fall
// back or skip; we don't carry an `XtermUnavailable` class because
// the consumer decision tree is identical to "skip on missing
// xdotool" and the message is self-explanatory.
export async function spawnMarkerWindow(
title: string,
): Promise<MarkerWindow> {
// Lazy import so the module loads cleanly on Wayland rows that
// never call this function. (Top-level imports of node:child_process
// are already paid for by execFile, so this is mostly stylistic.)
const { spawn } = await import('node:child_process');
let child;
try {
// `sleep 600` keeps the xterm alive for 10min — longer than
// any reasonable single test, short enough that a leaked
// xterm self-cleans within the sweep. -hold not used: we
// want the window to die when sleep dies.
child = spawn('xterm', ['-title', title, '-e', 'sleep', '600'], {
detached: false,
stdio: 'ignore',
});
} catch (err) {
const e = err as { code?: string | number };
if (e.code === 'ENOENT') {
throw new Error(
'xterm binary not found on PATH. Install with ' +
'`dnf install xterm` / `apt install xterm`. ' +
'Required by the focus-shift test path; consumers ' +
'should skip when this throws.',
);
}
throw err;
}
// Surface synchronous spawn failures (ENOENT on some Node
// versions arrives via the 'error' event, not the throw above).
const earlyError = await new Promise<Error | null>((resolve) => {
const onError = (err: Error) => {
child.removeListener('spawn', onSpawn);
resolve(err);
};
const onSpawn = () => {
child.removeListener('error', onError);
resolve(null);
};
child.once('error', onError);
child.once('spawn', onSpawn);
});
if (earlyError) {
const e = earlyError as Error & { code?: string | number };
if (e.code === 'ENOENT') {
throw new Error(
'xterm binary not found on PATH. Install with ' +
'`dnf install xterm` / `apt install xterm`.',
);
}
throw earlyError;
}
const pid = child.pid;
if (typeof pid !== 'number') {
// Shouldn't happen after a successful 'spawn' event, but
// the type system doesn't know that.
throw new Error('spawnMarkerWindow: child.pid was undefined after spawn');
}
let killed = false;
const kill = async (): Promise<void> => {
if (killed) return;
killed = true;
if (child.exitCode !== null || child.signalCode !== null) {
return; // already exited
}
// SIGTERM with a short grace period before SIGKILL. xterm
// honors SIGTERM cleanly; the SIGKILL fallback is for the
// pathological "child wedged in a syscall" case.
const exited = new Promise<void>((resolve) => {
child.once('exit', () => resolve());
});
try {
child.kill('SIGTERM');
} catch {
// Process may have died between the check and the kill.
}
const graceMs = 500;
const timedOut = await Promise.race([
exited.then(() => false),
new Promise<boolean>((resolve) =>
setTimeout(() => resolve(true), graceMs),
),
]);
if (timedOut) {
try {
child.kill('SIGKILL');
} catch {
// Already dead.
}
await exited;
}
};
return { pid, title, kill };
}

View File

@@ -0,0 +1,327 @@
// Node-inspector client for Electron's main process.
//
// Why this exists: the shipped Electron has an authenticated-CDP gate
// (see lib/electron.ts) that exits the app whenever
// --remote-debugging-port is on argv. The gate doesn't check --inspect /
// SIGUSR1, so we can attach the Node inspector at runtime — same code
// path as the in-app "Developer → Enable Main Process Debugger" menu.
//
// From the inspector we can evaluate arbitrary JS in the main process,
// which gives us:
// - Electron API access (app, webContents, dialog, BrowserView)
// - Renderer access via webContents.executeJavaScript()
// - Main-process mocks (e.g. dialog.showOpenDialog for T17)
//
// Caveat: `BrowserWindow.getAllWindows()` returns 0 because frame-fix-
// wrapper substitutes the BrowserWindow class and the substitution
// breaks the static registry. Use `webContents.getAllWebContents()`
// instead — that registry stays intact.
interface PendingCall {
resolve: (value: unknown) => void;
reject: (err: Error) => void;
timer: ReturnType<typeof setTimeout>;
}
// CDP accessibility-tree node shape (subset). The full AX tree is a flat
// array of these with parent/child links carried by id refs. We surface
// the value-bearing fields the v7 walker + claudeai.ts page-objects
// actually consume; remaining CDP fields (ignoredReasons,
// frameId, …) are accessible via the string-keyed bag.
export interface AxValue {
type: string;
value?: unknown;
}
export interface AxProperty {
name: string;
value: AxValue;
}
export interface AxNode {
nodeId: string;
parentId?: string;
childIds?: string[];
backendDOMNodeId?: number;
role?: { type: string; value: string };
name?: { type: string; value: string };
// AX state/relation properties (`haspopup`, `expanded`, `modal`,
// `checked`, `disabled`, …). claudeai.ts reads `haspopup` to
// discriminate menu-trigger buttons from action buttons that
// happen to share an accessible name.
properties?: AxProperty[];
ignored?: boolean;
[k: string]: unknown;
}
export class InspectorClient {
// why: 30s default for send() timeouts. "Slow but not stuck."
// Lower defaults break legitimately-slow operations like initial
// page-load on a cold app or a chunky DOM snapshot; higher defaults
// turn renderer-side hangs (blocked event loop, modal trapping focus,
// network-bound script stalled) into invisible silent freezes.
// Consumers can override per-call (timeoutMs arg) or per-instance
// (mutate InspectorClient.defaultTimeoutMs before instantiating).
static defaultTimeoutMs = 30000;
private ws: WebSocket;
private nextId = 0;
private pending = new Map<number, PendingCall>();
// Idempotency flag for close(). Runners + electron.ts close() may
// both call this on the same instance (intentionally — see
// electron.ts launchClaude tracking comment); the flag guarantees
// a second call is a true no-op rather than a redundant ws.close().
private closed = false;
private constructor(ws: WebSocket) {
this.ws = ws;
this.ws.addEventListener('message', (ev) => this.handleMessage(ev));
}
static async connect(port: number): Promise<InspectorClient> {
const meta = await fetch(`http://127.0.0.1:${port}/json/list`).then((r) =>
r.json(),
) as Array<{ webSocketDebuggerUrl: string }>;
if (!meta.length) {
throw new Error(`Inspector at ${port} has no debuggee`);
}
const url = meta[0]!.webSocketDebuggerUrl;
const ws = new WebSocket(url);
await new Promise<void>((resolve, reject) => {
ws.addEventListener('open', () => resolve(), { once: true });
ws.addEventListener(
'error',
(e) => reject(new Error(`inspector ws error: ${e.type}`)),
{ once: true },
);
});
const client = new InspectorClient(ws);
await client.send('Runtime.enable');
await client.send('Runtime.runIfWaitingForDebugger');
return client;
}
private handleMessage(ev: MessageEvent): void {
const msg = JSON.parse(typeof ev.data === 'string' ? ev.data : '{}') as {
id?: number;
error?: unknown;
result?: unknown;
};
if (msg.id !== undefined && this.pending.has(msg.id)) {
const { resolve, reject, timer } = this.pending.get(msg.id)!;
this.pending.delete(msg.id);
clearTimeout(timer);
if (msg.error) {
reject(new Error(JSON.stringify(msg.error)));
} else {
resolve(msg.result);
}
}
}
// why: every pending call gets a timer. When the renderer event loop
// is blocked (modal focus trap, network-bound script stalled, DOM
// snapshot too large) the CDP reply never arrives and the promise
// would hang forever. We reject with a clear "method=X" error and
// drop the pending entry (no leak), but we deliberately do NOT
// close the websocket — a single hung eval shouldn't tear down the
// connection; the next call may succeed.
send(
method: string,
params: Record<string, unknown> = {},
timeoutMs?: number,
): Promise<unknown> {
const id = ++this.nextId;
const ms = timeoutMs ?? InspectorClient.defaultTimeoutMs;
return new Promise((resolve, reject) => {
const timer = setTimeout(() => {
if (this.pending.delete(id)) {
reject(
new Error(
`inspector.send timed out after ${ms}ms (method=${method})`,
),
);
}
}, ms);
this.pending.set(id, { resolve, reject, timer });
this.ws.send(JSON.stringify({ id, method, params }));
});
}
// Evaluate an async expression in the main process; the expression body
// must end with `return X` (or set a value). Returns the JSON-parsed
// value. JSON-stringification inside the IIFE dodges the inspector's
// Promise-result deep-marshaling quirks (returnByValue produces empty
// objects for awaited Promise resolutions on this build).
//
// Bare `require` is NOT a global in the CDP eval scope — go through
// `process.mainModule.require('electron'|'node:fs'|…)` instead.
async evalInMain<T = unknown>(body: string, timeoutMs?: number): Promise<T> {
const expression =
'globalThis.__r = (async () => { ' +
'const __v = await (async () => { ' +
body +
' })(); ' +
'return JSON.stringify(__v === undefined ? null : __v); ' +
'})(); globalThis.__r;';
const result = (await this.send(
'Runtime.evaluate',
{
expression,
awaitPromise: true,
returnByValue: true,
},
timeoutMs,
)) as { result?: { value?: unknown }; exceptionDetails?: unknown };
if (result.exceptionDetails) {
throw new Error(
`evalInMain threw: ${JSON.stringify(result.exceptionDetails)}`,
);
}
const v = result.result?.value;
if (typeof v !== 'string') {
throw new Error(
`evalInMain expected JSON string, got ${JSON.stringify(result.result)}`,
);
}
return JSON.parse(v) as T;
}
// Convenience: evaluate JS in a specific webContents (renderer).
// `urlFilter` selects which webContents (substring match on getURL()).
async evalInRenderer<T = unknown>(
urlFilter: string,
js: string,
timeoutMs?: number,
): Promise<T> {
const escaped = JSON.stringify(js);
const result = await this.evalInMain<T>(
`
const { webContents } = process.mainModule.require('electron');
const all = webContents.getAllWebContents();
const target = all.find(w => w.getURL().includes(${JSON.stringify(urlFilter)}));
if (!target) {
throw new Error('no webContents matching: ${urlFilter.replace(/'/g, "\\'")}');
}
return await target.executeJavaScript(${escaped});
`,
timeoutMs,
);
return result;
}
// Query the renderer's full accessibility tree via Chrome DevTools
// Protocol's `Accessibility.getFullAXTree`. Reachable from main
// process JS (this client connects to Node's debugger, not Chromium's
// — but webContents.debugger gives us full CDP access from there).
//
// `urlFilter` selects which webContents to attach to (substring match
// on getURL()). Idempotent attach: reusing the same webContents
// across calls won't double-attach. Caller is responsible for AX
// cost — full-tree latency on large surfaces may be ≥100ms; use a
// scoped subtree query for those.
async getAccessibleTree(
urlFilter: string,
timeoutMs?: number,
): Promise<AxNode[]> {
const result = await this.evalInMain<{ nodes: AxNode[] }>(
`
const { webContents } = process.mainModule.require('electron');
const all = webContents.getAllWebContents();
const target = all.find(w => w.getURL().includes(${JSON.stringify(urlFilter)}));
if (!target) {
throw new Error('no webContents matching: ${urlFilter.replace(/'/g, "\\'")}');
}
if (!target.debugger.isAttached()) {
target.debugger.attach('1.3');
}
try {
await target.debugger.sendCommand('Accessibility.enable');
} catch (err) {
// Already-enabled is benign; surface anything else.
if (!String(err && err.message).includes('already enabled')) {
throw err;
}
}
const r = await target.debugger.sendCommand(
'Accessibility.getFullAXTree',
);
return r;
`,
timeoutMs,
);
return result.nodes;
}
// Resolve the AX-tree-supplied backendNodeId to a renderer-side
// JS object handle, then invoke `.click()` on it. This is the
// click-path counterpart to `getAccessibleTree`: capture identifies
// nodes by backendDOMNodeId, click consumes the same id without any
// selector reconstruction. `DOM.resolveNode` handles cross-frame
// nodes natively, and `Runtime.callFunctionOn` runs in the node's
// own execution context — so the click dispatches against the right
// document even when the target sits in an iframe.
async clickByBackendNodeId(
urlFilter: string,
backendNodeId: number,
timeoutMs?: number,
): Promise<void> {
await this.evalInMain<null>(
`
const { webContents } = process.mainModule.require('electron');
const all = webContents.getAllWebContents();
const target = all.find(w => w.getURL().includes(${JSON.stringify(urlFilter)}));
if (!target) {
throw new Error('no webContents matching: ${urlFilter.replace(/'/g, "\\'")}');
}
if (!target.debugger.isAttached()) {
target.debugger.attach('1.3');
}
const resolved = await target.debugger.sendCommand(
'DOM.resolveNode',
{ backendNodeId: ${backendNodeId} },
);
const objectId = resolved && resolved.object && resolved.object.objectId;
if (!objectId) {
throw new Error(
'clickByBackendNodeId: DOM.resolveNode returned no objectId for ' +
${backendNodeId},
);
}
try {
await target.debugger.sendCommand('Runtime.callFunctionOn', {
objectId,
functionDeclaration: 'function() { this.click(); }',
});
} finally {
try {
await target.debugger.sendCommand('Runtime.releaseObject', {
objectId,
});
} catch (_) {
// Releasing a stale handle is benign.
}
}
return null;
`,
timeoutMs,
);
}
close(): void {
if (this.closed) return;
this.closed = true;
// Drain pending timers + reject in-flight promises so callers
// don't hang on close. Without this an outstanding send() keeps
// the event loop alive past close().
for (const [, pending] of this.pending) {
clearTimeout(pending.timer);
pending.reject(new Error('inspector closed'));
}
this.pending.clear();
try {
this.ws.close();
} catch {
// already closed
}
}
}

View File

@@ -0,0 +1,158 @@
// Per-test config isolation.
//
// Decision 1 in docs/testing/automation.md calls for hermetic
// XDG_CONFIG_HOME / CLAUDE_CONFIG_DIR per test (S19 is the underlying
// primitive). Without it, persisted state leaks between tests:
// SingletonLock from one run blocks the next; S35's saved
// quickWindowPosition contaminates S29's closed-to-tray sanity; etc.
//
// Shape: each call to `createIsolation()` builds a fresh config root
// under $TMPDIR/claude-test-<random>/ and returns the env vars to merge
// into the spawned app, plus a teardown that removes the dir. Pass the
// same handle to multiple `launchClaude({ isolation })` calls when a
// test needs to launch the same app twice with shared state (e.g. S35
// position-memory across restart).
//
// `seedFromHost: true` extends this for tests that need the host's
// signed-in auth state (U01). The host directory itself stays
// untouched after the kill+copy: the test runs hermetically against
// a copy of just the auth-relevant files, and the tmpdir is rm -rf'd
// on cleanup so secrets never persist past the test process.
import { cp, mkdir, mkdtemp, rm, stat } from 'node:fs/promises';
import { homedir, tmpdir } from 'node:os';
import { join } from 'node:path';
import { killHostClaude } from './host-claude.js';
export interface Isolation {
configHome: string;
configDir: string;
cacheHome: string;
dataHome: string;
env: Record<string, string>;
cleanup(): Promise<void>;
}
export interface CreateIsolationOptions {
// When true: kill any running host Claude (LevelDB / SQLite hold
// writer locks while it runs), then copy the auth-relevant subset
// of $XDG_CONFIG_HOME/Claude into the new configDir. The host
// config never gets mutated by the test; secrets never leave the
// per-launch tmpdir.
seedFromHost?: boolean;
}
// Allowlist of relative paths under ~/.config/Claude/ that carry auth
// or first-launch UI state. Everything else is deliberately
// regenerated fresh in the tmpdir:
// - Cache/, Code Cache/, GPUCache/, Dawn*Cache/ — cheap to rebuild
// - blob_storage/, Crashpad/, logs/ — irrelevant to auth
// - SingletonLock, SingletonCookie, SingletonSocket — block startup
// - .org.chromium.Chromium.* — host-specific lock turds
// - claude-code-sessions/, claude-code-vm/, local-agent-mode-sessions/
// — large, account-specific, not needed for renderer auth
//
// Cookies + Local State are the auth-cookie pair (the latter holds
// the os_crypt key wrapper on platforms that need it). IndexedDB +
// Local Storage hold the renderer-side auth context that claude.ai's
// route guards check before redirecting to /login — cookies alone
// leave you bouncing back to login.
const SEED_PATHS = [
'Cookies',
'Cookies-journal',
'Local State',
'Local Storage',
'IndexedDB',
'Session Storage',
'WebStorage',
'SharedStorage',
'Network Persistent State',
'config.json',
'claude_desktop_config.json',
'developer_settings.json',
];
async function exists(path: string): Promise<boolean> {
try {
await stat(path);
return true;
} catch {
return false;
}
}
async function seedAuthFromHost(targetConfigDir: string): Promise<void> {
const hostConfigHome =
process.env.XDG_CONFIG_HOME ?? join(homedir(), '.config');
const hostClaudeDir = join(hostConfigHome, 'Claude');
if (!(await exists(hostClaudeDir))) {
throw new Error(
`seedFromHost: host config dir not found at ${hostClaudeDir}. ` +
'Sign into Claude Desktop on this machine first, then re-run.',
);
}
await mkdir(targetConfigDir, { recursive: true });
let copied = 0;
for (const rel of SEED_PATHS) {
const src = join(hostClaudeDir, rel);
if (!(await exists(src))) continue;
const dst = join(targetConfigDir, rel);
await cp(src, dst, {
recursive: true,
preserveTimestamps: true,
errorOnExist: false,
});
copied++;
}
if (copied === 0) {
throw new Error(
`seedFromHost: ${hostClaudeDir} exists but contains none of the ` +
'expected auth files. Open Claude Desktop, sign in, fully close, ' +
'and re-run.',
);
}
}
export async function createIsolation(
opts: CreateIsolationOptions = {},
): Promise<Isolation> {
const root = await mkdtemp(join(tmpdir(), 'claude-test-'));
const configHome = join(root, 'config');
const configDir = join(configHome, 'Claude');
const cacheHome = join(root, 'cache');
const dataHome = join(root, 'data');
if (opts.seedFromHost) {
// Order matters: kill before copy. While the host app runs,
// LevelDB holds a LOCK file in IndexedDB/Local Storage that
// makes the directory unreadable to a second process, and
// SQLite Cookies has WAL pages that may not be checkpointed.
await killHostClaude();
await seedAuthFromHost(configDir);
}
const env: Record<string, string> = {
XDG_CONFIG_HOME: configHome,
XDG_CACHE_HOME: cacheHome,
XDG_DATA_HOME: dataHome,
// CLAUDE_CONFIG_DIR is honored by launcher-common.sh and by
// the app itself for picking the persisted-settings location.
CLAUDE_CONFIG_DIR: configDir,
};
return {
configHome,
configDir,
cacheHome,
dataHome,
env,
async cleanup() {
await rm(root, { recursive: true, force: true });
},
};
}

View File

@@ -0,0 +1,656 @@
// Quick Entry domain wrapper — single point of coupling to upstream's
// main-process structure for QE-* tests.
//
// Why centralize: upstream symbol names (Ko for popup, ut for main, h1
// for the visibility check) drift between releases per CLAUDE.md's
// "Working with Minified JavaScript" notes. If this lookup logic lives
// in 12 separate spec files, every release becomes a 12-file fix. If
// it lives here, it's one fix.
//
// Discovery strategy: don't rely on minified symbol names. Use shape:
// - Popup webContents = the new entry that appears after the shortcut
// fires (snapshot/diff pattern).
// - Popup BrowserWindow = the only one constructed with
// transparent: true && alwaysOnTop: true.
// - Main BrowserWindow = the one whose webContents URL contains
// "claude.ai".
//
// Shortcut injection: ydotool through /dev/uinput. Works on X11,
// XWayland, and native Wayland with portal-grabbed shortcuts (KDE-W,
// Ubu-W, KDE-X). Does NOT work where the OS-level grab itself is broken
// (#404 GNOME-W) — that's the test, not a tool gap. Tests that need
// the popup to be open *without* exercising the OS shortcut grab call
// `installInterceptor()` first to stash a popup-constructor ref via
// BrowserWindow construction-time capture, then... we still need a
// trigger. For the closeout sweep the assumption is ydotool is present
// and the OS grab works on the row under test. S11/S12 explicitly test
// the grab path; everything else assumes it.
import { execFile } from 'node:child_process';
import { readFile } from 'node:fs/promises';
import { homedir } from 'node:os';
import { join } from 'node:path';
import { promisify } from 'node:util';
import type { InspectorClient } from './inspector.js';
import { retryUntil, sleep } from './retry.js';
const exec = promisify(execFile);
export interface WebContentsInfo {
id: number;
url: string;
}
export interface BrowserWindowState {
visible: boolean;
minimized: boolean;
fullScreen: boolean;
focused: boolean;
bounds: { x: number; y: number; width: number; height: number };
}
// Linux key codes for the upstream default Ctrl+Alt+Space accelerator.
// Override via constructor option for tests that exercise a remapped
// shortcut.
const DEFAULT_KEY_SEQUENCE = [
'29:1', // LEFTCTRL down
'56:1', // LEFTALT down
'57:1', // SPACE down
'57:0', // SPACE up
'56:0', // LEFTALT up
'29:0', // LEFTCTRL up
];
export class QuickEntry {
constructor(
private readonly inspector: InspectorClient,
private readonly keySeq: string[] = DEFAULT_KEY_SEQUENCE,
) {}
// Capture BrowserWindow refs by hooking prototype methods, not the
// constructor.
//
// Why prototype-level: scripts/frame-fix-wrapper.js returns the
// electron module wrapped in a Proxy whose `get` trap returns a
// closure-captured PatchedBrowserWindow. A constructor-level wrap
// (`electron.BrowserWindow = Wrapped`) writes to the underlying
// module but the Proxy keeps returning PatchedBrowserWindow on
// reads, so the wrap is bypassed entirely. Hooking
// `BrowserWindow.prototype.loadFile` instead captures every
// instance regardless of which subclass it was constructed
// through — Patched, frame-fix-wrapped, or plain.
//
// The popup is identified by its loadFile target:
// `.vite/renderer/quick_window/quick-window.html`
// (build-reference index.js:515443).
async installInterceptor(): Promise<void> {
await this.inspector.evalInMain<null>(`
if (globalThis.__qeInterceptorInstalled) return null;
const electron = process.mainModule.require('electron');
const proto = electron.BrowserWindow.prototype;
globalThis.__qeWindows = [];
const origLoadFile = proto.loadFile;
proto.loadFile = function(filePath, ...rest) {
try {
const url = String(filePath || '');
globalThis.__qeWindows.push({
ref: this,
loadedFile: url,
});
} catch (e) { /* recording must never throw */ }
return origLoadFile.call(this, filePath, ...rest);
};
const origLoadURL = proto.loadURL;
proto.loadURL = function(url, ...rest) {
try {
globalThis.__qeWindows.push({
ref: this,
loadedFile: String(url || ''),
});
} catch (e) {}
return origLoadURL.call(this, url, ...rest);
};
globalThis.__qeInterceptorInstalled = true;
return null;
`);
}
// The popup is the BrowserWindow whose loadFile target ends with
// `quick-window.html`. Stable path — upstream uses it verbatim
// (build-reference index.js:515443).
private popupSelector(): string {
return `(w => {
if (!w || !w.ref || w.ref.isDestroyed()) return false;
const f = String(w.loadedFile || '');
return f.indexOf('quick-window.html') !== -1
|| f.indexOf('quick_window/') !== -1;
})`;
}
async listWebContents(): Promise<WebContentsInfo[]> {
return await this.inspector.evalInMain<WebContentsInfo[]>(`
const { webContents } = process.mainModule.require('electron');
return webContents.getAllWebContents().map(w => ({
id: w.id, url: w.getURL(),
}));
`);
}
// Find the popup by elimination: not the main shell (file:// chrome)
// and not the embedded claude.ai BrowserView.
async getPopupWebContents(): Promise<WebContentsInfo | null> {
const all = await this.listWebContents();
const popup = all.find((w) => isPopupUrl(w.url));
return popup ?? null;
}
// Send the configured accelerator via ydotool. Errors out (caller
// can catch + skip) if ydotool isn't on PATH.
//
// YDOTOOL_SOCKET is honored from the parent env; defaults to
// /tmp/.ydotool_socket (the path the shipped systemd unit uses
// after the override drop-in). Without YDOTOOL_SOCKET, the client
// probes /run/user/$UID/.ydotool_socket — a location the daemon
// doesn't bind to, so the call fails confusingly.
async openViaShortcut(): Promise<void> {
await ensureYdotool();
await exec('ydotool', ['key', ...this.keySeq], {
env: {
...process.env,
YDOTOOL_SOCKET:
process.env.YDOTOOL_SOCKET ?? '/tmp/.ydotool_socket',
} as Record<string, string>,
});
}
// openViaShortcut + waitForPopupReady, with retry for the
// upstream-only-shows-when-logged-in race (build-reference
// index.js:515604: `function lHn() { return !user.isLoggedOut; }`).
// On a fresh launch, the renderer URL flips past /login before
// the main-process user object is populated; the first shortcut
// constructs the popup but skips show(). A second shortcut after
// a brief settle hits the populated-user path. Total budget is
// `attempts * (perAttemptMs + retryDelayMs)`.
async openAndWaitReady(opts: {
attempts?: number;
perAttemptMs?: number;
retryDelayMs?: number;
} = {}): Promise<void> {
const attempts = opts.attempts ?? 3;
const perAttemptMs = opts.perAttemptMs ?? 8_000;
const retryDelayMs = opts.retryDelayMs ?? 1_500;
let lastErr: unknown = null;
for (let i = 0; i < attempts; i++) {
await this.openViaShortcut();
try {
await this.waitForPopupReady(perAttemptMs);
return;
} catch (err) {
lastErr = err;
if (i < attempts - 1) await sleep(retryDelayMs);
}
}
throw new Error(
`openAndWaitReady: popup never became ready after ${attempts} ` +
`shortcut presses. Last error: ` +
(lastErr instanceof Error ? lastErr.message : String(lastErr)),
);
}
// Wait for the popup webContents to appear after openViaShortcut().
async waitForPopup(timeoutMs = 5000): Promise<WebContentsInfo> {
const wc = await retryUntil(
async () => this.getPopupWebContents(),
{ timeout: timeoutMs, interval: 100 },
);
if (!wc) {
throw new Error(
`Quick Entry popup webContents did not appear within ${timeoutMs}ms`,
);
}
return wc;
}
// Wait for the popup to become hidden (the upstream "submit
// accepted" signal). Upstream reuses the popup BrowserWindow
// across invocations — Ko stays alive, only the visibility
// toggles — so checking webContents existence would never
// resolve. Read isVisible() on the captured BrowserWindow ref
// instead.
async waitForPopupClosed(timeoutMs = 5000): Promise<void> {
const closed = await retryUntil(
async () => {
const state = await this.getPopupState();
if (!state) return true; // destroyed → closed
return state.visible ? null : true;
},
{ timeout: timeoutMs, interval: 100 },
);
if (!closed) {
throw new Error(
`Quick Entry popup did not become hidden within ${timeoutMs}ms`,
);
}
}
// Read live properties of the popup BrowserWindow. Replaces the
// previous getPopupConstructionArgs — construction-time options
// aren't observable through the prototype-method hook, but every
// upstream-relevant signal has a runtime equivalent. Frame state
// uses `getContentBounds() vs getBounds()` (frameless windows
// have equal content + frame bounds). Transparent uses the
// background color (popup is `#00000000`).
async getPopupRuntimeProps(): Promise<{
frameless: boolean;
transparent: boolean;
alwaysOnTop: boolean;
backgroundColor: string;
} | null> {
// `skipTaskbar` was previously reported here but BrowserWindow
// has no isSkipTaskbar() getter; the field hardcoded `false`
// regardless of how the popup was constructed, which is
// misleading. Dropped — no current spec consumes it. If a
// future test needs it, capture via a setSkipTaskbar wrap in
// installInterceptor() rather than faking a getter.
return await this.inspector.evalInMain(`
const wins = globalThis.__qeWindows || [];
const isPopup = ${this.popupSelector()};
const popup = wins.find(isPopup);
if (!popup || !popup.ref || popup.ref.isDestroyed()) return null;
const w = popup.ref;
const bounds = w.getBounds();
const content = w.getContentBounds();
const bg = (w.getBackgroundColor && w.getBackgroundColor()) || '';
return {
frameless: bounds.width === content.width && bounds.height === content.height,
transparent: bg === '#00000000' || bg === '#0000',
alwaysOnTop: w.isAlwaysOnTop(),
backgroundColor: bg,
};
`);
}
// Read the popup BrowserWindow's runtime visibility / bounds /
// focus / fullscreen state. Used by waitForPopupReady and
// waitForPopupClosed; the popup is reused across invocations
// (Ko stays alive, only visibility toggles), so isVisible() is
// the right "open vs closed" signal — not webContents existence.
async getPopupState(): Promise<(BrowserWindowState & { alwaysOnTop: boolean }) | null> {
return await this.inspector.evalInMain(`
const wins = globalThis.__qeWindows || [];
const isPopup = ${this.popupSelector()};
const popup = wins.find(isPopup);
if (!popup || !popup.ref || popup.ref.isDestroyed()) return null;
const w = popup.ref;
return {
visible: w.isVisible(),
minimized: w.isMinimized(),
fullScreen: w.isFullScreen(),
focused: w.isFocused(),
bounds: w.getBounds(),
alwaysOnTop: w.isAlwaysOnTop(),
};
`);
}
// Wait for the popup to be fully ready for input — meaning:
// (a) BrowserWindow has been show()n (isVisible === true),
// which only fires after upstream's `ready-to-show` event,
// which is after React's mount + first-pass effects, which
// is when document.addEventListener('keydown', ...) gets
// attached;
// (b) the textarea exists in the DOM.
// Without (a), first-time-mount typing fires keydown into a
// document with no listener and the submit silently drops.
async waitForPopupReady(timeoutMs = 5000): Promise<void> {
const popup = await this.waitForPopup(timeoutMs);
let lastState: unknown = null;
const ready = await retryUntil(
async () => {
const state = await this.getPopupState();
const dom = await this.inspector
.evalInMain<{
readyState: string;
hasTextarea: boolean;
} | null>(
`
const { webContents } = process.mainModule.require('electron');
const wc = webContents.fromId(${popup.id});
if (!wc || wc.isDestroyed()) return null;
return await wc.executeJavaScript(\`(() => ({
readyState: document.readyState,
hasTextarea: !!(document.querySelector('textarea')
|| document.querySelector('[contenteditable="true"]')),
}))()\`);
`,
)
.catch(() => null);
lastState = { state, dom };
if (!state || !state.visible) return null;
return dom && dom.hasTextarea ? dom : null;
},
{ timeout: timeoutMs, interval: 100 },
);
if (!ready) {
throw new Error(
`Popup did not become visible with a textarea within ${timeoutMs}ms. ` +
`Last observed: ${JSON.stringify(lastState)}`,
);
}
}
// Type a prompt into the popup's textarea and submit. The popup is
// a React app with a textarea + send button; React tracks input
// values via a private setter, so plain `el.value = ...` is ignored.
// The native-setter dance below is the standard React-friendly path.
//
// Waits for the textarea to exist before dispatching — first-time
// lazy popup creation needs the React mount to complete, otherwise
// the input event lands before any state listener and upstream
// drops the submit as empty.
async typeAndSubmit(text: string): Promise<void> {
await this.waitForPopupReady();
const popup = await this.getPopupWebContents();
if (!popup) throw new Error('popup vanished after waitForPopupReady');
const popupId = popup.id;
await this.inspector.evalInMain<null>(`
const { webContents } = process.mainModule.require('electron');
const wc = webContents.fromId(${popupId});
if (!wc) throw new Error('popup webContents ${popupId} gone');
await wc.executeJavaScript(${JSON.stringify(typeAndSubmitJs(text))});
return null;
`);
}
// Read the persisted popup position (S35) directly from the
// on-disk store. electron-store defaults to `config.json` under the
// app's userData dir; for claude-desktop that's
// `${configDir}/Claude/config.json` (or `~/.config/Claude/...`
// when no isolation is in play). Reading the file beats the
// previous globalThis-walk: that probe matched any object with
// .get/.set returning a `quickWindowPosition` value, which is
// fragile against unrelated minified objects coincidentally
// matching the shape.
//
// Optional `configDir` keeps the call backward-compatible — pass
// `app.isolation?.configDir` from runners under per-test isolation,
// omit it to fall back to the host's `~/.config/Claude`.
async getStoredPosition(configDir?: string): Promise<unknown | null> {
const storePath = configDir
? join(configDir, 'config.json')
: join(homedir(), '.config/Claude/config.json');
try {
const raw = await readFile(storePath, 'utf8');
const parsed = JSON.parse(raw) as { quickWindowPosition?: unknown };
return parsed.quickWindowPosition ?? null;
} catch {
// File missing (never saved) or unreadable — both null.
return null;
}
}
}
// Upstream loads the popup via
// loadFile('.vite/renderer/quick_window/quick-window.html')
// (build-reference index.js:515443). Anchor on that exact path. Fall
// back to a broader 'quick_window/' substring if upstream renames just
// the HTML file.
export function isPopupUrl(url: string): boolean {
if (!url.startsWith('file://')) return false;
if (url.includes('claude.ai')) return false;
if (url.includes('quick_window/quick-window.html')) return true;
if (url.includes('/quick_window/')) return true;
return false;
}
// React-friendly value setter. document.activeElement isn't reliable
// because the popup may not have focus on construction; we walk the
// DOM for the only textarea (or contenteditable).
function typeAndSubmitJs(text: string): string {
const escaped = JSON.stringify(text);
return `
(async () => {
const input = document.querySelector('textarea')
|| document.querySelector('[contenteditable="true"]');
if (!input) throw new Error('no textarea/contenteditable in popup DOM');
input.focus();
if (input.tagName === 'TEXTAREA') {
const setter = Object.getOwnPropertyDescriptor(
HTMLTextAreaElement.prototype, 'value'
).set;
setter.call(input, ${escaped});
input.dispatchEvent(new Event('input', { bubbles: true }));
} else {
input.textContent = ${escaped};
input.dispatchEvent(new InputEvent('input', { bubbles: true, data: ${escaped} }));
}
// Submit via Enter keydown — popup binds its own keyhandler
// (renderer-side per the closeout doc).
input.dispatchEvent(new KeyboardEvent('keydown', {
key: 'Enter', code: 'Enter', keyCode: 13, which: 13,
bubbles: true, cancelable: true,
}));
input.dispatchEvent(new KeyboardEvent('keyup', {
key: 'Enter', code: 'Enter', keyCode: 13, which: 13,
bubbles: true,
}));
})()
`;
}
// Main-window state manipulation. Used by QE-7/8/9/10/11 to set the
// precondition (minimized, hidden-to-tray, fullscreen, etc.) before
// triggering Quick Entry.
//
// All methods walk webContents to find the claude.ai-hosting
// BrowserWindow via BrowserWindow.fromWebContents(). The
// `BrowserWindow.getAllWindows()` registry is broken by frame-fix-
// wrapper (see lib/inspector.ts gotchas) but `fromWebContents` uses a
// different code path and remains reliable.
export class MainWindow {
constructor(private readonly inspector: InspectorClient) {}
async setState(action: 'minimize' | 'hide' | 'show' | 'restore' | 'fullScreen' | 'unFullScreen' | 'focus' | 'close'): Promise<void> {
await this.inspector.evalInMain<null>(`
const { webContents, BrowserWindow } = process.mainModule.require('electron');
const main = webContents.getAllWebContents().find(w => w.getURL().includes('claude.ai'));
if (!main) throw new Error('no claude.ai webContents — main not yet loaded');
const win = BrowserWindow.fromWebContents(main);
if (!win) throw new Error('no BrowserWindow for claude.ai webContents');
switch (${JSON.stringify(action)}) {
case 'minimize': win.minimize(); break;
case 'hide': win.hide(); break;
case 'show': win.show(); break;
case 'restore': win.restore(); break;
case 'fullScreen': win.setFullScreen(true); break;
case 'unFullScreen':win.setFullScreen(false); break;
case 'focus': win.focus(); break;
// 'close' fires the BrowserWindow 'close' event so
// frame-fix-wrapper.js:178-185 (the close-to-tray
// interceptor) and the upstream before-quit flow
// run as they would on a real X-button click. NOT
// the same as 'hide' — that bypasses the wrapper.
// T08 asserts on this distinction.
case 'close': win.close(); break;
}
return null;
`);
// Compositor-side state changes are async — small settle.
await sleep(150);
}
async getState(): Promise<BrowserWindowState | null> {
return await this.inspector.evalInMain(`
const { webContents, BrowserWindow } = process.mainModule.require('electron');
const main = webContents.getAllWebContents().find(w => w.getURL().includes('claude.ai'));
if (!main) return null;
const win = BrowserWindow.fromWebContents(main);
if (!win || win.isDestroyed()) return null;
return {
visible: win.isVisible(),
minimized: win.isMinimized(),
fullScreen: win.isFullScreen(),
focused: win.isFocused(),
bounds: win.getBounds(),
};
`);
}
}
// Wait for the claude.ai user object to be loaded — the precondition
// for upstream's lHn() (`!user.isLoggedOut`) returning true. The
// shortcut handler calls Ko.show() only when lHn() is true; if the
// renderer hasn't finished loading the user yet, the popup gets
// constructed and ready-to-show fires, but show() is silently
// skipped (build-reference index.js:515604). The user object is
// available once the renderer has navigated past the login page —
// e.g. /new, /chat/<uuid>, /code, /projects.
//
// Returns the post-login URL on success. Returns null on timeout —
// caller can decide to skip vs fail.
//
// Anchored at the host root and bounded with a path-terminator class so
// only `/login`, `/auth`, `/sign-in` etc. as the *first* path segment
// match. The previous unanchored `/\/(login|auth|sign[-_]?in)/i` also
// caught substrings like `/oauth/callback` (auth) and any URL containing
// `/login` further down the path.
const LOGIN_URL_RE =
/^https?:\/\/[^/]+\/(login|auth|sign[-_]?in)(?:[/?#]|$)/i;
export async function waitForUserLoaded(
inspector: InspectorClient,
timeoutMs = 30_000,
): Promise<string | null> {
return await retryUntil(
async () => {
const urls = await inspector.evalInMain<string[]>(`
const { webContents } = process.mainModule.require('electron');
return webContents.getAllWebContents()
.filter(w => w.getURL().includes('claude.ai'))
.map(w => w.getURL());
`);
const postLogin = urls.find(
(u) => !LOGIN_URL_RE.test(u) && u.includes('claude.ai'),
);
return postLogin ?? null;
},
{ timeout: timeoutMs, interval: 250 },
);
}
// Wait for a new chat session to load in the claude.ai webContents.
// Returns the URL once a /chat/<uuid> path is reached. This is the
// network-coupled half of the layered submit assertion (S31): a slow
// claude.ai or a network blip can fail this independently of any QE
// regression. Callers should treat its failure as Should-not-Critical.
const CHAT_URL_RE = /\/chat\/[0-9a-f-]{8,}/i;
export async function waitForNewChat(
inspector: InspectorClient,
timeoutMs = 15_000,
): Promise<string | null> {
return await retryUntil(
async () => {
const all = await inspector.evalInMain<{ url: string }[]>(`
const { webContents } = process.mainModule.require('electron');
return webContents.getAllWebContents()
.filter(w => w.getURL().includes('claude.ai'))
.map(w => ({ url: w.getURL() }));
`);
const match = all.find((w) => CHAT_URL_RE.test(w.url));
return match ? match.url : null;
},
{ timeout: timeoutMs, interval: 250 },
);
}
// Local-only assertion half: did the popup-side IPC fire with the
// right payload? Wraps the popup's `requestDismissWithPayload` IPC
// channel by intercepting it on the main side. Call before
// typeAndSubmit; resolves with the captured payload (or null on
// timeout).
export async function captureSubmitIpc(
inspector: InspectorClient,
timeoutMs = 5000,
): Promise<{ text: string } | null> {
await inspector.evalInMain<null>(`
if (!globalThis.__qeIpcInstalled) {
const { ipcMain } = process.mainModule.require('electron');
globalThis.__qeIpcCalls = [];
// Wrap every existing 'requestDismiss'-shaped channel.
// Channel names are minified-stable: requestDismiss /
// requestDismissWithPayload (closeout doc index.js:515409).
const channels = ['requestDismissWithPayload', 'requestDismiss'];
for (const ch of channels) {
const handlers = ipcMain._invokeHandlers || ipcMain._events || {};
// Best-effort: register a parallel listener that records
// invocations without disturbing the original handler.
ipcMain.on(ch, (_event, payload) => {
globalThis.__qeIpcCalls.push({ channel: ch, payload, ts: Date.now() });
});
}
globalThis.__qeIpcInstalled = true;
}
return null;
`);
return await retryUntil(
async () => {
const calls = await inspector.evalInMain<
{ channel: string; payload: unknown; ts: number }[]
>(`return globalThis.__qeIpcCalls || []`);
const submit = calls.find(
(c) =>
c.channel === 'requestDismissWithPayload' &&
c.payload != null &&
typeof c.payload === 'object',
);
if (!submit) return null;
const p = submit.payload as Record<string, unknown>;
const text =
typeof p.text === 'string'
? p.text
: typeof p.prompt === 'string'
? p.prompt
: typeof p.value === 'string'
? p.value
: '';
return { text };
},
{ timeout: timeoutMs, interval: 100 },
);
}
async function ensureYdotool(): Promise<void> {
try {
// `ydotool` with no args exits 1 and prints the help text — that
// confirms the binary works without sending input. Avoid
// `ydotool --help` which is rejected as an unknown command.
await exec('ydotool', [], {
env: {
...process.env,
YDOTOOL_SOCKET:
process.env.YDOTOOL_SOCKET ?? '/tmp/.ydotool_socket',
} as Record<string, string>,
});
} catch (err) {
const e = err as { code?: string | number; stderr?: string };
// exit 1 with usage help is normal — only fail on ENOENT (no
// binary) or stderr socket errors.
const stderr = (e.stderr ?? '').toString();
if (e.code === 'ENOENT') {
throw new Error(
'ydotool binary not found on PATH. Install with ' +
'`dnf install ydotool` / `apt install ydotool`.',
);
}
if (stderr.includes('failed to connect socket')) {
throw new Error(
'ydotoold socket not reachable. Start the daemon ' +
'(`sudo systemctl start ydotool.service`) and ensure ' +
'YDOTOOL_SOCKET points at its bind path. Underlying: ' +
stderr.trim(),
);
}
// Any other non-zero exit (notably exit 1 with usage) is fine.
}
}

View File

@@ -0,0 +1,27 @@
export interface RetryOptions {
timeout?: number;
interval?: number;
message?: string;
}
export async function retryUntil<T>(
fn: () => Promise<T | null | undefined>,
options: RetryOptions = {},
): Promise<T | null> {
const timeout = options.timeout ?? 10_000;
const interval = options.interval ?? 250;
const start = Date.now();
while (Date.now() - start < timeout) {
const result = await fn();
if (result !== null && result !== undefined) {
return result;
}
await sleep(interval);
}
return null;
}
export function sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}

View File

@@ -0,0 +1,48 @@
// Row-aware skip primitive.
//
// Spec files declare which matrix rows they apply to. Anything else is
// skipped (not failed) so the JUnit run carries `<skipped>` →
// `matrix.md` cell `-`. See Decision 1 in docs/testing/automation.md
// for the JUnit-to-cell mapping.
//
// Usage in a runner:
// skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W']);
//
// The reason is auto-formatted from the row list so the dashboard
// caller doesn't have to write it.
import type { TestInfo } from '@playwright/test';
import { getEnv } from './env.js';
export type Row =
| 'KDE-W'
| 'KDE-X'
| 'GNOME-W'
| 'GNOME-X'
| 'Ubu-W'
| 'Ubu-X'
| 'COSMIC'
| 'Sway'
| 'Niri'
| 'Hypr-O'
| 'Hypr-N'
| 'i3';
export function currentRow(): string {
return getEnv().row;
}
export function skipUnlessRow(testInfo: TestInfo, allowed: Row[]): void {
const row = currentRow();
if (allowed.includes(row as Row)) return;
testInfo.skip(
true,
`row ${row} not in [${allowed.join(', ')}] — applies-to mismatch`,
);
}
export function skipOnRow(testInfo: TestInfo, blocked: Row[]): void {
const row = currentRow();
if (!blocked.includes(row as Row)) return;
testInfo.skip(true, `row ${row} excluded`);
}

View File

@@ -0,0 +1,53 @@
import { getSessionBus, getConnectionPid, method } from './dbus.js';
import type { Variant } from 'dbus-next';
const WATCHER_DEST = 'org.kde.StatusNotifierWatcher';
const WATCHER_PATH = '/StatusNotifierWatcher';
const ITEM_IFACE = 'org.kde.StatusNotifierItem';
export interface SniItem {
service: string;
objectPath: string;
}
export async function listRegisteredItems(): Promise<SniItem[]> {
const bus = getSessionBus();
const proxy = await bus.getProxyObject(WATCHER_DEST, WATCHER_PATH);
const props = proxy.getInterface('org.freedesktop.DBus.Properties');
const result = await method(props, 'Get')(
WATCHER_DEST,
'RegisteredStatusNotifierItems',
);
const variant = result as Variant<string[]>;
return variant.value.map(parseItemAddress);
}
export async function findItemByPid(pid: number): Promise<SniItem | null> {
const items = await listRegisteredItems();
for (const item of items) {
try {
const itemPid = await getConnectionPid(item.service);
if (itemPid === pid) {
return item;
}
} catch {
// connection may have gone away mid-iteration; skip
}
}
return null;
}
export async function activateItem(item: SniItem): Promise<void> {
const bus = getSessionBus();
const proxy = await bus.getProxyObject(item.service, item.objectPath);
const iface = proxy.getInterface(ITEM_IFACE);
await method(iface, 'Activate')(0, 0);
}
function parseItemAddress(raw: string): SniItem {
const slash = raw.indexOf('/');
if (slash === -1) {
return { service: raw, objectPath: '/StatusNotifierItem' };
}
return { service: raw.slice(0, slash), objectPath: raw.slice(slash) };
}

View File

@@ -0,0 +1,71 @@
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
const exec = promisify(execFile);
export interface FrameExtents {
left: number;
right: number;
top: number;
bottom: number;
}
export async function findX11WindowByPid(pid: number): Promise<string | null> {
// Walk _NET_CLIENT_LIST and match on _NET_WM_PID. Pure xprop, no
// xdotool dependency — Electron's main window will surface here once
// the WM has accepted it.
const ids = await listClientWindows();
let firstMatch: string | null = null;
for (const id of ids) {
const wmPid = await getWindowPid(id);
if (wmPid !== pid) continue;
const title = await getWindowProperty(id, '_NET_WM_NAME');
if (title) return id;
if (!firstMatch) firstMatch = id;
}
return firstMatch;
}
async function listClientWindows(): Promise<string[]> {
try {
const { stdout } = await exec('xprop', ['-root', '_NET_CLIENT_LIST']);
// _NET_CLIENT_LIST(WINDOW): window id # 0x1234, 0x5678, ...
const m = stdout.match(/#\s*(.+)$/m);
if (!m) return [];
return m[1]!.split(',').map((s) => s.trim()).filter(Boolean);
} catch {
return [];
}
}
async function getWindowPid(windowId: string): Promise<number | null> {
const raw = await getWindowProperty(windowId, '_NET_WM_PID');
if (!raw) return null;
const n = parseInt(raw, 10);
return Number.isNaN(n) ? null : n;
}
export async function getFrameExtents(windowId: string): Promise<FrameExtents | null> {
const raw = await getWindowProperty(windowId, '_NET_FRAME_EXTENTS');
if (!raw) return null;
const nums = raw.split(',').map((s) => parseInt(s.trim(), 10));
if (nums.length !== 4 || nums.some(Number.isNaN)) return null;
return { left: nums[0]!, right: nums[1]!, top: nums[2]!, bottom: nums[3]! };
}
export async function getWindowTitle(windowId: string): Promise<string | null> {
const raw = await getWindowProperty(windowId, '_NET_WM_NAME');
if (!raw) return null;
const m = raw.match(/^"(.*)"$/s);
return m ? m[1]! : raw;
}
async function getWindowProperty(windowId: string, prop: string): Promise<string | null> {
try {
const { stdout } = await exec('xprop', ['-id', windowId, prop]);
const m = stdout.match(/=\s*(.+)$/m);
return m ? m[1]!.trim() : null;
} catch {
return null;
}
}

View File

@@ -0,0 +1,184 @@
import { test, expect } from '@playwright/test';
import { spawn } from 'node:child_process';
import { existsSync } from 'node:fs';
import { dirname } from 'node:path';
import { createIsolation } from '../lib/isolation.js';
// H-prefix runners are HARNESS self-tests — they validate the test
// harness's preconditions and the build pipeline's invariants, distinct
// from T-tests (upstream test cases) and S-tests (doc-spec entries).
// They tend to be cheap (file probes, exit-code assertions) and exist
// to catch silent drift in the things our other tests assume.
//
// H01 — CDP auth gate canary.
//
// The whole L1 strategy (lib/electron.ts:96-110) hinges on the fact
// that the shipped Electron exits the app whenever
// `--remote-debugging-port` / `--remote-debugging-pipe` is on argv
// without a valid CLAUDE_CDP_AUTH token. If upstream removes that
// gate, every L1 test silently weakens — Playwright's
// `_electron.launch()` (which always injects --remote-debugging-port=0)
// would start working again, but our SIGUSR1-attach pathway would
// keep "passing" without exercising the contract it was built for.
//
// This canary spawns the bundled Electron directly with
// --remote-debugging-port=0 and NO auth token, then asserts the
// process exits with code 1 (the gate's `process.exit(1)` per
// lib/electron.ts:96-97) and was not killed by signal. Timeout
// without exit means the gate is gone.
//
// Spawn-only — no app stays running, no inspector attach, no
// X11 window probe. Pure exit-code observation under isolation
// so the host config never sees the failed launch.
//
// Row-independent: the gate's Linux behavior is the same on every
// row we ship to. Don't `skipUnlessRow`.
// DEFAULT_INSTALL_PATHS mirror lib/electron.ts:123-132 — kept inline
// rather than importing resolveInstall() so this canary can run even
// if a future change to electron.ts breaks the import surface (the
// canary should be the LEAST coupled spec to any moving part).
const DEFAULT_INSTALL_PATHS: { electron: string; asar: string }[] = [
{
electron: '/usr/lib/claude-desktop/node_modules/electron/dist/electron',
asar: '/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar',
},
{
electron: '/opt/Claude/node_modules/electron/dist/electron',
asar: '/opt/Claude/node_modules/electron/dist/resources/app.asar',
},
];
function resolveInstallInline(): { electron: string; asar: string } {
const envBin = process.env.CLAUDE_DESKTOP_ELECTRON;
const envAsar = process.env.CLAUDE_DESKTOP_APP_ASAR;
if (envBin && envAsar) return { electron: envBin, asar: envAsar };
for (const candidate of DEFAULT_INSTALL_PATHS) {
if (existsSync(candidate.electron) && existsSync(candidate.asar)) {
return candidate;
}
}
throw new Error(
'Could not locate claude-desktop install. Set CLAUDE_DESKTOP_ELECTRON ' +
'and CLAUDE_DESKTOP_APP_ASAR, or install the deb/rpm package.',
);
}
test.setTimeout(30_000);
test('H01 — CDP auth gate fires on --remote-debugging-port without token', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({ type: 'surface', description: 'CDP auth gate' });
const { electron: electronBin, asar } = resolveInstallInline();
const appDir = dirname(dirname(dirname(dirname(electronBin))));
// Fresh isolation — the gate trips before any persisted state is
// touched, but if anything sneaks past `process.exit(1)` we'd
// rather it write to /tmp than ~/.config/Claude.
const isolation = await createIsolation();
const start = Date.now();
// Raw spawn — no LAUNCHER_INJECTED_FLAGS, no isolation env beyond
// what we set explicitly. The OPPOSITE of launchClaude(): we WANT
// the debug-port flag on argv so the gate fires.
const argv = [
'--remote-debugging-port=0',
asar,
];
// Build env: scrub CLAUDE_CDP_AUTH so a developer who set it
// locally doesn't accidentally pass the gate. Keep the rest of
// the parent env so Electron's normal load path (DISPLAY,
// XDG_RUNTIME_DIR, etc.) still works up to the gate check.
const env: Record<string, string> = {};
for (const [k, v] of Object.entries(process.env)) {
if (v !== undefined) env[k] = v;
}
delete env.CLAUDE_CDP_AUTH;
for (const [k, v] of Object.entries(isolation.env)) {
env[k] = v;
}
const proc = spawn(electronBin, argv, {
cwd: appDir,
env,
stdio: 'ignore',
detached: false,
});
let exitCode: number | null = null;
let signalCode: NodeJS.Signals | null = null;
let timedOut = false;
try {
await Promise.race([
new Promise<void>((resolve) => {
proc.once('exit', (code, signal) => {
exitCode = code;
signalCode = signal;
resolve();
});
}),
new Promise<void>((resolve) => {
setTimeout(() => {
timedOut = true;
resolve();
}, 10_000);
}),
]);
} finally {
// If the gate didn't fire we have a live Electron — kill it
// hard so the test environment isn't polluted by a running
// app pointed at the host's display.
if (proc.exitCode === null && proc.signalCode === null) {
proc.kill('SIGKILL');
await new Promise<void>((resolve) => {
proc.once('exit', () => resolve());
setTimeout(() => resolve(), 2_000);
});
}
await isolation.cleanup();
}
const elapsedMs = Date.now() - start;
await testInfo.attach('spawn-argv', {
body: JSON.stringify([electronBin, ...argv], null, 2),
contentType: 'application/json',
});
await testInfo.attach('exit-info', {
body: JSON.stringify(
{
exitCode,
signalCode,
timedOut,
elapsedMs,
note:
'Gate fires via process.exit(1) (lib/electron.ts:96-107). ' +
'exitCode=1, signalCode=null is the canonical signature.',
},
null,
2,
),
contentType: 'application/json',
});
if (timedOut) {
throw new Error(
'CDP gate did not fire — app stayed running with ' +
'--remote-debugging-port flag and no auth token, gate may ' +
'have been removed (lib/electron.ts:96-107). The L1 test ' +
'strategy depends on this gate being present.',
);
}
expect(
exitCode,
'gate exits with code 1 (process.exit(1) in index.pre.js)',
).toBe(1);
expect(
signalCode,
'process exited via gate, not killed by signal',
).toBe(null);
});

View File

@@ -0,0 +1,145 @@
import { test, expect } from '@playwright/test';
import { listAsar, readAsarFile, resolveAsarPath } from '../lib/asar.js';
// H02 — frame-fix-wrapper presence (file probe).
//
// The wrapper at scripts/frame-fix-wrapper.js is the linchpin of every
// Linux frame fix (close-to-tray, autostart shim, KWin child-bounds
// jiggle, AZERTY Ctrl+Q). It's injected by patch_app_asar in
// scripts/patches/app-asar.sh:18-49: the script copies the wrapper
// into the asar root, writes a frame-fix-entry.js shim that requires
// it, then rewrites package.json's `main` to point at the shim.
//
// If any of those steps silently breaks (missing source file, asar
// pack failure, package.json rewrite skipped), the app reverts to
// upstream's frameless-window behavior on every Linux row and our
// test harness's hook patterns (CLAUDE.md "Hooking Electron")
// stop matching what's loaded. S09 only covers the quick-window
// patch; nothing else asserts the wrapper landed at all.
//
// Three checks, ordered cheapest-first:
// 1. Both files exist in the asar manifest.
// 2. frame-fix-wrapper.js contains `Proxy(` (the Proxy pattern is
// the entire reason the wrapper works — see CLAUDE.md and
// lib/quickentry.ts:75-81).
// 3. frame-fix-entry.js requires the wrapper.
// 4. package.json's `main` references frame-fix-entry (substring,
// not exact, since patches don't always preserve `.js`).
//
// Pure file probe — no app launch. Fast (<1s). Row-independent.
test('H02 — frame-fix-wrapper.js + frame-fix-entry.js injected into app.asar', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Frame fix wrapper injection',
});
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
// 1. Manifest probe. listAsar returns full paths inside the
// archive (e.g. '/frame-fix-wrapper.js' or 'frame-fix-wrapper.js'
// depending on @electron/asar's normalization). Use endsWith
// so either form matches.
const manifest = listAsar(asarPath);
const frameFixFiles = manifest.filter(
(p) =>
p.endsWith('frame-fix-wrapper.js') ||
p.endsWith('frame-fix-entry.js'),
);
const wrapperPresent = frameFixFiles.some((p) =>
p.endsWith('frame-fix-wrapper.js'),
);
const entryPresent = frameFixFiles.some((p) =>
p.endsWith('frame-fix-entry.js'),
);
await testInfo.attach('frame-fix-files', {
body: JSON.stringify(
{
found: frameFixFiles,
wrapperPresent,
entryPresent,
},
null,
2,
),
contentType: 'application/json',
});
expect(
wrapperPresent,
'frame-fix-wrapper.js is present in app.asar manifest',
).toBe(true);
expect(
entryPresent,
'frame-fix-entry.js is present in app.asar manifest',
).toBe(true);
// 2. Wrapper contents — the Proxy pattern is the load-bearing
// structure (see scripts/frame-fix-wrapper.js:491-506 and
// CLAUDE.md "Frame Fix Wrapper" section). A wrapper without
// a Proxy is a stub that doesn't intercept anything.
const wrapper = readAsarFile('frame-fix-wrapper.js', asarPath);
const proxyPresent = wrapper.includes('Proxy(');
expect(
proxyPresent,
'frame-fix-wrapper.js uses the Proxy() pattern (CLAUDE.md "Frame Fix Wrapper")',
).toBe(true);
// 3. Entry shim — it must require the wrapper, otherwise it's
// not actually loading any of the patches.
const entry = readAsarFile('frame-fix-entry.js', asarPath);
const entryRequiresWrapper =
entry.includes("require('./frame-fix-wrapper") ||
entry.includes('require("./frame-fix-wrapper');
expect(
entryRequiresWrapper,
'frame-fix-entry.js requires ./frame-fix-wrapper',
).toBe(true);
// 4. package.json `main` — patch_app_asar in app-asar.sh:40-49
// rewrites pkg.main to 'frame-fix-entry.js'. Substring match
// on 'frame-fix-entry' tolerates patches that re-extension
// or rename the shim.
const pkgJsonRaw = readAsarFile('package.json', asarPath);
let mainEntry = '';
try {
const parsed = JSON.parse(pkgJsonRaw) as { main?: unknown };
if (typeof parsed.main === 'string') mainEntry = parsed.main;
} catch (err) {
throw new Error(
'package.json in app.asar is not valid JSON: ' +
(err instanceof Error ? err.message : String(err)),
);
}
await testInfo.attach('package-main', {
body: JSON.stringify({ main: mainEntry }, null, 2),
contentType: 'application/json',
});
expect(
mainEntry.includes('frame-fix-entry'),
'package.json `main` references frame-fix-entry (app-asar.sh:40-49)',
).toBe(true);
await testInfo.attach('evidence', {
body: JSON.stringify(
{
wrapperPresent,
entryPresent,
proxyPresent,
entryRequiresWrapper,
mainEntry,
},
null,
2,
),
contentType: 'application/json',
});
});

View File

@@ -0,0 +1,161 @@
import { test, expect } from '@playwright/test';
import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
// H03 — build pipeline patch fingerprints (file probe).
//
// scripts/patches/*.sh layers a stack of regex-based mutations onto
// the bundled JS at build time. Each patch lands a distinctive
// string somewhere in the asar; if a patch silently skips (anchor
// regex misses, idempotency guard short-circuits the wrong way,
// build orchestrator drops the call), that string is absent and
// the patch's behavior is gone.
//
// S09 already covers quick-window.sh. This test consolidates the
// rest into one manifest so future drift is observable in a single
// JSON dump. Fingerprints are pinned to STRINGS THE PATCH INJECTS
// (not strings the patch matches against), so an upstream rename
// of the matched site doesn't false-positive a passing patch.
//
// Pure file probe — no app launch. Fast (<1s). Row-independent.
interface PatchEntry {
patch: string;
fingerprint: string;
file: string;
// One-line note explaining where the fingerprint comes from
// in the patch script — surfaced in the attached manifest so
// future maintainers can tie a failure back to the right
// scripts/patches/*.sh:LINE.
source: string;
}
const MANIFEST: PatchEntry[] = [
{
patch: 'quick-window.sh',
fingerprint: 'XDG_CURRENT_DESKTOP',
file: '.vite/build/index.js',
source:
'patches/quick-window.sh injects an XDG_CURRENT_DESKTOP env-var ' +
'gate; same fingerprint S09 asserts.',
},
{
patch: 'app-asar.sh (frame-fix injection)',
fingerprint: 'frame-fix-entry',
file: 'package.json',
source:
'patches/app-asar.sh:40-49 rewrites package.json main to ' +
"'frame-fix-entry.js'.",
},
{
patch: 'tray.sh (startup-delay nativeTheme guard)',
fingerprint: '_trayStartTime',
file: '.vite/build/index.js',
source:
'patches/tray.sh:67-69 injects `let _trayStartTime=Date.now();` ' +
"into the nativeTheme `on('updated')` handler. Variable name " +
'is unique to our patch — upstream never declares it.',
},
{
patch: 'cowork.sh (Linux daemon quit handler)',
fingerprint: 'cowork-linux-daemon-shutdown',
file: '.vite/build/index.js',
source:
'patches/cowork.sh:602-605 registers a Linux-only quit handler ' +
"with name:'cowork-linux-daemon-shutdown'. Distinctive string " +
'unique to the patch.',
},
{
patch: 'claude-code.sh (Linux platform branch)',
fingerprint: 'linux-arm64',
file: '.vite/build/index.js',
source:
'patches/claude-code.sh:20-24 injects `linux-arm64` / `linux-x64` ' +
'platform-bundle branches into getHostPlatform. Upstream throws ' +
'on Linux; the string is absent without the patch.',
},
];
// TODOs intentionally left where a stable fingerprint isn't easy:
// - tray.sh has multiple sub-patches (icon selection, in-place
// update, menu-bar default). _trayStartTime above covers the
// menu-handler patch reliably; the in-place update patch
// anchors on a generated name like `${TRAY_VAR}.setImage(...)`
// where TRAY_VAR is minifier-renamed every release, so no
// fingerprint there is stable enough to assert without a
// second extraction step. Acceptable: the menu-handler
// fingerprint is upstream of the in-place patch in the same
// subsystem, so a missing _trayStartTime implies a much
// bigger build problem anyway.
test('H03 — build pipeline patch fingerprints present in app.asar', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Build pipeline patch fingerprints',
});
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
// Read each unique file once, then check fingerprints against
// the cached contents. Saves repeated asar extraction for
// patches that share a target file.
const fileCache = new Map<string, string>();
const results: {
patch: string;
fingerprint: string;
file: string;
source: string;
found: boolean;
}[] = [];
for (const entry of MANIFEST) {
let contents = fileCache.get(entry.file);
if (contents === undefined) {
try {
contents = readAsarFile(entry.file, asarPath);
fileCache.set(entry.file, contents);
} catch (err) {
// File missing — record as a "not found" result so
// the manifest dump shows the failure shape rather
// than aborting on the first hiccup.
results.push({
patch: entry.patch,
fingerprint: entry.fingerprint,
file: entry.file,
source:
entry.source +
' [READ ERROR: ' +
(err instanceof Error ? err.message : String(err)) +
']',
found: false,
});
continue;
}
}
results.push({
patch: entry.patch,
fingerprint: entry.fingerprint,
file: entry.file,
source: entry.source,
found: contents.includes(entry.fingerprint),
});
}
// Always attach the manifest — passing tests should still
// surface the verified fingerprints so future drift is visible
// without re-running with -v.
await testInfo.attach('patch-manifest', {
body: JSON.stringify(results, null, 2),
contentType: 'application/json',
});
const missing = results.filter((r) => !r.found);
expect(
missing,
'every expected patch fingerprint is present in the bundled app.asar',
).toEqual([]);
});

View File

@@ -0,0 +1,205 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { sleep } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
const exec = promisify(execFile);
// H04 — cowork daemon spawn / cleanup contract.
//
// docs/learnings/cowork-vm-daemon.md describes the contract that
// patches/cowork.sh implements: the app's auto-launch path
// (cowork.sh:262-362) forks cowork-vm-service.js as a detached
// child on first VM-service connection attempt, and the Linux
// quit handler registered at cowork.sh:584-633 SIGTERMs that
// daemon on app exit. No existing test asserts that contract
// end-to-end. If the auto-launch regresses, the app falls back
// to "VM service not running" errors silently; if the quit
// handler regresses, daemons leak across app sessions and
// pollute the next launch's socket binding.
//
// Shape: pgrep baseline (must be empty after launchClaude's
// cleanupPreLaunch — see lib/electron.ts:160-191), launch with
// isolation, wait for mainVisible, poll for a daemon pid, then
// close + verify cleanup.
//
// The daemon spawn is conditional — cowork.sh:265 anchors on
// 'VM service not running. The service failed to start.' which
// only fires when something in the renderer triggers a VM
// connection. On a freshly-launched app that never hits the
// Cowork tab, the daemon may legitimately not appear within
// the budget. Treat that as `testInfo.skip` rather than a fail.
//
// Row-gated to the same set as the QE tests — daemon is a Linux
// thing, gating mirrors S30.
const PGREP_PATTERN = 'cowork-vm-service\\.js';
async function pgrepPids(pattern: string): Promise<Set<number>> {
try {
const { stdout } = await exec('pgrep', ['-f', pattern], {
timeout: 5_000,
});
return new Set(
stdout
.split('\n')
.map((l) => parseInt(l.trim(), 10))
.filter((n) => !Number.isNaN(n)),
);
} catch (err) {
// pgrep exits 1 with empty stdout when no matches. Treat as
// the empty set; everything else propagates.
const e = err as { code?: number; stdout?: string };
if (e.code === 1) return new Set();
const out = e.stdout ?? '';
return new Set(
out
.split('\n')
.map((l) => parseInt(l.trim(), 10))
.filter((n) => !Number.isNaN(n)),
);
}
}
test.setTimeout(60_000);
test('H04 — cowork daemon spawns under app, exits with app', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Cowork daemon lifecycle',
});
skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Baseline — launchClaude's cleanupPreLaunch (lib/electron.ts:160-191)
// pkills any leftover cowork daemon before spawning, so a stray
// pid here would mean the cleanup itself is broken.
const baselinePids = await pgrepPids(PGREP_PATTERN);
await testInfo.attach('baseline-pids', {
body: JSON.stringify(
{
pids: Array.from(baselinePids),
note:
'cleanupPreLaunch should leave this empty before launch. ' +
'Non-empty here is a bug in lib/electron.ts:160-191.',
},
null,
2,
),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
let daemonPid: number | null = null;
let lingeringPids: number[] = [];
try {
// mainVisible — main shell up; the daemon spawn is gated on
// renderer activity (cowork.sh:262-362) which can begin
// asynchronously after the shell paints. Lower readiness
// levels race the spawn window.
await app.waitForReady('mainVisible');
// Poll up to 15s for a new daemon pid. cowork.sh's auto-
// launch only fires when the renderer attempts a VM service
// connection; on a passive launch (no Cowork tab interaction)
// the daemon may legitimately not appear in this window.
const start = Date.now();
while (Date.now() - start < 15_000) {
const pids = await pgrepPids(PGREP_PATTERN);
const newPids = Array.from(pids).filter(
(p) => !baselinePids.has(p),
);
if (newPids.length > 0) {
daemonPid = newPids[0]!;
break;
}
await sleep(500);
}
if (daemonPid === null) {
await testInfo.attach('skip-reason', {
body: JSON.stringify(
{
reason:
'cowork daemon not spawned within 15s of mainVisible',
note:
'Auto-launch in cowork.sh:262-362 is gated on a VM ' +
'service connection attempt from the renderer; on a ' +
'passive launch with no Cowork-tab interaction it may ' +
'legitimately not fire. Not a regression on its own.',
},
null,
2,
),
contentType: 'application/json',
});
testInfo.skip(
true,
'cowork daemon not spawned by this build — gating in ' +
'cowork.sh:262-362 may have suppressed it on a passive launch',
);
return;
}
await testInfo.attach('daemon-spawned', {
body: JSON.stringify(
{
pid: daemonPid,
elapsedMs: Date.now() - start,
},
null,
2,
),
contentType: 'application/json',
});
} finally {
await app.close();
}
// Quit handler (cowork.sh:584-633) waits up to 10s for the
// daemon to exit after SIGTERM. Give it a 5s settle window —
// graceful exit is the common case, but on a slow runner the
// kill loop's poll cadence (200ms × 50) can stretch. Re-pgrep
// after the wait.
await sleep(5_000);
const postExitPids = await pgrepPids(PGREP_PATTERN);
lingeringPids = Array.from(postExitPids).filter(
(p) => p === daemonPid || !baselinePids.has(p),
);
await testInfo.attach('post-exit-pgrep', {
body: JSON.stringify(
{
baseline: Array.from(baselinePids),
postExit: Array.from(postExitPids),
lingering: lingeringPids,
note:
'Lingering daemon pids after app.close() indicate the ' +
'Linux quit handler in cowork.sh:584-633 did not run, ' +
'or its 10s SIGTERM-then-noop loop completed without ' +
'the daemon actually exiting (escalate to SIGKILL upstream).',
},
null,
2,
),
contentType: 'application/json',
});
expect(
lingeringPids,
'no cowork-vm-service daemon lingers 5s after app.close()',
).toEqual([]);
});

View File

@@ -0,0 +1,356 @@
import { test, expect } from '@playwright/test';
import { spawn, execFile } from 'node:child_process';
import { existsSync, statSync } from 'node:fs';
import { open } from 'node:fs/promises';
import { promisify } from 'node:util';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { mkdtemp, rm } from 'node:fs/promises';
const exec = promisify(execFile);
// S01 — AppImage launches without manual `libfuse2t64` install.
//
// Per docs/testing/cases/distribution.md S01: on Ubuntu 24.04+ the
// project AppImage currently fails with `dlopen(): error loading
// libfuse.so.2` unless the user manually installs `libfuse2t64`.
// The case-doc anchor (scripts/packaging/appimage.sh:226) notes the
// upstream `appimagetool` runtime is bundled as-is — no FUSE shim,
// no postinst dep declaration, no clear error message. CI papers
// over the gap by `apt install libfuse2`-ing before exec
// (.github/workflows/test-artifacts.yml:47).
//
// Assertion shape:
// 1. Locate an AppImage. Skip cleanly if not running from one.
// 2. Spawn the AppImage with a brief grace window. Capture stderr.
// 3. Assert stderr does NOT contain `libfuse.so.2` (or the broader
// `dlopen` failure pattern that the AppImage runtime emits when
// FUSE is missing).
// 4. Kill the proc — we don't need a full launch, just the FUSE
// load attempt which happens before any squashfs mount.
//
// Why a runtime spawn rather than a static probe: the failure mode
// is `dlopen()` of libfuse.so.2 inside the AppImage runtime ELF
// itself, not anything our scripts produce. Only a real spawn on
// the target host exercises that dynamic loader path.
//
// Approach choice: we do NOT use `--appimage-version`. That flag is
// handled by the AppImage runtime BEFORE any FUSE mount, so it
// would exit 0 even on a host missing libfuse2 and silently pass
// the test. Instead we let the runtime reach its mount step, watch
// stderr for the dlopen error (which fires within ~100ms when the
// lib is absent), then kill before the Electron child has a chance
// to persist anything.
//
// Isolation: we spawn with a temp `XDG_CONFIG_HOME` / `HOME`-adjacent
// override so even if Electron does come up briefly before we kill
// it, nothing lands in `~/.config/Claude`.
//
// Row gating: this isn't matrix-row-driven — it's install-method-
// driven. The harness's `ROW` env doesn't carry "is this row's
// install an AppImage?", so we detect at runtime via launcher path
// + magic-byte sniff. Skip when the local install isn't AppImage.
interface AppImageProbeResult {
path: string | null;
reason: string;
}
// AppImages are ELF executables containing a squashfs image with a
// magic header at offset 8: `AI\x02` for type 2 (the format our build
// emits) or `AI\x01` for type 1. The magic is also visible to `file`,
// but ELF + extension + magic is cheap enough to inline.
async function probeAppImagePath(): Promise<AppImageProbeResult> {
const explicit = process.env.CLAUDE_DESKTOP_LAUNCHER;
const candidates: string[] = [];
if (explicit) candidates.push(explicit);
// Fallback search: project test-build dir holds AppImages from
// `./build.sh --build appimage`. Resolve relative to this spec
// so the search works regardless of CWD.
const projectRoot = '/home/aaddrick/source/claude-desktop-debian';
const testBuildDir = `${projectRoot}/test-build`;
if (existsSync(testBuildDir)) {
try {
const fs = await import('node:fs/promises');
const entries = await fs.readdir(testBuildDir);
for (const entry of entries) {
if (entry.endsWith('.AppImage')) {
candidates.push(`${testBuildDir}/${entry}`);
}
}
} catch {
// best-effort
}
}
for (const candidate of candidates) {
if (!existsSync(candidate)) continue;
try {
const st = statSync(candidate);
if (!st.isFile()) continue;
// Quick filename hint: skip the magic-byte read entirely
// for unambiguous .AppImage suffixes.
if (candidate.endsWith('.AppImage')) {
return { path: candidate, reason: 'matched .AppImage suffix' };
}
// Magic-byte sniff: ELF (`\x7fELF`) at offset 0, AppImage
// type marker `AI\x02` at offset 8.
const fh = await open(candidate, 'r');
try {
const buf = Buffer.alloc(12);
await fh.read(buf, 0, 12, 0);
const elf = buf.subarray(0, 4).toString('hex') === '7f454c46';
const aiMagic = buf.subarray(8, 11);
const isAppImage =
elf &&
aiMagic[0] === 0x41 &&
aiMagic[1] === 0x49 &&
(aiMagic[2] === 0x01 || aiMagic[2] === 0x02);
if (isAppImage) {
return {
path: candidate,
reason: 'matched AppImage magic bytes',
};
}
} finally {
await fh.close();
}
} catch {
// fall through to next candidate
}
}
return {
path: null,
reason:
'no AppImage found via CLAUDE_DESKTOP_LAUNCHER or ' +
`${testBuildDir}/*.AppImage`,
};
}
async function captureFuseDpkg(): Promise<string> {
// Best-effort context capture for the case-doc's listed
// "Diagnostics on failure". `dpkg -l` is Debian-only — we still
// run it and let it fail cleanly on RPM hosts (the empty/error
// output is itself diagnostic).
try {
const { stdout, stderr } = await exec(
'sh',
['-c', 'dpkg -l 2>&1 | grep -i fuse || true'],
{ timeout: 5_000 },
);
return `${stdout}${stderr}`.trim() || '(no fuse-related dpkg entries)';
} catch (err) {
const e = err as { stdout?: string; stderr?: string; code?: number };
return (
`dpkg query failed (exit ${e.code ?? '?'})\n` +
`${(e.stdout ?? '').trim()}\n` +
`${(e.stderr ?? '').trim()}`
).trim();
}
}
// Matches the dlopen failure pattern the AppImage runtime prints
// when libfuse2 is missing. The case-doc lists `libfuse.so.2` as the
// canonical token; we also flag the broader `dlopen` + `fuse`
// combination so a future runtime that changes the wording without
// fixing the underlying bug still trips the test.
function fuseFailureFound(stderr: string): { found: boolean; match?: string } {
const lower = stderr.toLowerCase();
if (lower.includes('libfuse.so.2')) {
return { found: true, match: 'libfuse.so.2' };
}
// Both 'dlopen' and 'fuse' on the same line of stderr — wider net
// for future-proofing.
for (const line of stderr.split('\n')) {
const ll = line.toLowerCase();
if (ll.includes('dlopen') && ll.includes('fuse')) {
return { found: true, match: line.trim() };
}
}
return { found: false };
}
test.setTimeout(30_000);
test('S01 — AppImage launches without manual libfuse2t64', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Distribution / AppImage',
});
const probe = await probeAppImagePath();
await testInfo.attach('appimage-probe', {
body: JSON.stringify(probe, null, 2),
contentType: 'application/json',
});
if (!probe.path) {
test.skip(true, `S01 only applies to AppImage installs: ${probe.reason}`);
return;
}
const appimagePath = probe.path;
// Always-on context: dpkg fuse state. Cheap, useful for triage
// regardless of pass/fail.
const dpkgFuse = await captureFuseDpkg();
await testInfo.attach('dpkg-fuse', {
body: dpkgFuse,
contentType: 'text/plain',
});
// Per-test sandbox so a brief Electron child doesn't pollute the
// host's ~/.config/Claude. We don't use launchClaude()'s isolation
// because it spawns the bundled Electron directly (bypassing the
// AppImage runtime's FUSE mount, which is exactly what we're
// trying to exercise here).
const sandboxRoot = await mkdtemp(join(tmpdir(), 'claude-s01-'));
const sandboxConfig = join(sandboxRoot, 'config');
const sandboxHome = join(sandboxRoot, 'home');
let exitCode: number | null = null;
let signalCode: NodeJS.Signals | null = null;
let timedOutBeforeFuseSignal = false;
const stderrChunks: Buffer[] = [];
const stdoutChunks: Buffer[] = [];
const start = Date.now();
try {
const proc = spawn(appimagePath, [], {
cwd: sandboxRoot,
env: {
...process.env,
HOME: sandboxHome,
XDG_CONFIG_HOME: sandboxConfig,
XDG_DATA_HOME: join(sandboxRoot, 'data'),
XDG_CACHE_HOME: join(sandboxRoot, 'cache'),
// Surface FUSE mount errors loudly; the AppImage runtime
// honours this for its diagnostic output.
APPIMAGE_DEBUG: '1',
},
stdio: ['ignore', 'pipe', 'pipe'],
detached: false,
});
proc.stderr?.on('data', (chunk: Buffer) => stderrChunks.push(chunk));
proc.stdout?.on('data', (chunk: Buffer) => stdoutChunks.push(chunk));
// Race three outcomes:
// (a) process exits on its own (FUSE failure exits ~100-300ms)
// (b) we observed a FUSE error in stderr — kill early
// (c) timeout: app probably mounted fine and is starting up,
// in which case absence of FUSE error in stderr is a PASS
const fuseSignal = new Promise<'fuse-error'>((resolve) => {
const checkInterval = setInterval(() => {
const so_far = Buffer.concat(stderrChunks).toString('utf8');
if (fuseFailureFound(so_far).found) {
clearInterval(checkInterval);
resolve('fuse-error');
}
}, 100);
proc.once('exit', () => clearInterval(checkInterval));
});
const exitSignal = new Promise<'exit'>((resolve) => {
proc.once('exit', (code, signal) => {
exitCode = code;
signalCode = signal;
resolve('exit');
});
});
const timeoutSignal = new Promise<'timeout'>((resolve) => {
setTimeout(() => {
timedOutBeforeFuseSignal = true;
resolve('timeout');
}, 8_000);
});
const winner = await Promise.race([
fuseSignal,
exitSignal,
timeoutSignal,
]);
// Whatever happened, kill the process so we don't leave
// Electron running. SIGTERM first, SIGKILL backstop.
if (proc.exitCode === null && proc.signalCode === null) {
proc.kill('SIGTERM');
await Promise.race([
new Promise<void>((resolve) =>
proc.once('exit', (code, signal) => {
exitCode = code;
signalCode = signal;
resolve();
}),
),
new Promise<void>((resolve) => setTimeout(resolve, 3_000)),
]);
if (proc.exitCode === null && proc.signalCode === null) {
proc.kill('SIGKILL');
await new Promise<void>((resolve) => {
proc.once('exit', (code, signal) => {
exitCode = code;
signalCode = signal;
resolve();
});
setTimeout(() => resolve(), 2_000);
});
}
}
await testInfo.attach('race-winner', {
body: winner,
contentType: 'text/plain',
});
} finally {
await rm(sandboxRoot, { recursive: true, force: true }).catch(() => {});
}
const elapsedMs = Date.now() - start;
const stderrFull = Buffer.concat(stderrChunks).toString('utf8');
const stdoutFull = Buffer.concat(stdoutChunks).toString('utf8');
const stderrTail =
stderrFull.length > 4096 ? stderrFull.slice(-4096) : stderrFull;
const stdoutTail =
stdoutFull.length > 4096 ? stdoutFull.slice(-4096) : stdoutFull;
const fuseCheck = fuseFailureFound(stderrFull);
await testInfo.attach('appimage-path', {
body: appimagePath,
contentType: 'text/plain',
});
await testInfo.attach('exit-info', {
body: JSON.stringify(
{
exitCode,
signalCode,
timedOutBeforeFuseSignal,
elapsedMs,
fuseFailureMatch: fuseCheck.match ?? null,
},
null,
2,
),
contentType: 'application/json',
});
await testInfo.attach('stderr-tail-4k', {
body: stderrTail || '(empty)',
contentType: 'text/plain',
});
await testInfo.attach('stdout-tail-4k', {
body: stdoutTail || '(empty)',
contentType: 'text/plain',
});
expect(
fuseCheck.found,
`AppImage stderr should not report a libfuse.so.2 dlopen failure ` +
`(matched: ${fuseCheck.match ?? 'n/a'}). The case-doc S01 ` +
`scenario fails on Ubuntu 24.04 unless libfuse2t64 is manually ` +
`installed; see scripts/packaging/appimage.sh:226 for the ` +
`upstream-runtime-as-is build choice.`,
).toBe(false);
});

View File

@@ -0,0 +1,184 @@
import { test, expect } from '@playwright/test';
import { existsSync, readFileSync } from 'node:fs';
import { join, resolve } from 'node:path';
// S02 — XDG_CURRENT_DESKTOP detection uses substring match.
//
// Backs S02 in docs/testing/cases/distribution.md.
//
// Ubuntu sets XDG_CURRENT_DESKTOP=ubuntu:GNOME (colon-separated,
// distro-prefixed). A naive `== "GNOME"` (or POSIX `= "GNOME"`)
// equality check misses Ubuntu and silently disables every DE-gated
// branch on those rows. The expected pattern is a substring/glob
// match (case-insensitive) over the colon-separated value:
//
// launcher-common.sh:38-44 → desktop="${XDG_CURRENT_DESKTOP,,}"
// [[ "$desktop" == *niri* ]]
// quick-window.sh:34-35 → (process.env.XDG_CURRENT_DESKTOP||"")
// .toLowerCase().includes("kde")
// quick-window.sh:117-118 → same shape, injected into index.js
//
// This is a source-tree regression detector: if a future change
// rewrites either gate to a strict-equality form, the runner trips.
// It does NOT assert the presence of any specific good pattern (the
// case doc anchors describe several different shapes — niri glob,
// KDE includes(), runtime JS gate); it asserts the *absence* of the
// bad ones.
//
// Pure file probe — no app launch. Fast (<1s). Row-independent.
//
// Path resolution probes, in order:
// 1. $CLAUDE_DESKTOP_REPO_ROOT/scripts (override)
// 2. ../../scripts relative to cwd (dev worktree, where the harness
// runs from tools/test-harness/)
// 3. /usr/lib/claude-desktop/scripts (deb/rpm install layout)
// If none resolve, the test skips with a reason.
interface BadHit {
file: string;
line: number;
text: string;
}
function resolveScriptsDir(): string | null {
const env = process.env.CLAUDE_DESKTOP_REPO_ROOT;
if (env) {
const p = join(env, 'scripts');
if (
existsSync(join(p, 'launcher-common.sh')) &&
existsSync(join(p, 'patches', 'quick-window.sh'))
) {
return p;
}
}
// Dev worktree probe — tools/test-harness lives two dirs deep,
// so cwd/../../scripts is the repo's scripts/ when tests are run
// from tools/test-harness/.
const devProbe = resolve(process.cwd(), '..', '..', 'scripts');
if (
existsSync(join(devProbe, 'launcher-common.sh')) &&
existsSync(join(devProbe, 'patches', 'quick-window.sh'))
) {
return devProbe;
}
// Installed path (deb/rpm).
const installedProbe = '/usr/lib/claude-desktop/scripts';
if (
existsSync(join(installedProbe, 'launcher-common.sh')) &&
existsSync(join(installedProbe, 'patches', 'quick-window.sh'))
) {
return installedProbe;
}
return null;
}
// Bad patterns: shell + JS strict-equality forms against
// XDG_CURRENT_DESKTOP. Each regex is intentionally narrow so the
// expected substring/glob shapes don't false-positive:
//
// - Shell `[[ "$XDG_CURRENT_DESKTOP" == "GNOME" ]]` — bash strict
// equality with a *literal* RHS (no glob `*`). The `*niri*`
// glob form is fine and must NOT match.
// - Shell `[ "$XDG_CURRENT_DESKTOP" = "GNOME" ]` — POSIX strict
// equality.
// - JS `process.env.XDG_CURRENT_DESKTOP === "GNOME"` (and `==`).
//
// Each regex captures the variable on either side of the operator
// so `"GNOME" == "$XDG_CURRENT_DESKTOP"` is also caught.
//
// `lowered` form (`"${XDG_CURRENT_DESKTOP,,}" == *niri*`) uses a
// glob and is allowed; the bad-RHS regexes require the literal to
// have no `*` wildcards inside the quotes.
const BAD_PATTERNS: { name: string; re: RegExp }[] = [
{
// bash [[ ... == "literal" ]] with XDG_CURRENT_DESKTOP on
// either side. RHS literal contains no `*` (glob-free).
name: 'bash [[ == ]] strict equality (no glob)',
re: /\[\[[^\]]*\$\{?XDG_CURRENT_DESKTOP[^\]]*==\s*"[^"*]*"[^\]]*\]\]/,
},
{
name: 'bash [[ == ]] strict equality, var on right (no glob)',
re: /\[\[[^\]]*==\s*"\$\{?XDG_CURRENT_DESKTOP[^\]]*\]\]/,
},
{
// POSIX [ ... = "literal" ] with XDG_CURRENT_DESKTOP.
name: 'POSIX [ = ] strict equality',
re: /\[\s+[^]]*\$\{?XDG_CURRENT_DESKTOP[^\]]*=\s*"[^"]*"[^\]]*\]/,
},
{
// JS strict equality (=== or ==) against a string literal.
// Either single or double quotes; either side of the operator.
name: 'JS === / == strict equality',
re: /process\.env\.XDG_CURRENT_DESKTOP\s*===?\s*['"][^'"]*['"]|['"][^'"]*['"]\s*===?\s*process\.env\.XDG_CURRENT_DESKTOP/,
},
];
function scanFile(absPath: string): BadHit[] {
const text = readFileSync(absPath, 'utf8');
const lines = text.split('\n');
const hits: BadHit[] = [];
for (let i = 0; i < lines.length; i++) {
const line = lines[i] ?? '';
// Cheap pre-filter: only check lines mentioning the env var.
if (!line.includes('XDG_CURRENT_DESKTOP')) continue;
for (const { re } of BAD_PATTERNS) {
if (re.test(line)) {
hits.push({
file: absPath,
line: i + 1,
text: line.length > 200 ? line.slice(0, 200) + '…' : line,
});
break;
}
}
}
return hits;
}
test('S02 — XDG_CURRENT_DESKTOP detection uses substring match, not strict ==', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Distribution / desktop detection',
});
const scriptsDir = resolveScriptsDir();
if (!scriptsDir) {
test.skip(
true,
'No accessible scripts/ dir (set CLAUDE_DESKTOP_REPO_ROOT or install deb/rpm)',
);
return;
}
await testInfo.attach('scripts-dir', {
body: scriptsDir,
contentType: 'text/plain',
});
const targets = [
join(scriptsDir, 'launcher-common.sh'),
join(scriptsDir, 'patches', 'quick-window.sh'),
];
await testInfo.attach('files-checked', {
body: JSON.stringify(targets, null, 2),
contentType: 'application/json',
});
const allHits: BadHit[] = [];
for (const t of targets) {
allHits.push(...scanFile(t));
}
await testInfo.attach('bad-pattern-hits', {
body: JSON.stringify(allHits, null, 2),
contentType: 'application/json',
});
expect(
allHits,
// eslint-disable-next-line max-len
'No strict-equality checks against XDG_CURRENT_DESKTOP — ubuntu:GNOME would miss them. Use substring/glob match (case-insensitive) instead.',
).toEqual([]);
});

View File

@@ -0,0 +1,158 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { captureSessionEnv } from '../lib/diagnostics.js';
const exec = promisify(execFile);
// S03 — DEB control file declares runtime dependencies.
//
// Per docs/testing/cases/distribution.md S03:
// Expected: All transitive runtime deps are declared in the package
// and pulled by APT. First launch succeeds without manual `apt
// install` of any extra package.
//
// Code anchor: scripts/packaging/deb.sh:185-197 — the DEBIAN/control
// file emits Package/Version/Section/Priority/Architecture/Maintainer/
// Description fields and **no `Depends:` line**, with the inline
// comment at :181-183 ("No external dependencies are required at
// runtime"). The case-doc treats this as a regression: Critical
// surface, expected contract is "deps declared", current state is
// "deps absent". So this runner is a regression detector — marked
// `test.fail()` while the case-doc gap is open. The expected
// failure reports green; the day `scripts/packaging/deb.sh:185-197`
// emits a `Depends:` line the assertion passes, which flips the
// `.fail()` to red and prompts a case-doc update + `.fail()` removal.
//
// Layer: pure spawn probe. `dpkg-query -W -f='${Depends}'
// claude-desktop` reads the field straight out of dpkg's status db,
// so we don't need to know where the .deb lives in apt's cache or
// how the package was originally fetched.
//
// Skip behaviour: if dpkg-query exits non-zero (no dpkg installed,
// or claude-desktop not in dpkg's db), the package isn't deb-managed
// on this host and S03 has nothing to assert against.
//
// Subtlety on mixed-tooling hosts: a Fedora/RPM box that also has
// `dpkg` installed for cross-distro dev can wind up with a stale
// `claude-desktop` entry in dpkg's status db (matching the field
// shape from a previous deb install). dpkg-query exits 0 in that
// case and we still run the assertion — the field shape we read is
// authoritative for what a current deb install would look like, so
// it's a valid signal even if the binary on PATH is the rpm one.
test.fail('S03 — DEB control file declares runtime dependencies', async (
{},
testInfo,
) => {
testInfo.annotations.push({
type: 'severity',
description: 'Critical',
});
testInfo.annotations.push({
type: 'surface',
description: 'Distribution / DEB packaging',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Read the Depends field from dpkg's status db. If dpkg-query
// itself isn't installed (ENOENT) or the package isn't in the db
// (exit 1), skip — S03 only applies to deb-managed installs.
let dependsField: string;
let pkgVersion = '';
try {
const { stdout } = await exec(
'dpkg-query',
['-W', '-f=${Depends}', 'claude-desktop'],
{ timeout: 5_000 },
);
dependsField = stdout.trim();
} catch (err) {
const e = err as { stderr?: string; code?: number | string };
await testInfo.attach('dpkg-query-error', {
body: JSON.stringify(
{
code: e.code ?? null,
stderr: (e.stderr ?? '').trim(),
},
null,
2,
),
contentType: 'application/json',
});
test.skip(
true,
'S03 only applies to deb-installed claude-desktop ' +
'(dpkg-query missing or package not in dpkg db)',
);
return;
}
// Capture the full Depends payload, version, and resolved binary
// path as evidence regardless of pass/fail. Per Decision 7 these
// are always-on attachments.
try {
const { stdout } = await exec(
'dpkg-query',
['-W', '-f=${Version}', 'claude-desktop'],
{ timeout: 5_000 },
);
pkgVersion = stdout.trim();
} catch {
// Version probe is best-effort — Depends-field result above
// already proves the package is in the db.
}
let installPath = '';
try {
const { stdout } = await exec('which', ['claude-desktop'], {
timeout: 5_000,
});
installPath = stdout.trim();
} catch {
// `which` fails when the launcher isn't on PATH (e.g. dpkg
// has a stale record but the binary's been removed). Capture
// the empty string and let the Depends assertion run.
}
await testInfo.attach('depends-field', {
body: dependsField,
contentType: 'text/plain',
});
await testInfo.attach('package-version', {
body: pkgVersion,
contentType: 'text/plain',
});
await testInfo.attach('install-path', {
body: installPath,
contentType: 'text/plain',
});
await testInfo.attach('evidence', {
body: JSON.stringify(
{
dependsField,
dependsLength: dependsField.length,
packageVersion: pkgVersion,
installPath,
},
null,
2,
),
contentType: 'application/json',
});
// Core S03 assertion. Upstream contract: a Critical-severity
// runtime install pulls all transitive deps via APT, which
// requires the control file to declare them. Empty Depends ==
// regression against scripts/packaging/deb.sh:185-197.
expect(
dependsField,
'DEBIAN/control Depends: field is non-empty per upstream ' +
'contract (case-doc S03 — currently fails until ' +
'scripts/packaging/deb.sh:185-197 emits a Depends line)',
).not.toBe('');
});

View File

@@ -0,0 +1,232 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
const exec = promisify(execFile);
// S04 — RPM install via DNF pulls all required runtime deps.
//
// Mirror of S03 for the RPM/DNF branch. Case-doc:
// docs/testing/cases/distribution.md#s04--rpm-install-via-dnf-pulls-all-required-runtime-deps
//
// Severity: Critical. Surface: DNF repository / dependency
// declarations. Applies to KDE-W, KDE-X, GNOME, Sway, i3, Niri (any
// RPM-based distro).
//
// Case-doc anchors `scripts/packaging/rpm.sh:188` (`AutoReqProv: no`
// disables RPM's auto-dep generation; the spec declares no
// `Requires:`) and `:194-198` (strip + build-id disabled because
// Electron binaries don't tolerate them — bundled approach).
//
// **Regression-detector shape.** The assertion direction is "Requires
// has at least one declared runtime dep" — i.e. at least one line in
// `rpm -qR claude-desktop` that isn't an `rpmlib(...)` capability and
// isn't a `%post`/`%postun` interpreter path (`/bin/sh` etc). Today
// that filter empties out, so the spec is marked `test.fail()` while
// the case-doc gap is open: the expected failure reports green. When
// upstream `rpm.sh` flips `AutoReqProv: on` (or declares an explicit
// `Requires:` block) the assertion passes, which flips the `.fail()`
// to red and prompts a case-doc update + `.fail()` removal.
//
// `rpm -qR` always emits `rpmlib(CompressedFileNames)`,
// `rpmlib(FileDigests)`, `rpmlib(PayloadFilesHavePrefix)`, and
// `rpmlib(PayloadIsZstd)` regardless of spec content — those are
// satisfied by the rpm runtime itself, not by declared deps. Bare
// interpreter paths like `/bin/sh` come from scriptlet detection on
// the spec's `%post` / `%postun`, not from declared library deps.
// Both get filtered out so the assertion is strictly "did anyone
// declare a runtime dep, by hand or via AutoReqProv".
//
// Skip cleanly when:
// - `rpm` isn't on PATH (Debian/Ubuntu host, AppImage-only host).
// - `rpm -q claude-desktop` says the package isn't rpm-installed
// (deb host with rpm tooling for cross-distro dev, AppImage extract).
//
// Layer: spawn probe + stdout parse. No app launch. Row-independent
// in shape, but only meaningful on RPM-based rows.
interface ProbeResult {
cmd: string;
exitCode: number | null;
stdout: string;
stderr: string;
}
async function probe(
bin: string,
args: string[],
): Promise<ProbeResult> {
const cmd = `${bin} ${args.join(' ')}`;
try {
const { stdout, stderr } = await exec(bin, args, {
timeout: 5_000,
});
return {
cmd,
exitCode: 0,
stdout: stdout.trim(),
stderr: stderr.trim(),
};
} catch (err) {
const e = err as {
stdout?: string;
stderr?: string;
code?: number | string;
};
const code =
typeof e.code === 'number'
? e.code
: typeof e.code === 'string'
? null
: null;
return {
cmd,
exitCode: code,
stdout: (e.stdout ?? '').trim(),
stderr: (e.stderr ?? '').trim(),
};
}
}
function formatProbe(p: ProbeResult): string {
const tail = [
p.stdout && `stdout: ${p.stdout}`,
p.stderr && `stderr: ${p.stderr}`,
]
.filter(Boolean)
.join('\n');
return `$ ${p.cmd} (exit ${p.exitCode ?? '?'})\n${tail}`.trim();
}
// `rpm -qR` lines we don't count as "declared runtime deps":
// - `rpmlib(...)` capabilities — auto-emitted by rpm regardless of
// the spec, satisfied by the rpm runtime itself.
// - Bare interpreter paths (`/bin/sh`, `/bin/bash`, `/usr/bin/env`)
// — picked up from the spec's scriptlets (`%post` / `%postun`),
// not from declared library deps.
function isAutoEmittedRequire(line: string): boolean {
const trimmed = line.trim();
if (!trimmed) return true;
if (trimmed.startsWith('rpmlib(')) return true;
// Strip a trailing version constraint ("/bin/sh >= 1.0") before
// matching so the shape is just the capability/path.
const head = trimmed.split(/\s+/)[0] ?? '';
if (
head === '/bin/sh' ||
head === '/bin/bash' ||
head === '/usr/bin/env' ||
head === '/usr/bin/sh' ||
head === '/usr/bin/bash'
) {
return true;
}
return false;
}
test.fail('S04 — RPM package declares runtime requirements', async (
{},
testInfo,
) => {
testInfo.annotations.push({
type: 'severity',
description: 'Critical',
});
testInfo.annotations.push({
type: 'surface',
description: 'DNF repository / dependency declarations',
});
// Skip cleanly on hosts without rpm tooling.
const rpmWhich = await probe('which', ['rpm']);
await testInfo.attach('which-rpm', {
body: formatProbe(rpmWhich),
contentType: 'text/plain',
});
if (rpmWhich.exitCode !== 0 || !rpmWhich.stdout) {
test.skip(
true,
'S04 only applies to rpm-installed claude-desktop ' +
'(rpm not on PATH)',
);
return;
}
// Resolve installed package version. `rpm -q` returns non-zero if
// the package isn't installed via rpm (Debian/AppImage host with
// rpm tooling, etc) — that's the second skip path.
const rpmQ = await probe('rpm', ['-q', 'claude-desktop']);
await testInfo.attach('rpm-q', {
body: formatProbe(rpmQ),
contentType: 'text/plain',
});
if (rpmQ.exitCode !== 0) {
test.skip(
true,
'S04 only applies to rpm-installed claude-desktop ' +
'(rpm -q claude-desktop returned non-zero)',
);
return;
}
// Capture install path for the diagnostics bundle. Failure here
// isn't a skip — `which` not finding `claude-desktop` on a host
// where `rpm -q claude-desktop` succeeds is unusual but harmless
// for the assertion shape.
const whichClaude = await probe('which', ['claude-desktop']);
await testInfo.attach('which-claude-desktop', {
body: formatProbe(whichClaude),
contentType: 'text/plain',
});
const rpmRequires = await probe('rpm', ['-qR', 'claude-desktop']);
await testInfo.attach('rpm-qR', {
body: formatProbe(rpmRequires),
contentType: 'text/plain',
});
expect(
rpmRequires.exitCode,
`rpm -qR claude-desktop must succeed on an rpm-installed host`,
).toBe(0);
const allLines = rpmRequires.stdout
.split('\n')
.map((l) => l.trim())
.filter((l) => l.length > 0);
const declaredRequires = allLines.filter(
(l) => !isAutoEmittedRequire(l),
);
await testInfo.attach('requires-classified', {
body: JSON.stringify(
{
all: allLines,
declared: declaredRequires,
declaredCount: declaredRequires.length,
},
null,
2,
),
contentType: 'application/json',
});
// Core S04 assertion. Per case-doc "Expected": "All transitive
// runtime deps are declared in the RPM and pulled by DNF." A
// non-empty `declaredRequires` is the minimum signal — it doesn't
// prove the *full* set is declared, but it proves the spec moved
// off `AutoReqProv: no` with no manual `Requires:` (the current
// state per scripts/packaging/rpm.sh:188).
//
// Marked `test.fail()` at the test definition: today this fails
// by design (regression-detector state), and the expected failure
// reports green. When scripts/packaging/rpm.sh starts declaring
// runtime deps (manual Requires lines, AutoReqProv flip, or both)
// the assertion passes, which flips `.fail()` to red — the signal
// to update the case-doc and remove the annotation.
expect(
declaredRequires.length,
`rpm -qR claude-desktop should report at least one declared ` +
`runtime requirement (non-rpmlib(...), non-interpreter). ` +
`Currently empty per scripts/packaging/rpm.sh:188 ` +
`(\`AutoReqProv: no\`, no \`Requires:\`).`,
).toBeGreaterThan(0);
});

View File

@@ -0,0 +1,201 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import {
runDoctor,
captureSessionEnv,
} from '../lib/diagnostics.js';
const exec = promisify(execFile);
// S05 — Doctor recognises rpm-installed claude-desktop, doesn't
// false-flag as AppImage.
//
// Per docs/testing/cases/distribution.md S05 (sibling of T13 in
// launch.md — same surface, intentional matrix overlap):
//
// * Steps: on a Fedora/Nobara/RPM-based distro with claude-desktop
// installed via dnf, run `claude-desktop --doctor` and look for the
// install-method line.
// * Expected: doctor detects rpm install (e.g. via `rpm -qf` against
// the binary path) and reports it cleanly. No `not found via dpkg
// (AppImage?)` warning.
// * Currently: scripts/doctor.sh's install-method probe is gated on
// `command -v dpkg-query` and has no `rpm -qf` branch. Case-doc
// anchors the block as :290-299; the actual lines in the file as of
// runner-write time are :353-362 (drift noted, see report). On
// RPM-only hosts (no dpkg-query) the entire block is skipped — no
// install-method line is printed at all. On hosts with both
// dpkg-query installed AND an rpm-installed claude-desktop, the
// `_warn 'claude-desktop not found via dpkg (AppImage?)'` branch
// fires only if dpkg-query comes up empty. (Anecdotally on some
// Fedora hosts dpkg-query returns a stale Version string against
// `claude-desktop` — in that case the PASS path runs and the
// warning is suppressed for the wrong reason, but S05 still
// passes by the letter of the assertion.)
//
// Scope split vs T13:
//
// * T13 (launch.md) covers all rows: detect rpm OR deb, assert no
// false-flag for whichever owns the binary. Skips on AppImage /
// hand-built / undetectable installs.
// * S05 (this file) is RPM-only: skips when `rpm -qf` doesn't claim
// the binary, regardless of whether dpkg owns it. The matrix wants
// both cells filled; the overlap is intentional — S05 fails loudly
// on Fedora rows when T13's broader gating happens to skip (e.g.
// if `rpm -qf` is missing from PATH, T13 falls through to the
// `unknown` branch and skips, while S05 reports skip with the same
// reason but separately).
//
// Layer: spawn probe + stdout grep. Doesn't touch the running app
// instance; doctor is `--doctor`-gated and exits without launching
// Electron.
//
// Diagnostics on failure (per case-doc): full --doctor output,
// `rpm -qf $(which claude-desktop)`, the doctor source line that
// decides the format. Captured unconditionally as attachments so
// post-hoc triage from a JUnit-only run is possible.
const FALSE_FLAG_FRAGMENT = 'not found via dpkg (AppImage?)';
interface ProbeResult {
cmd: string;
exitCode: number | null;
stdout: string;
stderr: string;
}
async function probe(
bin: string,
args: string[],
): Promise<ProbeResult> {
const cmd = `${bin} ${args.join(' ')}`;
try {
const { stdout, stderr } = await exec(bin, args, {
timeout: 5_000,
});
return {
cmd,
exitCode: 0,
stdout: stdout.trim(),
stderr: stderr.trim(),
};
} catch (err) {
const e = err as {
stdout?: string;
stderr?: string;
code?: number;
};
return {
cmd,
exitCode: typeof e.code === 'number' ? e.code : null,
stdout: (e.stdout ?? '').trim(),
stderr: (e.stderr ?? '').trim(),
};
}
}
function formatProbe(p: ProbeResult): string {
const tail = [
p.stdout && `stdout: ${p.stdout}`,
p.stderr && `stderr: ${p.stderr}`,
]
.filter(Boolean)
.join('\n');
return `$ ${p.cmd} (exit ${p.exitCode ?? '?'})\n${tail}`.trim();
}
test('S05 — Doctor recognises rpm install, no dpkg false-flag', async (
{},
testInfo,
) => {
testInfo.annotations.push({
type: 'severity',
description: 'Should',
});
testInfo.annotations.push({
type: 'surface',
description: 'CLI / --doctor',
});
// Applies to RPM-based rows per case-doc (KDE-W, KDE-X, GNOME,
// Sway, i3, Niri). Rather than gating on the ROW env var, gate on
// the actual install method — the assertion has no signal on
// non-rpm hosts regardless of how the matrix labels them.
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const launcher =
process.env.CLAUDE_DESKTOP_LAUNCHER ?? 'claude-desktop';
const whichProbe = await probe('which', [launcher]);
await testInfo.attach('which-claude-desktop', {
body: formatProbe(whichProbe),
contentType: 'text/plain',
});
const installPath =
whichProbe.stdout.split('\n')[0]?.trim() ?? '';
if (whichProbe.exitCode !== 0 || !installPath) {
test.skip(
true,
`claude-desktop not reachable on PATH ` +
`(launcher='${launcher}'); rpm-install probe needs ` +
`a resolvable binary`,
);
return;
}
// Detect rpm install. `rpm -qf` returns 0 + the owning package's
// NEVRA when the file is rpm-managed, non-zero otherwise. We also
// run `rpm -q claude-desktop` to surface the package metadata
// independent of which file `which` resolved (helpful when the
// launcher is a wrapper script that shadows the real binary).
const rpmFile = await probe('rpm', ['-qf', installPath]);
const rpmPkg = await probe('rpm', ['-q', 'claude-desktop']);
await testInfo.attach('rpm-qf', {
body: formatProbe(rpmFile),
contentType: 'text/plain',
});
await testInfo.attach('rpm-q-claude-desktop', {
body: formatProbe(rpmPkg),
contentType: 'text/plain',
});
if (rpmFile.exitCode !== 0) {
// Not rpm-installed. S05's assertion only has signal on RPM
// rows; on deb / AppImage / hand-built / undetectable installs
// this is a clean skip (T13 covers the deb-side mirror).
test.skip(
true,
`S05 only applies to rpm-installed claude-desktop; ` +
`rpm -qf ${installPath} returned ` +
`exit ${rpmFile.exitCode ?? '?'} ` +
`(stderr: ${rpmFile.stderr || '<empty>'})`,
);
return;
}
const result = await runDoctor(launcher);
await testInfo.attach('doctor-output', {
body: result.output,
contentType: 'text/plain',
});
await testInfo.attach('doctor-exit-code', {
body: String(result.exitCode),
contentType: 'text/plain',
});
// Core S05 assertion: doctor must NOT print the dpkg false-flag
// warning for an rpm-installed copy. T02 already asserts the
// exit-code contract (`doctor exits 0`) — don't duplicate that
// here; S05 is purely about the install-method line.
expect(
result.output,
`doctor must not false-flag rpm install ` +
`(${rpmFile.stdout || 'rpm-owned'} at ${installPath}) ` +
`as missing-dpkg AppImage`,
).not.toContain(FALSE_FLAG_FRAGMENT);
});

View File

@@ -0,0 +1,167 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { readPidArgv, argvHasFlag } from '../lib/argv.js';
import { readLauncherLog, captureSessionEnv } from '../lib/diagnostics.js';
import { retryUntil } from '../lib/retry.js';
// S07 — `CLAUDE_USE_WAYLAND=1` opt-in path works.
//
// Backs S07 in docs/testing/cases/shortcuts-and-input.md.
//
// Case-doc anchors:
// scripts/launcher-common.sh:28-29 — `CLAUDE_USE_WAYLAND=1` opt-in
// (sets `use_x11_on_wayland=false`, taking the native-Wayland
// branch in build_electron_args).
// scripts/launcher-common.sh:100-111 — native-Wayland Electron flags:
// `--enable-features=UseOzonePlatform,WaylandWindowDecorations`,
// `--ozone-platform=wayland`, `--enable-wayland-ime`,
// `--wayland-text-input-version=3`, plus `GDK_BACKEND=wayland`.
//
// What this asserts: when the harness's Wayland mode is engaged
// (`CLAUDE_HARNESS_USE_WAYLAND=1`), the spawned Electron's argv
// contains `--ozone-platform=wayland` and `CLAUDE_USE_WAYLAND=1` is
// exported into the spawn env. That mirrors the launcher's
// CLAUDE_USE_WAYLAND=1 branch — same flag set is emitted (see
// LAUNCHER_INJECTED_FLAGS_WAYLAND in src/lib/electron.ts:134-141).
//
// Gating choice — harness-mode vs launcher-script:
//
// The harness deliberately bypasses the launcher script (CDP-gate
// reasons — see lib/electron.ts:102-117), so it constructs its own
// flag set. Setting `extraEnv: { CLAUDE_USE_WAYLAND: '1' }` would
// only affect the child env, not the harness's flag selector. To
// exercise the Wayland branch end-to-end the harness exposes
// `CLAUDE_HARNESS_USE_WAYLAND=1`, which:
// 1. swaps to LAUNCHER_INJECTED_FLAGS_WAYLAND (the same flag
// set the launcher's Wayland branch emits), and
// 2. exports `CLAUDE_USE_WAYLAND=1` + `GDK_BACKEND=wayland` into
// the child env.
//
// This test asserts that contract. When CLAUDE_HARNESS_USE_WAYLAND
// is unset we skip — the harness's X11 default doesn't model the
// CLAUDE_USE_WAYLAND opt-in path. Run the suite with
// `CLAUDE_HARNESS_USE_WAYLAND=1 npx playwright test ...` to
// activate the assertion.
//
// Row gate: native-Wayland-capable rows only. KDE-W is intentionally
// included even though the case-doc Applies-to lists wlroots rows
// (Sway/Niri/Hypr) — KDE Plasma Wayland can also run native Wayland
// when CLAUDE_USE_WAYLAND=1 is set, and KDE-W is the harness's CI
// row, so we want this to be exercisable there.
test.setTimeout(45_000);
test('S07 — CLAUDE_USE_WAYLAND opt-in surfaces in Electron argv', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Display backend / Wayland opt-in',
});
skipUnlessRow(testInfo, [
'Sway',
'Niri',
'Hypr-O',
'Hypr-N',
'GNOME-W',
'KDE-W',
]);
if (process.env.CLAUDE_HARNESS_USE_WAYLAND !== '1') {
test.skip(
true,
'S07 requires CLAUDE_HARNESS_USE_WAYLAND=1 (the harness ' +
'Wayland-mode that mirrors the launcher CLAUDE_USE_WAYLAND ' +
'branch). Re-run with the env set.',
);
return;
}
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
await testInfo.attach('harness-env', {
body: JSON.stringify(
{
CLAUDE_HARNESS_USE_WAYLAND:
process.env.CLAUDE_HARNESS_USE_WAYLAND ?? null,
CLAUDE_USE_WAYLAND: process.env.CLAUDE_USE_WAYLAND ?? null,
},
null,
2,
),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
try {
// Don't waitForX11Window — under native Wayland the app is
// going through Ozone-Wayland directly, no XWayland window
// appears. /proc/$pid/cmdline is populated by exec(), so we
// just need the spawned Electron to stay alive long enough
// to read it. Poll for non-null + non-empty argv.
const argv = await retryUntil(
async () => {
const a = await readPidArgv(app.pid);
return a && a.length > 0 ? a : null;
},
{ timeout: 15_000, interval: 250 },
);
await testInfo.attach('electron-argv', {
body: JSON.stringify(argv, null, 2),
contentType: 'application/json',
});
expect(argv, 'could read /proc/$pid/cmdline').not.toBeNull();
// Launcher log is only populated when the launcher script
// runs; the harness spawns Electron directly. Capture the
// log if it happens to exist (host-leftover from an earlier
// real-launcher run) for diagnostic context only.
const log = await readLauncherLog();
if (log) {
const tail = log.split('\n').slice(-50).join('\n');
await testInfo.attach('launcher-log-tail', {
body: tail,
contentType: 'text/plain',
});
}
const ozoneWayland = argvHasFlag(argv ?? [], '--ozone-platform=wayland');
const useOzone = argvHasFlag(
argv ?? [],
'--enable-features=UseOzonePlatform',
);
await testInfo.attach('flag-presence', {
body: JSON.stringify(
{
'--ozone-platform=wayland': ozoneWayland,
'--enable-features=UseOzonePlatform': useOzone,
note:
'When CLAUDE_HARNESS_USE_WAYLAND=1 the harness ' +
'must emit the same Electron flag set as the ' +
'launcher script\'s CLAUDE_USE_WAYLAND=1 branch.',
},
null,
2,
),
contentType: 'application/json',
});
expect(
ozoneWayland,
'spawned Electron has --ozone-platform=wayland on argv',
).toBe(true);
expect(
useOzone,
'spawned Electron has --enable-features=UseOzonePlatform ' +
'(co-emitted with the wayland ozone flag)',
).toBe(true);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,129 @@
import { test, expect } from '@playwright/test';
import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
import { skipUnlessRow } from '../lib/row.js';
// S08 — Tray rebuild-race fast-path injected (file probe).
//
// Backs the static side of S08 in
// docs/testing/cases/tray-and-window-chrome.md. T03 already covers the
// runtime SNI-count assertion (post-`nativeTheme.themeSource` toggle:
// exactly one StatusNotifierItem stays registered). This spec is the
// complementary build-time fingerprint — verifies that
// `patch_tray_inplace_update` in scripts/patches/tray.sh actually
// landed in the bundled `index.js`, so a silent regex miss in the
// patch script (idempotency guard short-circuits, anchor regex drifts
// against minifier churn, etc.) is observable without having to wait
// for a runtime tray-duplication failure on KDE.
//
// Fingerprint: literal `.setImage(` substring in
// `.vite/build/index.js`.
//
// Why this is load-bearing and stable:
//
// - Pristine upstream (`build-reference/app-extracted/.vite/build/
// index.js`) contains zero `.setImage(` occurrences. The tray
// constructs exclusively via `new <EL>.Tray(<EL>.nativeImage
// .createFromPath(...))` and never re-images in place. (Verified
// by `grep -cE '\.setImage\s*\(' index.js` → 0.)
// - The injected fast-path emitted by `patch_tray_inplace_update`
// (scripts/patches/tray.sh:212-217) calls
// `<TRAY_VAR>.setImage(<EL_VAR>.nativeImage.createFromPath(
// <PATH_VAR>))` — that is the entire point of the fast-path
// (skip destroy + recreate, update the existing Tray's image in
// place so the SNI registration stays put on KDE Plasma).
// - The Electron API name `setImage` is not a minified local —
// it's a method on `Tray.prototype` and stays literal across
// upstream version bumps regardless of the bundler's variable
// renaming. So the fingerprint is robust to the same minifier
// churn that forces tray.sh to extract `tray_var` / `electron_var`
// / `path_var` dynamically.
// - Idempotency marker in tray.sh:174-180 keys on the same literal
// post-rename `setImage(<EL>.nativeImage.createFromPath(<PATH>))`
// sequence; presence of `.setImage(` therefore tracks 1:1 with
// the patch's own self-detection.
//
// Why not the other candidates considered:
//
// - `_trayStartTime`: already covered by H03 for the prior tray.sh
// sub-patch (`patch_tray_menu_handler`). H03's note explicitly
// calls out that the in-place update sub-patch needs its own
// fingerprint, which is what S08 supplies here.
// - `process.platform!=="darwin"`: appears 50+ times in the
// minified bundle (every Electron-on-Linux / -on-Windows
// branch). Not distinctive.
// - `setContextMenu` count >= 2: works (upstream has exactly one
// occurrence; patched bundle has two — fast-path + slow-path),
// but is brittle to any future upstream code that calls
// `setContextMenu` for an unrelated reason. `.setImage(`
// presence-only is stricter and simpler.
//
// Pure file probe — no app launch. Fast (<1s). Row-gated to KDE
// (case-doc Applies-to: KDE-W, KDE-X) since the underlying SNI
// rebuild race only manifests on KDE Plasma's `systemtray` widget;
// other DEs handle UnregisterItem/Register sequencing without the
// duplicate-icon visual artifact, so the fast-path is a should-have
// there but the assertion isn't load-bearing for the row.
test('S08 — Tray rebuild-race fast-path injected (file probe)', async ({}, testInfo) => {
skipUnlessRow(testInfo, ['KDE-W', 'KDE-X']);
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Tray icon / KDE rebuild race',
});
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
const indexJs = readAsarFile('.vite/build/index.js', asarPath);
// `.setImage(` is the patch-injected literal. Match-count is
// surfaced for diagnostics: 0 = patch missed, 1+ = patch landed.
// (We don't pin to exactly 1 — if upstream ever ships a
// legitimate second `.setImage(` site, the patch's fast-path is
// still present and S08 should still pass.)
const setImageCount = (indexJs.match(/\.setImage\s*\(/g) ?? []).length;
const fastPathPresent = setImageCount > 0;
// Bonus diagnostic signal: the slow-path destroy+recreate block
// is preserved by the patch (it stays in place for initial-
// creation and tray-disable cases — see tray.sh:182-188 and
// docs/learnings/tray-rebuild-race.md "The fix"). So a healthy
// patched bundle has >= 1 `setContextMenu` call (slow path) and
// >= 1 `.setImage(` call (fast path). Pristine upstream has
// exactly 1 `setContextMenu` and 0 `.setImage(`.
const setContextMenuCount = (
indexJs.match(/\.setContextMenu\s*\(/g) ?? []
).length;
await testInfo.attach('fingerprint-evidence', {
body: JSON.stringify(
{
file: '.vite/build/index.js',
fingerprint: '.setImage(',
setImageCount,
setContextMenuCount,
fastPathPresent,
source:
'patches/tray.sh:212-217 (patch_tray_inplace_update) ' +
'injects `<TRAY>.setImage(<EL>.nativeImage.' +
'createFromPath(<PATH>))` before the destroy+recreate ' +
'block. Upstream never calls .setImage on the tray, ' +
'so non-zero count == patch landed.',
},
null,
2,
),
contentType: 'application/json',
});
expect(
fastPathPresent,
'app.asar contains the in-place `.setImage(` call injected by ' +
'patch_tray_inplace_update (scripts/patches/tray.sh)',
).toBe(true);
});

View File

@@ -0,0 +1,47 @@
import { test, expect } from '@playwright/test';
import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
// S09 — Quick window patch runs only on KDE (post-#406 gate).
// Backs QE-19 in docs/testing/quick-entry-closeout.md.
//
// The patch in scripts/patches/quick-window.sh injects an
// `(process.env.XDG_CURRENT_DESKTOP||"").toLowerCase().includes("kde")`
// gate into the bundled JS. The string `XDG_CURRENT_DESKTOP` shows up
// in app.asar's index.js if and only if the patch ran at build time.
// The patch ships in every build; the KDE-vs-non-KDE branch is
// decided at runtime by the env-var check.
//
// Pure file probe — no app launch. Fast (<1s).
//
// Runtime gate effectiveness is verified implicitly by S31 passing
// on KDE (popup-show works through the patched code path) and the
// upstream-equivalent path running on non-KDE rows.
test('S09 — Quick window patch runs only on KDE (post-#406 gate)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({ type: 'surface', description: 'Patch gate' });
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
const indexJs = readAsarFile('.vite/build/index.js', asarPath);
// The gate string is the runtime fingerprint of the patch. If the
// patch didn't run, the bundled JS won't contain it.
const gatePresent = indexJs.includes('XDG_CURRENT_DESKTOP');
expect(
gatePresent,
'app.asar contains the XDG_CURRENT_DESKTOP gate string injected by quick-window.sh',
).toBe(true);
// Bonus signal: the patch's idempotency guard. If both are
// present the patch's full payload landed.
const patchedComment = indexJs.includes('kde');
await testInfo.attach('gate-evidence', {
body: JSON.stringify({ gatePresent, patchedComment }, null, 2),
contentType: 'application/json',
});
});

View File

@@ -0,0 +1,122 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { QuickEntry } from '../lib/quickentry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S10 — Quick Entry popup is transparent (no opaque square frame).
// Backs the KDE-W row of S10 in
// docs/testing/cases/shortcuts-and-input.md.
//
// Upstream constructs the popup BrowserWindow with
// transparent: true, backgroundColor: "#00000000", frame: false
// at build-reference index.js:515380, 515383, 515381. On KDE Plasma
// Wayland the compositor honours the alpha channel and the popup
// renders with a transparent background; on broken-Electron versions
// (electron/electron#50213, the 41.0.4-41.x.y bisect window per
// @noctuum on #370) the alpha is dropped and an opaque square frame
// shows behind the rounded prompt UI.
//
// Construction-time options aren't observable through the prototype-
// method hook in lib/quickentry.ts (the Proxy from frame-fix-wrapper
// returns the closure-captured PatchedBrowserWindow on `electron.
// BrowserWindow` reads — see the doc-comment on
// QuickEntry.installInterceptor and CLAUDE.md "Test harness Electron
// hooks" learning). Runtime-side, `getBackgroundColor()` reflects
// what the BrowserWindow was actually constructed with — so we read
// it via getPopupRuntimeProps() and assert
// transparent === true && backgroundColor in {'#00000000','#0000'}
// matching the predicate in lib/quickentry.ts:266.
//
// Gated to KDE-W: other KDE rows (KDE-X) don't have the same
// compositor / Electron-Wayland concern that the case-doc S10
// surfaces. If S10 fails on a host whose bundled Electron is in the
// 41.0.4-41.x.y window, that's the upstream regression — see S33 for
// the version-capture half. Don't wrap in skip on failure; surface
// it as a regression-detector signal.
test.setTimeout(60_000);
test('S10 — Quick Entry popup is transparent (no opaque square frame)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Quick Entry window (KDE Wayland)',
});
skipUnlessRow(testInfo, ['KDE-W']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
await testInfo.attach('isolation', {
body: JSON.stringify(
{
useHostConfig,
configDir: app.isolation?.configDir ?? null,
},
null,
2,
),
contentType: 'application/json',
});
try {
// Main needs to be up before the shortcut can lazily construct
// the popup — the popup-show path reads renderer state via
// upstream's lHn() user-loaded check (see openAndWaitReady's
// retry-loop comment in lib/quickentry.ts).
const { inspector } = await app.waitForReady('mainVisible');
const qe = new QuickEntry(inspector);
await qe.installInterceptor();
// Fire the OS shortcut and wait for the popup BrowserWindow to
// be visible with its textarea mounted — same handshake S29
// uses. If ydotool isn't reachable, openAndWaitReady throws
// the install-instructions error from ensureYdotool — that
// surfaces as a clear test failure (acceptable per the
// case-doc; not wrapped in a skip).
await qe.openAndWaitReady();
const props = await qe.getPopupRuntimeProps();
await testInfo.attach('popup-runtime-props', {
body: JSON.stringify(props, null, 2),
contentType: 'application/json',
});
expect(
props,
'getPopupRuntimeProps returned null — interceptor did not ' +
'capture the popup BrowserWindow ref',
).not.toBeNull();
// Predicate matches lib/quickentry.ts:266 — '#00000000' is the
// canonical 8-digit form Electron returns for the upstream
// construction value, '#0000' is the short form some Electron
// builds normalise to. Either is acceptable.
expect(
props!.backgroundColor === '#00000000'
|| props!.backgroundColor === '#0000',
`popup backgroundColor must be transparent (#00000000 or ` +
`#0000), got ${JSON.stringify(props!.backgroundColor)}. ` +
`If the bundled Electron is in the 41.0.4-41.x.y window ` +
`(see S33), this is the electron#50213 regression ` +
`tracked under issue #370.`,
).toBe(true);
expect(
props!.transparent,
'popup transparent flag (derived from backgroundColor) is ' +
'false — opaque square frame would render behind the ' +
'rounded prompt UI',
).toBe(true);
inspector.close();
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,262 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { QuickEntry } from '../lib/quickentry.js';
import {
focusOtherWindow,
getFocusedWindowId,
spawnMarkerWindow,
WaylandFocusUnavailable,
XdotoolUnavailable,
type MarkerWindow,
} from '../lib/input.js';
import { captureSessionEnv, readLauncherLog } from '../lib/diagnostics.js';
import { sleep } from '../lib/retry.js';
// S11 — Quick Entry shortcut fires from any focus on Wayland
// (mutter XWayland key-grab). Backs the S11 row in
// docs/testing/cases/shortcuts-and-input.md (severity: Critical).
//
// What this catches vs what it doesn't
// ------------------------------------
// The case-doc's load-bearing concern is the GNOME-W mutter
// XWayland key-grab regression — issue #404 — where mutter under
// native Wayland refuses to honour the XWayland-side global key
// grab, so the shortcut becomes focus-bound. This spec CANNOT
// detect that regression: there is no portable focus-injection
// path on native Wayland (each compositor exposes its own IPC
// and the libei input-emulation portal isn't universally
// honored). The lib/input.ts focus-shifter primitive throws
// `WaylandFocusUnavailable` on native Wayland rows by design —
// see its leading comment for the full reasoning. The Wayland-
// side regression detector is a primitive-gap; it stays manual
// until libei adoption broadens.
//
// What this spec DOES catch is a regression in the X11-side of
// the global-shortcut path (the side that currently works on
// GNOME-X / Ubu-X — `🔧` and `✅` respectively in the matrix).
// If the X11 grab broke on those rows, S11 would catch it. So
// this is a regression detector on a CURRENTLY-PASSING path,
// unlike S12 which is a currently-failing detector for the
// `--enable-features=GlobalShortcutsPortal` wiring.
//
// Row gate
// --------
// Case-doc applies-to is "GNOME, Ubu" (both W and X variants),
// but the focus-shifter primitive is X11-only, gated strictly on
// `XDG_SESSION_TYPE === 'x11'`. Wayland rows can't be exercised
// here — they would either skip via the row gate or trip
// `WaylandFocusUnavailable` from the primitive. So the runner's
// row gate is the X11 subset only: GNOME-X, Ubu-X. The Wayland
// rows for S11 stay manual / matrix-cell-from-doc until a
// libei-based primitive lands.
test.setTimeout(60_000);
test('S11 — Quick Entry shortcut fires from any focus (X11 path)', async ({}, testInfo) => {
skipUnlessRow(testInfo, ['GNOME-X', 'Ubu-X']);
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Quick Entry / global shortcut',
});
// Single-shot diagnostic record. We attach this once at the
// end (or on early throw) rather than spreading five separate
// attachments — mirrors S31's results shape so matrix-regen
// has one well-known JSON to scrape per spec.
const diag: {
sessionEnv: Record<string, string>;
markerTitle: string | null;
activeWidBeforeFocus: string | null;
activeWidAfterFocus: string | null;
popupState: unknown;
openError: string | null;
focusError: string | null;
launcherLogTail: string | null;
} = {
sessionEnv: captureSessionEnv(),
markerTitle: null,
activeWidBeforeFocus: null,
activeWidAfterFocus: null,
popupState: null,
openError: null,
focusError: null,
launcherLogTail: null,
};
const attachDiag = async () => {
await testInfo.attach('s11-diagnostics', {
body: JSON.stringify(diag, null, 2),
contentType: 'application/json',
});
};
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
let marker: MarkerWindow | null = null;
try {
// `mainVisible` is the cheapest level that gives us a
// registered global shortcut. Upstream registers via
// globalShortcut.register early in main-process startup
// (build-reference index.js:499416), but we still want
// the main window mapped so the popup-construction path
// has something to anchor to.
const { inspector } = await app.waitForReady('mainVisible');
const qe = new QuickEntry(inspector);
await qe.installInterceptor();
// Capture pre-focus active WID for the diagnostic record.
// On a healthy X11 session this is the Claude main window
// (we just `mainVisible`-readied it). If null, xprop is
// missing or _NET_ACTIVE_WINDOW is unset — neither is a
// blocker for the test, just less useful diagnostics.
diag.activeWidBeforeFocus = await getFocusedWindowId();
// Marker title is unique-per-test to avoid colliding with
// any leftover xterm from a previous run (xterm exits its
// `sleep 600` after 10min so leaks are bounded, but a
// re-run inside that window would otherwise match the
// stale window).
const markerTitle =
`claude-test-s11-marker-${testInfo.testId}-${Date.now()}`;
diag.markerTitle = markerTitle;
try {
marker = await spawnMarkerWindow(markerTitle);
} catch (err) {
// Most likely cause: xterm not on PATH. The primitive
// throws a plain Error with the install hint. Skip
// rather than fail — this is an environment gap.
const msg = err instanceof Error ? err.message : String(err);
diag.focusError = `spawnMarkerWindow: ${msg}`;
await attachDiag();
testInfo.skip(
true,
'xterm not installed; required for the focus-shift target. ' +
`Underlying: ${msg}`,
);
return;
}
// `focusOtherWindow` calls `xdotool search --name <title>`
// once and throws if there are zero matches; only the
// post-focus _NET_ACTIVE_WINDOW verification has its own
// retry. So we need a brief readiness poll for the marker
// window to actually map into the X tree before we attempt
// the focus shift — and the focus shift itself must
// eventually succeed within the budget.
//
// We capture the LAST error (rather than rethrowing on the
// first) so the diagnostic carries the real cause if every
// attempt fails. WaylandFocusUnavailable / XdotoolUnavailable
// are sticky — they won't change between retries — so we
// short-circuit out on the first occurrence and skip.
let focusOk = false;
let lastFocusErr: unknown = null;
let earlySkipReason: string | null = null;
const focusBudgetMs = 5_000;
const focusStart = Date.now();
while (Date.now() - focusStart < focusBudgetMs) {
try {
await focusOtherWindow(markerTitle);
focusOk = true;
break;
} catch (err) {
lastFocusErr = err;
if (err instanceof WaylandFocusUnavailable) {
earlySkipReason =
'WaylandFocusUnavailable on a row that was ' +
'supposed to be X11-gated. Check XDG_SESSION_TYPE.';
break;
}
if (err instanceof XdotoolUnavailable) {
earlySkipReason =
'xdotool not installed; required for the ' +
'focus-shift step. ' +
(err instanceof Error ? err.message : String(err));
break;
}
// "no X11 window matches" (marker not mapped yet) or
// "compositor refused activation" — both can resolve on
// retry. Brief pause then loop.
await sleep(100);
}
}
if (earlySkipReason) {
diag.focusError =
lastFocusErr instanceof Error
? lastFocusErr.message
: String(lastFocusErr);
await attachDiag();
testInfo.skip(true, earlySkipReason);
return;
}
if (!focusOk) {
const msg =
lastFocusErr instanceof Error
? lastFocusErr.message
: String(lastFocusErr);
diag.focusError = msg;
diag.launcherLogTail = await readLauncherLog();
await attachDiag();
throw new Error(
`focusOtherWindow failed within ${focusBudgetMs}ms: ${msg}`,
);
}
// At this point focus is on the marker xterm. Capture the
// post-focus active WID — should equal the marker's WID,
// not Claude's. (We don't have a clean way to fetch the
// marker's WID independently here without re-running
// xdotool; the value-vs-pre comparison in the diagnostic
// is sufficient evidence of the shift.)
diag.activeWidAfterFocus = await getFocusedWindowId();
// Now press the global shortcut. The whole point of S11:
// even though the marker xterm holds focus (and Claude
// does not), the OS-level grab should fire the popup.
try {
await qe.openAndWaitReady();
} catch (err) {
diag.openError = err instanceof Error ? err.message : String(err);
diag.popupState = await qe.getPopupState();
diag.launcherLogTail = await readLauncherLog();
await attachDiag();
throw err;
}
const popupState = await qe.getPopupState();
diag.popupState = popupState;
diag.launcherLogTail = await readLauncherLog();
await attachDiag();
// Single critical assertion: popup exists AND is visible
// after the shortcut press from non-Claude focus. A null
// state means the BrowserWindow was never constructed —
// the X11 grab didn't fire. visible === false means it
// constructed but show() was suppressed (the upstream
// lHn() short-circuit, or a regression in the visibility
// flow). Either is a fail for S11's contract.
expect(
popupState && popupState.visible,
'Quick Entry popup is visible after shortcut press from ' +
'non-Claude focus (X11 path)',
).toBe(true);
} finally {
// Marker xterm cleanup is idempotent. Always run before
// app.close() so the kill happens even if the spec
// throws between the two.
if (marker) {
await marker.kill().catch(() => {
// best-effort — process may already be dead
});
}
await app.close();
}
});

View File

@@ -0,0 +1,95 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { readPidArgv, argvHasFlag } from '../lib/argv.js';
import { readLauncherLog, captureSessionEnv } from '../lib/diagnostics.js';
// S12 — `--enable-features=GlobalShortcutsPortal` launcher flag
// wired up for GNOME Wayland. Backs QE-6 in
// docs/testing/quick-entry-closeout.md.
//
// On GNOME Wayland, mutter no longer honors XWayland-side key grabs,
// so the Quick Entry global shortcut fails from unfocused state
// (#404). The fix is to route global shortcuts through XDG Desktop
// Portal: pass `--enable-features=GlobalShortcutsPortal` to Electron
// from the launcher when XDG_CURRENT_DESKTOP includes GNOME and
// XDG_SESSION_TYPE is wayland.
//
// As of writing, this fix is NOT implemented. The test asserts the
// fix's signature (the flag is in the spawned Electron's argv) and
// will therefore FAIL on GNOME-W until the launcher patch lands.
// That's intentional — it's the regression detector, not a smoke
// test. Once the patch is in, this becomes a Critical green cell.
//
// Row gate: GNOME Wayland only. KDE rows skip with `-`.
test.setTimeout(45_000);
test('S12 — --enable-features=GlobalShortcutsPortal launcher flag wired up for GNOME Wayland', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Launcher flag wiring',
});
skipUnlessRow(testInfo, ['GNOME-W', 'Ubu-W']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
try {
await app.waitForX11Window(15_000);
const argv = await readPidArgv(app.pid);
await testInfo.attach('electron-argv', {
body: JSON.stringify(argv, null, 2),
contentType: 'application/json',
});
expect(argv, 'could read /proc/$pid/cmdline').not.toBeNull();
// Launcher log carries a stable line — see
// scripts/launcher-common.sh:98, 102 — that says which backend
// was selected. Capture it for diagnostic context.
const log = await readLauncherLog();
if (log) {
const tail = log.split('\n').slice(-50).join('\n');
await testInfo.attach('launcher-log-tail', {
body: tail,
contentType: 'text/plain',
});
}
const present = argvHasFlag(
argv ?? [],
'--enable-features=GlobalShortcutsPortal',
);
await testInfo.attach('flag-presence', {
body: JSON.stringify(
{
flag: '--enable-features=GlobalShortcutsPortal',
present,
note:
'On GNOME Wayland this flag must be present for ' +
'#404 to be closeable. Until the launcher patch ' +
'lands, this test fails as a regression detector.',
},
null,
2,
),
contentType: 'application/json',
});
expect(
present,
'--enable-features=GlobalShortcutsPortal is in Electron argv on GNOME Wayland',
).toBe(true);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,266 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { QuickEntry } from '../lib/quickentry.js';
import {
focusOtherWindow,
getFocusedWindowId,
spawnMarkerWindow,
NiriIpcUnavailable,
FootUnavailable,
type MarkerWindow,
} from '../lib/input-niri.js';
import { captureSessionEnv, readLauncherLog } from '../lib/diagnostics.js';
import { sleep } from '../lib/retry.js';
// S14 — Quick Entry shortcut fires from any focus on Niri
// (XDG portal BindShortcuts path). Backs the S14 row in
// docs/testing/cases/shortcuts-and-input.md (severity: Critical
// for Niri users).
//
// What this catches vs what it doesn't
// ------------------------------------
// On Niri the launcher special-cases the app to native Wayland
// (`scripts/launcher-common.sh:41-44`), so upstream's
// `globalShortcut.register` (`index.js:499416`) routes through
// Electron's `xdg-desktop-portal` `BindShortcuts` path inside
// Chromium rather than an X11 grab. The case-doc records this
// path as currently failing on Niri:
// `Failed to call BindShortcuts (error code 5)`. So this spec
// is a known-failing detector — the shape mirrors S12's
// `--enable-features=GlobalShortcutsPortal` GNOME-W detector:
// the assertion encodes the contract, and the test will start
// passing automatically once the upstream / portal-side issue
// is resolved on Niri without any spec edit.
//
// The user-visible symptom (Quick Entry shortcut doesn't fire
// on Niri) is the same as #404 (mutter XWayland key-grab on
// GNOME-W) but the root cause is different: Niri is wlroots
// Wayland with no XWayland by default, so the X11-side
// `lib/input.ts` focus-shifter cannot exercise this path.
// `lib/input-niri.ts` is the substrate — `niri msg --json`
// for the focus-injection + readback chain, `foot --title` for
// the Wayland-native marker window. The mutter / GNOME-W
// regression detector remains a separate primitive gap (libei
// when broadly available, or a per-compositor mutter-IPC
// primitive — neither shipped).
//
// Row gate
// --------
// Niri only. Other Wayland rows (KDE-W, GNOME-W, Ubu-W) each
// need their own compositor IPC and stay manual / matrix-cell-
// from-doc until a libei-based primitive lands.
test.setTimeout(60_000);
test('S14 — Quick Entry shortcut fires from any focus (Niri Wayland path)', async ({}, testInfo) => {
skipUnlessRow(testInfo, ['Niri']);
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'XDG Desktop Portal BindShortcuts',
});
// Single-shot diagnostic record. We attach this once at the
// end (or on early throw) rather than spreading five separate
// attachments — mirrors S31's results shape so matrix-regen
// has one well-known JSON to scrape per spec.
const diag: {
sessionEnv: Record<string, string>;
markerTitle: string | null;
activeWidBeforeFocus: number | null;
activeWidAfterFocus: number | null;
popupState: unknown;
openError: string | null;
focusError: string | null;
launcherLogTail: string | null;
} = {
sessionEnv: captureSessionEnv(),
markerTitle: null,
activeWidBeforeFocus: null,
activeWidAfterFocus: null,
popupState: null,
openError: null,
focusError: null,
launcherLogTail: null,
};
const attachDiag = async () => {
await testInfo.attach('s14-diagnostics', {
body: JSON.stringify(diag, null, 2),
contentType: 'application/json',
});
};
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
let marker: MarkerWindow | null = null;
try {
// `mainVisible` is the cheapest level that gives us a
// registered global shortcut. Upstream registers via
// globalShortcut.register early in main-process startup
// (build-reference index.js:499416), but we still want
// the main window mapped so the popup-construction path
// has something to anchor to.
const { inspector } = await app.waitForReady('mainVisible');
const qe = new QuickEntry(inspector);
await qe.installInterceptor();
// Capture pre-focus active window id for the diagnostic
// record. On a healthy Niri session this is the Claude
// main window (we just `mainVisible`-readied it). If
// null, `niri msg` is unavailable or there is no focused
// window — neither blocks the test, just less useful
// diagnostics.
diag.activeWidBeforeFocus = await getFocusedWindowId();
// Marker title is unique-per-test to avoid colliding with
// any leftover foot from a previous run (foot exits its
// `sleep 600` after 10min so leaks are bounded, but a
// re-run inside that window would otherwise match the
// stale window).
const markerTitle =
`claude-test-s14-marker-${testInfo.testId}-${Date.now()}`;
diag.markerTitle = markerTitle;
try {
marker = await spawnMarkerWindow(markerTitle);
} catch (err) {
// Most likely cause: foot not on PATH. The primitive
// throws `FootUnavailable` with the install hint. Skip
// rather than fail — this is an environment gap.
const msg = err instanceof Error ? err.message : String(err);
diag.focusError = `spawnMarkerWindow: ${msg}`;
await attachDiag();
testInfo.skip(
true,
'foot not installed; required for the focus-shift target. ' +
`Underlying: ${msg}`,
);
return;
}
// `focusOtherWindow` queries `niri msg --json windows`
// once and throws if there are zero matches; only the
// post-focus focused-window verification has its own
// retry. So we need a brief readiness poll for the
// marker window to actually appear in the niri window
// list before we attempt the focus shift — and the focus
// shift itself must eventually succeed within the budget.
//
// We capture the LAST error (rather than rethrowing on
// the first) so the diagnostic carries the real cause if
// every attempt fails. NiriIpcUnavailable / FootUnavailable
// are sticky — they won't change between retries — so we
// short-circuit out on the first occurrence and skip.
let focusOk = false;
let lastFocusErr: unknown = null;
let earlySkipReason: string | null = null;
const focusBudgetMs = 5_000;
const focusStart = Date.now();
while (Date.now() - focusStart < focusBudgetMs) {
try {
await focusOtherWindow(markerTitle);
focusOk = true;
break;
} catch (err) {
lastFocusErr = err;
if (err instanceof NiriIpcUnavailable) {
earlySkipReason =
'NiriIpcUnavailable on a row that was ' +
'supposed to be Niri-gated. Check NIRI_SOCKET / ' +
'`niri msg` availability.';
break;
}
if (err instanceof FootUnavailable) {
earlySkipReason =
'foot not installed; required for the ' +
'focus-shift step. ' +
(err instanceof Error ? err.message : String(err));
break;
}
// "no window matches" (marker not yet listed by
// niri) or "focus-window action did not stick" —
// both can resolve on retry. Brief pause then loop.
await sleep(100);
}
}
if (earlySkipReason) {
diag.focusError =
lastFocusErr instanceof Error
? lastFocusErr.message
: String(lastFocusErr);
await attachDiag();
testInfo.skip(true, earlySkipReason);
return;
}
if (!focusOk) {
const msg =
lastFocusErr instanceof Error
? lastFocusErr.message
: String(lastFocusErr);
diag.focusError = msg;
diag.launcherLogTail = await readLauncherLog();
await attachDiag();
throw new Error(
`focusOtherWindow failed within ${focusBudgetMs}ms: ${msg}`,
);
}
// At this point focus is on the marker foot. Capture the
// post-focus focused-window id — should equal the
// marker's id, not Claude's. (We don't have a clean way
// to fetch the marker's id independently here without
// re-running `niri msg`; the value-vs-pre comparison in
// the diagnostic is sufficient evidence of the shift.)
diag.activeWidAfterFocus = await getFocusedWindowId();
// Now press the global shortcut. The whole point of S14:
// even though the marker foot holds focus (and Claude
// does not), the portal-routed BindShortcuts grab should
// fire the popup. Currently known-failing per case-doc
// S14 (`Failed to call BindShortcuts (error code 5)`).
try {
await qe.openAndWaitReady();
} catch (err) {
diag.openError = err instanceof Error ? err.message : String(err);
diag.popupState = await qe.getPopupState();
diag.launcherLogTail = await readLauncherLog();
await attachDiag();
throw err;
}
const popupState = await qe.getPopupState();
diag.popupState = popupState;
diag.launcherLogTail = await readLauncherLog();
await attachDiag();
// Single critical assertion: popup exists AND is visible
// after the shortcut press from non-Claude focus. A null
// state means the BrowserWindow was never constructed —
// the portal grab didn't fire. visible === false means
// it constructed but show() was suppressed (the upstream
// lHn() short-circuit, or a regression in the visibility
// flow). Either is a fail for S14's contract.
expect(
popupState && popupState.visible,
'Quick Entry popup is visible after shortcut press from ' +
'non-Claude focus (Niri Wayland path)',
).toBe(true);
} finally {
// Marker foot cleanup is idempotent. Always run before
// app.close() so the kill happens even if the spec
// throws between the two.
if (marker) {
await marker.kill().catch(() => {
// best-effort — process may already be dead
});
}
await app.close();
}
});

View File

@@ -0,0 +1,367 @@
import { test, expect } from '@playwright/test';
import { spawn } from 'node:child_process';
import { existsSync, statSync } from 'node:fs';
import { mkdtemp, open, readdir, rm } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
// S15 — AppImage `--appimage-extract` fallback works as documented.
//
// Per docs/testing/cases/distribution.md S15: on FUSE-less hosts the
// AppImage runtime ships an extract fallback. Running the AppImage
// with `--appimage-extract` should drop a `squashfs-root/` next to
// CWD with a working `AppRun` inside, runnable without FUSE. The
// case-doc anchors point at scripts/packaging/appimage.sh:282/:312
// (built with stock `appimagetool`, which always supports
// `--appimage-extract`) and the AppRun script at
// scripts/packaging/appimage.sh:70-118; CI exercises the same path
// (tests/test-artifact-appimage.sh:36-44).
//
// Assertion shape:
// 1. Locate an AppImage. Skip cleanly if not running from one.
// 2. mkdtemp a work dir, spawn `<AppImage> --appimage-extract` with
// that dir as CWD. Assert exit 0.
// 3. Assert `squashfs-root/AppRun` exists.
// 4. Spawn `squashfs-root/AppRun --version` with a 5s timeout. The
// case-doc accepts "exit 0 or doesn't immediately fail" — we
// treat anything that didn't crash with a FUSE/dlopen error
// within the window as a pass; clean exit 0 is the strongest
// signal.
// 5. rm the extracted tree in `finally`.
//
// AppImage detection mirrors S01's inline probe (probe
// CLAUDE_DESKTOP_LAUNCHER, fall back to <repo>/test-build/*.AppImage,
// verify ELF magic + AppImage type marker). Inline rather than
// extracted to a shared lib — only two callers today, and the
// canary-style runners benefit from being decoupled from moving
// helper surfaces.
interface AppImageProbeResult {
path: string | null;
reason: string;
}
// AppImages are ELF executables containing a squashfs image with a
// magic header at offset 8: `AI\x02` for type 2 (the format our build
// emits) or `AI\x01` for type 1.
async function probeAppImagePath(): Promise<AppImageProbeResult> {
const explicit = process.env.CLAUDE_DESKTOP_LAUNCHER;
const candidates: string[] = [];
if (explicit) candidates.push(explicit);
const projectRoot = '/home/aaddrick/source/claude-desktop-debian';
const testBuildDir = `${projectRoot}/test-build`;
if (existsSync(testBuildDir)) {
try {
const entries = await readdir(testBuildDir);
for (const entry of entries) {
if (entry.endsWith('.AppImage')) {
candidates.push(`${testBuildDir}/${entry}`);
}
}
} catch {
// best-effort
}
}
for (const candidate of candidates) {
if (!existsSync(candidate)) continue;
try {
const st = statSync(candidate);
if (!st.isFile()) continue;
if (candidate.endsWith('.AppImage')) {
return { path: candidate, reason: 'matched .AppImage suffix' };
}
const fh = await open(candidate, 'r');
try {
const buf = Buffer.alloc(12);
await fh.read(buf, 0, 12, 0);
const elf = buf.subarray(0, 4).toString('hex') === '7f454c46';
const aiMagic = buf.subarray(8, 11);
const isAppImage =
elf &&
aiMagic[0] === 0x41 &&
aiMagic[1] === 0x49 &&
(aiMagic[2] === 0x01 || aiMagic[2] === 0x02);
if (isAppImage) {
return {
path: candidate,
reason: 'matched AppImage magic bytes',
};
}
} finally {
await fh.close();
}
} catch {
// fall through to next candidate
}
}
return {
path: null,
reason:
'no AppImage found via CLAUDE_DESKTOP_LAUNCHER or ' +
`${testBuildDir}/*.AppImage`,
};
}
interface SpawnResult {
exitCode: number | null;
signalCode: NodeJS.Signals | null;
stdout: string;
stderr: string;
timedOut: boolean;
elapsedMs: number;
}
async function runWithTimeout(
cmd: string,
args: string[],
cwd: string,
timeoutMs: number,
): Promise<SpawnResult> {
const start = Date.now();
const proc = spawn(cmd, args, {
cwd,
env: process.env,
stdio: ['ignore', 'pipe', 'pipe'],
detached: false,
});
const stdoutChunks: Buffer[] = [];
const stderrChunks: Buffer[] = [];
proc.stdout?.on('data', (c: Buffer) => stdoutChunks.push(c));
proc.stderr?.on('data', (c: Buffer) => stderrChunks.push(c));
let exitCode: number | null = null;
let signalCode: NodeJS.Signals | null = null;
let timedOut = false;
await Promise.race([
new Promise<void>((resolve) => {
proc.once('exit', (code, signal) => {
exitCode = code;
signalCode = signal;
resolve();
});
}),
new Promise<void>((resolve) => {
setTimeout(() => {
timedOut = true;
resolve();
}, timeoutMs);
}),
]);
if (proc.exitCode === null && proc.signalCode === null) {
proc.kill('SIGTERM');
await Promise.race([
new Promise<void>((resolve) =>
proc.once('exit', (code, signal) => {
exitCode = code;
signalCode = signal;
resolve();
}),
),
new Promise<void>((resolve) => setTimeout(resolve, 2_000)),
]);
if (proc.exitCode === null && proc.signalCode === null) {
proc.kill('SIGKILL');
await new Promise<void>((resolve) => {
proc.once('exit', (code, signal) => {
exitCode = code;
signalCode = signal;
resolve();
});
setTimeout(() => resolve(), 1_000);
});
}
}
return {
exitCode,
signalCode,
stdout: Buffer.concat(stdoutChunks).toString('utf8'),
stderr: Buffer.concat(stderrChunks).toString('utf8'),
timedOut,
elapsedMs: Date.now() - start,
};
}
function tail(s: string, n: number): string {
if (s.length <= n) return s;
return s.slice(-n);
}
test.setTimeout(60_000);
test('S15 — AppImage --appimage-extract fallback works', async ({}, testInfo) => {
// Case-doc S15 lists Severity: Could. Surface label is the harness
// taxonomy ("Distribution / AppImage extract") rather than the
// case-doc's free-text "AppImage runtime / FUSE-less fallback".
testInfo.annotations.push({ type: 'severity', description: 'Could' });
testInfo.annotations.push({
type: 'surface',
description: 'Distribution / AppImage extract',
});
const probe = await probeAppImagePath();
await testInfo.attach('appimage-probe', {
body: JSON.stringify(probe, null, 2),
contentType: 'application/json',
});
if (!probe.path) {
test.skip(true, `S15 only applies to AppImage installs: ${probe.reason}`);
return;
}
const appImagePath = probe.path;
await testInfo.attach('appimage-path', {
body: appImagePath,
contentType: 'text/plain',
});
// mkdtemp so the extract tree lands in $TMPDIR, not the harness
// CWD. `--appimage-extract` writes `squashfs-root/` relative to
// CWD, so we just spawn with cwd = the temp dir.
const extractDir = await mkdtemp(join(tmpdir(), 'claude-s15-'));
const squashRoot = join(extractDir, 'squashfs-root');
const appRun = join(squashRoot, 'AppRun');
await testInfo.attach('extract-dir', {
body: extractDir,
contentType: 'text/plain',
});
try {
// Step 1: extraction. 30s budget — extracting ~200MB of
// squashfs to disk is well under that on any modern host.
const extract = await runWithTimeout(
appImagePath,
['--appimage-extract'],
extractDir,
30_000,
);
await testInfo.attach('extract-exit', {
body: JSON.stringify(
{
exitCode: extract.exitCode,
signalCode: extract.signalCode,
timedOut: extract.timedOut,
elapsedMs: extract.elapsedMs,
},
null,
2,
),
contentType: 'application/json',
});
await testInfo.attach('extract-stderr-tail-4k', {
body: tail(extract.stderr, 4096) || '(empty)',
contentType: 'text/plain',
});
await testInfo.attach('extract-stdout-tail-4k', {
body: tail(extract.stdout, 4096) || '(empty)',
contentType: 'text/plain',
});
expect(
extract.exitCode,
`AppImage --appimage-extract should exit 0 ` +
`(stderr tail: ${tail(extract.stderr, 256)})`,
).toBe(0);
expect(
extract.signalCode,
'extraction process should not be killed by signal',
).toBe(null);
// Step 2: assert squashfs-root/AppRun exists.
const appRunExists = existsSync(appRun);
await testInfo.attach('apprun-exists', {
body: JSON.stringify(
{
path: appRun,
exists: appRunExists,
squashfsRootExists: existsSync(squashRoot),
},
null,
2,
),
contentType: 'application/json',
});
expect(
appRunExists,
`squashfs-root/AppRun should exist after extract at ${appRun}`,
).toBe(true);
// Step 3: spawn `AppRun --version` with a 5s timeout. AppRun
// is a wrapper script (scripts/packaging/appimage.sh:70-118)
// that hands off to the real Electron entry — `--version`
// is the cheapest probe that exercises the full launch path
// without bringing up a window. The case-doc accepts "exit 0
// or doesn't immediately fail"; a clean exit 0 is best, but
// we also flag obvious FUSE / dlopen errors as failures.
const apprun = await runWithTimeout(
appRun,
['--version'],
squashRoot,
5_000,
);
await testInfo.attach('apprun-exit', {
body: JSON.stringify(
{
exitCode: apprun.exitCode,
signalCode: apprun.signalCode,
timedOut: apprun.timedOut,
elapsedMs: apprun.elapsedMs,
},
null,
2,
),
contentType: 'application/json',
});
await testInfo.attach('apprun-stderr-tail-4k', {
body: tail(apprun.stderr, 4096) || '(empty)',
contentType: 'text/plain',
});
await testInfo.attach('apprun-stdout-tail-4k', {
body: tail(apprun.stdout, 4096) || '(empty)',
contentType: 'text/plain',
});
// Hard fail on the cardinal "didn't run at all" patterns: a
// FUSE / dlopen complaint here would mean the extract path
// ALSO depends on FUSE (which would defeat its purpose).
const stderrLower = apprun.stderr.toLowerCase();
const fuseFailure =
stderrLower.includes('libfuse.so.2') ||
(stderrLower.includes('dlopen') && stderrLower.includes('fuse'));
expect(
fuseFailure,
`AppRun --version stderr should not show a FUSE/dlopen ` +
`failure (the extract fallback exists precisely to avoid ` +
`FUSE). stderr tail: ${tail(apprun.stderr, 256)}`,
).toBe(false);
// Soft acceptance: exit 0 is canonical, but Electron's
// `--version` printer can occasionally exit non-zero on Linux
// when accessory subsystems (sandbox, dbus) are missing while
// still printing the version. Accept exit 0 OR (timed-out
// while still alive AND stdout shows a version string).
const versionLooksOk =
/\d+\.\d+\.\d+/.test(apprun.stdout) ||
/\d+\.\d+\.\d+/.test(apprun.stderr);
const acceptableNonZero = apprun.timedOut && versionLooksOk;
expect(
apprun.exitCode === 0 || acceptableNonZero,
`AppRun --version should exit 0 or print a version before ` +
`timeout. exit=${apprun.exitCode} signal=${apprun.signalCode} ` +
`timedOut=${apprun.timedOut} ` +
`stdoutHasVersion=${versionLooksOk}`,
).toBe(true);
} finally {
await rm(extractDir, { recursive: true, force: true }).catch(() => {});
}
});

View File

@@ -0,0 +1,326 @@
import { test, expect } from '@playwright/test';
import { spawn, execFile } from 'node:child_process';
import { existsSync, statSync } from 'node:fs';
import { open, mkdtemp, rm } from 'node:fs/promises';
import { promisify } from 'node:util';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { retryUntil, sleep } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
const exec = promisify(execFile);
// S16 — AppImage mount cleans up on app exit.
//
// Per docs/testing/cases/distribution.md S16: launching the AppImage
// produces a `/tmp/.mount_claude*` FUSE mount; quitting cleanly should
// remove it. CLAUDE.md "Common Gotchas" documents
// `pkill -9 -f "mount_claude"` as the manual recovery for stale mounts
// after force-quit. The case-doc anchor notes mount lifecycle is owned
// by upstream `appimagetool`'s runtime, not this repo — we assert
// upstream behaviour as a regression detector.
//
// IMPORTANT — `lib/electron.ts:launchClaude()` bypasses the AppImage
// runtime: it spawns the bundled Electron binary directly with
// `app.asar` as an argument (see electron.ts:312-328 + DEFAULT_INSTALL_
// PATHS at :157-166), so no FUSE mount ever appears. Using launchClaude
// here would make the test trivially pass on any host. To exercise the
// real `appimagetool` runtime + FUSE mount path, we spawn the AppImage
// directly via `child_process.spawn`, the same shape as S01.
//
// Readiness signal: rather than waiting for an X11 window (the AppImage
// re-execs itself + spawns Electron children, so `_NET_WM_PID` matching
// against our spawn pid is unreliable), we poll for the `.mount_claude`
// entry to appear in `mount(8)` output — the FUSE mount is the runtime's
// first user-visible side-effect and happens within ~100ms on a healthy
// host. That same signal is what we ultimately assert on, so it
// double-duties as readiness + the post-launch baseline-delta.
const MOUNT_TOKEN = '.mount_claude';
interface AppImageProbeResult {
path: string | null;
reason: string;
}
// Mirrors S01's probe: AppImages are ELF executables with the
// `AI\x02` (type 2) or `AI\x01` (type 1) magic at offset 8.
async function probeAppImagePath(): Promise<AppImageProbeResult> {
const explicit = process.env.CLAUDE_DESKTOP_LAUNCHER;
const candidates: string[] = [];
if (explicit) candidates.push(explicit);
const projectRoot = '/home/aaddrick/source/claude-desktop-debian';
const testBuildDir = `${projectRoot}/test-build`;
if (existsSync(testBuildDir)) {
try {
const fs = await import('node:fs/promises');
const entries = await fs.readdir(testBuildDir);
for (const entry of entries) {
if (entry.endsWith('.AppImage')) {
candidates.push(`${testBuildDir}/${entry}`);
}
}
} catch {
// best-effort
}
}
for (const candidate of candidates) {
if (!existsSync(candidate)) continue;
try {
const st = statSync(candidate);
if (!st.isFile()) continue;
if (candidate.endsWith('.AppImage')) {
return { path: candidate, reason: 'matched .AppImage suffix' };
}
const fh = await open(candidate, 'r');
try {
const buf = Buffer.alloc(12);
await fh.read(buf, 0, 12, 0);
const elf = buf.subarray(0, 4).toString('hex') === '7f454c46';
const aiMagic = buf.subarray(8, 11);
const isAppImage =
elf &&
aiMagic[0] === 0x41 &&
aiMagic[1] === 0x49 &&
(aiMagic[2] === 0x01 || aiMagic[2] === 0x02);
if (isAppImage) {
return {
path: candidate,
reason: 'matched AppImage magic bytes',
};
}
} finally {
await fh.close();
}
} catch {
// fall through
}
}
return {
path: null,
reason:
'no AppImage found via CLAUDE_DESKTOP_LAUNCHER or ' +
`${testBuildDir}/*.AppImage`,
};
}
interface MountSnapshot {
count: number;
lines: string[];
}
async function snapshotClaudeMounts(): Promise<MountSnapshot> {
const { stdout } = await exec('mount', [], { timeout: 5_000 });
const lines = stdout
.split('\n')
.filter((line) => line.includes(MOUNT_TOKEN));
return { count: lines.length, lines };
}
test.setTimeout(60_000);
test('S16 — AppImage mount cleans up on app exit', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Distribution / AppImage mount',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const probe = await probeAppImagePath();
await testInfo.attach('appimage-probe', {
body: JSON.stringify(probe, null, 2),
contentType: 'application/json',
});
if (!probe.path) {
test.skip(true, `S16 only applies to AppImage installs: ${probe.reason}`);
return;
}
const appimagePath = probe.path;
// Baseline: any pre-existing claude mounts on this host. Should be
// zero on a clean host, but if a previous run leaked a mount we
// want to delta against it rather than fail spuriously here.
const baseline = await snapshotClaudeMounts();
await testInfo.attach('baseline-mounts', {
body: JSON.stringify(baseline, null, 2),
contentType: 'application/json',
});
// Per-test sandbox so the briefly-launched Electron child doesn't
// pollute the host's ~/.config/Claude. Same shape as S01 — we
// can't use launchClaude()'s isolation because it bypasses the
// AppImage runtime altogether.
const sandboxRoot = await mkdtemp(join(tmpdir(), 'claude-s16-'));
const sandboxConfig = join(sandboxRoot, 'config');
const sandboxHome = join(sandboxRoot, 'home');
let postLaunch: MountSnapshot | null = null;
let postClose: MountSnapshot | null = null;
let newMountLines: string[] = [];
let proc: ReturnType<typeof spawn> | null = null;
let cleanShutdown = false;
try {
proc = spawn(appimagePath, [], {
cwd: sandboxRoot,
env: {
...process.env,
HOME: sandboxHome,
XDG_CONFIG_HOME: sandboxConfig,
XDG_DATA_HOME: join(sandboxRoot, 'data'),
XDG_CACHE_HOME: join(sandboxRoot, 'cache'),
},
stdio: ['ignore', 'ignore', 'ignore'],
detached: false,
});
if (!proc.pid) {
throw new Error('Failed to spawn AppImage — no pid');
}
// Wait for the FUSE mount to appear. retryUntil polls every
// 200ms; on a healthy host the mount lands in <500ms. 15s is
// generous slack for slow VMs / heavily-loaded hosts.
const mountAppeared = await retryUntil(
async () => {
const snap = await snapshotClaudeMounts();
const fresh = snap.lines.filter(
(line) => !baseline.lines.includes(line),
);
return fresh.length > 0 ? snap : null;
},
{ timeout: 15_000, interval: 200 },
);
if (!mountAppeared) {
// Capture diagnostics before bailing — same shape we'd
// attach on the assertion failure path.
postLaunch = await snapshotClaudeMounts();
await testInfo.attach('post-launch-mounts', {
body: JSON.stringify(postLaunch, null, 2),
contentType: 'application/json',
});
throw new Error(
`AppImage runtime did not produce a ${MOUNT_TOKEN} mount ` +
`within 15s of spawn. Either the runtime failed (check ` +
`for libfuse2 — see S01) or upstream changed the mount ` +
`token.`,
);
}
postLaunch = mountAppeared;
newMountLines = postLaunch.lines.filter(
(line) => !baseline.lines.includes(line),
);
await testInfo.attach('post-launch-mounts', {
body: JSON.stringify(
{
...postLaunch,
newSinceBaseline: newMountLines,
},
null,
2,
),
contentType: 'application/json',
});
// Case-doc step 2: "Quit the app cleanly". `app.close()`-style
// SIGTERM to the AppImage process. Per CLAUDE.md "Common
// Gotchas", killing only the main proc may leave Electron
// children alive holding the mount — so we follow the SIGTERM
// with a `pkill -f mount_claude` SIGKILL backstop if the mount
// hasn't unwound after the settle window.
proc.kill('SIGTERM');
await Promise.race([
new Promise<void>((resolve) => {
proc!.once('exit', () => resolve());
}),
sleep(8_000),
]);
cleanShutdown = proc.exitCode !== null || proc.signalCode !== null;
} finally {
// Whatever happened above, force-clear any leftover claude
// processes so the next test starts clean. This mirrors the
// `pkill -9 -f "mount_claude"` recovery from CLAUDE.md.
if (proc && proc.exitCode === null && proc.signalCode === null) {
try {
proc.kill('SIGKILL');
} catch {
// already dead
}
}
try {
await exec('pkill', ['-9', '-f', 'mount_claude'], {
timeout: 5_000,
});
} catch {
// pkill exits 1 when nothing matches — that's the success
// case for cleanup (the SIGTERM path already worked).
}
await rm(sandboxRoot, { recursive: true, force: true }).catch(() => {});
}
// Post-close: poll for the mount to disappear. Upstream's runtime
// unmounts on its own when all children exit; the case-doc gives
// it ~10s. retryUntil with 200ms polls keeps the typical-case
// settle to ~500ms while leaving headroom for slow hosts.
const cleanedUp = await retryUntil(
async () => {
const snap = await snapshotClaudeMounts();
const lingering = snap.lines.filter(
(line) => !baseline.lines.includes(line),
);
return lingering.length === 0 ? snap : null;
},
{ timeout: 10_000, interval: 200 },
);
postClose = cleanedUp ?? (await snapshotClaudeMounts());
const lingeringMounts = postClose.lines.filter(
(line) => !baseline.lines.includes(line),
);
await testInfo.attach('post-close-mounts', {
body: JSON.stringify(
{
...postClose,
lingeringSinceBaseline: lingeringMounts,
cleanShutdown,
note:
'Lingering mounts after SIGTERM + 10s settle indicate the ' +
'AppImage runtime did not unmount on child exit. CLAUDE.md ' +
'documents `pkill -9 -f "mount_claude"` as the manual ' +
'recovery; this test asserts that the recovery path is ' +
'NOT needed for a clean SIGTERM shutdown.',
},
null,
2,
),
contentType: 'application/json',
});
expect(
newMountLines.length,
`AppImage spawn should produce at least one new ${MOUNT_TOKEN} mount ` +
`(baseline ${baseline.count}, post-launch ${postLaunch?.count ?? 0})`,
).toBeGreaterThan(0);
expect(
lingeringMounts,
`No ${MOUNT_TOKEN} mount should linger after app exit + 10s settle. ` +
`Stale mounts indicate the upstream appimagetool runtime's ` +
`unmount-on-exit handler did not fire (or an Electron child is ` +
`still alive holding the mount — see CLAUDE.md "Killing the app" ` +
`gotcha).`,
).toEqual([]);
});

View File

@@ -0,0 +1,233 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S17 — App launched from `.desktop` inherits shell-profile PATH.
//
// Upstream's shell-path-worker (`shellPathWorker.js`) is forked at
// `app.on('ready')` and runs the user's login shell with `-l -i`,
// printing PATH between sentinels (mac-style env inheritance, now
// applied on Linux too — see index.js:259300 for SLr() / NLr() and
// shellPathWorker.js:205 for extractPathFromShell()).
//
// We launch the app with a deliberately-scrubbed PATH so the
// worker's contribution is visible against a clean baseline. We
// CANNOT just read `process.env.PATH` afterwards: the merge in
// FX() (`index.js:259311`) is gated on `process.env[A] === void 0`,
// so a caller-provided PATH is never overwritten by the worker.
// The bundled f2t module is closure-scoped and not reachable from
// outside.
//
// Workaround: from the inspector we re-fork the same shell-path
// worker via `utilityProcess.fork`, mirroring NLr() exactly, and
// observe the worker's `envResult` message. That gives us the
// worker's resolved PATH directly — same machinery the app uses,
// but with an observable result port.
// Scrubbed baseline: enough system paths for Electron to find its
// helper binaries (zygote, GPU, sandbox shim) but with no user-profile
// entries (`~/.local/bin`, `~/.npm-global/bin`, `~/bin`, `~/.cargo/bin`,
// etc.). Going tighter (e.g. `/usr/bin:/bin`) starves the renderer of
// system tools and the main window never reports visible.
const SCRUBBED_PATH = '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin';
interface WorkerResult {
ok: boolean;
path?: string;
error?: string;
durationMs: number;
}
test('S17 — App inherits shell-profile PATH on `.desktop` invocation', async ({}, testInfo) => {
// App startup (~5-10s) + inspector attach (~1s) + login-shell PATH
// extraction (1-3s; can be 5s on a cold zsh w/ oh-my-zsh) + slack.
test.setTimeout(150_000);
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Shell PATH / shell-path worker',
});
// Worker is gated on SHELL existing + pointing at a real binary
// (`shellPathWorker.js:187` getSafeShell()). On hosts without a
// SHELL we have nothing to assert — skip rather than false-fail.
if (!process.env.SHELL) {
testInfo.skip(true, 'SHELL unset on host — shell-path worker has no shell to fork');
return;
}
await testInfo.attach('host-session-env', {
body: JSON.stringify(
{
...captureSessionEnv(),
SHELL: process.env.SHELL,
HOME: process.env.HOME,
},
null,
2,
),
contentType: 'application/json',
});
await testInfo.attach('scrubbed-path', {
body: SCRUBBED_PATH,
contentType: 'text/plain',
});
const app = await launchClaude({
extraEnv: { PATH: SCRUBBED_PATH },
});
try {
const { inspector } = await app.waitForReady('mainVisible');
// Capture what the main process sees as PATH right after
// startup. By the FX()-merge contract this should equal the
// scrubbed value (caller-provided PATH wins over worker
// merge); we attach it for diagnostic completeness so a
// future regression where the merge starts overwriting is
// visible against this anchor.
const mainProcessPath = await inspector.evalInMain<string>(`
return process.env.PATH || '';
`);
await testInfo.attach('main-process-path', {
body: mainProcessPath,
contentType: 'text/plain',
});
// Fork the shell-path worker the app ships with, mirroring
// NLr() at index.js:259349. utilityProcess.fork + a
// MessageChannelMain pair, init the worker, request
// 'getEnvironment', read back the envResult.PATH. The
// worker runs the user's login shell which can take 1-3s on
// a cold zsh — budget 10s to absorb that plus fork latency.
// One bounded shot, no retry: a worker hang or dead-spawn
// here is a real failure, not a transient.
const workerResult = await inspector.evalInMain<WorkerResult>(
`
const path = process.mainModule.require('node:path');
const fs = process.mainModule.require('node:fs');
const { utilityProcess, MessageChannelMain } =
process.mainModule.require('electron');
const workerPath = path.join(
process.resourcesPath,
'app.asar',
'.vite',
'build',
'shell-path-worker',
'shellPathWorker.js',
);
if (!fs.existsSync(workerPath)) {
return {
ok: false,
error: 'worker not found at ' + workerPath,
durationMs: 0,
};
}
const start = Date.now();
return await new Promise((resolve) => {
let done = false;
const child = utilityProcess.fork(workerPath, [], {
serviceName: 'S17 shell-path probe',
});
const { port1, port2 } = new MessageChannelMain();
const finish = (v) => {
if (done) return;
done = true;
clearTimeout(timer);
try { port1.close(); } catch (_) {}
try { child.kill(); } catch (_) {}
resolve({ ...v, durationMs: Date.now() - start });
};
const timer = setTimeout(() => finish({
ok: false,
error: 'worker probe timed out after 10000ms',
}), 10000);
port1.on('message', (e) => {
if (e.data && e.data.type === 'envResult') {
finish({
ok: true,
path: (e.data.env && e.data.env.PATH) || '',
});
} else if (e.data && e.data.type === 'error') {
finish({ ok: false, error: e.data.message });
}
});
port1.start();
child.once('spawn', () => {
child.postMessage({ type: 'init' }, [port2]);
port1.postMessage({ type: 'getEnvironment' });
});
child.once('exit', (code) => {
finish({
ok: false,
error: 'worker exited before envResult, code=' + code,
});
});
});
`,
15_000,
);
await testInfo.attach('worker-result', {
body: JSON.stringify(workerResult, null, 2),
contentType: 'application/json',
});
expect(
workerResult.ok,
`shell-path worker fork succeeded (error=${workerResult.error})`,
).toBe(true);
const settledPath = workerResult.path ?? '';
await testInfo.attach('settled-path', {
body: settledPath,
contentType: 'text/plain',
});
// Diff the segments so the failure log shows exactly what
// the worker contributed (or didn't).
const scrubbedSet = new Set(SCRUBBED_PATH.split(':'));
const settledSegments = settledPath.split(':').filter(Boolean);
const added = settledSegments.filter((s) => !scrubbedSet.has(s));
await testInfo.attach('path-diff', {
body: JSON.stringify(
{
scrubbed: SCRUBBED_PATH.split(':'),
settled: settledSegments,
added,
},
null,
2,
),
contentType: 'application/json',
});
// If the host's shell rc adds nothing to PATH (clean
// install, no profile customisations) the worker has
// nothing to surface and the assertion below would
// false-fail. Skip with a clear note rather than fail.
if (settledPath === SCRUBBED_PATH || added.length === 0) {
testInfo.skip(
true,
'host shell profile contributes no PATH additions ' +
'beyond the scrubbed baseline — worker has nothing to ' +
'extract on this host',
);
return;
}
expect(
settledPath,
'worker-resolved PATH expanded beyond the scrubbed baseline',
).not.toBe(SCRUBBED_PATH);
expect(
added.length,
'worker added at least one PATH segment from shell profile',
).toBeGreaterThan(0);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,156 @@
import { test, expect } from '@playwright/test';
import { mkdtemp, rm } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { launchClaude } from '../lib/electron.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S19 — `CLAUDE_CONFIG_DIR` redirects scheduled-task storage.
//
// Backs S19 in docs/testing/cases/routines.md.
//
// Case-doc anchors:
// build-reference/app-extracted/.vite/build/index.js:283107 — `cE()`
// resolves `process.env.CLAUDE_CONFIG_DIR ?? ~/.claude` (with a
// `~` / `~/` / `~\` expansion shim).
// build-reference/app-extracted/.vite/build/index.js:283118 — `Tce()`
// returns `${cE()}/scheduled-tasks`, the directory the
// scheduled-tasks substrate writes into.
// build-reference/app-extracted/.vite/build/index.js:488317, :509032 —
// call sites that pass `taskFilesDir: Tce()` into the
// scheduled-tasks substrate.
//
// Tier 2 reframe: the full flow (login + create a scheduled task and
// read its SKILL.md off disk) is Tier 3. Tier 2's slice is the
// env-propagation half:
// confirm `CLAUDE_CONFIG_DIR` from `extraEnv` actually reaches the main
// process's `process.env`. If that contract breaks, `cE()` falls back
// to `~/.claude` and every Tier-3 path-redirection assertion built on
// top of it silently regresses.
//
// We also opportunistically eval the resolver fingerprint inline (the
// same expression `cE()` and `Tce()` compute) and assert the synthetic
// resolved path lives under our test dir. This is a runtime echo, not
// an introspection of the bundled symbols (`cE` / `Tce` are minified
// closure-locals — not reachable from `globalThis`); the static
// fingerprint of those functions is covered by the asar-grep style
// probes (S26 / S27 family). A future regression where the env stops
// propagating shows up as a hard failure here even though the bundled
// resolver is unchanged.
//
// extraEnv-vs-isolation env precedence: `lib/electron.ts` spreads
// `opts.extraEnv` AFTER `isolation?.env` (line ~317-323), so the
// override here wins over the default isolation's
// `CLAUDE_CONFIG_DIR=<tmp>/config/Claude`. Confirmed by reading
// electron.ts before writing this runner.
//
// No row gate — applies to all rows.
interface ResolverProbe {
homedir: string;
envValue: string | null;
resolvedConfigDir: string;
resolvedScheduledTasksDir: string;
}
test.setTimeout(60_000);
test('S19 — CLAUDE_CONFIG_DIR from extraEnv reaches main process', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Could' });
testInfo.annotations.push({
type: 'surface',
description: 'Config dir env var',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Dedicated tmpdir for this test's CLAUDE_CONFIG_DIR override —
// disjoint from the default-isolation tmpdir so a future regression
// where the override path silently falls back to the isolation dir
// is caught (the two paths differ by their tmpdir prefix).
const testDir = await mkdtemp(join(tmpdir(), 'claude-s19-'));
await testInfo.attach('test-config-dir', {
body: testDir,
contentType: 'text/plain',
});
const app = await launchClaude({
extraEnv: { CLAUDE_CONFIG_DIR: testDir },
});
try {
const { inspector } = await app.waitForReady('mainVisible');
// Half 1: env propagation. The bundled `cE()` resolver reads
// `process.env.CLAUDE_CONFIG_DIR` directly — if this doesn't
// equal what we passed in `extraEnv`, every downstream path
// resolution inherits the wrong root.
const observed = await inspector.evalInMain<string | null>(`
return process.env.CLAUDE_CONFIG_DIR ?? null;
`);
await testInfo.attach('observed-claude-config-dir', {
body: observed ?? '(unset)',
contentType: 'text/plain',
});
expect(
observed,
'main process sees CLAUDE_CONFIG_DIR === <test-dir> ' +
'(extraEnv must win over default isolation env)',
).toBe(testDir);
// Half 2: synthetic resolver echo. Re-implement `cE()` /
// `Tce()` in the inspector — same expression the bundled
// code uses, computed against the live main-process env and
// homedir. Captures both the env-propagation fact AND the
// path shape Tce() actually produces, so a future regression
// where someone reroutes scheduled-tasks under a sibling
// folder (e.g. `${cE()}/tasks/`) is visible here.
const probe = await inspector.evalInMain<ResolverProbe>(`
const os = process.mainModule.require('node:os');
const path = process.mainModule.require('node:path');
const envValue = process.env.CLAUDE_CONFIG_DIR ?? null;
const homedir = os.homedir();
const resolveConfigDir = () => {
const e = envValue;
if (
e === '~' ||
(e != null && e.startsWith('~/')) ||
(e != null && e.startsWith('~\\\\'))
) {
return path.join(homedir, e.slice(1));
}
return e ?? path.join(homedir, '.claude');
};
const resolvedConfigDir = resolveConfigDir();
return {
homedir,
envValue,
resolvedConfigDir,
resolvedScheduledTasksDir: path.join(
resolvedConfigDir,
'scheduled-tasks',
),
};
`);
await testInfo.attach('resolver-probe', {
body: JSON.stringify(probe, null, 2),
contentType: 'application/json',
});
expect(
probe.resolvedConfigDir,
'cE()-equivalent resolves to the test dir',
).toBe(testDir);
expect(
probe.resolvedScheduledTasksDir,
'Tce()-equivalent resolves under the test dir',
).toBe(join(testDir, 'scheduled-tasks'));
} finally {
await app.close();
await rm(testDir, { recursive: true, force: true });
}
});

View File

@@ -0,0 +1,75 @@
import { test, expect } from '@playwright/test';
import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
// S21 — Lid-close still suspends per OS policy (absence probe).
//
// S20 covers the positive side: "Keep computer awake" calls
// powerSaveBlocker.start('prevent-app-suspension'), which Electron
// maps to a logind inhibit lock with what='idle:sleep'. S21 is the
// negative complement — the app must NOT install any
// `handle-lid-switch` override, otherwise lid-close stops invoking
// logind's `HandleLidSwitch=suspend` policy.
//
// Per the case-doc Code anchors:
// "no `handle-lid-switch` / `HandleLidSwitch` token anywhere in
// `index.js` (verified via grep -nE 'lid|HandleLidSwitch|handle-lid'
// index.js)"
//
// We assert the lowercase D-Bus form (`handle-lid-switch`) and the
// systemd-config form (`HandleLidSwitch`) are both absent. If either
// surfaces in a future bundle that's a regression worth flagging:
// Electron exposes both as inhibit-what tokens (D-Bus side) and
// logind property names (config side), and any mention in the bundle
// implies the app started reasoning about lid behavior on its own.
//
// Pure file probe — no app launch. Fast (<1s). Row-independent
// (applies to all laptop hosts; desktops still pass trivially since
// the bundle is identical across rows).
test('S21 — App does not handle lid-switch (file probe / absence)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Suspend inhibitor scope',
});
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
const indexJs = readAsarFile('.vite/build/index.js', asarPath);
// Two absence checks — the D-Bus inhibit-what form (lowercase,
// hyphenated) and the systemd-logind config-property form. The
// case-doc grep covers both.
const lowerForm = 'handle-lid-switch';
const upperForm = 'HandleLidSwitch';
const lowerPresent = indexJs.includes(lowerForm);
const upperPresent = indexJs.includes(upperForm);
await testInfo.attach('lid-switch-probe', {
body: JSON.stringify(
{
file: '.vite/build/index.js',
checks: [
{ needle: lowerForm, present: lowerPresent },
{ needle: upperForm, present: upperPresent },
],
},
null,
2,
),
contentType: 'application/json',
});
expect(
lowerPresent,
'no `handle-lid-switch` string in bundle (lid-close defers to OS)',
).toBe(false);
expect(
upperPresent,
'no `HandleLidSwitch` string in bundle (lid-close defers to OS)',
).toBe(false);
});

View File

@@ -0,0 +1,97 @@
import { test, expect } from '@playwright/test';
import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
// S22 — Computer-use toggle is absent or visibly disabled on Linux.
//
// This spec is the **Tier 1 file-level fingerprint** for S22. The
// full surface check — actually walking Settings → Desktop app →
// General and asserting the toggle either doesn't render or renders
// disabled with a "not supported on Linux" hint — is Tier 3 (AX-tree
// form) and lives elsewhere. Here we only verify the upstream
// platform-gate string still exists in the bundle: if it disappears
// or starts including "linux", the gate has changed shape and any
// downstream UI assertion is built on sand.
//
// Per the case-doc Code anchor (platform-integration.md S22):
// `qDA = new Set(["darwin", "win32"])` excludes Linux from the
// computer-use platform set; `TF()` (the master enable check)
// short-circuits to false when `qDA.has(process.platform)` is
// false.
//
// The minified identifier (`qDA` here) rotates between releases —
// we DON'T pin it. Instead we match the stable shape:
// /new Set\(\[\s*"darwin"\s*,\s*"win32"\s*\]\)/
// which tolerates both the no-space minified form
// (`new Set(["darwin","win32"])`) and the with-space beautified form
// (`new Set(["darwin", "win32"])`) the same way our patch-script
// regexes have to.
//
// We also assert the literal `"linux"` is NOT in the same Set
// expression — a positive-shape match ensures Linux stays excluded
// even if upstream re-orders the platform list.
//
// Pure file probe — no app launch. Fast (<1s). Row-independent.
const PLATFORM_SET_RE =
/new Set\(\[\s*"darwin"\s*,\s*"win32"\s*\]\)/;
test('S22 — Computer-use platform gate excludes linux (file probe)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Computer use / platform gate',
});
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
const indexJs = readAsarFile('.vite/build/index.js', asarPath);
const match = indexJs.match(PLATFORM_SET_RE);
const found = match !== null;
// If the 2-element gate was widened to include "linux", that's a
// real behavior change — flag it. We sniff for a 2-element
// `new Set([...])` that pairs "linux" with darwin or win32, which
// would mean upstream swapped one of the existing platforms for
// linux at the gate level.
//
// Note: a 3-element Set `["darwin","win32","linux"]` exists
// elsewhere in the bundle for an unrelated feature (telemetry /
// platform-allowlist scope), so we don't flag that shape here —
// the computer-use gate is specifically the 2-element one per
// the case-doc anchor.
const linuxPairedRe =
/new Set\(\[\s*"(?:linux"\s*,\s*"(?:darwin|win32)|(?:darwin|win32)"\s*,\s*"linux)"\s*\]\)/;
const linuxPaired = linuxPairedRe.test(indexJs);
await testInfo.attach('platform-gate-probe', {
body: JSON.stringify(
{
file: '.vite/build/index.js',
regex: PLATFORM_SET_RE.source,
found,
matchSnippet: match ? match[0] : null,
linuxPaired,
},
null,
2,
),
contentType: 'application/json',
});
expect(
found,
'app.asar contains a `new Set(["darwin","win32"])` platform ' +
'gate (computer-use excludes Linux)',
).toBe(true);
expect(
linuxPaired,
'no 2-element `new Set([..., "linux", ...])` platform gate ' +
'exists (would mean upstream re-enabled computer-use ' +
'on Linux)',
).toBe(false);
});

View File

@@ -0,0 +1,207 @@
import { test, expect } from '@playwright/test';
import { join } from 'node:path';
import { launchClaude } from '../lib/electron.js';
import { createIsolation, type Isolation } from '../lib/isolation.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S25 — Mobile pairing survives Linux session restart (Tier 2 slice).
//
// Full S25 (case-doc platform-integration.md:250) is a Tier 3 mobile-
// pairing flow needing a paired phone. The Linux-side persistence
// half is independently testable: upstream caches the trusted-device
// token via `safeStorage.encryptString` (libsecret on Linux) so a
// successful pair survives restart without re-enrolling. The
// load-bearing contract on Linux is "encrypt-decrypt round-trip is
// stable across an Electron process restart against the system
// keyring backend." That's what this runner exercises.
//
// Code anchors (case-doc S25):
// - index.js:511984 — ZEe = "coworkTrustedDeviceToken" electron-
// store key for the trusted-device token.
// - index.js:511989 — oYn() writes via safeStorage.encryptString
// (libsecret on Linux); aYn() (:512003) decrypts on read.
// - index.js:512022 — gYn() re-enrolls via POST /api/auth/
// trusted_devices only when there's no cached token.
//
// Approach: bypass electron-store entirely. The store is incidental —
// what's load-bearing is that the keyring resolves the same encryption
// key between launches. We:
// 1. Fresh isolation handle (clean state — no seedFromHost; this
// isn't an auth test).
// 2. Launch 1, check safeStorage.isEncryptionAvailable() (skip if
// false — common on headless rows / no keyring backend).
// 3. Encrypt a known plaintext via safeStorage.encryptString, write
// the ciphertext bytes to ${configDir}/test-token.bin, close.
// 4. Launch 2, read ${configDir}/test-token.bin, decrypt via
// safeStorage.decryptString, assert decrypted text equals
// plaintext.
// 5. Cleanup the isolation handle (we own it — passing it to
// launchClaude doesn't transfer ownership).
//
// Why compare decrypted plaintext, not ciphertext: safeStorage on
// Linux uses libsecret-derived AES-128 with random IVs, so the same
// plaintext yields different ciphertext on re-encrypt. The round-
// trip is the contract — ciphertext equality isn't.
const PLAINTEXT = 'S25-trusted-device-token-' + Date.now();
const TOKEN_FILE_NAME = 'test-token.bin';
// Two launches at ~60s each plus settle / waitForReady budget.
test.setTimeout(180_000);
test('S25 — safeStorage token round-trip survives app restart', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Dispatch pairing persistence',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Fresh isolation, shared across both launches. No seedFromHost —
// the keyring backend is process-scoped, not config-scoped, so a
// signed-out clean isolation still exercises the same code path.
const isolation: Isolation = await createIsolation();
const tokenFile = join(isolation.configDir, TOKEN_FILE_NAME);
let encryptionAvailable = false;
let cipherLen = 0;
try {
// Launch 1: encrypt + write.
const app1 = await launchClaude({ isolation });
try {
const { inspector } = await app1.waitForReady('mainVisible');
encryptionAvailable = await inspector.evalInMain<boolean>(`
const { safeStorage } = process.mainModule.require('electron');
return safeStorage.isEncryptionAvailable();
`);
await testInfo.attach('encryption-available-launch1', {
body: JSON.stringify({ encryptionAvailable }, null, 2),
contentType: 'application/json',
});
if (!encryptionAvailable) {
testInfo.skip(
true,
'safeStorage.isEncryptionAvailable() === false — no ' +
'keyring backend on this row (libsecret/kwallet/' +
'gnome-keyring not running, or running headless)',
);
return;
}
// Encrypt + write to tokenFile. base64-encode the ciphertext
// for transport across the inspector boundary (evalInMain
// returns JSON, and Buffers serialize as { type, data } —
// base64 in/out is simpler and lossless).
const writeResult = await inspector.evalInMain<{
cipherLen: number;
path: string;
}>(`
const { safeStorage } = process.mainModule.require('electron');
const fs = process.mainModule.require('node:fs');
const cipher = safeStorage.encryptString(${JSON.stringify(PLAINTEXT)});
fs.mkdirSync(${JSON.stringify(isolation.configDir)}, {
recursive: true,
});
fs.writeFileSync(${JSON.stringify(tokenFile)}, cipher);
return { cipherLen: cipher.length, path: ${JSON.stringify(tokenFile)} };
`);
cipherLen = writeResult.cipherLen;
await testInfo.attach('encrypt-and-write', {
body: JSON.stringify(
{
plaintextPreview: PLAINTEXT,
tokenFile: writeResult.path,
cipherLen,
},
null,
2,
),
contentType: 'application/json',
});
// Sanity check: in-session round-trip. Catches the case where
// safeStorage reports available but the backend is broken
// (e.g. locked keyring with no unlock prompt). Without this,
// a backend failure would surface as a launch-2 read error
// that's harder to distinguish from a cross-restart break.
const inSessionRoundTrip = await inspector.evalInMain<string>(`
const { safeStorage } = process.mainModule.require('electron');
const fs = process.mainModule.require('node:fs');
const cipher = fs.readFileSync(${JSON.stringify(tokenFile)});
return safeStorage.decryptString(cipher);
`);
expect(
inSessionRoundTrip,
'in-session encrypt+decrypt round-trip works',
).toBe(PLAINTEXT);
inspector.close();
} finally {
await app1.close();
}
// Launch 2: read + decrypt with the same isolation handle.
const app2 = await launchClaude({ isolation });
let decrypted: string | null = null;
try {
const { inspector } = await app2.waitForReady('mainVisible');
const stillAvailable = await inspector.evalInMain<boolean>(`
const { safeStorage } = process.mainModule.require('electron');
return safeStorage.isEncryptionAvailable();
`);
await testInfo.attach('encryption-available-launch2', {
body: JSON.stringify({ stillAvailable }, null, 2),
contentType: 'application/json',
});
expect(
stillAvailable,
'safeStorage still available on launch 2',
).toBe(true);
decrypted = await inspector.evalInMain<string>(`
const { safeStorage } = process.mainModule.require('electron');
const fs = process.mainModule.require('node:fs');
const cipher = fs.readFileSync(${JSON.stringify(tokenFile)});
return safeStorage.decryptString(cipher);
`);
await testInfo.attach('decrypt-after-restart', {
body: JSON.stringify(
{
tokenFile,
cipherLen,
decrypted,
match: decrypted === PLAINTEXT,
},
null,
2,
),
contentType: 'application/json',
});
inspector.close();
} finally {
await app2.close();
}
expect(
decrypted,
'safeStorage.decryptString returned a value after restart',
).not.toBeNull();
expect(
decrypted,
'decrypted plaintext matches what was written before restart — ' +
'keyring backend resolved the same encryption key across ' +
'process restart',
).toBe(PLAINTEXT);
} finally {
await isolation.cleanup();
}
});

View File

@@ -0,0 +1,265 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
const exec = promisify(execFile);
// S26 — Auto-update is disabled when installed via apt/dnf.
//
// Per docs/testing/cases/distribution.md S26:
// Expected: when installed via the project's APT or DNF repo, the
// in-app auto-update path is suppressed. The app does not download
// replacement binaries (which would race the package manager).
// Updates flow through `apt upgrade` / `dnf upgrade` only. AppImage
// installs may continue to self-update or punt to the user.
//
// The case-doc explicitly flags this as **Missing in build 1.5354.0**:
// no project-side suppression of the upstream auto-update path exists.
// The launcher exports `ELECTRON_FORCE_IS_PACKAGED=true`
// (scripts/launcher-common.sh:249), upstream's Linux gate (`lii()` at
// build-reference/.../index.js:508761-508774) returns true, and the
// code path proceeds to `hA.autoUpdater.setFeedURL(...)` +
// `.checkForUpdates()` unconditionally. The only reason it doesn't
// hit the network today is Electron's Linux `autoUpdater` being
// unimplemented — a happy accident, not a contract. Tracked at
// https://github.com/aaddrick/claude-desktop-debian/issues/567 with
// two candidate fixes (frame-fix-wrapper hook vs. gating
// ELECTRON_FORCE_IS_PACKAGED on package format).
//
// **Regression-detector shape.** This runner pins the current state
// of the bundle so the failing assertion flips to passing the moment
// the project ships a suppression patch (PR #567 or successor):
//
// 1. Sanity assertion (passes today): `setFeedURL` is present in
// the bundled main-process JS. This proves the upstream
// auto-update code path we'd need to suppress is actually in
// the bundle — without it, the rest of the test would be
// vacuously true.
//
// 2. Suppression assertion (fails today): a project-injected
// suppression marker is present in the bundle. No such marker
// exists yet. The expected fingerprint shape (per the
// issue-#567 thread) is one of:
// - `cdd-disable-auto-update` — an injected comment / sentinel
// string we'd add alongside a no-op patch.
// - `frame-fix-wrapper`-side autoUpdater interception — would
// live in scripts/frame-fix-wrapper.js (not the asar JS
// itself), but the wrapper module is already covered by H02
// for general presence.
// - A `disableAutoUpdates: !0`-shaped override in the bundle
// coming from a new patch in scripts/patches/*.sh.
// We probe for any of these and require at least one to land.
// When a suppression patch ships, update MARKERS below with the
// actual fingerprint so this assertion stays a working drift
// detector instead of becoming a stale TODO.
//
// **Skip behaviour.** Case-doc scopes this to "all DEB/RPM rows" —
// AppImage installs are explicitly carved out ("AppImage installs
// may continue to self-update or punt to the user"). We detect deb
// or rpm install via `dpkg-query -W claude-desktop` and `rpm -q
// claude-desktop`; if neither succeeds, we skip. On hosts where
// both succeed (mixed-tooling dev box), we run — the assertion
// shape is purely about what's in the bundle, not about which
// package manager owns the on-disk binary.
//
// Layer: pure file probe (asar read) + spawn probes for install
// detection. No app launch.
interface ProbeResult {
cmd: string;
exitCode: number | null;
stdout: string;
stderr: string;
}
async function probe(
bin: string,
args: string[],
): Promise<ProbeResult> {
const cmd = `${bin} ${args.join(' ')}`;
try {
const { stdout, stderr } = await exec(bin, args, {
timeout: 5_000,
});
return {
cmd,
exitCode: 0,
stdout: stdout.trim(),
stderr: stderr.trim(),
};
} catch (err) {
const e = err as {
stdout?: string;
stderr?: string;
code?: number | string;
};
const code =
typeof e.code === 'number' ? e.code : null;
return {
cmd,
exitCode: code,
stdout: (e.stdout ?? '').trim(),
stderr: (e.stderr ?? '').trim(),
};
}
}
// Candidate suppression-marker fingerprints. None present today;
// any one of these going green flips the assertion to passing. When
// PR #567 (or its successor) lands, prune this list down to the
// actual marker so the test is a clean drift detector going forward.
//
// We deliberately don't match `disableAutoUpdates` alone — that
// string is ALREADY in the bundle as the enterprise-policy MDM key
// (index.js:140737, :140830 etc), so its presence proves nothing.
// The markers below are shapes that only appear if the project
// injected them.
const SUPPRESSION_MARKERS: { needle: string; rationale: string }[] = [
{
needle: 'cdd-disable-auto-update',
rationale:
'sentinel comment a future scripts/patches/*.sh would ' +
'inject alongside a no-op autoUpdater patch',
},
{
needle: 'cdd-no-auto-update',
rationale:
'alternative sentinel shape consistent with ' +
'cdd-cowork-* / cdd-tray-* naming used elsewhere',
},
{
needle: 'autoUpdater is disabled by claude-desktop-debian',
rationale:
'human-readable log line a frame-fix-wrapper.js-side ' +
'autoUpdater no-op hook would emit on first call',
},
];
test('S26 — Auto-update is disabled when installed via apt/dnf', async (
{},
testInfo,
) => {
testInfo.annotations.push({
type: 'severity',
description: 'Critical',
});
testInfo.annotations.push({
type: 'surface',
description: 'Distribution / auto-update suppression',
});
// Detect install method. S26 only applies to deb/rpm-installed
// hosts per case-doc "Applies to: All DEB/RPM rows".
const dpkgProbe = await probe('dpkg-query', [
'-W',
'-f=${Version}',
'claude-desktop',
]);
const rpmProbe = await probe('rpm', ['-q', 'claude-desktop']);
await testInfo.attach('install-probes', {
body: JSON.stringify(
{
dpkg: {
cmd: dpkgProbe.cmd,
exitCode: dpkgProbe.exitCode,
stdout: dpkgProbe.stdout,
stderr: dpkgProbe.stderr,
},
rpm: {
cmd: rpmProbe.cmd,
exitCode: rpmProbe.exitCode,
stdout: rpmProbe.stdout,
stderr: rpmProbe.stderr,
},
},
null,
2,
),
contentType: 'application/json',
});
const debInstalled = dpkgProbe.exitCode === 0 && !!dpkgProbe.stdout;
const rpmInstalled = rpmProbe.exitCode === 0 && !!rpmProbe.stdout;
const installMethod = debInstalled
? 'deb'
: rpmInstalled
? 'rpm'
: 'none';
await testInfo.attach('install-method', {
body: installMethod,
contentType: 'text/plain',
});
if (!debInstalled && !rpmInstalled) {
test.skip(
true,
'S26 only applies to deb/rpm-installed claude-desktop ' +
'(case-doc scopes to APT/DNF rows; AppImage installs ' +
'are explicitly carved out)',
);
return;
}
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
const indexJs = readAsarFile('.vite/build/index.js', asarPath);
// Sanity assertion: the upstream autoUpdater code path is in the
// bundle. If `setFeedURL` ever disappears (upstream rewrite,
// module rename), this whole test is vacuous and should be
// re-grounded against the new shape before re-asserting on the
// suppression direction.
const setFeedURLCount = (
indexJs.match(/setFeedURL/g) ?? []
).length;
// Probe each candidate suppression marker.
const markerResults = SUPPRESSION_MARKERS.map((m) => ({
needle: m.needle,
rationale: m.rationale,
found: indexJs.includes(m.needle),
}));
const anyMarkerFound = markerResults.some((r) => r.found);
await testInfo.attach('bundle-evidence', {
body: JSON.stringify(
{
file: '.vite/build/index.js',
setFeedURLOccurrences: setFeedURLCount,
suppressionMarkers: markerResults,
anyMarkerFound,
},
null,
2,
),
contentType: 'application/json',
});
expect(
setFeedURLCount,
'app.asar contains the upstream `setFeedURL` autoUpdater code ' +
'path (sanity check — the thing S26 expects suppressed). ' +
'If this drops to 0 the test is vacuous; re-ground against ' +
'the new bundle shape.',
).toBeGreaterThan(0);
// Core S26 assertion. Today: fails by design — no project-side
// suppression has shipped (#567 open). Flips to passing once a
// suppression patch lands and one of SUPPRESSION_MARKERS matches.
expect(
anyMarkerFound,
'app.asar contains a project-injected auto-update suppression ' +
'marker (deb/rpm installs must not race the package ' +
'manager). Currently absent per case-doc S26 / issue #567 ' +
'— upstream autoUpdater is unhooked on Linux, suppression ' +
'is "accidental" and depends on Electron leaving Linux ' +
'autoUpdater unimplemented.',
).toBe(true);
});

View File

@@ -0,0 +1,121 @@
import { test, expect } from '@playwright/test';
import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
// S27 — Plugins install per-user, not into system paths (file probe).
//
// Tier 1 file-level signal that plugin storage is rooted under the
// user's `~/.claude` tree, not under `/usr/share/...` or any other
// system-managed prefix. The full Tier 3 form — install a plugin
// end-to-end and `find /usr -newer /tmp/marker -name '*claude*'` to
// prove nothing landed system-wide — still lives in the case doc as
// a manual step. This spec catches the failure mode where an
// upstream refactor switches the resolver to a system path; the
// runtime install would still need a Tier 3 to catch a path that
// only diverges once the install actually runs.
//
// Three fingerprints, all targeting the SAME plugin storage code
// path documented in extensibility.md S27 anchors at
// `:283107` cE() and `:465815` dx() / `:465821` `installed_plugins.json`:
//
// 1. `installed_plugins.json` is in the bundle. This is the
// idempotency record that `dx()` (= `cE() + "/plugins"`) sits
// atop. Sibling assertion to T11; same surface, narrower claim.
// 2. The bundle contains a homedir+".claude" resolver pattern
// (matches `cE()` at :283107 — `homedir(), ".claude"` paired
// string-literally regardless of the minified function name).
// Anchors the per-user claim independent of `cE()`'s rotating
// identifier.
// 3. The bundle contains NO `/usr/share/claude/plugins`,
// `/usr/share/claude-desktop/plugins`, `/etc/claude/plugins`,
// or `/var/lib/claude/plugins` strings. (`/etc/claude-code`
// and `/etc/claude/vertex-sa.json` exist for unrelated
// subsystems — managed-settings lookup at :465788 and Vertex
// AI fallback at :139930 — neither is the plugin store. The
// forbidden list is scoped to `*/plugins` to avoid matching
// those.)
//
// The "per-user" claim is structural: (1) confirms the bundle ships
// plugin storage at all, (2) confirms the resolver is homedir-based,
// (3) rules out the obvious system-path alternatives. Together they
// pin the Tier 1 surface; runtime confirmation stays in the case doc.
//
// Pure file probe — no app launch. Fast (<1s). Row-independent.
test('S27 — plugins install path resolves to ~/.claude, not system paths', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Plugin install / per-user storage',
});
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
const indexJs = readAsarFile('.vite/build/index.js', asarPath);
// (1) Plugin storage is in the bundle at all. `installed_plugins.json`
// is the idempotency record `dx()` writes — sibling fingerprint
// to T11 plugin-install, narrower claim here.
const installedPluginsRecord = indexJs.includes(
'installed_plugins.json',
);
// (2) Homedir-based resolver pattern — matches `cE()` at :283107
// (`homedir(), ".claude"`) without depending on the minified
// function name `cE`, which rotates every release. The regex
// tolerates the import alias on `os` (`yi` / `zc` etc.) by
// anchoring on the call shape `<ident>.homedir(),<ws>".claude"`.
const homedirResolverRe = /\.homedir\(\)\s*,\s*"\.claude"/;
const homedirResolverPresent = homedirResolverRe.test(indexJs);
// (3) No system-path plugin store. Scoped to `*/plugins` so that
// unrelated /etc/claude-code (managed-settings) and
// /etc/claude/vertex-sa.json (Vertex AI fallback) don't trip
// this — neither is on the plugin install code path.
const FORBIDDEN_SYSTEM_PATHS = [
'/usr/share/claude/plugins',
'/usr/share/claude-desktop/plugins',
'/usr/lib/claude/plugins',
'/usr/lib/claude-desktop/plugins',
'/usr/local/share/claude/plugins',
'/etc/claude/plugins',
'/etc/claude-desktop/plugins',
'/var/lib/claude/plugins',
'/opt/Claude/plugins',
'/opt/claude-desktop/plugins',
];
const systemPathHits = FORBIDDEN_SYSTEM_PATHS.filter((p) =>
indexJs.includes(p),
);
await testInfo.attach('s27-evidence', {
body: JSON.stringify(
{
installedPluginsRecord,
homedirResolverPresent,
homedirResolverRegex: homedirResolverRe.source,
systemPathHits,
forbiddenChecked: FORBIDDEN_SYSTEM_PATHS,
},
null,
2,
),
contentType: 'application/json',
});
expect(
installedPluginsRecord,
'app.asar contains `installed_plugins.json` (plugin storage record at extensibility.md S27 anchor :465821)',
).toBe(true);
expect(
homedirResolverPresent,
'app.asar contains a `homedir(), ".claude"` resolver pattern (cE() at extensibility.md S27 anchor :283107)',
).toBe(true);
expect(
systemPathHits,
'app.asar contains no `*/plugins` system-path strings (S27 per-user-only invariant)',
).toEqual([]);
});

View File

@@ -0,0 +1,161 @@
import { test, expect } from '@playwright/test';
import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S28 — Worktree creation surfaces clear error on read-only mounts
// (file-probe form).
//
// Per docs/testing/cases/extensibility.md S28: when a project sits on
// a read-only mount and the user tries to start a parallel session,
// worktree creation must fail with a clear error pointing at the
// read-only mount — no silent loss, no parent-repo corruption. The
// case-doc anchor (`build-reference/.../index.js:462760` `Sbn()`) is
// the classifier that buckets the underlying git error into
// `"permission-denied"` for the read-only-mount taxonomy.
//
// **Tier reclassification.** A Tier 2 inspector-eval against `Sbn()`
// with a synthetic error would be the natural shape. In practice `Sbn`
// is a closure-local
// in the bundled main process — not reachable from the inspector
// without an IPC surface that calls into it, and no such surface is
// exposed by the case-doc anchors. So we drop one tier further: a
// pure asar fingerprint that pins the classifier's input strings and
// output bucket together with the worktree-failure log line they're
// wired into. If upstream reshapes the classifier (renames the bucket,
// drops one of the input matches, or unwires the worktree path from
// the bucketing call), this test fails — which is exactly the drift
// signal the higher-tier form would catch via a synthetic error.
//
// The full Tier 3 surface — actual read-only mount, parallel session,
// dialog text scrape — stays in the case doc as a manual repro.
//
// Fingerprint shape (single regex matches all four strings together
// in the same `Sbn()` return expression, identifier-agnostic):
//
// <id>.includes("Permission denied") ||
// <id>.includes("Access is denied") ||
// <id>.includes("could not lock config file")
// ? "permission-denied"
//
// where `<id>` is `e` in the beautified source but rotates between
// releases. We anchor on the call shape and the literal strings, not
// the identifier. Whitespace is tolerated to handle both the
// minified runtime form and the beautified build-reference form.
//
// Sibling assertion: the `Failed to create git worktree:` log line
// (case-doc anchor :462928, `R.error("Failed to create git worktree:
// …")`) is present in the same file. This is the call site whose
// error string Sbn() classifies — without it, the classifier exists
// in isolation and the contract S28 cares about (read-only mount →
// permission-denied bucket on the worktree creation path) is broken.
//
// Pure file probe — no app launch. Fast (<1s). Row-independent.
const PERMISSION_DENIED_CLASSIFIER_RE =
/(\w+)\.includes\(\s*"Permission denied"\s*\)\s*\|\|\s*\1\.includes\(\s*"Access is denied"\s*\)\s*\|\|\s*\1\.includes\(\s*"could not lock config file"\s*\)\s*\?\s*"permission-denied"/;
const WORKTREE_FAILURE_LOG_RE =
/Failed to create git worktree:/;
test('S28 — worktree permission-denied classifier wired to git worktree failure path (file probe)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Could' });
testInfo.annotations.push({
type: 'surface',
description: 'Worktree permission classifier',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
const indexJs = readAsarFile('.vite/build/index.js', asarPath);
// (1) Classifier shape — all three input strings + the
// "permission-denied" output bucket appear in the same
// expression. The single regex enforces clustering: the three
// `<id>.includes(...)` calls are joined by `||` and resolve to
// `"permission-denied"` in the same ternary, so we don't need a
// separate proximity window check — the regex IS the cluster
// condition.
const classifierMatch = indexJs.match(PERMISSION_DENIED_CLASSIFIER_RE);
const classifierFound = classifierMatch !== null;
// Surrounding context for the diagnostic attachment — ~200 chars
// either side of the match so a future failure shows what the
// upstream-reshaped classifier looks like.
let classifierContext: string | null = null;
if (classifierMatch && classifierMatch.index !== undefined) {
const start = Math.max(0, classifierMatch.index - 200);
const end = Math.min(
indexJs.length,
classifierMatch.index + classifierMatch[0].length + 200,
);
classifierContext = indexJs.slice(start, end);
}
// (2) The classifier's call site — the `Failed to create git
// worktree:` log line at case-doc anchor :462928. Without this,
// the classifier exists in isolation and S28's contract
// (read-only mount → permission-denied bucket on the worktree
// creation path) is unwired.
const worktreeFailureLogPresent =
WORKTREE_FAILURE_LOG_RE.test(indexJs);
// (3) Sanity: the bucket name itself appears in the bundle. This
// is implied by (1) but we surface it as a separate count so a
// future failure that drops only the regex match is
// distinguishable from one that drops the bucket entirely.
const bucketOccurrences = (
indexJs.match(/"permission-denied"/g) ?? []
).length;
await testInfo.attach('s28-evidence', {
body: JSON.stringify(
{
file: '.vite/build/index.js',
classifierRegex: PERMISSION_DENIED_CLASSIFIER_RE.source,
classifierFound,
classifierMatchSnippet: classifierMatch
? classifierMatch[0]
: null,
classifierContext,
worktreeFailureLogRegex: WORKTREE_FAILURE_LOG_RE.source,
worktreeFailureLogPresent,
permissionDeniedBucketOccurrences: bucketOccurrences,
},
null,
2,
),
contentType: 'application/json',
});
expect(
classifierFound,
'app.asar contains the permission-denied classifier shape ' +
'(`<id>.includes("Permission denied") || ... || ' +
'<id>.includes("could not lock config file") ? ' +
'"permission-denied"`) per extensibility.md S28 anchor :462760',
).toBe(true);
expect(
worktreeFailureLogPresent,
'app.asar contains the `Failed to create git worktree:` log ' +
'line (extensibility.md S28 anchor :462928) — the call site ' +
'whose error string the classifier buckets',
).toBe(true);
expect(
bucketOccurrences,
'app.asar contains the `"permission-denied"` bucket name (sanity ' +
'check — implied by classifier match but surfaced separately ' +
'so a future regression can distinguish a regex-shape change ' +
'from a bucket rename)',
).toBeGreaterThan(0);
});

View File

@@ -0,0 +1,183 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { QuickEntry, MainWindow } from '../lib/quickentry.js';
import type { InspectorClient } from '../lib/inspector.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S29 — Quick Entry popup is created lazily on first shortcut press
// (closed-to-tray sanity), and the BrowserWindow is reused across
// subsequent presses. Backs QE-4 in
// docs/testing/quick-entry-closeout.md.
//
// Upstream constructs the popup BrowserWindow lazily on first
// shortcut invocation (`if (!Ko || ...) Ko = new BrowserWindow(...)`
// near index.js:515375), so the popup does not need a pre-existing
// main window. This test verifies that when the main window has
// been hidden-to-tray (no window mapped on the desktop), the
// shortcut still successfully creates and shows the popup.
//
// Reuse half: after the first press constructs Ko, every later press
// must hit `Ko.show()` rather than `new BrowserWindow(...)`. The
// interceptor records every `loadFile` call, so a fresh
// construction would push a SECOND entry into `__qeWindows` matching
// the popup selector. We assert the count stays at 1 across the
// hide / re-press cycle. See lib/quickentry.ts:215 for the
// "Ko stays alive" comment.
//
// Subset of S31's QE-9 case but standalone for the closeout matrix
// — S31 covers submit-side correctness, this covers popup-creation
// correctness.
test.setTimeout(60_000);
test('S29 — Quick Entry popup is created lazily on first shortcut press (closed-to-tray sanity)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Quick Entry popup lifecycle',
});
skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
try {
// Wait for main to fully load before hiding it. Without this,
// the inspector probe races the initial `show()` and the
// state we capture isn't representative.
const { inspector } = await app.waitForReady('mainVisible');
const qe = new QuickEntry(inspector);
const mainWin = new MainWindow(inspector);
await qe.installInterceptor();
// Hide-to-tray. Project's frame-fix-wrapper turns the X-button
// close into hide(); we replicate that explicitly so the test
// doesn't depend on simulating window-manager close.
await mainWin.setState('hide');
const hiddenState = await mainWin.getState();
await testInfo.attach('main-state-after-hide', {
body: JSON.stringify(hiddenState, null, 2),
contentType: 'application/json',
});
expect(
hiddenState && !hiddenState.visible,
'main window is not visible after hide-to-tray',
).toBe(true);
// Confirm popup does NOT yet exist (we never triggered the
// shortcut). This is the lazy-creation precondition.
const beforeShortcut = await qe.getPopupWebContents();
expect(
beforeShortcut,
'popup webContents does not exist before first shortcut press',
).toBeNull();
// Trigger Quick Entry. The popup should be lazily constructed
// and made visible even though no main window is mapped.
await qe.openAndWaitReady();
const popupState = await qe.getPopupState();
await testInfo.attach('popup-state-first-press', {
body: JSON.stringify(popupState, null, 2),
contentType: 'application/json',
});
expect(
popupState && popupState.visible,
'popup is visible after first shortcut press from closed-to-tray',
).toBe(true);
// Reuse precondition: exactly one popup-shaped entry sits in
// `__qeWindows` after the first press. The interceptor pushes
// on every loadFile/loadURL, so anything beyond 1 means the
// popup was constructed more than once already.
const popupCountAfterFirst = await countPopupWindows(inspector);
await testInfo.attach('popup-window-count-after-first', {
body: JSON.stringify({ count: popupCountAfterFirst }, null, 2),
contentType: 'application/json',
});
expect(
popupCountAfterFirst,
'exactly one popup BrowserWindow recorded after first shortcut press',
).toBe(1);
// Dismiss the popup directly via the captured ref — no need to
// involve the OS shortcut grab a second time for the dismiss
// step. waitForPopupClosed reads `isVisible()` on the same ref,
// which flips false as soon as `hide()` returns.
await inspector.evalInMain<null>(`
const wins = globalThis.__qeWindows || [];
const popup = wins.find(w => {
if (!w || !w.ref || w.ref.isDestroyed()) return false;
const f = String(w.loadedFile || '');
return f.indexOf('quick-window.html') !== -1
|| f.indexOf('quick_window/') !== -1;
});
if (popup && popup.ref && !popup.ref.isDestroyed()) {
popup.ref.hide();
}
return null;
`);
await qe.waitForPopupClosed(5_000);
// Second shortcut press. Upstream's lazy-init branch must take
// the existing-Ko path here; if it instead constructed a new
// BrowserWindow, `__qeWindows` would gain a second
// quick-window.html entry and the count below would jump to 2.
await qe.openAndWaitReady();
const popupStateSecond = await qe.getPopupState();
await testInfo.attach('popup-state-second-press', {
body: JSON.stringify(popupStateSecond, null, 2),
contentType: 'application/json',
});
expect(
popupStateSecond && popupStateSecond.visible,
'popup is visible after second shortcut press (reuse path)',
).toBe(true);
const popupCountAfterSecond = await countPopupWindows(inspector);
await testInfo.attach('popup-window-count-after-second', {
body: JSON.stringify({ count: popupCountAfterSecond }, null, 2),
contentType: 'application/json',
});
expect(
popupCountAfterSecond,
'popup BrowserWindow is reused — second shortcut press did not ' +
'construct a new window (regression guard for the lifecycle ' +
'comment in lib/quickentry.ts)',
).toBe(1);
inspector.close();
} finally {
await app.close();
}
});
// Count entries in `globalThis.__qeWindows` whose loadFile target
// matches the popup selector. Mirrors the private popupSelector in
// lib/quickentry.ts — kept inline rather than exposing a new helper
// because this is the only caller and the shape is one line.
async function countPopupWindows(inspector: InspectorClient): Promise<number> {
return await inspector.evalInMain<number>(`
const wins = globalThis.__qeWindows || [];
let n = 0;
for (const w of wins) {
if (!w || !w.ref || w.ref.isDestroyed()) continue;
const f = String(w.loadedFile || '');
if (f.indexOf('quick-window.html') !== -1
|| f.indexOf('quick_window/') !== -1) {
n++;
}
}
return n;
`);
}

View File

@@ -0,0 +1,278 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { lstatSync } from 'node:fs';
import { join } from 'node:path';
import { promisify } from 'node:util';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { QuickEntry } from '../lib/quickentry.js';
import { sleep } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
import { listRegisteredItems } from '../lib/sni.js';
import { getConnectionPid } from '../lib/dbus.js';
const exec = promisify(execFile);
// S30 — Quick Entry shortcut becomes a no-op after full app exit.
// Backs QE-5 in docs/testing/quick-entry-closeout.md.
//
// Electron unregisters the global shortcut on app exit; the
// shortcut becomes a system-level no-op. The failure mode this
// test guards against is "ghost respawn" — where some part of the
// system (autostart, lingering daemon) starts a new instance in
// response to the keypress.
//
// After app.close() the inspector is gone; verification is
// pgrep-based: assert no claude-desktop process exists before AND
// after the keypress, and that no app.asar process appears in a
// 3s window after injection.
//
// Beyond the ghost-respawn delta, this test also asserts a clean
// shutdown: no leftover cowork-vm-service pid, no SNI item still
// registered against launchedPid, and (under isolation) no
// SingletonLock symlink left behind in the per-test config dir.
// These come BEFORE the post-exit shortcut press so the order is
// "did exit clean → did the keypress respawn anything" — both
// failure shapes are observable from the same fixture.
test.setTimeout(45_000);
async function pgrepPids(pattern: string): Promise<Set<number>> {
try {
const { stdout } = await exec('pgrep', ['-f', pattern], {
timeout: 5_000,
});
return new Set(
stdout
.split('\n')
.map((l) => parseInt(l.trim(), 10))
.filter((n) => !Number.isNaN(n)),
);
} catch (err) {
// pgrep exits 1 when no matches, with empty stdout. Treat
// that as the empty set; everything else propagates.
const e = err as { code?: number; stdout?: string };
if (e.code === 1) return new Set();
const out = e.stdout ?? '';
return new Set(
out
.split('\n')
.map((l) => parseInt(l.trim(), 10))
.filter((n) => !Number.isNaN(n)),
);
}
}
test('S30 — Quick Entry shortcut becomes a no-op after full app exit', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Global shortcut unregistration',
});
skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
const launchedPid = app.pid;
// Need an inspector handle just long enough to confirm a working
// shortcut registration. We use it to verify the popup CAN open
// before exit, so the post-exit no-op result is meaningful.
try {
// mainVisible covers main-shell readiness — triggering the
// shortcut before show() races the popup-show flow (loadFile
// + ready-to-show + show()) and the popup never becomes
// visible.
const { inspector } = await app.waitForReady('mainVisible');
const qe = new QuickEntry(inspector);
await qe.installInterceptor();
// Confirm shortcut is wired by invoking it and waiting for
// the popup to appear. openAndWaitReady retries through the
// upstream lHn() race (build-reference index.js:515604) where
// the first shortcut after main-visible is sometimes too
// early for the user object to have populated.
await qe.openAndWaitReady();
inspector.close();
} catch (err) {
await testInfo.attach('preflight-error', {
body: err instanceof Error ? err.stack ?? err.message : String(err),
contentType: 'text/plain',
});
await app.close();
throw new Error(
'Preflight failed: shortcut did not produce a popup. Cannot ' +
'verify post-exit no-op without a working pre-exit baseline.',
);
}
// Full exit. close() sends SIGTERM then SIGKILL after 5s. Note:
// renderer / zygote child processes may linger briefly after the
// main process exits — they're harmless leftovers, not "ghost
// respawns." The spec's regression target is "no NEW process
// from the shortcut," so we baseline whatever's left before
// injecting and assert the delta.
await app.close();
// Give the kernel a moment to reap.
await sleep(500);
const baselinePids = await pgrepPids('app\\.asar');
await testInfo.attach('baseline-pids-after-close', {
body: JSON.stringify(
{
launchedPid,
pidsRemaining: Array.from(baselinePids),
note:
'leftover renderer/zygote processes are harmless; the ' +
'regression target is "no NEW pid spawned by the ' +
'shortcut press", asserted as a delta below.',
},
null,
2,
),
contentType: 'application/json',
});
// Closeout leak checks. These probe "did the app exit clean"
// rather than "did the post-exit shortcut respawn anything" —
// distinct failure shapes, observed from the same fixture.
// Run BEFORE the shortcut injection so a respawn can't taint
// any of these signals.
// (a) No leftover cowork-vm-service pids. Pre-launch cleanup
// pkills these (cleanupPreLaunch in lib/electron.ts); a clean
// shutdown should have already torn them down.
const coworkPids = await pgrepPids('cowork-vm-service\\.js');
const coworkPidsRemaining = Array.from(coworkPids);
// (b) SNI item is deregistered. The connection should be gone
// post-exit, so getConnectionPid against the formerly-owned
// service may throw with NameHasNoOwner — treat that as "not
// present", which is the desired state.
let sniItemPresent = false;
try {
const items = await listRegisteredItems();
for (const item of items) {
try {
const pid = await getConnectionPid(item.service);
if (pid === launchedPid) {
sniItemPresent = true;
break;
}
} catch {
// owner gone — that's "not present" for this item
}
}
} catch {
// watcher itself may not be running on this row; absence
// of a watcher means nothing's registered, which is fine.
}
// (c) SingletonLock symlink is removed (isolation only).
// Under CLAUDE_TEST_USE_HOST_CONFIG the host owns its lock;
// don't probe it. Use lstatSync because SingletonLock is a
// symlink whose target may be stale — existsSync would follow
// the link and miss broken-but-present cases.
let singletonLockPresent = false;
if (app.isolation) {
const lockPath = join(app.isolation.configDir, 'SingletonLock');
try {
lstatSync(lockPath);
singletonLockPresent = true;
} catch {
// ENOENT — clean
}
}
await testInfo.attach('closeout-leak-check', {
body: JSON.stringify(
{
coworkPidsRemaining,
sniItemPresent,
singletonLockPresent,
launchedPid,
isolationConfigDir: app.isolation?.configDir ?? null,
useHostConfig,
},
null,
2,
),
contentType: 'application/json',
});
expect(
coworkPidsRemaining,
'no cowork-vm-service pids remain after app.close()',
).toEqual([]);
expect(
sniItemPresent,
'no SNI item still registered against launchedPid after app.close()',
).toBe(false);
expect(
singletonLockPresent,
'no SingletonLock symlink remains under isolation configDir after app.close()',
).toBe(false);
// Inject the shortcut. ydotool is at the kernel level, so the
// keys go out regardless of who's listening. We can't use
// QuickEntry.openViaShortcut here — that's a class method that
// exists for tests with a live inspector — so we shell out
// directly. Same key sequence (Ctrl+Alt+Space).
try {
await exec(
'ydotool',
['key', '29:1', '56:1', '57:1', '57:0', '56:0', '29:0'],
{
env: {
...process.env,
YDOTOOL_SOCKET:
process.env.YDOTOOL_SOCKET ?? '/tmp/.ydotool_socket',
} as Record<string, string>,
timeout: 5_000,
},
);
} catch (err) {
await testInfo.attach('ydotool-error', {
body: err instanceof Error ? err.message : String(err),
contentType: 'text/plain',
});
throw err;
}
// Wait through the window during which a respawn could occur.
await sleep(3_000);
const postShortcutPids = await pgrepPids('app\\.asar');
const newPids = Array.from(postShortcutPids).filter(
(p) => !baselinePids.has(p),
);
await testInfo.attach('post-shortcut-pgrep', {
body: JSON.stringify(
{
baseline: Array.from(baselinePids),
postShortcut: Array.from(postShortcutPids),
newPids,
note: 'A non-empty newPids set indicates a ghost respawn — ' +
'autostart, service-supervisor, or the OS shortcut ' +
'binding launching a fresh instance.',
},
null,
2,
),
contentType: 'application/json',
});
expect(
newPids,
'no NEW claude-desktop pid appears 3s after post-exit shortcut press',
).toEqual([]);
});

View File

@@ -0,0 +1,201 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import {
QuickEntry,
MainWindow,
waitForNewChat,
} from '../lib/quickentry.js';
import { sleep } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S31 — Quick Entry submit makes the new chat reachable from any
// main-window state. Backs QE-7, QE-8, QE-9, QE-10 in
// docs/testing/quick-entry-closeout.md (covers #393 close-out).
//
// Layered assertion per the closeout's "test as black-box" guidance:
// - LOCAL (Critical): popup opens after the shortcut AND popup
// closes within ~5s of submit. Per QE-13, upstream silently
// drops <3-char inputs without dismissing the popup, so
// "popup closed" is the upstream-defined "submit accepted"
// signal — pure local, no minified-symbol introspection.
// - NETWORK (Should-not-Critical): a /chat/<uuid> URL loaded into
// the claude.ai webContents within 15s. Coupled to claude.ai
// reachability + chat-creation API latency; a failure here on
// its own does NOT block the row.
//
// Sign-in: requires real signed-in claude.ai state. Default isolation
// gives a fresh CLAUDE_CONFIG_DIR with no auth tokens, so set
// CLAUDE_TEST_USE_HOST_CONFIG=1 to share ~/.config/Claude with the
// host (which carries the signed-in account on test VMs). The runner
// skips with a clear message if claude.ai never loads.
//
// QE-10 (workspace) requires WM-specific helpers (wmctrl / swaymsg /
// kdotool) and is deferred — see TODO at the bottom.
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
// 3 scenarios × (~5s open + ~10s submit + up to 15s nav) + 30s startup
// fits in ~120s realistically. Bump the per-test budget so we don't
// race the global default.
test.setTimeout(180_000);
test('S31 — Quick Entry submit reaches new chat from any main-window state', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Quick Entry submit / main window',
});
skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const app = await launchClaude({ isolation: useHostConfig ? null : undefined });
try {
// claudeAi level: main visible AND a claude.ai webContents
// exists. Soft-fails (claudeAiUrl absent) when claude.ai
// never loads — typically the not-signed-in case.
const { inspector, claudeAiUrl } = await app.waitForReady('claudeAi');
if (!claudeAiUrl) {
testInfo.skip(
true,
'claude.ai webContents never loaded — likely not signed in. ' +
'Set CLAUDE_TEST_USE_HOST_CONFIG=1 to share host config.',
);
return;
}
const qe = new QuickEntry(inspector);
const mainWin = new MainWindow(inspector);
await qe.installInterceptor();
// Each scenario sets a precondition, then submits a prompt.
// Run them in sequence on the same app instance — the sweep
// pattern has no implicit cross-test cleanup, but the popup
// dismisses cleanly between submits and the main window is
// always returned to a known state.
const scenarios: Array<{ id: string; setup: () => Promise<void> }> = [
{
id: 'QE-7 visible-and-focused',
setup: async () => {
await mainWin.setState('show');
await mainWin.setState('focus');
},
},
{
id: 'QE-8 minimized',
setup: async () => {
await mainWin.setState('show');
await mainWin.setState('minimize');
},
},
{
id: 'QE-9 hidden-to-tray',
setup: async () => {
await mainWin.setState('hide');
},
},
// QE-10 (different workspace) deferred — see TODO below.
// QE-11 / QE-12 (Dash-pinned vs not) is GNOME-only and
// belongs in S32, not here.
];
const results: Array<{
id: string;
popupOpened: boolean;
popupClosed: boolean;
navUrl: string | null;
}> = [];
for (const sc of scenarios) {
const prompt = `s31-${sc.id.split(' ')[0]}-${Date.now()}`;
console.log(`[S31] scenario ${sc.id} → prompt "${prompt}"`);
await sc.setup();
await sleep(250);
// Open popup. ydotool sends the OS-level shortcut; the popup
// should appear within a couple of seconds even with main
// hidden/minimized (closeout doc S29 covers the lazy-create
// path).
let popupOpened = false;
try {
await qe.openAndWaitReady();
popupOpened = true;
} catch (err) {
console.log(
`[S31] ${sc.id} popup-open failed: ${
err instanceof Error ? err.message : String(err)
}`,
);
}
let popupClosed = false;
let navUrl: string | null = null;
if (popupOpened) {
await qe.typeAndSubmit(prompt);
try {
await qe.waitForPopupClosed(8_000);
popupClosed = true;
} catch (err) {
console.log(
`[S31] ${sc.id} popup-close failed: ${
err instanceof Error ? err.message : String(err)
}`,
);
}
navUrl = await waitForNewChat(inspector, 15_000);
}
results.push({ id: sc.id, popupOpened, popupClosed, navUrl });
// Reset main window before next scenario.
await mainWin.setState('show').catch(() => {});
await mainWin.setState('restore').catch(() => {});
}
await testInfo.attach('s31-results', {
body: JSON.stringify(results, null, 2),
contentType: 'application/json',
});
// Critical: popup must open and submit must be accepted (popup
// dismisses) in every scenario. Together these verify the
// shortcut → popup → submit pathway is intact end-to-end on
// the local side.
for (const r of results) {
expect(r.popupOpened, `popup opened for ${r.id}`).toBe(true);
expect(r.popupClosed, `popup closed (submit accepted) for ${r.id}`).toBe(true);
}
// Should-not-Critical assertion — network nav. If claude.ai
// flakes, mark the row Should rather than Critical fail. We
// surface this by only annotating, not failing, when nav misses.
const navMisses = results.filter((r) => !r.navUrl);
if (navMisses.length > 0) {
testInfo.annotations.push({
type: 'should-failure',
description:
`network nav missed for ${navMisses.map((r) => r.id).join(', ')}` +
'claude.ai reachability or chat-API latency; not a #393 regression on its own',
});
}
inspector.close();
} finally {
await app.close();
}
});
// TODO: QE-10 (different workspace). Needs WM-specific helpers:
// - X11: wmctrl -s <n> to switch workspace, wmctrl -i -r <wid> -t <n>
// to move main window
// - KDE Wayland: kdotool / kwin-mcp
// - GNOME Wayland: no scriptable workspace API; manual or skip
// - Sway/Hypr/Niri: native CLI (swaymsg, hyprctl, niri msg)
// Add as lib/workspace.ts when the first non-S31 test needs it too;
// premature now.

View File

@@ -0,0 +1,193 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import {
QuickEntry,
MainWindow,
waitForNewChat,
} from '../lib/quickentry.js';
import { retryUntil } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S32 — Quick Entry submit on GNOME mutter doesn't trip Electron
// stale-isFocused. Backs QE-11 / QE-12 in
// docs/testing/quick-entry-closeout.md.
//
// Andrej730's #393 root cause: Electron's `BrowserWindow.isFocused()`
// returns stale-true on Linux mutter after `hide()`, which causes
// upstream's `h1() || ut.show()` short-circuit (index.js:515566) to
// skip `show()` — so submit creates a new chat session but the main
// window never reappears, and the chat is unreachable.
//
// Differs from S31 in TWO ways:
// 1. Row-gated to GNOME Wayland (KDE-W is excluded; the post-#406
// patch handles KDE specifically).
// 2. Adds two regression-detector assertions independent of S31:
// (a) the popup is not still visible after submit (the bug
// can also leave Ko on screen because the close-on-dismiss
// handler is downstream of the show() that short-circuits),
// (b) the main window becomes visible (the original symptom
// Andrej730 reported).
// Each assertion is a separate failure shape — popup-stuck and
// main-stuck can occur together or independently.
//
// Expected to FAIL on GNOME-W today until the fix lands (either
// widening the patch beyond KDE, or upstream Electron fixing
// isFocused() on Linux). That's the regression-detector use of this
// test — green it cell once the fix is in.
test.setTimeout(180_000);
test('S32 — Quick Entry submit on GNOME mutter does not trip Electron stale-isFocused', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Electron BrowserWindow.isFocused() on Linux',
});
skipUnlessRow(testInfo, ['GNOME-W', 'Ubu-W']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
try {
// claudeAi level — submit makes no sense before claude.ai
// loads. Soft-fails to skip when not signed in.
const { inspector, claudeAiUrl } = await app.waitForReady('claudeAi');
if (!claudeAiUrl) {
testInfo.skip(
true,
'claude.ai webContents never loaded — likely not signed in. ' +
'Set CLAUDE_TEST_USE_HOST_CONFIG=1 to share host config.',
);
return;
}
const qe = new QuickEntry(inspector);
const mainWin = new MainWindow(inspector);
await qe.installInterceptor();
// Reproduce the tray-only state Andrej730 traced.
await mainWin.setState('show');
await retryUntil(
async () => {
const s = await mainWin.getState();
return s && s.visible ? s : null;
},
{ timeout: 5_000, interval: 200 },
);
await mainWin.setState('hide');
const hidden = await mainWin.getState();
await testInfo.attach('main-state-hidden', {
body: JSON.stringify(hidden, null, 2),
contentType: 'application/json',
});
expect(hidden && !hidden.visible, 'main is hidden before submit').toBe(true);
// Submit a prompt. This is the moment the stale-isFocused
// bug bites — h1() returns true (because isFocused() lies),
// so show() is skipped, and main never reappears.
const prompt = `s32-${Date.now()}`;
await qe.openAndWaitReady();
await qe.typeAndSubmit(prompt);
// Capture popup-close outcome instead of swallowing it. The
// pre-fix S31 pattern catches-and-discards because S31 uses
// popupClosed as its Critical assertion already; here we
// want the boolean for an independent assertion below.
let popupClosed = false;
try {
await qe.waitForPopupClosed(8_000);
popupClosed = true;
} catch {
// timeout — leave popupClosed=false; the explicit popup-
// state assertion below will surface the regression shape.
}
// Popup-stuck assertion. The same short-circuit that skips
// `show()` for main can leave the popup on screen because
// the close-on-dismiss path (popup.hide()) sits downstream
// of the show() call that returned early. Treat either
// destroyed (state === null) or hidden (visible === false)
// as "popup not stuck."
const popupStateAfterSubmit = await qe.getPopupState();
await testInfo.attach('popup-state-after-submit', {
body: JSON.stringify(
{
popupClosed,
popupState: popupStateAfterSubmit,
},
null,
2,
),
contentType: 'application/json',
});
const popupNotVisible =
popupStateAfterSubmit === null || !popupStateAfterSubmit.visible;
expect(
popupNotVisible,
'popup is not visible after submit (regression detector ' +
'for the stale-isFocused short-circuit leaving Ko on screen)',
).toBe(true);
// Should signal — chat created (network).
const navUrl = await waitForNewChat(inspector, 15_000);
// Critical signal — main reappears. The stale-isFocused bug
// causes this to remain false even though submit physically
// succeeded.
const mainBecameVisible = await retryUntil(
async () => {
const s = await mainWin.getState();
return s && s.visible ? s : null;
},
{ timeout: 8_000, interval: 200 },
);
await testInfo.attach('s32-result', {
body: JSON.stringify(
{
navUrl,
popupClosed,
popupStateAfterSubmit,
mainBecameVisible: !!mainBecameVisible,
mainStateAfterSubmit: mainBecameVisible,
note: 'GNOME-W today is expected to show navUrl=set ' +
'AND mainBecameVisible=false until the fix lands.',
},
null,
2,
),
contentType: 'application/json',
});
expect(
mainBecameVisible,
'main window becomes visible after Quick Entry submit (no stale-isFocused short-circuit)',
).toBeTruthy();
// Reset. Run with show before scenario re-runs so any post-
// test inspector activity sees a clean window.
await mainWin.setState('show').catch(() => {});
inspector.close();
} finally {
await app.close();
}
});
// Note on QE-12 (Dash-pinned vs not pinned): the closeout doc says
// the Dash distinction is empirical, not code-driven — upstream has
// no notion of Dash presence. So we only run the not-pinned case
// here (the harder repro from the #393 traces). If the not-pinned
// case green-cells, the pinned case will too. Adding a separate
// scenario for QE-12 specifically would require Dash-pin
// orchestration, which has no scriptable API on GNOME Wayland.
// Treat S32 as covering both QE-11 and QE-12 for the matrix.

View File

@@ -0,0 +1,109 @@
import { test, expect } from '@playwright/test';
import { existsSync, readFileSync } from 'node:fs';
import { dirname, join } from 'node:path';
// S33 — Quick Entry transparent rendering tracked against bundled
// Electron version. Backs QE-18 in docs/testing/quick-entry-closeout.md.
//
// Per @noctuum's bisect on #370, Electron 41.0.4 introduced the
// transparency / opaque-square-frame regression on KDE Wayland. This
// test records the bundled Electron version per row so the matrix
// can correlate S10 outcomes with the version.
//
// Reads from electron/package.json rather than running
// `electron --version`. The bundled Electron binary auto-loads
// resources/app.asar relative to its own path, so `--version` is
// passed through as argv to Claude Desktop instead of being
// intercepted by Electron's flag parser. The package.json is
// canonical and avoids that whole class of issue.
const DEFAULT_ELECTRON_PATHS = [
'/usr/lib/claude-desktop/node_modules/electron/dist/electron',
'/opt/Claude/node_modules/electron/dist/electron',
];
function resolveElectronBin(): string {
const env = process.env.CLAUDE_DESKTOP_ELECTRON;
if (env) return env;
for (const candidate of DEFAULT_ELECTRON_PATHS) {
if (existsSync(candidate)) return candidate;
}
throw new Error(
'Could not locate the bundled Electron binary. Set ' +
'CLAUDE_DESKTOP_ELECTRON or install the deb/rpm package.',
);
}
// electron/package.json sits two dirs up from `dist/electron`.
function resolveElectronPkg(electronBin: string): string {
return join(dirname(electronBin), '..', 'package.json');
}
test('S33 — Quick Entry transparent rendering tracked against bundled Electron version', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Bundled Electron version',
});
const electronBin = resolveElectronBin();
const pkgPath = resolveElectronPkg(electronBin);
await testInfo.attach('electron-bin', {
body: electronBin,
contentType: 'text/plain',
});
await testInfo.attach('electron-package-json-path', {
body: pkgPath,
contentType: 'text/plain',
});
expect(
existsSync(pkgPath),
`electron/package.json exists at ${pkgPath}`,
).toBe(true);
const pkg = JSON.parse(readFileSync(pkgPath, 'utf8')) as {
version?: string;
name?: string;
};
expect(pkg.name, 'package.json is for the electron module').toMatch(
/^electron/,
);
const version = pkg.version ?? '';
await testInfo.attach('electron-version', {
body: version,
contentType: 'text/plain',
});
expect(version, 'package.json version is a non-empty semver').toMatch(
/^\d+\.\d+\.\d+/,
);
// Surface the #370 hypothesis check for matrix-regen.
const [major, minor, patch] = version
.split('.')
.map((n) => parseInt(n, 10));
const bisectThreshold =
major !== undefined &&
minor !== undefined &&
patch !== undefined &&
(major > 41 ||
(major === 41 && minor > 0) ||
(major === 41 && minor === 0 && patch >= 4));
await testInfo.attach('bisect-context', {
body: JSON.stringify(
{
version,
atOrAboveBisectThreshold: bisectThreshold,
bisectNote:
'electron/electron#50213; #370 expected to reproduce on >= 41.0.4 ' +
'until upstream ships a CSD-rendering fix',
},
null,
2,
),
contentType: 'application/json',
});
});

View File

@@ -0,0 +1,159 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow, currentRow } from '../lib/row.js';
import { QuickEntry, MainWindow } from '../lib/quickentry.js';
import { retryUntil, sleep } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S34 — Quick Entry shortcut focuses fullscreen main window instead
// of showing popup. Backs QE-1b in
// docs/testing/quick-entry-closeout.md.
//
// Upstream contract (build-reference index.js:525287-525290):
// `if (ut.isFullScreen()) { ut.focus(); ide(); } else { showPopup(); }`
// — when the main window is fullscreen, the shortcut focuses main
// instead of showing the popup. Intentional UX: assumes the user
// wants to interact with the existing fullscreen Claude rather than
// overlay a popup on it.
//
// Two-sided assertion: (1) popup does NOT become visible (the
// suppression half), and (2) main is focused + still fullscreen
// after the shortcut (the focus half). The original test only
// asserted (1); upstream's contract is `ut.focus(); ide()` not
// just "skip showPopup", so an asserts-suppression-only test
// could pass even if the focus() call regressed silently.
//
// Compositor honor of focus() on fullscreen windows is uneven:
// KDE-W / KDE-X are reliable, GNOME-W / Ubu-W routinely no-op
// focus requests on fullscreen surfaces (mutter "focus stealing
// prevention"). The focus assertion is hard on KDE rows and
// soft-fixme'd elsewhere — the suppression half still runs
// everywhere.
test.setTimeout(45_000);
test('S34 — Quick Entry shortcut focuses fullscreen main window instead of showing popup', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Shortcut behavior on fullscreen main',
});
skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
try {
// mainVisible — some compositors no-op setFullScreen on
// un-mapped windows, so wait for the main shell to be shown
// before driving fullscreen state.
const { inspector } = await app.waitForReady('mainVisible');
const qe = new QuickEntry(inspector);
const mainWin = new MainWindow(inspector);
await qe.installInterceptor();
await mainWin.setState('show');
await mainWin.setState('fullScreen');
// Compositor takes a moment to enter fullscreen.
const fullscreened = await retryUntil(
async () => {
const state = await mainWin.getState();
return state && state.fullScreen ? state : null;
},
{ timeout: 5_000, interval: 200 },
);
await testInfo.attach('main-fullscreen-state', {
body: JSON.stringify(fullscreened, null, 2),
contentType: 'application/json',
});
if (!fullscreened) {
testInfo.skip(
true,
"compositor did not honor setFullScreen — can't validate the fullscreen edge case",
);
return;
}
// Trigger the shortcut and verify the popup never becomes
// visible. We give it 3s — generous compared to a normal
// popup-open which is ~500ms.
await qe.openViaShortcut();
await sleep(3_000);
const popupState = await qe.getPopupState();
await testInfo.attach('popup-state-after-shortcut', {
body: JSON.stringify(popupState, null, 2),
contentType: 'application/json',
});
// Popup may not exist at all (preferred), or may exist but
// be hidden. Both satisfy the contract; only "popup is
// visible" is a regression.
if (popupState !== null) {
expect(
popupState.visible,
'popup BrowserWindow exists but is not visible while main is fullscreen',
).toBe(false);
}
// Focus half: upstream's contract is `ut.focus(); ide()` —
// not just "skip showPopup". Assert the focus side too.
const mainAfter = await mainWin.getState();
await testInfo.attach('main-state-after-shortcut', {
body: JSON.stringify(mainAfter, null, 2),
contentType: 'application/json',
});
// fullScreen is unconditional — the shortcut should never
// drop fullscreen state. (If main lost fullscreen, the
// shortcut went through the showPopup branch instead of
// the focus-and-ide branch — i.e. a different regression
// shape than "popup visible".)
expect(
mainAfter && mainAfter.fullScreen,
'main remains fullscreen after shortcut press (focus branch, not showPopup branch)',
).toBe(true);
// Focused is hard-asserted on KDE rows where focus() is
// reliable; soft-fixme on GNOME-derived rows where mutter
// routinely no-ops focus on fullscreen surfaces. The
// distinction is the compositor, not the upstream contract
// — upstream calls focus() either way.
const row = currentRow();
const hardFocusRows = ['KDE-W', 'KDE-X'];
const focusOk = !!(mainAfter && mainAfter.focused);
if (!focusOk) {
if (hardFocusRows.includes(row)) {
expect(
focusOk,
`main is focused after shortcut press on ${row} (focus() honored by KDE compositors)`,
).toBe(true);
} else {
testInfo.fixme(
true,
`main not focused after shortcut on ${row}; upstream contract ` +
`requires focus() but compositor honor on fullscreen ` +
`surfaces is best-effort outside KDE. mainAfter=` +
JSON.stringify(mainAfter),
);
}
}
// Restore before close so we don't leave the app in fullscreen
// state if the user is sharing config (CLAUDE_TEST_USE_HOST_CONFIG).
await mainWin.setState('unFullScreen').catch(() => {});
inspector.close();
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,381 @@
import { test, expect } from '@playwright/test';
import { readFileSync, writeFileSync } from 'node:fs';
import { join } from 'node:path';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { QuickEntry } from '../lib/quickentry.js';
import { createIsolation, type Isolation } from '../lib/isolation.js';
import { sleep } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S35 — Quick Entry popup position is persisted across invocations
// and across app restarts. Backs QE-22 in
// docs/testing/quick-entry-closeout.md.
//
// Upstream persists position via `an.set("quickWindowPosition", ...)`
// in the popup's `hide` handler (build-reference index.js:515468). On
// subsequent invocations the popup's construction reads the saved
// position from `an.get("quickWindowPosition")`. The test moves the
// popup to a known position, dismisses (triggering save), restarts
// the app with shared XDG_CONFIG_HOME, and verifies the popup
// reappears at the saved position — not the upstream default.
//
// Three-launch test:
// 1. open → move → dismiss → re-open → verify in-session memory
// 2. relaunch with same XDG_CONFIG_HOME → verify position persisted
// 3. wipe quickWindowPosition from on-disk config → relaunch →
// verify popup lands at upstream default (NOT the cleared
// target), proving the path is read-from-disk-not-just-memory
//
// The on-disk round-trip in (2) directly reads
// ${configDir}/Claude/config.json between launches to confirm the
// hide handler reached disk — distinct signal from "in-memory
// position survives restart" (an electron-store memory cache could
// in principle satisfy that without touching disk).
//
// All three launches share the same Isolation handle so
// XDG_CONFIG_HOME stays consistent across restarts. The first two
// calls don't own the handle, so close() leaves the dir intact for
// the next launch. The test owns cleanup.
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
// Three launches at ~60s each plus settle / waitForReady budget.
// 180s was tight for two; 240s gives the third a margin.
test.setTimeout(240_000);
test('S35 — Quick Entry popup position is persisted across invocations and across app restarts', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Popup placement memory',
});
skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// In useHostConfig mode, the host's persisted state is shared
// across launches automatically. In default isolation mode, we
// pin a handle and pass it to both launches so XDG_CONFIG_HOME
// matches.
let isolation: Isolation | null = null;
if (!useHostConfig) {
isolation = await createIsolation();
}
// The position we'll move the popup to. Picked to be unambiguously
// distinct from any default — far from the bottom-center area
// where upstream's default placement lands.
const TARGET_X = 80;
const TARGET_Y = 80;
try {
// First launch: open popup, move, dismiss (save fires), re-open,
// confirm position1 matches TARGET. This is the in-session-
// memory half.
const app1 = await launchClaude({ isolation });
let position1: { x: number; y: number } | null = null;
try {
// userLoaded — Upstream's shortcut handler calls Ko.show()
// only when lHn() is true (`!user.isLoggedOut`); if the
// renderer hasn't loaded the user yet, the popup gets
// constructed but not shown.
const { inspector, postLoginUrl } = await app1.waitForReady('userLoaded');
if (!postLoginUrl) {
testInfo.skip(
true,
'claude.ai user did not load past /login within 30s — ' +
'CLAUDE_TEST_USE_HOST_CONFIG=1 needs a signed-in account',
);
return;
}
const qe = new QuickEntry(inspector);
await qe.installInterceptor();
// URL change is renderer-driven; the main-process user
// object that lHn() reads loads on a separate timeline.
// 3s margin is empirical — without it, the first shortcut
// hits before the auth state propagates and Ko.show() is
// silently skipped. openAndWaitReady's retry would catch
// this too, but eating one full attempt + retryDelayMs is
// slower than the upfront sleep.
await sleep(3_000);
await qe.openAndWaitReady();
// Move the popup. setBounds is the most reliable way; the
// constructor uses it internally too.
await inspector.evalInMain<null>(`
const wins = globalThis.__qeWindows || [];
const popup = wins.find(${popupSelectorJs()});
if (!popup || !popup.ref || popup.ref.isDestroyed()) {
throw new Error('popup ref unavailable for setBounds');
}
popup.ref.setPosition(${TARGET_X}, ${TARGET_Y});
return null;
`);
await sleep(150);
// Dismiss the popup — hide handler fires, save runs.
await inspector.evalInMain<null>(`
const wins = globalThis.__qeWindows || [];
const popup = wins.find(${popupSelectorJs()});
if (popup && popup.ref && !popup.ref.isDestroyed()) {
popup.ref.hide();
}
return null;
`);
await qe.waitForPopupClosed(5_000);
await sleep(300); // give the save handler time to write
// Re-open. Should appear at TARGET (in-session memory).
await qe.openAndWaitReady();
const state1 = await qe.getPopupState();
position1 = state1
? { x: state1.bounds.x, y: state1.bounds.y }
: null;
await testInfo.attach('position-after-move', {
body: JSON.stringify({ position1, target: { x: TARGET_X, y: TARGET_Y } }, null, 2),
contentType: 'application/json',
});
// Dismiss for clean exit.
await inspector.evalInMain<null>(`
const wins = globalThis.__qeWindows || [];
const popup = wins.find(${popupSelectorJs()});
if (popup && popup.ref && !popup.ref.isDestroyed()) {
popup.ref.hide();
}
return null;
`);
await qe.waitForPopupClosed(5_000);
await sleep(300);
inspector.close();
} finally {
await app1.close();
}
expect(
position1,
'popup position observable after first launch',
).not.toBeNull();
expect(
position1!.x,
'popup x matches target after move + re-open',
).toBe(TARGET_X);
expect(
position1!.y,
'popup y matches target after move + re-open',
).toBe(TARGET_Y);
// On-disk round-trip. Read config.json directly between
// launches to confirm the hide handler reached disk — distinct
// signal from "in-memory position survives restart" (an
// electron-store memory cache could in principle satisfy the
// post-restart assertion without ever flushing). Skipped under
// useHostConfig because we don't know the host's configDir.
if (isolation) {
const configPath = join(isolation.configDir, 'Claude/config.json');
let parsed: { quickWindowPosition?: { x?: number; y?: number } } = {};
let rawForAttach = '';
try {
rawForAttach = readFileSync(configPath, 'utf8');
parsed = JSON.parse(rawForAttach);
} catch (err) {
rawForAttach =
'<read error: ' +
(err instanceof Error ? err.message : String(err)) +
'>';
}
await testInfo.attach('config-json-after-launch1', {
body: JSON.stringify(
{
configPath,
parsed,
raw: rawForAttach.slice(0, 4_000),
},
null,
2,
),
contentType: 'application/json',
});
expect(
parsed.quickWindowPosition,
'quickWindowPosition key written to on-disk config.json by hide handler',
).toBeTruthy();
expect(
parsed.quickWindowPosition?.x,
'on-disk x matches TARGET_X',
).toBe(TARGET_X);
expect(
parsed.quickWindowPosition?.y,
'on-disk y matches TARGET_Y',
).toBe(TARGET_Y);
}
// Second launch: same XDG_CONFIG_HOME (or host config). Open
// popup; should appear at the saved position from the first
// launch's hide handler.
const app2 = await launchClaude({ isolation });
let position2: { x: number; y: number } | null = null;
try {
// userLoaded — same race as the first launch. Settings
// load is part of main's startup, so by the time the user
// has loaded, `an.get("quickWindowPosition")` returns the
// saved value.
const { inspector, postLoginUrl } = await app2.waitForReady('userLoaded');
if (!postLoginUrl) {
testInfo.skip(
true,
'claude.ai user did not load past /login within 30s on second launch',
);
return;
}
const qe = new QuickEntry(inspector);
await qe.installInterceptor();
await qe.openAndWaitReady();
const state2 = await qe.getPopupState();
position2 = state2
? { x: state2.bounds.x, y: state2.bounds.y }
: null;
await testInfo.attach('position-after-restart', {
body: JSON.stringify(
{
position1,
position2,
match: !!position2 && position2.x === position1!.x && position2.y === position1!.y,
},
null,
2,
),
contentType: 'application/json',
});
inspector.close();
} finally {
await app2.close();
}
expect(
position2,
'popup position observable after restart',
).not.toBeNull();
expect(
position2!.x,
'popup x persisted across restart',
).toBe(position1!.x);
expect(
position2!.y,
'popup y persisted across restart',
).toBe(position1!.y);
// Third launch: clear-and-default. Wipe the
// quickWindowPosition key from on-disk config and confirm
// the popup lands somewhere OTHER than TARGET. This proves
// the read path actually consults disk — if the popup still
// appeared at TARGET after the key was cleared, upstream
// would be sourcing position from somewhere we don't know
// about (env, hard-coded fallback shape, in-memory leak
// across the close/spawn boundary).
//
// Don't assert exact default coordinates — those depend on
// display geometry. Just assert "not the cleared target".
// Skipped under useHostConfig (no known configDir to mutate).
if (isolation) {
const configPath = join(isolation.configDir, 'Claude/config.json');
let beforeRaw = '';
try {
beforeRaw = readFileSync(configPath, 'utf8');
const parsed = JSON.parse(beforeRaw) as Record<string, unknown>;
delete parsed.quickWindowPosition;
writeFileSync(configPath, JSON.stringify(parsed, null, 2), 'utf8');
} catch (err) {
await testInfo.attach('config-clear-error', {
body:
'configPath=' + configPath + '\n' +
(err instanceof Error ? err.stack ?? err.message : String(err)),
contentType: 'text/plain',
});
throw err;
}
const app3 = await launchClaude({ isolation });
let position3: { x: number; y: number } | null = null;
try {
const { inspector, postLoginUrl } =
await app3.waitForReady('userLoaded');
if (!postLoginUrl) {
testInfo.skip(
true,
'claude.ai user did not load past /login on third launch',
);
return;
}
const qe = new QuickEntry(inspector);
await qe.installInterceptor();
await qe.openAndWaitReady();
const state3 = await qe.getPopupState();
position3 = state3
? { x: state3.bounds.x, y: state3.bounds.y }
: null;
await testInfo.attach('position-after-clear', {
body: JSON.stringify(
{
configPath,
beforeRawSnippet: beforeRaw.slice(0, 2_000),
target: { x: TARGET_X, y: TARGET_Y },
position3,
note:
'position3 should NOT equal target — that would ' +
'imply the read path bypassed disk.',
},
null,
2,
),
contentType: 'application/json',
});
inspector.close();
} finally {
await app3.close();
}
expect(
position3,
'popup position observable after third launch',
).not.toBeNull();
const matchedTarget =
!!position3 &&
position3.x === TARGET_X &&
position3.y === TARGET_Y;
expect(
matchedTarget,
'popup did NOT reappear at the cleared target — confirms ' +
'upstream reads position from disk, not an in-memory cache',
).toBe(false);
}
} finally {
if (isolation) await isolation.cleanup();
}
});
// The popup-selector logic is duplicated from quickentry.ts because
// it's a private method there; expressing it inline here keeps S35
// self-contained without making the helper public for one caller.
function popupSelectorJs(): string {
return `(w => {
if (!w || !w.ref || w.ref.isDestroyed()) return false;
const f = String(w.loadedFile || '');
return f.indexOf('quick-window.html') !== -1
|| f.indexOf('quick_window/') !== -1;
})`;
}

View File

@@ -0,0 +1,90 @@
import { test } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S36 — Quick Entry popup falls back to primary display when saved
// monitor is gone. Backs QE-23 in
// docs/testing/quick-entry-closeout.md.
//
// Per the closeout doc § Mandatory matrix, this is "Skip when:
// Single-monitor VM or host." Active multi-monitor disconnect mid-
// test requires libvirt device-detach orchestration that's outside
// the harness today (and largely orthogonal — the failure mode is
// the popup landing at off-screen coordinates after a saved-monitor
// loss, which needs real disconnect, not just a state mock).
//
// This runner detects multi-monitor at launch time and:
// - skips with `-` if single-monitor (the closeout doc explicitly
// marks this row N/A in the dashboard for those hosts);
// - skips with `?` (test.fail unimplemented) on multi-monitor
// hosts until the disconnect orchestration is built. JUnit
// <error> maps to `?` per the matrix.md legend, signaling
// "untested" rather than passing or failing.
//
// When implemented, the procedure is:
// 1. boot test VM with two displays attached
// 2. invoke QE on the secondary, save (S35 establishes the path)
// 3. detach the secondary display via libvirt
// 4. invoke QE
// 5. assert popup appears on the primary display via
// hA.screen.getDisplayMatching(bounds) === primary
test.setTimeout(45_000);
test('S36 — Quick Entry popup falls back to primary display when saved monitor is gone', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Smoke' });
testInfo.annotations.push({
type: 'surface',
description: 'Multi-monitor placement',
});
skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
try {
await app.waitForX11Window(15_000);
const inspector = await app.attachInspector(15_000);
const displays = await inspector.evalInMain<
Array<{ id: number; bounds: { x: number; y: number; width: number; height: number } }>
>(`
const { screen } = process.mainModule.require('electron');
return screen.getAllDisplays().map(d => ({ id: d.id, bounds: d.bounds }));
`);
await testInfo.attach('displays', {
body: JSON.stringify(displays, null, 2),
contentType: 'application/json',
});
inspector.close();
if (displays.length < 2) {
testInfo.skip(
true,
'single-monitor host — S36 requires multi-monitor + libvirt ' +
'detach orchestration. Per quick-entry-closeout.md, mark `-` ' +
'in the dashboard for single-monitor rows.',
);
return;
}
// Multi-monitor host detected. Active disconnect mid-test isn't
// implemented yet — surface an explicit unimplemented status so
// the matrix shows `?` rather than a misleading green.
testInfo.fixme(
true,
`multi-monitor host (${displays.length} displays) — disconnect ` +
'orchestration not yet implemented. See spec body for the ' +
'required steps when adding it.',
);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,38 @@
import { test } from '@playwright/test';
import { skipUnlessRow } from '../lib/row.js';
// S37 — Quick Entry popup remains functional after main window
// destroy. Backs QE-24 in docs/testing/quick-entry-closeout.md.
//
// Per the closeout doc:
// "Likely unreachable on Linux without a debug build, due to
// project's hide-to-tray override of the X button. Mark `-`
// (N/A) on rows where the destroy path can't be triggered."
//
// On every supported Linux row, scripts/frame-fix-wrapper.js
// intercepts the X button to call hide() instead of close()/
// destroy() (the close-to-tray behavior). DevTools'
// `remote.getCurrentWindow().destroy()` would work in principle,
// but `remote` isn't exposed in modern Electron and adding it as
// a test-only patch is more invasive than this case is worth.
//
// All Linux rows skip this with the upstream-rationale message.
// If a non-Linux row is added later (FreeBSD?), revisit; the spec
// remains useful as the "what would happen if" reference.
test('S37 — Quick Entry popup remains functional after main window destroy', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Popup lifecycle independence from main window',
});
skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
testInfo.skip(
true,
'main-window destroy is unreachable on Linux without a debug ' +
'build (close-to-tray override intercepts the X button to ' +
'hide() rather than destroy()). Marked N/A in the matrix ' +
'per docs/testing/quick-entry-closeout.md QE-24.',
);
});

View File

@@ -0,0 +1,39 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
import { getWindowTitle } from '../lib/wm.js';
test('T01 — App launch', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Smoke' });
testInfo.annotations.push({ type: 'surface', description: 'App startup' });
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const app = await launchClaude();
try {
// Anti-debug gate (see lib/electron.ts) prevents CDP / Playwright
// renderer access. We verify launch via the X11 window appearing —
// which simultaneously confirms (a) Electron started, (b) it picked
// the X11 backend (Decision 6: --ozone-platform=x11 was honored),
// and (c) the WM accepted the window.
const wid = await app.waitForX11Window(15_000);
expect(wid, 'X11 window appeared for claude-desktop pid').toBeTruthy();
await testInfo.attach('window-id', {
body: wid,
contentType: 'text/plain',
});
const title = await getWindowTitle(wid);
await testInfo.attach('window-title', {
body: title ?? '',
contentType: 'text/plain',
});
expect(title ?? '', 'window title contains "Claude"').toMatch(/claude/i);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,124 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import {
runDoctor,
captureSessionEnv,
} from '../lib/diagnostics.js';
const exec = promisify(execFile);
// T02 — Doctor health check.
//
// Run `claude-desktop --doctor` and assert exit code === 0. Per the
// case-doc (docs/testing/cases/launch.md T02): all checks should
// PASS / WARN with no FAIL, and the launcher exits 0. This is a
// short-lived spawn probe — `runDoctor()` shells out under a
// 15s timeout and returns `{ output, exitCode }` without touching
// the host's main app instance (doctor is a `--doctor`-gated branch
// that prints and exits, not a full Electron launch).
//
// Applies to all rows. No `skipUnlessRow()` — the doctor script
// (scripts/doctor.sh) runs identically on every distribution we
// ship (deb/rpm/AppImage); a row-specific FAIL there is a real T02
// failure, not a "doesn't apply" skip.
//
// Diagnostics on failure (per case-doc): full --doctor output, the
// install path (`which claude-desktop`), and package metadata
// (`dpkg -S` / `rpm -qf` against the binary). The output and session
// env are attached unconditionally; the locate / package-metadata
// probes only run when the assertion is about to fail, since they're
// noisy and only useful for triage.
async function captureWhich(bin: string): Promise<string> {
try {
const { stdout } = await exec('which', [bin], { timeout: 5_000 });
return stdout.trim();
} catch (err) {
const e = err as { stdout?: string; stderr?: string; code?: number };
return (
`which exited ${e.code ?? '?'}\n` +
`stdout: ${e.stdout ?? ''}\n` +
`stderr: ${e.stderr ?? ''}`
).trim();
}
}
async function capturePackageMetadata(path: string): Promise<string> {
if (!path) return 'no install path resolved';
const lines: string[] = [];
for (const cmd of [
['dpkg', ['-S', path]],
['rpm', ['-qf', path]],
] as [string, string[]][]) {
try {
const { stdout, stderr } = await exec(cmd[0], cmd[1], {
timeout: 5_000,
});
lines.push(
`$ ${cmd[0]} ${cmd[1].join(' ')}\n` +
`${stdout.trim()}${stderr.trim() ? `\n${stderr.trim()}` : ''}`,
);
} catch (err) {
const e = err as {
stdout?: string;
stderr?: string;
code?: number;
};
lines.push(
`$ ${cmd[0]} ${cmd[1].join(' ')} (exit ${e.code ?? '?'})\n` +
`${(e.stdout ?? '').trim()}\n` +
`${(e.stderr ?? '').trim()}`.trim(),
);
}
}
return lines.join('\n\n');
}
test('T02 — Doctor exit code is 0', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'CLI / --doctor',
});
// Applies to all rows — no skipUnlessRow.
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const result = await runDoctor();
await testInfo.attach('doctor-output', {
body: result.output,
contentType: 'text/plain',
});
await testInfo.attach('doctor-exit-code', {
body: String(result.exitCode),
contentType: 'text/plain',
});
if (result.exitCode !== 0) {
const launcher =
process.env.CLAUDE_DESKTOP_LAUNCHER ?? 'claude-desktop';
const whichOut = await captureWhich(launcher);
await testInfo.attach('which-claude-desktop', {
body: whichOut,
contentType: 'text/plain',
});
// First line of `which` output is the resolved path; pass that
// to dpkg/rpm so package-metadata reflects what doctor actually
// inspected.
const installPath = whichOut.split('\n')[0]?.trim() ?? '';
const pkgMeta = await capturePackageMetadata(installPath);
await testInfo.attach('package-metadata', {
body: pkgMeta,
contentType: 'text/plain',
});
}
expect(result.exitCode, 'doctor exits with code 0').toBe(0);
});

View File

@@ -0,0 +1,167 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import {
findItemByPid,
listRegisteredItems,
type SniItem,
} from '../lib/sni.js';
import { disconnectBus, getConnectionPid } from '../lib/dbus.js';
import { retryUntil, sleep } from '../lib/retry.js';
// T03 — Tray icon present + tray-rebuild idempotency.
//
// Two assertions in one test, sharing the same launched app:
//
// 1. After startup, exactly ONE StatusNotifierItem on the session
// bus is owned by the claude-desktop pid. Presence-only would
// pass if the pid registered two items, which is the exact
// shape of the bug below.
// 2. After toggling `nativeTheme.themeSource`, still exactly ONE
// SNI item is owned by the pid. This guards the
// tray-rebuild-race fixed in scripts/patches/tray.sh:
// destroy()+sleep(250)+new Tray() can transiently leave two
// SNIs registered for the pid because KDE Plasma's systemtray
// observer reacts to UnregisterItem after the new Register
// call lands. See docs/learnings/tray-rebuild-race.md for the
// full timing story.
//
// The fast-path patch swaps destroy/recreate for in-place
// setImage/setContextMenu on the existing Tray, which never
// touches StatusNotifierWatcher registration — so the count
// stays at 1 across the toggle. If the patch ever regresses (or
// the rebuild path is reached for some other reason), the
// post-toggle count climbs and this test catches it.
test('T03 — Tray icon present (and rebuild leaves exactly one SNI)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Smoke' });
testInfo.annotations.push({
type: 'surface',
description: 'Tray / StatusNotifierItem',
});
testInfo.annotations.push({
type: 'surface',
description: 'Tray rebuild idempotency',
});
const app = await launchClaude();
try {
await app.waitForX11Window(15_000);
// Tray registration may lag the first window by a few hundred ms.
// Poll the SNI watcher until our pid shows up among registered items.
const ourItem = await retryUntil(
async () => findItemByPid(app.pid),
{ timeout: 15_000, interval: 500 },
);
expect(
ourItem,
'a StatusNotifierItem registered by claude-desktop pid was found',
).toBeTruthy();
if (ourItem) {
await testInfo.attach('sni-item', {
body: JSON.stringify(ourItem, null, 2),
contentType: 'application/json',
});
}
// Walk the full registry and count items owned by our pid.
// Presence-only (above) doesn't catch the duplicate-registration
// shape — we'd have found one and stopped.
const preToggleItems = await listRegisteredItems();
const preToggleOwners = await collectItemsForPid(preToggleItems, app.pid);
await testInfo.attach('sni-items-pre-toggle', {
body: JSON.stringify(preToggleOwners, null, 2),
contentType: 'application/json',
});
expect(
preToggleOwners.length,
'exactly one SNI item is owned by claude-desktop pid before theme toggle',
).toBe(1);
// Exercise the rebuild path. nativeTheme.themeSource flip is
// the user-visible trigger from docs/learnings/tray-rebuild-
// race.md (Appearance → Colors / Plasma Style / Global Theme
// all funnel through nativeTheme::updated). The fast-path
// patch should keep this in-place; the unpatched slow-path
// would destroy + recreate, transiently registering a second
// SNI.
const inspector = await app.attachInspector();
const originalThemeSource = await inspector.evalInMain<string>(`
const { nativeTheme } = process.mainModule.require('electron');
return nativeTheme.themeSource;
`);
const flipped = originalThemeSource === 'dark' ? 'light' : 'dark';
try {
await inspector.evalInMain<null>(`
const { nativeTheme } = process.mainModule.require('electron');
nativeTheme.themeSource = ${JSON.stringify(flipped)};
return null;
`);
// Settle window for any rebuild churn — the unpatched path
// has a built-in 250ms sleep between destroy() and new
// Tray(); 500ms covers that plus DBus signal propagation.
await sleep(500);
const postToggleItems = await listRegisteredItems();
const postToggleOwners = await collectItemsForPid(
postToggleItems,
app.pid,
);
await testInfo.attach('sni-items-post-toggle', {
body: JSON.stringify(
{
originalThemeSource,
flippedTo: flipped,
owners: postToggleOwners,
},
null,
2,
),
contentType: 'application/json',
});
expect(
postToggleOwners.length,
'exactly one SNI item is owned by claude-desktop pid after theme toggle ' +
'(tray-rebuild race regression — see docs/learnings/tray-rebuild-race.md)',
).toBe(1);
} finally {
// Reset themeSource so we don't leave the test host with a
// flipped theme override on the off-chance the isolation
// boundary leaks.
await inspector
.evalInMain<null>(`
const { nativeTheme } = process.mainModule.require('electron');
nativeTheme.themeSource = ${JSON.stringify(originalThemeSource)};
return null;
`)
.catch(() => {});
inspector.close();
}
} finally {
await app.close();
await disconnectBus();
}
});
// Walk the SNI item list and return only those whose owning DBus
// connection has the given pid. Mirrors findItemByPid but keeps every
// match instead of returning the first.
async function collectItemsForPid(
items: SniItem[],
pid: number,
): Promise<SniItem[]> {
const owned: SniItem[] = [];
for (const item of items) {
try {
const itemPid = await getConnectionPid(item.service);
if (itemPid === pid) owned.push(item);
} catch {
// connection may have gone away mid-iteration; skip
}
}
return owned;
}

View File

@@ -0,0 +1,47 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { getFrameExtents, getWindowTitle } from '../lib/wm.js';
test('T04 — Window decorations draw', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Smoke' });
testInfo.annotations.push({
type: 'surface',
description: 'Window chrome',
});
// On KDE Wayland (Decision 6: project default is X11/XWayland), the app
// window is reachable via xprop. Native-Wayland window-state queries are a
// later iteration — see Still open #5 in docs/testing/automation.md.
const app = await launchClaude();
try {
const wid = await app.waitForX11Window(15_000);
expect(wid, 'X11 window for claude-desktop pid was found').toBeTruthy();
await testInfo.attach('window-id', {
body: wid,
contentType: 'text/plain',
});
const title = await getWindowTitle(wid);
expect(title ?? '', 'window title contains "Claude"').toMatch(/claude/i);
// _NET_FRAME_EXTENTS is set by the WM when it draws decorations.
// All-zero extents (or absent property) indicates an undecorated window.
const extents = await getFrameExtents(wid);
await testInfo.attach('frame-extents', {
body: JSON.stringify(extents, null, 2),
contentType: 'application/json',
});
expect(extents, 'window has _NET_FRAME_EXTENTS set by WM').toBeTruthy();
if (extents) {
const total =
extents.left + extents.right + extents.top + extents.bottom;
expect(total, 'sum of frame extents > 0 (window is decorated)')
.toBeGreaterThan(0);
}
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,214 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { launchClaude } from '../lib/electron.js';
import { killHostClaude } from '../lib/host-claude.js';
import { retryUntil } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
const exec = promisify(execFile);
// T05 — `claude://` URL delivers to the running app via xdg-open.
//
// Tier-3 delivery probe. The earlier Tier-2 attempt
// (`app.isDefaultProtocolClient('claude')`) doesn't work in the
// harness: ELECTRON_FORCE_IS_PACKAGED=true makes `app.getName()`
// resolve to `Claude`, so the runtime registration call is a no-op
// and the API can't tell us anything useful. Instead we drive the
// real OS path: install a `second-instance` listener in the main
// process, fire `xdg-open 'claude://test/<marker>'` from a separate
// process, and verify the URL appears in the captured argv.
//
// Routing: `xdg-open` resolves `x-scheme-handler/claude` to
// `claude-desktop.desktop` and execs claude-desktop. The new
// process calls `app.requestSingleInstanceLock()` (upstream
// build-reference/app-extracted/.vite/build/index.js:525162),
// loses to our running instance, and the primary's
// `app.on('second-instance', ...)` handler at index.js:525163-525172
// fires with the spawned child's argv. The URL is in that argv —
// `uPn(t)` extracts it and routes to `fCA(r)` → `bEe(...)`.
//
// Why isolation: null. xdg-open's spawn always lands under the
// user's `~/.config/Claude` (the SingletonLock path is fixed in
// `app.getPath('userData')`, derived from XDG_CONFIG_HOME at
// child-process spawn time — we can't influence the spawned
// child's env from here). For the SingletonLock collision to route
// the URL to OUR instance, OUR instance must hold the lock at
// `~/.config/Claude/SingletonLock`. Default isolation gives us a
// tmpdir lock, so xdg-open's child wouldn't collide with us — it'd
// either start as a fresh primary (if no host claude-desktop is
// running) or route to the host's actual claude-desktop. Sharing
// host config is the only way the second-instance hook fires.
//
// Side effect: this test runs against the real `~/.config/Claude`
// and any host claude-desktop must be killed first. The URL is a
// synthetic `claude://test/<marker>` that hits `bEe()`'s default
// branch (no Preview/Hotkey/DebugHandoff host match) — no
// navigation, no destructive side effect.
test.setTimeout(60_000);
test('T05 — claude:// URL delivers to running app via xdg-open', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Smoke' });
testInfo.annotations.push({
type: 'surface',
description: 'URL scheme / protocol delivery',
});
// Skip cleanly when the prerequisites aren't on this host.
try {
await exec('which', ['xdg-open']);
} catch {
test.skip(true, 'xdg-open not available');
return;
}
const xdgMime = await exec('xdg-mime', [
'query',
'default',
'x-scheme-handler/claude',
])
.then((r) => r.stdout.trim())
.catch(() => '');
if (!xdgMime.includes('claude-desktop')) {
test.skip(
true,
`claude:// not registered as default scheme handler (xdg-mime: "${xdgMime}")`,
);
return;
}
await testInfo.attach('xdg-mime', {
body: xdgMime,
contentType: 'text/plain',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// xdg-open's spawned child binds the SingletonLock at
// `~/.config/Claude/SingletonLock`; we must hold that lock so
// the child loses and routes via second-instance instead of
// becoming a fresh primary. Kill any host instance first, then
// launch with `isolation: null` so OUR XDG_CONFIG_HOME matches
// the child's.
await killHostClaude();
const app = await launchClaude({ isolation: null });
const marker = `t05-${Date.now()}-${Math.random()
.toString(36)
.slice(2, 8)}`;
const url = `claude://test/${marker}`;
try {
const { inspector } = await app.waitForReady('mainVisible');
// Install a main-process hook that captures every
// second-instance payload into a global. The handler
// signature is (event, argv, cwd, additionalData) per
// Electron docs and the upstream call site at index.js
// :525163.
await inspector.evalInMain<null>(`
const { app } = process.mainModule.require('electron');
global.__T05_argvCaptures = global.__T05_argvCaptures || [];
if (!global.__T05_handlerInstalled) {
app.on('second-instance', (event, argv, cwd) => {
global.__T05_argvCaptures.push({
argv,
cwd,
ts: Date.now(),
});
});
global.__T05_handlerInstalled = true;
}
return null;
`);
// Fire the URL from a separate process. xdg-open execs
// claude-desktop with the URL on argv; that child loses
// the SingletonLock to us and routes via second-instance.
// Capture exec output so a failure mode where xdg-open
// itself errored shows up in the attached diagnostics.
let xdgOpenStdout = '';
let xdgOpenStderr = '';
let xdgOpenError: string | null = null;
try {
const r = await exec('xdg-open', [url], { timeout: 10_000 });
xdgOpenStdout = r.stdout;
xdgOpenStderr = r.stderr;
} catch (err) {
const e = err as {
stdout?: string;
stderr?: string;
message?: string;
};
xdgOpenStdout = e.stdout ?? '';
xdgOpenStderr = e.stderr ?? '';
xdgOpenError = e.message ?? String(err);
}
// Poll the captured argv list until our marker shows up.
// 10s is generous: xdg-open returns immediately, the spawned
// claude-desktop reaches `app.on('ready', ...)` in ~2-4s on
// a warm cache, and `requestSingleInstanceLock()` losing
// fires the parent's second-instance synchronously.
interface Capture {
argv: string[];
cwd: string;
ts: number;
}
const captured = await retryUntil<Capture>(
async () => {
const dump = await inspector.evalInMain<Capture[]>(`
return global.__T05_argvCaptures || [];
`);
return (
dump.find((c) =>
(c.argv ?? []).some((a) => a.includes(marker)),
) ?? null
);
},
{ timeout: 10_000, interval: 250 },
);
const allCaptures = await inspector.evalInMain<Capture[]>(`
return global.__T05_argvCaptures || [];
`);
await testInfo.attach('marker', {
body: marker,
contentType: 'text/plain',
});
await testInfo.attach('url', {
body: url,
contentType: 'text/plain',
});
await testInfo.attach(
'xdg-open',
{
body: JSON.stringify(
{
stdout: xdgOpenStdout,
stderr: xdgOpenStderr,
error: xdgOpenError,
},
null,
2,
),
contentType: 'application/json',
},
);
await testInfo.attach('captured-second-instance', {
body: JSON.stringify(allCaptures, null, 2),
contentType: 'application/json',
});
expect(
captured,
`second-instance handler should fire with argv containing "${marker}"`,
).toBeTruthy();
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,83 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// T06 — Quick Entry global shortcut is registered after main visible.
//
// Tier 2 form of T06 (case-doc:
// docs/testing/cases/shortcuts-and-input.md#t06--quick-entry-global-shortcut-unfocused).
// The shortcut-delivery half (press → popup appears) is covered by
// S29 (lazy-create from tray), S30 (post-exit no-op), and S31 (submit
// reaches new chat). T06 here is purely the registration-state probe:
// after the app is visible, `globalShortcut.isRegistered(accelerator)`
// must return true. Registration succeeds even on portal-grabbed
// Wayland sessions; only delivery is portal-gated, so this assertion
// applies to all rows.
//
// Accelerator string is hardcoded to "Ctrl+Alt+Space" per the
// case-doc Code anchor (build-reference index.js:499376 — `ort`
// default accelerator: `"Ctrl+Alt+Space"` non-mac, `"Alt+Space"` on
// mac). Linux always takes the non-mac branch. If the user remaps
// the shortcut via Settings, this test would fail; the harness
// always launches into a fresh isolated config (no remap).
// 90s test timeout matches waitForReady's own default budget — main
// visibility on a fresh isolation can take ~30-50s on a cold cache
// (Electron unpack + claude.ai initial nav).
test.setTimeout(90_000);
test('T06 — Quick Entry global shortcut is registered after main visible', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Quick Entry / global shortcut',
});
// No skipUnlessRow — applies to all rows. Registration succeeds
// even where delivery is portal-gated; T06's contract is the
// registration state alone.
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
try {
// mainVisible — registration happens during upstream's
// `app.on('ready')` chain (build-reference index.js:499416,
// 525287-525290), which lands before the main BrowserWindow
// becomes visible. Querying after mainVisible guarantees the
// register() call has run.
const { inspector } = await app.waitForReady('mainVisible');
const result = await inspector.evalInMain<{
accelerator: string;
isRegistered: boolean;
}>(`
const { globalShortcut } = process.mainModule.require('electron');
const accelerator = 'Ctrl+Alt+Space';
return {
accelerator,
isRegistered: globalShortcut.isRegistered(accelerator),
};
`);
await testInfo.attach('shortcut-registration', {
body: JSON.stringify(result, null, 2),
contentType: 'application/json',
});
expect(
result.isRegistered,
`globalShortcut.isRegistered('${result.accelerator}') is true ` +
'after main visible',
).toBe(true);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,199 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { createIsolation, type Isolation } from '../lib/isolation.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
import { retryUntil } from '../lib/retry.js';
// T07 — In-app topbar renders + clickable.
//
// Path: seed auth from the host's signed-in Claude Desktop config into
// a per-test tmpdir, launch the app against that hermetic config, wait
// for `userLoaded` (claude.ai past /login — the topbar is rendered by
// claude.ai's authenticated SPA, not the shell), then DOM-probe the
// topbar via the `data-testid="topbar-windows-menu"` anchor documented
// in docs/learnings/linux-topbar-shim.md.
//
// Side effect of `seedFromHost: true`: the host's running Claude
// Desktop is killed (SIGTERM, then SIGKILL on holdouts). This is
// required because LevelDB / SQLite hold writer locks that would
// torn-page the seed copy. The host config dir itself is left
// untouched — only an allowlisted subset is copied into the tmpdir,
// which is rm -rf'd on test close. See lib/isolation.ts for the
// allowlist and lib/host-claude.ts for the kill semantics.
interface TopbarButton {
ariaLabel: string;
testId: string | null;
rect: { x: number; y: number; w: number; h: number };
visible: boolean;
}
interface TopbarSnapshot {
found: boolean;
containerSelector: string | null;
buttonCount: number;
buttons: TopbarButton[];
}
test('T07 — In-app topbar renders with clickable buttons', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Smoke' });
testInfo.annotations.push({
type: 'surface',
description: 'Window chrome / in-app topbar',
});
// No skipUnlessRow — T07 applies to all rows on PR #538 builds.
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Seed auth from host: kills any running host Claude (writer-lock
// release for LevelDB / SQLite), then copies the auth-relevant
// subset of ~/.config/Claude into a per-test tmpdir. The host
// config never gets mutated, and the tmpdir is rm -rf'd on
// app.close(). Skip cleanly when no signed-in host config is
// available — createIsolation throws with a clear message in that
// case (no host dir, or dir present but missing the auth files).
let isolation: Isolation;
try {
isolation = await createIsolation({ seedFromHost: true });
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
test.skip(true, `seedFromHost unavailable: ${msg}`);
return;
}
const app = await launchClaude({ isolation });
try {
// userLoaded gates on claude.ai URL past /login. With seeded
// auth this should fire well within the default budget on a
// warm cache; if the seed was stale and the renderer bounces
// to /login, postLoginUrl stays absent and we skip.
const ready = await app.waitForReady('userLoaded');
await testInfo.attach('claude-ai-url', {
body: ready.claudeAiUrl ?? '(no claude.ai webContents observed)',
contentType: 'text/plain',
});
if (!ready.postLoginUrl) {
test.skip(
true,
'seeded auth did not reach post-login URL — host config ' +
'may be stale (signed out, expired session, etc.)',
);
return;
}
await testInfo.attach('post-login-url', {
body: ready.postLoginUrl,
contentType: 'text/plain',
});
// Topbar probe: anchor on the `topbar-windows-menu` test id (the
// hamburger button — name reflects upstream's "this is for
// Windows" framing per linux-topbar-shim.md gate 3). Sibling
// buttons live in the same `div.absolute.top-0.inset-x-0`
// container per the click-state diagnostic in that learning.
// Fallback to `parentElement` if the closest() lookup misses
// (defensive — tailwind class regen could shift the container).
//
// Wrap in retryUntil because the renderer can still be mid-
// navigation when waitForReady('userLoaded') resolves (the gate
// polls URL only — it doesn't wait for SPA route settle), and a
// post-login client-side redirect during executeJavaScript
// surfaces as `Execution context was destroyed`. Each retry
// re-issues the eval against the now-current execution context.
const topbar = await retryUntil(
async () => {
try {
const r = await ready.inspector.evalInRenderer<TopbarSnapshot>(
'claude.ai',
`
(() => {
const menu = document.querySelector('[data-testid="topbar-windows-menu"]');
if (!menu) {
return { found: false, containerSelector: null, buttonCount: 0, buttons: [] };
}
const closest = menu.closest('div.absolute.top-0');
const container = closest ?? menu.parentElement;
if (!container) {
return { found: false, containerSelector: null, buttonCount: 0, buttons: [] };
}
const buttons = Array.from(container.querySelectorAll('button'));
return {
found: true,
containerSelector: closest
? 'div.absolute.top-0 (closest)'
: 'menu.parentElement (fallback)',
buttonCount: buttons.length,
buttons: buttons.map(b => {
const rect = b.getBoundingClientRect();
return {
ariaLabel: b.getAttribute('aria-label') ?? '',
testId: b.getAttribute('data-testid'),
rect: {
x: rect.x,
y: rect.y,
w: rect.width,
h: rect.height,
},
visible: rect.width > 0 && rect.height > 0,
};
}),
};
})()
`,
);
return r.found ? r : null;
} catch (err) {
// "Execution context was destroyed" during a route
// transition is benign — the next iteration runs
// against the new context.
const msg = err instanceof Error ? err.message : String(err);
if (msg.includes('context was destroyed')) return null;
throw err;
}
},
{ timeout: 15_000, interval: 500 },
);
if (!topbar) {
throw new Error(
'topbar probe never observed [data-testid="topbar-windows-menu"] ' +
'within 15s after userLoaded',
);
}
await testInfo.attach('topbar-snapshot', {
body: JSON.stringify(topbar, null, 2),
contentType: 'application/json',
});
expect(
topbar.found,
'data-testid="topbar-windows-menu" anchor was found in ' +
'claude.ai renderer (gate 3 / shim UA spoof active)',
).toBe(true);
// Case-doc lists five buttons (hamburger, sidebar toggle, search,
// back, forward) plus the Cowork ghost. The exact rendered count
// depends on whether the Cowork ghost is materialised at probe
// time, so assert the floor of five — the full button list is
// captured in the topbar-snapshot attachment for case-doc anchor
// refinement.
expect(
topbar.buttonCount,
'topbar container has at least 5 buttons',
).toBeGreaterThanOrEqual(5);
for (const btn of topbar.buttons) {
const id = btn.ariaLabel || btn.testId || '(unlabeled)';
expect(
btn.visible,
`topbar button "${id}" has non-zero bounding rect`,
).toBe(true);
}
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,111 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { MainWindow } from '../lib/quickentry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
import { retryUntil } from '../lib/retry.js';
// T08 — Closing the main window hides to tray instead of quitting.
//
// On Linux, upstream's quit-on-last-window-closed handler at
// build-reference/app-extracted/.vite/build/index.js:525550-525552
// (`hA.app.on("window-all-closed", () => { Zr || Ap() })` — `Zr` is
// the darwin guard) would otherwise call into the quit path the
// first time the user clicks the X-button. PR #451 plumbed
// scripts/frame-fix-wrapper.js:178-185:
// this.on('close', e => {
// if (!result.app._quittingIntentionally && !this.isDestroyed()) {
// e.preventDefault();
// this.hide();
// }
// });
// armed by the `before-quit` handler at frame-fix-wrapper.js:370-374
// which sets `_quittingIntentionally = true` for the tray-Quit /
// Ctrl+Q / SIGTERM exits. So the X-button path takes the
// preventDefault + hide() branch; the tray-Quit path bypasses it.
//
// Test shape: launch, capture pre-state, fire `'close'` on the main
// BrowserWindow (MainWindow.setState('close') calls win.close(),
// which fires the same 'close' event the wrapper intercepts on a
// real X-button click), then assert the window flipped to invisible
// AND the Electron process is still running. The `'hide'` action
// would also flip visible:false but bypasses the wrapper — that's
// what S29 tests, and it deliberately does NOT exercise the
// regression-detection T08 cares about.
//
// Applies to all rows. No skipUnlessRow gate.
test.setTimeout(60_000);
test('T08 — Closing main window hides to tray, app stays alive', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Smoke' });
testInfo.annotations.push({
type: 'surface',
description: 'Window chrome / close-to-tray',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const app = await launchClaude();
try {
const { inspector } = await app.waitForReady('mainVisible');
const mainWin = new MainWindow(inspector);
const before = await mainWin.getState();
await testInfo.attach('main-state-before-close', {
body: JSON.stringify(before, null, 2),
contentType: 'application/json',
});
expect(before, 'main window state reachable pre-close').toBeTruthy();
expect(before?.visible, 'main window visible before close').toBe(true);
// Fire the BrowserWindow 'close' event. The wrapper at
// frame-fix-wrapper.js:178-185 should preventDefault +
// hide() rather than letting the window destroy + the app
// quit via the 'window-all-closed' path.
await mainWin.setState('close');
// Poll for visible:false. The close-to-tray transition is
// synchronous in the wrapper's interceptor, but compositor
// side effects (unmap + isVisible() flip) can lag a beat —
// 5s is generous for the runtime check.
const after = await retryUntil(
async () => {
const s = await mainWin.getState();
return s && !s.visible ? s : null;
},
{ timeout: 5_000, interval: 200 },
);
await testInfo.attach('main-state-after-close', {
body: JSON.stringify(after, null, 2),
contentType: 'application/json',
});
await testInfo.attach('proc-state', {
body: JSON.stringify(
{
exitCode: app.process.exitCode,
signalCode: app.process.signalCode,
pid: app.pid,
},
null,
2,
),
contentType: 'application/json',
});
expect(after, 'main window state reachable post-close').toBeTruthy();
expect(after?.visible, 'main window hidden after close').toBe(false);
expect(
app.process.exitCode,
'app process did not quit (close-to-tray)',
).toBe(null);
expect(
app.process.signalCode,
'app process not killed by signal',
).toBe(null);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,179 @@
import { test, expect } from '@playwright/test';
import { existsSync, readFileSync } from 'node:fs';
import { join } from 'node:path';
import { launchClaude } from '../lib/electron.js';
import { retryUntil } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// T09 — Autostart via XDG.
//
// frame-fix-wrapper.js installs a setLoginItemSettings shim on Linux
// (Electron's openAtLogin is a no-op there — electron/electron#15198).
// The shim resolves $XDG_CONFIG_HOME/autostart/claude-desktop.desktop
// (falling back to ~/.config when the env var is unset/empty) and
// writes a spec-compliant [Desktop Entry] block on `openAtLogin: true`,
// unlinking it on `openAtLogin: false`.
//
// Default isolation gives a per-test XDG_CONFIG_HOME, so the autostart
// file lands inside the sandbox — no host-level cleanup needed.
//
// Code anchors:
// scripts/frame-fix-wrapper.js:566 — autostartPath construction
// scripts/frame-fix-wrapper.js:601 — buildAutostartContent()
// scripts/frame-fix-wrapper.js:627 — setLoginItemSettings shim
// Cold-start + waitForReady('mainVisible') alone has a 90s budget,
// so the default 60s test timeout is too tight. Two inspector evals
// add a few hundred ms each; 120s gives margin without masking real
// hangs.
test.setTimeout(120_000);
test('T09 — Autostart via XDG writes/removes desktop entry', async (
{},
testInfo,
) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Autostart / login item',
});
// All Linux rows — no skipUnlessRow.
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const app = await launchClaude();
try {
await testInfo.attach('isolation-env', {
body: JSON.stringify(app.isolation?.env ?? null, null, 2),
contentType: 'application/json',
});
const xdgConfigHome = app.isolation?.env.XDG_CONFIG_HOME;
expect(
xdgConfigHome,
'isolation provides XDG_CONFIG_HOME',
).toBeTruthy();
const autostartPath = join(
xdgConfigHome!,
'autostart',
'claude-desktop.desktop',
);
await testInfo.attach('autostart-path', {
body: autostartPath,
contentType: 'text/plain',
});
// Don't gate on 'mainVisible' — that requires a claude.ai
// webContents to exist, which depends on network reachability
// and isn't relevant to the autostart shim (installed at
// frame-fix-wrapper module-load time, well before the renderer
// loads claude.ai). All we need is the inspector attached.
await app.waitForX11Window();
const inspector = await app.attachInspector();
// Sanity: file should not exist before the toggle. The shim only
// writes on explicit setLoginItemSettings calls.
const initiallyPresent = existsSync(autostartPath);
await testInfo.attach('initial-existence', {
body: String(initiallyPresent),
contentType: 'text/plain',
});
expect(
initiallyPresent,
'autostart file absent before any toggle',
).toBe(false);
// Capture the wrapper's view of XDG_CONFIG_HOME and shim binding.
// On failure this answers two questions immediately: did the env
// var propagate into the spawned process, and is the wrapper's
// setLoginItemSettings substitution still in place. If wrapperEnv
// .xdg is null but isolation-env had it set, the env didn't reach
// Electron — diagnose at launchClaude. If isFn is true but the
// file never lands, the wrapper substitution is being undone (or
// the path-construction comment in this file is out of date).
const wrapperEnv = await inspector.evalInMain<{
xdg: string | null;
home: string;
isFn: boolean;
xdgKeys: string[];
}>(`
const os = process.mainModule.require('os');
const { app } = process.mainModule.require('electron');
return {
xdg: process.env.XDG_CONFIG_HOME ?? null,
home: os.homedir(),
isFn: typeof app.setLoginItemSettings === 'function',
xdgKeys: Object.keys(process.env).filter(k => k.startsWith('XDG_')),
};
`);
await testInfo.attach('wrapper-env', {
body: JSON.stringify(wrapperEnv, null, 2),
contentType: 'application/json',
});
// Toggle on.
await inspector.evalInMain<null>(`
const { app } = process.mainModule.require('electron');
app.setLoginItemSettings({ openAtLogin: true });
return null;
`);
// Filesystem write is synchronous in the shim, but the eval
// resolves before the Node fs.writeFileSync syscall settles
// against any FUSE-backed tmpdir. retryUntil returns null on
// timeout, so use a truthy sentinel to distinguish "found" from
// "timed out".
const enabled = await retryUntil(
async () => (existsSync(autostartPath) ? 'present' : null),
{ timeout: 3_000, interval: 100 },
);
await testInfo.attach('post-enable-existence', {
body: String(existsSync(autostartPath)),
contentType: 'text/plain',
});
expect(
enabled,
'autostart file written after openAtLogin: true',
).toBe('present');
const desktopEntry = readFileSync(autostartPath, 'utf8');
await testInfo.attach('desktop-entry', {
body: desktopEntry,
contentType: 'text/plain',
});
expect(
desktopEntry,
'desktop entry has [Desktop Entry] header',
).toMatch(/^\[Desktop Entry\]/m);
expect(desktopEntry, 'desktop entry has Type= line').toMatch(
/^Type=Application/m,
);
expect(desktopEntry, 'desktop entry has Exec= line').toMatch(/^Exec=.+/m);
expect(desktopEntry, 'desktop entry has Name= line').toMatch(/^Name=.+/m);
// Toggle off.
await inspector.evalInMain<null>(`
const { app } = process.mainModule.require('electron');
app.setLoginItemSettings({ openAtLogin: false });
return null;
`);
const disabled = await retryUntil(
async () => (!existsSync(autostartPath) ? 'gone' : null),
{ timeout: 3_000, interval: 100 },
);
await testInfo.attach('post-disable-existence', {
body: String(existsSync(autostartPath)),
contentType: 'text/plain',
});
expect(
disabled,
'autostart file removed after openAtLogin: false',
).toBe('gone');
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,392 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { sleep } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
const exec = promisify(execFile);
// T10 — cowork daemon respawn after kill.
//
// docs/testing/cases/platform-integration.md T10 covers two
// claims: the daemon spawns when Cowork needs it (asserted by
// H04), AND it respawns within the documented timeout if it
// crashes mid-session. This runner covers the second half.
//
// The respawn path is implemented by Patch 6 in
// scripts/patches/cowork.sh:244-362 (issue #408). The auto-launch
// gate uses a timestamp-based cooldown (`_lastSpawn`, 10s window)
// instead of a one-shot boolean specifically so the retry loop
// in kUe()/the renamed retry function can re-fork the daemon
// after it dies. If the cooldown regresses back to a one-shot
// boolean, or the cooldown window grows past the renderer's
// retry budget, kill-then-respawn silently breaks and the user
// sees "VM service not running" until they restart the app.
//
// Trigger model: post-1.5354.0 the cowork client opens a
// persistent pipe at boot (zI/E$i happy path) and uses it for
// every subsequent RPC. After SIGKILL the persistent socket goes
// dead but no client code is in steady-state RPC traffic, so
// nothing fires the retry loop on its own. T10 has to drive
// traffic itself: invoking ClaudeVM.getRunningStatus() through
// the renderer wrapper forces the client to call zI() / kUe(),
// which sees the dead socket, hits the cooldown gate, and
// re-forks the daemon.
//
// Verification primitive: globalThis.__coworkDaemonPid is set
// by the patched fork code after each successful spawn (Patch 6
// in scripts/patches/cowork.sh). Polling that global is faster
// and race-free vs. pgrep, but pgrep is also captured on
// failure for cross-check.
//
// Row gate matches H04 — daemon is Linux-only, gating mirrors the
// rest of the cowork lifecycle row set.
const PGREP_PATTERN = 'cowork-vm-service\\.js';
async function pgrepPids(pattern: string): Promise<Set<number>> {
try {
const { stdout } = await exec('pgrep', ['-f', pattern], {
timeout: 5_000,
});
return new Set(
stdout
.split('\n')
.map((l) => parseInt(l.trim(), 10))
.filter((n) => !Number.isNaN(n)),
);
} catch (err) {
// pgrep exits 1 with empty stdout when no matches. Treat as
// the empty set; everything else propagates.
const e = err as { code?: number; stdout?: string };
if (e.code === 1) return new Set();
const out = e.stdout ?? '';
return new Set(
out
.split('\n')
.map((l) => parseInt(l.trim(), 10))
.filter((n) => !Number.isNaN(n)),
);
}
}
test.setTimeout(90_000);
test('T10 — cowork daemon respawns after SIGKILL', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Cowork daemon respawn',
});
skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Baseline — launchClaude's cleanupPreLaunch (lib/electron.ts:160-191)
// pkills any leftover cowork daemon before spawning, so a stray
// pid here would mean the cleanup itself is broken.
const baselinePids = await pgrepPids(PGREP_PATTERN);
await testInfo.attach('baseline-pids', {
body: JSON.stringify(
{
pids: Array.from(baselinePids),
note:
'cleanupPreLaunch should leave this empty before launch. ' +
'Non-empty here is a bug in lib/electron.ts:160-191.',
},
null,
2,
),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
let daemonPid: number | null = null;
try {
// userLoaded — main shell up AND the renderer has navigated
// to a post-login URL. The boot-time daemon spawn happens
// well before this (cowork.sh:262-362 gates on early renderer
// activity), but Phase 3's `window['claude.web'].ClaudeVM`
// invocation requires the renderer to be on a post-login URL
// where the eipc wrapper is exposed. Pre-login pages don't
// expose `claude.web`, so RPC attempts get "Cannot find
// context with specified id" errors. Waiting for userLoaded
// once at the top guarantees the wrapper is reachable.
const { inspector } = await app.waitForReady('userLoaded');
// Phase 1: capture the original daemon pid. Same 15s window
// as H04 — if the daemon never spawned in the first place,
// there's nothing to kill, so skip with the same reason.
const spawnStart = Date.now();
while (Date.now() - spawnStart < 15_000) {
const pids = await pgrepPids(PGREP_PATTERN);
const newPids = Array.from(pids).filter(
(p) => !baselinePids.has(p),
);
if (newPids.length > 0) {
daemonPid = newPids[0]!;
break;
}
await sleep(500);
}
if (daemonPid === null) {
await testInfo.attach('skip-reason', {
body: JSON.stringify(
{
reason:
'cowork daemon not spawned within 15s of mainVisible',
note:
'Auto-launch in cowork.sh:262-362 is gated on a VM ' +
'service connection attempt from the renderer; on a ' +
'passive launch with no Cowork-tab interaction it may ' +
'legitimately not fire. Without an initial spawn there ' +
'is no daemon to kill, so the respawn assertion is ' +
'unreachable. Same skip path as H04.',
},
null,
2,
),
contentType: 'application/json',
});
testInfo.skip(
true,
'cowork daemon not spawned by this build — gating in ' +
'cowork.sh:262-362 may have suppressed it on a passive launch',
);
return;
}
const originalSpawnElapsedMs = Date.now() - spawnStart;
await testInfo.attach('original-spawn', {
body: JSON.stringify(
{
pid: daemonPid,
elapsedMs: originalSpawnElapsedMs,
},
null,
2,
),
contentType: 'application/json',
});
// Phase 2: SIGKILL the daemon. Try direct process.kill first;
// the daemon is forked by the Electron main process under the
// same uid as the test runner, so this should not need root.
// Shell-out fallback covers the unlikely case where direct
// kill fails (e.g. EPERM on a misconfigured runner).
const killTs = Date.now();
let killMethod = 'process.kill';
try {
process.kill(daemonPid, 'SIGKILL');
} catch (err) {
killMethod = 'execFile-kill-9';
await exec('kill', ['-9', String(daemonPid)], { timeout: 5_000 });
}
await testInfo.attach('kill', {
body: JSON.stringify(
{
killedPid: daemonPid,
killMethod,
killedAt: new Date(killTs).toISOString(),
},
null,
2,
),
contentType: 'application/json',
});
// Phase 3: drive the retry loop and poll for a NEW pid. The
// cooldown in cowork.sh:329-332 is 10s, so the new pid can't
// arrive earlier than 10s past the original `_lastSpawn`. The
// 30s budget gives 10s of cooldown headroom plus 20s for the
// renderer context to recover from any post-kill navigation
// (the dead VM service can trigger a re-render that throws
// "Cannot find context with specified id" on RPCs in flight),
// plus the fork + bind + exec round-trip for the new daemon.
//
// Each poll iteration: (1) fire ClaudeVM.getRunningStatus()
// via the renderer wrapper — best-effort, expect throws on
// post-kill navigations and on the first attempts before the
// cooldown gate opens — and (2) read globalThis.__coworkDaemonPid
// (set by the patched fork code after every successful spawn).
// pgrep is the cross-check.
const respawnStart = Date.now();
let respawnPid: number | null = null;
let rpcAttempts = 0;
let rpcFailures = 0;
let lastRpcError: string | null = null;
while (Date.now() - respawnStart < 30_000) {
// Drive a daemon RPC by invoking the eipc handler from
// MAIN directly. The renderer-wrapper path
// (window['claude.web'].ClaudeVM.getRunningStatus) is
// unreliable here because the dead VM service triggers
// a renderer re-render that throws "Cannot find context
// with specified id" on most calls. Calling the handler
// from main bypasses the renderer entirely; the handler
// internally goes through zI()/VsA()/kUe(), the latter
// of which sees ECONNREFUSED/ENOENT and hits the
// cooldown-gated fork. We forge a senderFrame.url to
// satisfy any origin-gated handlers (claude.web scope).
rpcAttempts++;
try {
await inspector.evalInMain(`
const { webContents } = process.mainModule.require('electron');
const wc = webContents.getAllWebContents().find(w => {
try { return w.getURL().includes('claude.ai'); }
catch { return false; }
});
if (!wc) return null;
const handlers = wc.ipc && wc.ipc._invokeHandlers;
if (!handlers || typeof handlers.keys !== 'function') return null;
const channel = Array.from(handlers.keys())
.find(k => k.endsWith('_$_ClaudeVM_$_getRunningStatus'));
if (!channel) return null;
const handler = handlers.get(channel);
if (typeof handler !== 'function') return null;
const fakeEvent = {
senderFrame: { url: 'https://claude.ai/' },
sender: wc,
};
try { await handler(fakeEvent); } catch (e) { /* expected */ }
return null;
`);
} catch (err) {
rpcFailures++;
lastRpcError = err instanceof Error ? err.message : String(err);
}
// Primary signal: the global pid changed.
let currentGlobalPid: number | null = null;
try {
currentGlobalPid = await inspector.evalInMain<number | null>(
`return globalThis.__coworkDaemonPid ?? null;`,
);
} catch {
// inspector momentarily unavailable — keep polling
}
if (
currentGlobalPid !== null &&
currentGlobalPid !== daemonPid &&
!baselinePids.has(currentGlobalPid)
) {
respawnPid = currentGlobalPid;
break;
}
// Cross-check via pgrep (covers the corner where the global
// is set but pgrep hasn't observed the new pid yet, or the
// global never gets updated for some reason).
const pids = await pgrepPids(PGREP_PATTERN);
const candidates = Array.from(pids).filter(
(p) => !baselinePids.has(p) && p !== daemonPid,
);
if (candidates.length > 0) {
respawnPid = candidates[0]!;
break;
}
await sleep(500);
}
const respawnElapsedMs = Date.now() - respawnStart;
if (respawnPid === null) {
const finalPids = await pgrepPids(PGREP_PATTERN);
let finalGlobalPid: number | null = null;
try {
finalGlobalPid = await inspector.evalInMain<number | null>(
`return globalThis.__coworkDaemonPid ?? null;`,
);
} catch {
// best-effort
}
await testInfo.attach('respawn-failure', {
body: JSON.stringify(
{
killedPid: daemonPid,
pgrepFinal: Array.from(finalPids),
globalDaemonPidFinal: finalGlobalPid,
rpcAttempts,
rpcFailures,
lastRpcError,
elapsedMs: respawnElapsedMs,
note:
'No new cowork-vm-service pid observed within 30s ' +
'of SIGKILL despite firing ClaudeVM.getRunningStatus ' +
'each iteration. Cooldown in cowork.sh:329-332 is 10s. ' +
'Possible regressions: cooldown reverted to a one-shot ' +
'boolean (issue #408), the retry loop no longer enters ' +
'the auto-launch branch on ECONNREFUSED/ENOENT, the ' +
'patched fork no longer assigns __coworkDaemonPid, or ' +
'ClaudeVM eipc no longer routes through the daemon ' +
'RPC (the trigger surface).',
},
null,
2,
),
contentType: 'application/json',
});
} else {
await testInfo.attach('respawn', {
body: JSON.stringify(
{
originalPid: daemonPid,
respawnPid,
rpcAttempts,
rpcFailures,
elapsedMs: respawnElapsedMs,
},
null,
2,
),
contentType: 'application/json',
});
}
expect(
respawnPid,
'cowork-vm-service respawns within 30s of SIGKILL',
).not.toBeNull();
expect(
respawnPid,
'respawn pid is distinct from the killed pid',
).not.toBe(daemonPid);
} finally {
await app.close();
// Best-effort cleanup confirmation. If anything still matches
// PGREP_PATTERN after close, attach it for diagnosis but don't
// fail — H04 is the runner that asserts the cleanup contract.
await sleep(2_000);
const postExitPids = await pgrepPids(PGREP_PATTERN);
const lingering = Array.from(postExitPids).filter(
(p) => !baselinePids.has(p),
);
await testInfo.attach('post-exit-pgrep', {
body: JSON.stringify(
{
baseline: Array.from(baselinePids),
postExit: Array.from(postExitPids),
lingering,
note:
'Informational. H04 owns the cleanup-after-close ' +
'assertion; this attachment is for cross-referencing ' +
'when respawn passes but cleanup regresses elsewhere.',
},
null,
2,
),
contentType: 'application/json',
});
}
});

View File

@@ -0,0 +1,136 @@
import { test, expect } from '@playwright/test';
import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
// T11 — Plugin install (Anthropic & Partners), file-level fingerprint.
//
// The full T11 case (Code-tab → + → Plugins → Add plugin → Install →
// landed under Manage plugins → re-install idempotent) is Tier 3:
// the install handshake hits the Anthropic API, which requires a
// signed-in claude.ai. Until that end-to-end runner exists, this
// spec is the cheap "install code path is wired into the bundle"
// signal — if these strings are missing, the upstream rename or
// refactor that removed them would silently break the Tier 3 flow
// the moment it gets written, and a build that ships without the
// install plumbing would pass the rest of the harness with zero
// indication anything is wrong.
//
// Two fingerprints, both pinned to STRING LITERALS the install code
// path itself emits/uses (not strings the path matches against):
//
// 1. `[CustomPlugins] installPlugin: attempting remote API install`
// — the log line emitted at index.js:507193 when the gate
// accepts and the remote-API branch fires (see case-doc T11
// Code anchors and docs/learnings/plugin-install.md "Install
// Gate"). If this string disappears, either the log was
// removed or the whole `installPlugin` IPC handler was
// restructured — both cases drift far enough from current
// behavior that the click-chain Tier 3 spec needs revisiting.
//
// 2. `installed_plugins.json` — the per-user idempotency record
// written under `dx()` (index.js:465822). T11's "re-install is
// idempotent" expectation rides on this file's read/write.
// Also load-bears for S27 (per-user storage) — its absence
// from the bundle would mean both T11 and S27's plumbing
// moved.
//
// Pure file probe — no app launch. Fast (<1s). Row-independent
// (the install code path is in the bundle regardless of desktop
// environment).
interface FingerprintEntry {
fingerprint: string;
file: string;
// Why this string is load-bearing for T11 — surfaced in the
// attached manifest so a future failure ties straight to the
// case-doc anchor that introduced it.
source: string;
}
const FINGERPRINTS: FingerprintEntry[] = [
{
fingerprint:
'[CustomPlugins] installPlugin: attempting remote API install',
file: '.vite/build/index.js',
source:
'index.js:507193 — log line on the remote-API branch of ' +
"the installPlugin gate (pluginSource === 'remote'). Case-doc " +
'T11 Code anchors; docs/learnings/plugin-install.md "Install Gate".',
},
{
fingerprint: 'installed_plugins.json',
file: '.vite/build/index.js',
source:
'index.js:465822 — per-user idempotency record under dx() ' +
"(`~/.claude/plugins/`). T11 step 4 ('re-install idempotent') " +
'and S27 (per-user storage) both ride on this path.',
},
];
test('T11 — Plugin install code path is wired (file probe)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Plugin install / extensibility',
});
// Applies to all rows — fingerprints are in the bundle,
// row-independent. Login-required end-to-end coverage of T11
// (gate → API → Manage plugins → idempotent re-install) lives
// in a Tier 3 follow-up; this is the cheap drift sentinel.
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
// Read each unique file once, then check fingerprints against
// the cached contents. Mirrors H03's manifest shape so future
// additions slot in without restructuring.
const fileCache = new Map<string, string>();
const results: {
fingerprint: string;
file: string;
source: string;
found: boolean;
}[] = [];
for (const entry of FINGERPRINTS) {
let contents = fileCache.get(entry.file);
if (contents === undefined) {
try {
contents = readAsarFile(entry.file, asarPath);
fileCache.set(entry.file, contents);
} catch (err) {
results.push({
fingerprint: entry.fingerprint,
file: entry.file,
source:
entry.source +
' [READ ERROR: ' +
(err instanceof Error ? err.message : String(err)) +
']',
found: false,
});
continue;
}
}
results.push({
fingerprint: entry.fingerprint,
file: entry.file,
source: entry.source,
found: contents.includes(entry.fingerprint),
});
}
await testInfo.attach('plugin-install-fingerprints', {
body: JSON.stringify(results, null, 2),
contentType: 'application/json',
});
const missing = results.filter((r) => !r.found);
expect(
missing,
'every plugin-install fingerprint is present in the bundled app.asar',
).toEqual([]);
});

View File

@@ -0,0 +1,272 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { createIsolation, type Isolation } from '../lib/isolation.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
import { invokeEipcChannel, waitForEipcChannels } from '../lib/eipc.js';
// T11 — Plugin install (Anthropic & Partners) IPC surface registered +
// install-flow read-side handlers invocable (Tier 2 reframe of the
// case-doc claim "Click Install → Anthropic & Partners plugin lands in
// Manage plugins → re-install is idempotent").
//
// Sibling to T11_plugin_install_fingerprint.spec.ts — the Tier 1 spec
// asserts the install code path's two case-doc string literals are in
// the bundle (`[CustomPlugins] installPlugin: attempting remote API
// install` at index.js:507193 and `installed_plugins.json` at :465822).
// This Tier 2 spec promotes from "the install code is in the bundle" to
// "the install handlers register at runtime AND the read-sides that
// drive Manage plugins / idempotency-record return the documented
// shapes". A half-applied refactor where the bundle still contains the
// strings but the handlers no longer register / the impl object is
// missing methods would pass Tier 1 and fail Tier 2.
//
// Backs T11 in docs/testing/cases/extensibility.md ("Plugin install
// (Anthropic & Partners)"). Session 7's per-interface registry walk
// listed CustomPlugins (16 methods) and LocalPlugins (15 methods) on
// the claude.ai webContents. Session 12's smoke-test against the
// debugger-attached running Claude confirmed:
// - `CustomPlugins.listInstalledPlugins(egressAllowedDomains)`
// accepts `[[]]` (empty allow-list) and returns `Array<…>` (length
// 0 on dev box's host config — no plugins installed).
// - `LocalPlugins.getPlugins()` accepts `[]` and returns `Array<…>`
// (length 0 on dev box — `~/.claude/plugins/installed_plugins.json`
// absent or empty). Same arg-validator-empty pattern as T19/T20's
// `LocalSessions.getAll`.
//
// Why both layers — registration AND invocation
// ---------------------------------------------
// Registration of the 5 install-flow suffixes proves the lifecycle is
// wired (install / uninstall / update + the two read-sides that drive
// the UX). Invocation of `listInstalledPlugins` (the CustomPlugins-
// side "Manage plugins" reader) and `getPlugins` (the LocalPlugins-
// side `~/.claude/plugins/installed_plugins.json` reader) proves both
// halves of the install flow's read-sides are reachable through the
// renderer wrapper and return arrays. Dual-invocation across two
// distinct interfaces (CustomPlugins + LocalPlugins) gives strictly
// stronger coverage than the single-interface T21 / T33c pattern —
// proves the install plumbing crosses both impl objects intact.
//
// Why these 5 registration suffixes
// ---------------------------------
// The plugin install case-doc maps to:
// 1. Click "Install" → `CustomPlugins.installPlugin` (case-doc
// anchor :507181, primary write-side).
// 2. "Lands in Manage plugins" → `CustomPlugins.listInstalledPlugins`
// (read-side, what populates the Manage plugins panel).
// 3. "Re-install is idempotent" → `installPlugin` again, with the
// idempotency mechanism backed by `LocalPlugins.getPlugins`
// reading `~/.claude/plugins/installed_plugins.json` (case-doc
// anchor :465822 + :465816).
// Plus the install-lifecycle complements `uninstallPlugin` and
// `updatePlugin` for register-only drift coverage — a build that ships
// `installPlugin` without its lifecycle siblings would be a half-
// applied refactor, and registration probes are cheap. All five must
// register; partial registration breaks the case-doc claim.
//
// Why these 2 invocation targets
// ------------------------------
// Both `CustomPlugins.listInstalledPlugins(egressAllowedDomains) →
// Array<Plugin>` and `LocalPlugins.getPlugins() → Array<Plugin>` are
// pure read-side handlers — no fs writes, no network egress, no
// process spawn. The empty `egressAllowedDomains = []` arg follows
// T33c's pattern (the safety property is that the empty allow-list
// blocks all network access if the underlying impl shells out to the
// CLI — for `listInstalledPlugins` the local-only path is used and
// the allow-list is effectively a no-op). `getPlugins` takes no args
// and reads `~/.claude/plugins/` directly. Mixed-arg-shape dual
// invocation is fine — same pattern as T21 (one handler takes a `cwd`
// string, another doesn't).
//
// Read-only by design — neither handler mutates user state. Dev-box
// observation: both return empty arrays (no plugins installed on the
// harness's `~/.claude/plugins/` tree).
//
// Skip semantics
// --------------
// `seedFromHost: true` is required — without a signed-in claude.ai,
// the renderer never reaches claude.ai origin and the CustomPlugins /
// LocalPlugins wrappers aren't exposed (mirrors T19 / T20 / T21 /
// T22b / T31b / T33b / T33c / T35b / T37b / T38b pattern).
test.setTimeout(90_000);
const EXPECTED_SUFFIXES = [
// case-doc anchor :507181 — primary install write-side
'CustomPlugins_$_installPlugin',
// install-lifecycle complement
'CustomPlugins_$_uninstallPlugin',
// install-lifecycle complement (re-install vs update path)
'CustomPlugins_$_updatePlugin',
// T11 step 3 — "lands in Manage plugins" read-side, also invoked
'CustomPlugins_$_listInstalledPlugins',
// T11 step 4 idempotency-record reader (case-doc :465822 / :465816),
// also invoked
'LocalPlugins_$_getPlugins',
] as const;
// `egressAllowedDomains` arg shape on CustomPlugins.listInstalledPlugins:
// positional `string[]` at position 0. Hand-rolled validator (NOT Zod)
// per session 9's CustomPlugins finding — `Array.isArray(r) && r.every(a
// => typeof a === "string")`. Empty array passes; the impl forwards the
// allow-list to any spawned subprocess, so `[]` is the "block all
// network egress" path. Smoke-tested session 12 against debugger-
// attached running Claude.
const LIST_INSTALLED_ARGS = [[]] as const;
test('T11 — Plugin install IPC surface + install-flow read-sides invocable', async (
{},
testInfo,
) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description:
'Plugin install / extensibility (eipc registration + invocation)',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
let isolation: Isolation;
try {
isolation = await createIsolation({ seedFromHost: true });
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
test.skip(true, `seedFromHost unavailable: ${msg}`);
return;
}
const app = await launchClaude({ isolation });
try {
const ready = await app.waitForReady('userLoaded');
await testInfo.attach('claude-ai-url', {
body: ready.claudeAiUrl ?? '(no claude.ai webContents observed)',
contentType: 'text/plain',
});
if (!ready.postLoginUrl) {
test.skip(
true,
'seeded auth did not reach post-login URL — host config ' +
'may be stale (signed out, expired session, etc.)',
);
return;
}
await testInfo.attach('post-login-url', {
body: ready.postLoginUrl,
contentType: 'text/plain',
});
const resolved = await waitForEipcChannels(
ready.inspector,
EXPECTED_SUFFIXES,
);
// Invoke `CustomPlugins.listInstalledPlugins` — array of
// installed plugins (CustomPlugins side, drives the Manage
// plugins panel). Plugin entries may include user-account-scoped
// metadata (workspace paths, plugin IDs that reveal org-internal
// marketplace pointers when the user is in an org); log length
// only, never bodies (mirrors T19/T20/T21/T33c/T37b's defensive
// default).
let listInstalledShape = 'not-invoked';
let listInstalledLength: number | null = null;
const listInstalledResult = await invokeEipcChannel<unknown>(
ready.inspector,
'CustomPlugins_$_listInstalledPlugins',
LIST_INSTALLED_ARGS,
);
if (Array.isArray(listInstalledResult)) {
listInstalledShape = `array(length=${listInstalledResult.length})`;
listInstalledLength = listInstalledResult.length;
} else if (listInstalledResult === null) {
listInstalledShape = 'null';
} else {
listInstalledShape = typeof listInstalledResult;
}
// Invoke `LocalPlugins.getPlugins` — array of locally-known
// plugins (LocalPlugins side, reads
// `~/.claude/plugins/installed_plugins.json` which is the
// idempotency record per case-doc anchor :465822). Same length-
// only logging.
let getPluginsShape = 'not-invoked';
let getPluginsLength: number | null = null;
const getPluginsResult = await invokeEipcChannel<unknown>(
ready.inspector,
'LocalPlugins_$_getPlugins',
[],
);
if (Array.isArray(getPluginsResult)) {
getPluginsShape = `array(length=${getPluginsResult.length})`;
getPluginsLength = getPluginsResult.length;
} else if (getPluginsResult === null) {
getPluginsShape = 'null';
} else {
getPluginsShape = typeof getPluginsResult;
}
const registration: Record<string, unknown> = {};
for (const suffix of EXPECTED_SUFFIXES) {
registration[suffix] = resolved.get(suffix);
}
await testInfo.attach('t11-runtime', {
body: JSON.stringify(
{
expectedRegistrationSuffixes: EXPECTED_SUFFIXES,
registration,
invocations: [
{
suffix: 'CustomPlugins_$_listInstalledPlugins',
args: LIST_INSTALLED_ARGS,
responseShape: listInstalledShape,
responseLength: listInstalledLength,
},
{
suffix: 'LocalPlugins_$_getPlugins',
args: [],
responseShape: getPluginsShape,
responseLength: getPluginsLength,
},
],
},
null,
2,
),
contentType: 'application/json',
});
for (const suffix of EXPECTED_SUFFIXES) {
expect(
resolved.get(suffix),
`[T11] eipc channel ending in '${suffix}' is registered on ` +
'the claude.ai webContents — load-bearing for the plugin ' +
'install flow (case-doc anchors index.js:507181 / :465816 / ' +
':465822)',
).not.toBeNull();
}
expect(
Array.isArray(listInstalledResult),
`[T11] CustomPlugins/listInstalledPlugins response is an array ` +
`(got ${listInstalledShape}) — drives the Manage plugins ` +
'panel readout; an array result (empty or non-empty) proves ' +
'the CustomPlugins impl object is reachable through the ' +
'renderer wrapper and the install-side listing endpoint is wired',
).toBe(true);
expect(
Array.isArray(getPluginsResult),
`[T11] LocalPlugins/getPlugins response is an array ` +
`(got ${getPluginsShape}) — reads the local plugin tree ` +
'(`~/.claude/plugins/installed_plugins.json` per case-doc ' +
':465822); an array result proves the LocalPlugins impl ' +
'object is reachable and the idempotency-record read path ' +
'is wired',
).toBe(true);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,100 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// T12 — WebGL warn-only on Linux: GPU acceleration may be limited
// (virtio-gpu in VMs, hybrid-GPU laptops, blocklisted drivers) but
// the app must still launch and render the main UI without
// crashing. Per the case-doc, the chrome://gpu page is informational
// — there's no hard "enabled" requirement, just "doesn't crash and
// no feature breaks".
//
// The case-doc steps point a human at chrome://gpu via DevTools.
// Automating chrome:// navigation against a live BrowserView is
// blocked by Electron's chrome-scheme guard, so this runner does the
// equivalent capture from the main process via
// `app.getGPUFeatureStatus()` (and `app.getGPUInfo('basic')` for
// vendor/renderer breadcrumbs). The hard signal is "we got past
// waitForReady('mainVisible') and read the status without the
// renderer dying"; the JSON capture is the matrix-regen artifact.
//
// Code anchors driving the assertion shape:
// - index.js:524809 — upstream gates `disableHardwareAcceleration`
// on a user toggle, never passes `--ignore-gpu-blocklist` /
// `--use-gl=*`, so chrome://gpu reflects Chromium's stock
// blocklist behaviour.
// - index.js:500571 — only `webgl:!1` override is scoped to the
// in-memory feedback popup; main UI does not disable WebGL.
//
// Applies to all rows. No skipUnlessRow gate.
// Default 60s test timeout doesn't leave any margin around
// waitForReady('mainVisible')'s 90s budget. Cold-start GPU
// initialisation on virtio-gpu / blocklisted-driver rows is the
// reason that budget exists.
test.setTimeout(120_000);
test('T12 — GPU feature status captured, no crash', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Could' });
testInfo.annotations.push({
type: 'surface',
description: 'Platform integration / GPU',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const app = await launchClaude();
try {
// 'mainVisible' rather than 'window' because the load-bearing
// claim is "main UI rendered" — if the GPU stack were broken
// hard enough to block compositing, MainWindow.getState()
// wouldn't report visible:true and we'd fail here, before
// the GPU probe runs.
const { inspector } = await app.waitForReady('mainVisible');
const gpuStatus = await inspector.evalInMain<Record<string, string>>(`
const { app } = process.mainModule.require('electron');
return app.getGPUFeatureStatus();
`);
await testInfo.attach('gpu-feature-status', {
body: JSON.stringify(gpuStatus, null, 2),
contentType: 'application/json',
});
// `getGPUInfo('basic')` is async and returns vendor / device /
// driver fields. 'complete' is much heavier (full Chromium
// GPU diagnostic dump) and not needed for the matrix
// breadcrumb — 'basic' is the documented default for
// per-row capture.
const gpuInfo = await inspector.evalInMain<unknown>(`
const { app } = process.mainModule.require('electron');
return await app.getGPUInfo('basic');
`);
await testInfo.attach('gpu-info-basic', {
body: JSON.stringify(gpuInfo, null, 2),
contentType: 'application/json',
});
// Sanity assertion: `getGPUFeatureStatus()` returned a populated
// object. An empty result would mean the API itself broke (a
// real regression worth catching), distinct from any individual
// feature being blocklisted (which the case-doc explicitly
// allows on VM / hybrid-GPU rows).
//
// We deliberately do NOT assert any specific feature key is
// 'enabled' — case-doc T12 calls out that webgl/webgl2 may
// report blocklisted on virtio-gpu and hybrid GPUs and that's
// expected. Reaching this line at all means waitForReady
// already proved the renderer is alive; the JSON capture is
// the load-bearing artifact for matrix regen.
expect(
Object.keys(gpuStatus).length,
'app.getGPUFeatureStatus() returned a populated object',
).toBeGreaterThan(0);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,218 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import {
runDoctor,
captureSessionEnv,
} from '../lib/diagnostics.js';
const exec = promisify(execFile);
// T13 — Doctor reports correct package format.
//
// Per docs/testing/cases/launch.md T13 (mirror surface: S05 in
// distribution.md): on RPM-based distros, `claude-desktop --doctor`
// must NOT print `not found via dpkg (AppImage?)` for a copy that
// rpm owns. The doctor script's install-method probe is dpkg-only
// (scripts/doctor.sh — the `command -v dpkg-query` block around the
// `Installed version:` PASS / `not found via dpkg (AppImage?)` WARN
// emit; case-doc anchors that as :290-299 but the actual lines are
// :353-360 in the version of doctor.sh checked at runner-write time
// — see case-doc anchor drift note in the report). There is no
// corresponding `rpm -qf` / `rpm -q claude-desktop` branch, so a
// dnf-installed copy on a host that also has `dpkg-query` available
// will false-flag.
//
// Applies to all rows in principle, but the assertion only has
// signal when we can (a) reach `claude-desktop` on PATH and (b)
// detect an actual install method. AppImage rows and rows where the
// launcher isn't reachable get cleanly skipped — the case-doc says
// no skipUnlessRow(), but install-method-undetectable is its own
// skip condition.
//
// Layer: spawn probe + stdout grep. We shell out to `which`, then
// `rpm -qf` and `dpkg -S` against the resolved path — whichever
// returns 0 is the install method. If both fail the binary is not
// package-managed (treat as AppImage / hand-built; skip). If both
// succeed (mixed Debian + RPM tooling host), we still treat it as
// rpm-owned for the assertion shape: the warning we're guarding
// against is "false-flag as AppImage", which can only fire when
// dpkg returns empty.
const FALSE_FLAG_FRAGMENT =
'not found via dpkg (AppImage?)';
interface ProbeResult {
cmd: string;
exitCode: number | null;
stdout: string;
stderr: string;
}
async function probe(
bin: string,
args: string[],
): Promise<ProbeResult> {
const cmd = `${bin} ${args.join(' ')}`;
try {
const { stdout, stderr } = await exec(bin, args, {
timeout: 5_000,
});
return {
cmd,
exitCode: 0,
stdout: stdout.trim(),
stderr: stderr.trim(),
};
} catch (err) {
const e = err as {
stdout?: string;
stderr?: string;
code?: number;
};
return {
cmd,
exitCode: typeof e.code === 'number' ? e.code : null,
stdout: (e.stdout ?? '').trim(),
stderr: (e.stderr ?? '').trim(),
};
}
}
function formatProbe(p: ProbeResult): string {
const tail = [
p.stdout && `stdout: ${p.stdout}`,
p.stderr && `stderr: ${p.stderr}`,
]
.filter(Boolean)
.join('\n');
return `$ ${p.cmd} (exit ${p.exitCode ?? '?'})\n${tail}`.trim();
}
type InstallMethod = 'rpm' | 'deb' | 'unknown';
test('T13 — Doctor identifies package format correctly', async (
{},
testInfo,
) => {
testInfo.annotations.push({
type: 'severity',
description: 'Should',
});
testInfo.annotations.push({
type: 'surface',
description: 'CLI / --doctor',
});
// Applies to all rows per case-doc — no skipUnlessRow().
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const launcher =
process.env.CLAUDE_DESKTOP_LAUNCHER ?? 'claude-desktop';
const whichProbe = await probe('which', [launcher]);
await testInfo.attach('which-claude-desktop', {
body: formatProbe(whichProbe),
contentType: 'text/plain',
});
const installPath = whichProbe.stdout.split('\n')[0]?.trim() ?? '';
if (whichProbe.exitCode !== 0 || !installPath) {
// No claude-desktop on PATH (and CLAUDE_DESKTOP_LAUNCHER
// either unset or pointing somewhere `which` can't resolve).
// Without a real binary path we can't probe rpm/dpkg, so skip
// — runDoctor() would still spawn, but the assertion
// has no signal.
test.skip(
true,
`claude-desktop not reachable on PATH ` +
`(launcher='${launcher}'); install-method probe ` +
`needs a resolvable binary`,
);
return;
}
const rpmProbe = await probe('rpm', ['-qf', installPath]);
const dpkgProbe = await probe('dpkg', ['-S', installPath]);
await testInfo.attach('rpm-qf', {
body: formatProbe(rpmProbe),
contentType: 'text/plain',
});
await testInfo.attach('dpkg-S', {
body: formatProbe(dpkgProbe),
contentType: 'text/plain',
});
let method: InstallMethod;
if (rpmProbe.exitCode === 0) {
// rpm-owned. If dpkg-S also returned 0 (mixed-tooling host
// like a Fedora box with dpkg installed for cross-distro
// dev), we still assert the rpm shape — the false-flag
// warning can only fire when dpkg-query comes up empty for
// `claude-desktop`. If both tools claim ownership the
// assertion still passes against `not found via dpkg`,
// which is what the case-doc cares about.
method = 'rpm';
} else if (dpkgProbe.exitCode === 0) {
method = 'deb';
} else {
method = 'unknown';
}
await testInfo.attach('detected-install-method', {
body: method,
contentType: 'text/plain',
});
if (method === 'unknown') {
// Neither rpm nor dpkg owns the binary — AppImage extract,
// hand-built install, or symlink to a mounted AppImage.
// Doctor's dpkg-only probe has nothing to assert against
// here; the "package format" question doesn't apply.
test.skip(
true,
`install method undetectable: rpm -qf and dpkg -S both ` +
`returned non-zero against ${installPath} ` +
`(AppImage / hand-built / non-package-managed)`,
);
return;
}
const result = await runDoctor(launcher);
await testInfo.attach('doctor-output', {
body: result.output,
contentType: 'text/plain',
});
await testInfo.attach('doctor-exit-code', {
body: String(result.exitCode),
contentType: 'text/plain',
});
if (method === 'rpm') {
// Core T13 / S05 assertion. On a Fedora row this currently
// fails — there's no rpm branch in scripts/doctor.sh, so
// either the dpkg-only block is skipped (no install-method
// line printed at all) or — on hosts with dpkg-query
// installed but no dpkg record for claude-desktop — the
// false-flag warning fires. The latter is what we guard
// against: the warning's literal text must not appear.
expect(
result.output,
`doctor must not false-flag rpm install as AppImage ` +
`(stdout contained '${FALSE_FLAG_FRAGMENT}')`,
).not.toContain(FALSE_FLAG_FRAGMENT);
} else {
// method === 'deb'. The dpkg-query branch should have
// produced an `Installed version:` PASS, not the AppImage
// false-flag. Assert the PASS path; if doctor instead
// printed the WARN despite dpkg owning the binary that's
// the deb-side regression of the same bug.
expect(
result.output,
`doctor must not warn 'not found via dpkg' for a ` +
`dpkg-installed copy at ${installPath}`,
).not.toContain(FALSE_FLAG_FRAGMENT);
}
});

View File

@@ -0,0 +1,95 @@
import { test, expect } from '@playwright/test';
import { asarContains, resolveAsarPath } from '../lib/asar.js';
// T14a — Single-instance lock + second-instance listener wired
// (file probe).
//
// T14 in docs/testing/cases/launch.md covers multi-instance
// behavior: a second invocation of `claude-desktop` should focus
// the existing window rather than spawning a fresh process. The
// case-doc anchors point at two upstream sites in the bundled
// main process:
//
// build-reference/app-extracted/.vite/build/index.js:525162-525173
// hA.app.requestSingleInstanceLock()
// ? hA.app.on("second-instance", (A, t, i) => {
// ...
// ut.isVisible() || ut.show(),
// ut.isMinimized() && ut.restore(),
// ut.focus());
// })
// : hA.app.quit();
//
// build-reference/app-extracted/.vite/build/index.js:525204-525207
// hA.app.on("ready", async () => {
// ...
// if (!Zr && !hA.app.requestSingleInstanceLock()) {
// R.info("Not main instance, returning early from app ready");
// return;
// }
//
// T14 is split across two specs:
//
// - T14a (this file, Tier 1) — file-level fingerprint. Verifies
// `requestSingleInstanceLock` and the `'second-instance'`
// listener event name exist in the bundled JS. Cheap (<1s),
// row-independent, no app launch. Catches an upstream rename or
// a future patch accidentally stripping the gate.
//
// - T14b (Tier 2, lands separately) — runtime second-launch
// behavior assertion: spawn the app, spawn it again, verify no
// new pid appears and the existing window gets focus. Needs a
// real launch + window-state probe + pgrep delta, which is why
// it's deferred to a later tier.
//
// Pure file probe. Tag matches T14's case-doc severity: Critical.
test('T14a — Single-instance lock + second-instance listener wired (file probe)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'App lifecycle / single instance',
});
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
// Both fingerprints live in `.vite/build/index.js`. Probe with
// `asarContains` against the same archive twice — @electron/asar
// reads are cheap enough that a per-call read keeps the assertion
// shape simple without needing to cache.
const lockCallPresent = asarContains(
'.vite/build/index.js',
'requestSingleInstanceLock',
asarPath,
);
const secondInstanceListenerPresent = asarContains(
'.vite/build/index.js',
'second-instance',
asarPath,
);
await testInfo.attach('fingerprints', {
body: JSON.stringify(
{
lockCallPresent,
secondInstanceListenerPresent,
},
null,
2,
),
contentType: 'application/json',
});
expect(
lockCallPresent,
'app.asar contains requestSingleInstanceLock() — single-instance gate wired',
).toBe(true);
expect(
secondInstanceListenerPresent,
"app.asar contains 'second-instance' listener event name",
).toBe(true);
});

View File

@@ -0,0 +1,228 @@
import { test, expect } from '@playwright/test';
import { spawn } from 'node:child_process';
import { existsSync } from 'node:fs';
import { dirname } from 'node:path';
import { launchClaude } from '../lib/electron.js';
// T14b — Second invocation exits and focuses existing window
// (runtime pair of T14a's file-probe).
//
// docs/testing/cases/launch.md T14 expects: when the app is
// already running and a second invocation happens, the second
// invocation exits and the existing window receives focus — no
// new pid stays alive. Code anchors at
// build-reference/app-extracted/.vite/build/index.js:525162-525173
// (`hA.app.requestSingleInstanceLock()` + `app.on('second-instance', ...)`)
// and :525204-525207 (early-return in `app.on('ready', ...)` when the
// lock is lost — this is the path the second spawn takes to exit).
//
// Shape: launch the app under per-test isolation, then spawn a
// SECOND Electron with the SAME isolation env so both procs collide
// on the same SingletonLock under <configHome>/Claude. The second
// spawn should call `app.requestSingleInstanceLock()`, lose, hit
// the early-return in the `ready` handler and exit on its own. We
// observe via exit(code, signal) on the second proc, then re-check
// the primary pid is still alive via /proc/<pid>.
//
// Replicating the install-resolution logic inline (mirrors H01) keeps
// this spec independent of `launchClaude`'s internal spawn shape.
// We do NOT want to call `launchClaude()` for the second invocation —
// that would attach a second inspector, fight signal handlers, and
// register a second cleanup. Raw `spawn()` is the right primitive:
// observe the gate fire, then walk away.
const DEFAULT_INSTALL_PATHS: { electron: string; asar: string }[] = [
{
electron: '/usr/lib/claude-desktop/node_modules/electron/dist/electron',
asar: '/usr/lib/claude-desktop/node_modules/electron/dist/resources/app.asar',
},
{
electron: '/opt/Claude/node_modules/electron/dist/electron',
asar: '/opt/Claude/node_modules/electron/dist/resources/app.asar',
},
];
function resolveInstallInline(): { electron: string; asar: string } {
const envBin = process.env.CLAUDE_DESKTOP_ELECTRON;
const envAsar = process.env.CLAUDE_DESKTOP_APP_ASAR;
if (envBin && envAsar) return { electron: envBin, asar: envAsar };
for (const candidate of DEFAULT_INSTALL_PATHS) {
if (existsSync(candidate.electron) && existsSync(candidate.asar)) {
return candidate;
}
}
throw new Error(
'Could not locate claude-desktop install. Set CLAUDE_DESKTOP_ELECTRON ' +
'and CLAUDE_DESKTOP_APP_ASAR, or install the deb/rpm package.',
);
}
function pidAlive(pid: number): boolean {
// /proc/<pid> existence is the cheapest liveness check on Linux.
// `process.kill(pid, 0)` would also work but throws on ESRCH which
// makes the call site noisier for no benefit here.
return existsSync(`/proc/${pid}`);
}
test.setTimeout(60_000);
test('T14b — Second invocation exits and focuses existing window', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'App lifecycle / single instance',
});
const start = Date.now();
const app = await launchClaude();
const firstPid = app.pid;
// Capture the isolation env up front — `app.close()` cleans up the
// tmpdir, so we need a snapshot to drive the second spawn while the
// primary is still running. `isolation` is null only when the caller
// passed `isolation: null`; the default constructs a fresh handle.
if (!app.isolation) {
throw new Error(
'T14b expects launchClaude default isolation; ' +
'app.isolation is null. Did the harness defaults change?',
);
}
const isolationEnv = { ...app.isolation.env };
let secondPid: number | null = null;
let secondExitCode: number | null = null;
let secondSignal: NodeJS.Signals | null = null;
let secondTimedOut = false;
let firstAliveAfter = false;
try {
await app.waitForReady('mainVisible');
// Build the second-spawn argv + env. Mirror launchClaude()'s
// LAUNCHER_INJECTED_FLAGS_X11 / LAUNCHER_INJECTED_ENV (lib/
// electron.ts:123-146) so both procs look the same to the
// SingletonLock check — the only difference is that this one
// is started after the first holds the lock.
const { electron: electronBin, asar } = resolveInstallInline();
const appDir = dirname(dirname(dirname(dirname(electronBin))));
const argv = [
'--disable-features=CustomTitlebar',
'--ozone-platform=x11',
'--no-sandbox',
asar,
];
const env: Record<string, string> = {};
for (const [k, v] of Object.entries(process.env)) {
if (v !== undefined) env[k] = v;
}
// SAME isolation env as the running primary. SingletonLock lives
// under <configHome>/Claude — both procs must point there for
// requestSingleInstanceLock() to collide.
for (const [k, v] of Object.entries(isolationEnv)) {
env[k] = v;
}
env.ELECTRON_FORCE_IS_PACKAGED = 'true';
env.ELECTRON_USE_SYSTEM_TITLE_BAR = '1';
env.CI = '1';
const proc = spawn(electronBin, argv, {
cwd: appDir,
env,
stdio: 'ignore',
detached: false,
});
secondPid = proc.pid ?? null;
if (!secondPid) {
throw new Error('Failed to spawn second Electron — no pid');
}
// 10s budget. The second-instance early-return path (index.js
// :525204-525207) fires on `app.on('ready', ...)`, which lands
// well within Electron startup (~2-4s on a warm cache). If we
// blow past 10s the gate didn't fire — kill hard and fail.
await Promise.race([
new Promise<void>((resolve) => {
proc.once('exit', (code, signal) => {
secondExitCode = code;
secondSignal = signal;
resolve();
});
}),
new Promise<void>((resolve) => {
setTimeout(() => {
secondTimedOut = true;
resolve();
}, 10_000);
}),
]);
if (secondTimedOut && proc.exitCode === null && proc.signalCode === null) {
// Gate didn't fire — kill the rogue second proc so we don't
// leave two Electrons fighting over the same userData dir.
proc.kill('SIGKILL');
await new Promise<void>((resolve) => {
proc.once('exit', () => resolve());
setTimeout(() => resolve(), 2_000);
});
}
firstAliveAfter = pidAlive(firstPid);
} finally {
await app.close();
}
const elapsedMs = Date.now() - start;
await testInfo.attach('pids', {
body: JSON.stringify(
{
firstPid,
secondPid,
firstAliveAfterSecondSpawn: firstAliveAfter,
},
null,
2,
),
contentType: 'application/json',
});
await testInfo.attach('second-spawn-exit', {
body: JSON.stringify(
{
exitCode: secondExitCode,
signalCode: secondSignal,
timedOut: secondTimedOut,
elapsedMs,
note:
'Second instance is expected to exit on its own via the ' +
'early-return path in app.on("ready") at ' +
'build-reference/app-extracted/.vite/build/index.js:525204-525207 ' +
'when requestSingleInstanceLock() loses to the primary. ' +
'timedOut=true means the gate did not fire — second-instance ' +
'wiring may be broken.',
},
null,
2,
),
contentType: 'application/json',
});
if (secondTimedOut) {
throw new Error(
'Second-instance gate did not fire within 10s — second Electron ' +
'stayed alive under the same isolation as the primary. ' +
'requestSingleInstanceLock() / app.on("second-instance", ...) ' +
'wiring may be broken (index.js:525162-525173).',
);
}
expect(
secondSignal,
'second instance exited on its own, not by signal from us',
).toBe(null);
expect(
firstAliveAfter,
'primary pid still alive after the second spawn exited',
).toBe(true);
});

Some files were not shown because too many files have changed in this diff Show More