test(harness): session 2 runners (10 new specs, 53% → 66% coverage)

Categories landed:
- B (seedFromHost-unlocked): T16 (Code tab loads), T26 (Routines page
  renders) — both promote Tier 3 → Tier 2 via the seedFromHost
  primitive shipped in session 1.
- A (Tier 2 single-launch deferred from session 1): T10 (Cowork daemon
  respawn after SIGKILL), S10 (KDE-W Quick Entry popup transparent),
  S25 (safeStorage round-trip across two launches with shared
  isolation handle).
- C (Tier 2 reframes): T23 (Notification reaches DBus via dbus-monitor
  subprocess), T25 (shell.showItemInFolder via mock-then-call —
  mirrors T17's installOpenDialogMock), T38 (openInEditor IPC handler
  registered probe via ipcMain._invokeHandlers), S19
  (CLAUDE_CONFIG_DIR extraEnv reaches main process).
- Tier 1 reclass: S28 (worktree permission classifier asar fingerprint
  — Sbn() is closure-local, not inspector-reachable).

Mechanism notes — see plan doc status section for full rationale:
- T23 uses dbus-monitor not gdbus monitor (the latter only sees
  signals owned by a destination, not method calls to it).
- T38 inspects ipcMain._invokeHandlers for handler registration; the
  channel ends in $eipc_message$_<UUID>_$_claude.web_$_<name> with a
  build-stable UUID prefix — anchors on the suffix.
- T25 mock-then-call beats invoke-then-cleanup (no host file manager
  pop-up, stronger assertion).
- S25 compares decrypted plaintexts not ciphertexts (safeStorage on
  Linux uses random IVs).

Co-Authored-By: Claude <claude@anthropic.com>
This commit is contained in:
aaddrick
2026-05-03 17:01:42 -04:00
parent 1f5702bc7b
commit fb5189fe45
10 changed files with 1830 additions and 0 deletions

View File

@@ -0,0 +1,122 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { QuickEntry } from '../lib/quickentry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S10 — Quick Entry popup is transparent (no opaque square frame).
// Backs the KDE-W row of S10 in
// docs/testing/cases/shortcuts-and-input.md.
//
// Upstream constructs the popup BrowserWindow with
// transparent: true, backgroundColor: "#00000000", frame: false
// at build-reference index.js:515380, 515383, 515381. On KDE Plasma
// Wayland the compositor honours the alpha channel and the popup
// renders with a transparent background; on broken-Electron versions
// (electron/electron#50213, the 41.0.4-41.x.y bisect window per
// @noctuum on #370) the alpha is dropped and an opaque square frame
// shows behind the rounded prompt UI.
//
// Construction-time options aren't observable through the prototype-
// method hook in lib/quickentry.ts (the Proxy from frame-fix-wrapper
// returns the closure-captured PatchedBrowserWindow on `electron.
// BrowserWindow` reads — see the doc-comment on
// QuickEntry.installInterceptor and CLAUDE.md "Test harness Electron
// hooks" learning). Runtime-side, `getBackgroundColor()` reflects
// what the BrowserWindow was actually constructed with — so we read
// it via getPopupRuntimeProps() and assert
// transparent === true && backgroundColor in {'#00000000','#0000'}
// matching the predicate in lib/quickentry.ts:266.
//
// Gated to KDE-W: other KDE rows (KDE-X) don't have the same
// compositor / Electron-Wayland concern that the case-doc S10
// surfaces. If S10 fails on a host whose bundled Electron is in the
// 41.0.4-41.x.y window, that's the upstream regression — see S33 for
// the version-capture half. Don't wrap in skip on failure; surface
// it as a regression-detector signal.
test.setTimeout(60_000);
test('S10 — Quick Entry popup is transparent (no opaque square frame)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Quick Entry window (KDE Wayland)',
});
skipUnlessRow(testInfo, ['KDE-W']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
await testInfo.attach('isolation', {
body: JSON.stringify(
{
useHostConfig,
configDir: app.isolation?.configDir ?? null,
},
null,
2,
),
contentType: 'application/json',
});
try {
// Main needs to be up before the shortcut can lazily construct
// the popup — the popup-show path reads renderer state via
// upstream's lHn() user-loaded check (see openAndWaitReady's
// retry-loop comment in lib/quickentry.ts).
const { inspector } = await app.waitForReady('mainVisible');
const qe = new QuickEntry(inspector);
await qe.installInterceptor();
// Fire the OS shortcut and wait for the popup BrowserWindow to
// be visible with its textarea mounted — same handshake S29
// uses. If ydotool isn't reachable, openAndWaitReady throws
// the install-instructions error from ensureYdotool — that
// surfaces as a clear test failure (acceptable per the
// case-doc; not wrapped in a skip).
await qe.openAndWaitReady();
const props = await qe.getPopupRuntimeProps();
await testInfo.attach('popup-runtime-props', {
body: JSON.stringify(props, null, 2),
contentType: 'application/json',
});
expect(
props,
'getPopupRuntimeProps returned null — interceptor did not ' +
'capture the popup BrowserWindow ref',
).not.toBeNull();
// Predicate matches lib/quickentry.ts:266 — '#00000000' is the
// canonical 8-digit form Electron returns for the upstream
// construction value, '#0000' is the short form some Electron
// builds normalise to. Either is acceptable.
expect(
props!.backgroundColor === '#00000000'
|| props!.backgroundColor === '#0000',
`popup backgroundColor must be transparent (#00000000 or ` +
`#0000), got ${JSON.stringify(props!.backgroundColor)}. ` +
`If the bundled Electron is in the 41.0.4-41.x.y window ` +
`(see S33), this is the electron#50213 regression ` +
`tracked under issue #370.`,
).toBe(true);
expect(
props!.transparent,
'popup transparent flag (derived from backgroundColor) is ' +
'false — opaque square frame would render behind the ' +
'rounded prompt UI',
).toBe(true);
inspector.close();
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,156 @@
import { test, expect } from '@playwright/test';
import { mkdtemp, rm } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { launchClaude } from '../lib/electron.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S19 — `CLAUDE_CONFIG_DIR` redirects scheduled-task storage.
//
// Backs S19 in docs/testing/cases/routines.md.
//
// Case-doc anchors:
// build-reference/app-extracted/.vite/build/index.js:283107 — `cE()`
// resolves `process.env.CLAUDE_CONFIG_DIR ?? ~/.claude` (with a
// `~` / `~/` / `~\` expansion shim).
// build-reference/app-extracted/.vite/build/index.js:283118 — `Tce()`
// returns `${cE()}/scheduled-tasks`, the directory the
// scheduled-tasks substrate writes into.
// build-reference/app-extracted/.vite/build/index.js:488317, :509032 —
// call sites that pass `taskFilesDir: Tce()` into the
// scheduled-tasks substrate.
//
// Tier 2 reframe (per docs/testing/runner-implementation-plan.md S19):
// the full flow (login + create a scheduled task and read its SKILL.md
// off disk) is Tier 3. Tier 2's slice is the env-propagation half:
// confirm `CLAUDE_CONFIG_DIR` from `extraEnv` actually reaches the main
// process's `process.env`. If that contract breaks, `cE()` falls back
// to `~/.claude` and every Tier-3 path-redirection assertion built on
// top of it silently regresses.
//
// We also opportunistically eval the resolver fingerprint inline (the
// same expression `cE()` and `Tce()` compute) and assert the synthetic
// resolved path lives under our test dir. This is a runtime echo, not
// an introspection of the bundled symbols (`cE` / `Tce` are minified
// closure-locals — not reachable from `globalThis`); the static
// fingerprint of those functions is covered by the asar-grep style
// probes (S26 / S27 family). A future regression where the env stops
// propagating shows up as a hard failure here even though the bundled
// resolver is unchanged.
//
// extraEnv-vs-isolation env precedence: `lib/electron.ts` spreads
// `opts.extraEnv` AFTER `isolation?.env` (line ~317-323), so the
// override here wins over the default isolation's
// `CLAUDE_CONFIG_DIR=<tmp>/config/Claude`. Confirmed by reading
// electron.ts before writing this runner.
//
// No row gate — applies to all rows.
interface ResolverProbe {
homedir: string;
envValue: string | null;
resolvedConfigDir: string;
resolvedScheduledTasksDir: string;
}
test.setTimeout(60_000);
test('S19 — CLAUDE_CONFIG_DIR from extraEnv reaches main process', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Could' });
testInfo.annotations.push({
type: 'surface',
description: 'Config dir env var',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Dedicated tmpdir for this test's CLAUDE_CONFIG_DIR override —
// disjoint from the default-isolation tmpdir so a future regression
// where the override path silently falls back to the isolation dir
// is caught (the two paths differ by their tmpdir prefix).
const testDir = await mkdtemp(join(tmpdir(), 'claude-s19-'));
await testInfo.attach('test-config-dir', {
body: testDir,
contentType: 'text/plain',
});
const app = await launchClaude({
extraEnv: { CLAUDE_CONFIG_DIR: testDir },
});
try {
const { inspector } = await app.waitForReady('mainVisible');
// Half 1: env propagation. The bundled `cE()` resolver reads
// `process.env.CLAUDE_CONFIG_DIR` directly — if this doesn't
// equal what we passed in `extraEnv`, every downstream path
// resolution inherits the wrong root.
const observed = await inspector.evalInMain<string | null>(`
return process.env.CLAUDE_CONFIG_DIR ?? null;
`);
await testInfo.attach('observed-claude-config-dir', {
body: observed ?? '(unset)',
contentType: 'text/plain',
});
expect(
observed,
'main process sees CLAUDE_CONFIG_DIR === <test-dir> ' +
'(extraEnv must win over default isolation env)',
).toBe(testDir);
// Half 2: synthetic resolver echo. Re-implement `cE()` /
// `Tce()` in the inspector — same expression the bundled
// code uses, computed against the live main-process env and
// homedir. Captures both the env-propagation fact AND the
// path shape Tce() actually produces, so a future regression
// where someone reroutes scheduled-tasks under a sibling
// folder (e.g. `${cE()}/tasks/`) is visible here.
const probe = await inspector.evalInMain<ResolverProbe>(`
const os = process.mainModule.require('node:os');
const path = process.mainModule.require('node:path');
const envValue = process.env.CLAUDE_CONFIG_DIR ?? null;
const homedir = os.homedir();
const resolveConfigDir = () => {
const e = envValue;
if (
e === '~' ||
(e != null && e.startsWith('~/')) ||
(e != null && e.startsWith('~\\\\'))
) {
return path.join(homedir, e.slice(1));
}
return e ?? path.join(homedir, '.claude');
};
const resolvedConfigDir = resolveConfigDir();
return {
homedir,
envValue,
resolvedConfigDir,
resolvedScheduledTasksDir: path.join(
resolvedConfigDir,
'scheduled-tasks',
),
};
`);
await testInfo.attach('resolver-probe', {
body: JSON.stringify(probe, null, 2),
contentType: 'application/json',
});
expect(
probe.resolvedConfigDir,
'cE()-equivalent resolves to the test dir',
).toBe(testDir);
expect(
probe.resolvedScheduledTasksDir,
'Tce()-equivalent resolves under the test dir',
).toBe(join(testDir, 'scheduled-tasks'));
} finally {
await app.close();
await rm(testDir, { recursive: true, force: true });
}
});

View File

@@ -0,0 +1,207 @@
import { test, expect } from '@playwright/test';
import { join } from 'node:path';
import { launchClaude } from '../lib/electron.js';
import { createIsolation, type Isolation } from '../lib/isolation.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S25 — Mobile pairing survives Linux session restart (Tier 2 slice).
//
// Full S25 (case-doc platform-integration.md:250) is a Tier 3 mobile-
// pairing flow needing a paired phone. The Linux-side persistence
// half is independently testable: upstream caches the trusted-device
// token via `safeStorage.encryptString` (libsecret on Linux) so a
// successful pair survives restart without re-enrolling. The
// load-bearing contract on Linux is "encrypt-decrypt round-trip is
// stable across an Electron process restart against the system
// keyring backend." That's what this runner exercises.
//
// Code anchors (case-doc S25):
// - index.js:511984 — ZEe = "coworkTrustedDeviceToken" electron-
// store key for the trusted-device token.
// - index.js:511989 — oYn() writes via safeStorage.encryptString
// (libsecret on Linux); aYn() (:512003) decrypts on read.
// - index.js:512022 — gYn() re-enrolls via POST /api/auth/
// trusted_devices only when there's no cached token.
//
// Approach: bypass electron-store entirely. The store is incidental —
// what's load-bearing is that the keyring resolves the same encryption
// key between launches. We:
// 1. Fresh isolation handle (clean state — no seedFromHost; this
// isn't an auth test).
// 2. Launch 1, check safeStorage.isEncryptionAvailable() (skip if
// false — common on headless rows / no keyring backend).
// 3. Encrypt a known plaintext via safeStorage.encryptString, write
// the ciphertext bytes to ${configDir}/test-token.bin, close.
// 4. Launch 2, read ${configDir}/test-token.bin, decrypt via
// safeStorage.decryptString, assert decrypted text equals
// plaintext.
// 5. Cleanup the isolation handle (we own it — passing it to
// launchClaude doesn't transfer ownership).
//
// Why compare decrypted plaintext, not ciphertext: safeStorage on
// Linux uses libsecret-derived AES-128 with random IVs, so the same
// plaintext yields different ciphertext on re-encrypt. The round-
// trip is the contract — ciphertext equality isn't.
const PLAINTEXT = 'S25-trusted-device-token-' + Date.now();
const TOKEN_FILE_NAME = 'test-token.bin';
// Two launches at ~60s each plus settle / waitForReady budget.
test.setTimeout(180_000);
test('S25 — safeStorage token round-trip survives app restart', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Dispatch pairing persistence',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Fresh isolation, shared across both launches. No seedFromHost —
// the keyring backend is process-scoped, not config-scoped, so a
// signed-out clean isolation still exercises the same code path.
const isolation: Isolation = await createIsolation();
const tokenFile = join(isolation.configDir, TOKEN_FILE_NAME);
let encryptionAvailable = false;
let cipherLen = 0;
try {
// Launch 1: encrypt + write.
const app1 = await launchClaude({ isolation });
try {
const { inspector } = await app1.waitForReady('mainVisible');
encryptionAvailable = await inspector.evalInMain<boolean>(`
const { safeStorage } = process.mainModule.require('electron');
return safeStorage.isEncryptionAvailable();
`);
await testInfo.attach('encryption-available-launch1', {
body: JSON.stringify({ encryptionAvailable }, null, 2),
contentType: 'application/json',
});
if (!encryptionAvailable) {
testInfo.skip(
true,
'safeStorage.isEncryptionAvailable() === false — no ' +
'keyring backend on this row (libsecret/kwallet/' +
'gnome-keyring not running, or running headless)',
);
return;
}
// Encrypt + write to tokenFile. base64-encode the ciphertext
// for transport across the inspector boundary (evalInMain
// returns JSON, and Buffers serialize as { type, data } —
// base64 in/out is simpler and lossless).
const writeResult = await inspector.evalInMain<{
cipherLen: number;
path: string;
}>(`
const { safeStorage } = process.mainModule.require('electron');
const fs = require('node:fs');
const cipher = safeStorage.encryptString(${JSON.stringify(PLAINTEXT)});
fs.mkdirSync(${JSON.stringify(isolation.configDir)}, {
recursive: true,
});
fs.writeFileSync(${JSON.stringify(tokenFile)}, cipher);
return { cipherLen: cipher.length, path: ${JSON.stringify(tokenFile)} };
`);
cipherLen = writeResult.cipherLen;
await testInfo.attach('encrypt-and-write', {
body: JSON.stringify(
{
plaintextPreview: PLAINTEXT,
tokenFile: writeResult.path,
cipherLen,
},
null,
2,
),
contentType: 'application/json',
});
// Sanity check: in-session round-trip. Catches the case where
// safeStorage reports available but the backend is broken
// (e.g. locked keyring with no unlock prompt). Without this,
// a backend failure would surface as a launch-2 read error
// that's harder to distinguish from a cross-restart break.
const inSessionRoundTrip = await inspector.evalInMain<string>(`
const { safeStorage } = process.mainModule.require('electron');
const fs = require('node:fs');
const cipher = fs.readFileSync(${JSON.stringify(tokenFile)});
return safeStorage.decryptString(cipher);
`);
expect(
inSessionRoundTrip,
'in-session encrypt+decrypt round-trip works',
).toBe(PLAINTEXT);
inspector.close();
} finally {
await app1.close();
}
// Launch 2: read + decrypt with the same isolation handle.
const app2 = await launchClaude({ isolation });
let decrypted: string | null = null;
try {
const { inspector } = await app2.waitForReady('mainVisible');
const stillAvailable = await inspector.evalInMain<boolean>(`
const { safeStorage } = process.mainModule.require('electron');
return safeStorage.isEncryptionAvailable();
`);
await testInfo.attach('encryption-available-launch2', {
body: JSON.stringify({ stillAvailable }, null, 2),
contentType: 'application/json',
});
expect(
stillAvailable,
'safeStorage still available on launch 2',
).toBe(true);
decrypted = await inspector.evalInMain<string>(`
const { safeStorage } = process.mainModule.require('electron');
const fs = require('node:fs');
const cipher = fs.readFileSync(${JSON.stringify(tokenFile)});
return safeStorage.decryptString(cipher);
`);
await testInfo.attach('decrypt-after-restart', {
body: JSON.stringify(
{
tokenFile,
cipherLen,
decrypted,
match: decrypted === PLAINTEXT,
},
null,
2,
),
contentType: 'application/json',
});
inspector.close();
} finally {
await app2.close();
}
expect(
decrypted,
'safeStorage.decryptString returned a value after restart',
).not.toBeNull();
expect(
decrypted,
'decrypted plaintext matches what was written before restart — ' +
'keyring backend resolved the same encryption key across ' +
'process restart',
).toBe(PLAINTEXT);
} finally {
await isolation.cleanup();
}
});

View File

@@ -0,0 +1,161 @@
import { test, expect } from '@playwright/test';
import { readAsarFile, resolveAsarPath } from '../lib/asar.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// S28 — Worktree creation surfaces clear error on read-only mounts
// (file-probe form).
//
// Per docs/testing/cases/extensibility.md S28: when a project sits on
// a read-only mount and the user tries to start a parallel session,
// worktree creation must fail with a clear error pointing at the
// read-only mount — no silent loss, no parent-repo corruption. The
// case-doc anchor (`build-reference/.../index.js:462760` `Sbn()`) is
// the classifier that buckets the underlying git error into
// `"permission-denied"` for the read-only-mount taxonomy.
//
// **Tier reclassification.** The runner-implementation-plan.md
// (S28 ~line 446) reframes S28 as a Tier 2 inspector-eval against
// `Sbn()` with a synthetic error. In practice `Sbn` is a closure-local
// in the bundled main process — not reachable from the inspector
// without an IPC surface that calls into it, and no such surface is
// exposed by the case-doc anchors. So we drop one tier further: a
// pure asar fingerprint that pins the classifier's input strings and
// output bucket together with the worktree-failure log line they're
// wired into. If upstream reshapes the classifier (renames the bucket,
// drops one of the input matches, or unwires the worktree path from
// the bucketing call), this test fails — which is exactly the drift
// signal the higher-tier form would catch via a synthetic error.
//
// The full Tier 3 surface — actual read-only mount, parallel session,
// dialog text scrape — stays in the case doc as a manual repro.
//
// Fingerprint shape (single regex matches all four strings together
// in the same `Sbn()` return expression, identifier-agnostic):
//
// <id>.includes("Permission denied") ||
// <id>.includes("Access is denied") ||
// <id>.includes("could not lock config file")
// ? "permission-denied"
//
// where `<id>` is `e` in the beautified source but rotates between
// releases. We anchor on the call shape and the literal strings, not
// the identifier. Whitespace is tolerated to handle both the
// minified runtime form and the beautified build-reference form.
//
// Sibling assertion: the `Failed to create git worktree:` log line
// (case-doc anchor :462928, `R.error("Failed to create git worktree:
// …")`) is present in the same file. This is the call site whose
// error string Sbn() classifies — without it, the classifier exists
// in isolation and the contract S28 cares about (read-only mount →
// permission-denied bucket on the worktree creation path) is broken.
//
// Pure file probe — no app launch. Fast (<1s). Row-independent.
const PERMISSION_DENIED_CLASSIFIER_RE =
/(\w+)\.includes\(\s*"Permission denied"\s*\)\s*\|\|\s*\1\.includes\(\s*"Access is denied"\s*\)\s*\|\|\s*\1\.includes\(\s*"could not lock config file"\s*\)\s*\?\s*"permission-denied"/;
const WORKTREE_FAILURE_LOG_RE =
/Failed to create git worktree:/;
test('S28 — worktree permission-denied classifier wired to git worktree failure path (file probe)', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Could' });
testInfo.annotations.push({
type: 'surface',
description: 'Worktree permission classifier',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const asarPath = resolveAsarPath();
await testInfo.attach('asar-path', {
body: asarPath,
contentType: 'text/plain',
});
const indexJs = readAsarFile('.vite/build/index.js', asarPath);
// (1) Classifier shape — all three input strings + the
// "permission-denied" output bucket appear in the same
// expression. The single regex enforces clustering: the three
// `<id>.includes(...)` calls are joined by `||` and resolve to
// `"permission-denied"` in the same ternary, so we don't need a
// separate proximity window check — the regex IS the cluster
// condition.
const classifierMatch = indexJs.match(PERMISSION_DENIED_CLASSIFIER_RE);
const classifierFound = classifierMatch !== null;
// Surrounding context for the diagnostic attachment — ~200 chars
// either side of the match so a future failure shows what the
// upstream-reshaped classifier looks like.
let classifierContext: string | null = null;
if (classifierMatch && classifierMatch.index !== undefined) {
const start = Math.max(0, classifierMatch.index - 200);
const end = Math.min(
indexJs.length,
classifierMatch.index + classifierMatch[0].length + 200,
);
classifierContext = indexJs.slice(start, end);
}
// (2) The classifier's call site — the `Failed to create git
// worktree:` log line at case-doc anchor :462928. Without this,
// the classifier exists in isolation and S28's contract
// (read-only mount → permission-denied bucket on the worktree
// creation path) is unwired.
const worktreeFailureLogPresent =
WORKTREE_FAILURE_LOG_RE.test(indexJs);
// (3) Sanity: the bucket name itself appears in the bundle. This
// is implied by (1) but we surface it as a separate count so a
// future failure that drops only the regex match is
// distinguishable from one that drops the bucket entirely.
const bucketOccurrences = (
indexJs.match(/"permission-denied"/g) ?? []
).length;
await testInfo.attach('s28-evidence', {
body: JSON.stringify(
{
file: '.vite/build/index.js',
classifierRegex: PERMISSION_DENIED_CLASSIFIER_RE.source,
classifierFound,
classifierMatchSnippet: classifierMatch
? classifierMatch[0]
: null,
classifierContext,
worktreeFailureLogRegex: WORKTREE_FAILURE_LOG_RE.source,
worktreeFailureLogPresent,
permissionDeniedBucketOccurrences: bucketOccurrences,
},
null,
2,
),
contentType: 'application/json',
});
expect(
classifierFound,
'app.asar contains the permission-denied classifier shape ' +
'(`<id>.includes("Permission denied") || ... || ' +
'<id>.includes("could not lock config file") ? ' +
'"permission-denied"`) per extensibility.md S28 anchor :462760',
).toBe(true);
expect(
worktreeFailureLogPresent,
'app.asar contains the `Failed to create git worktree:` log ' +
'line (extensibility.md S28 anchor :462928) — the call site ' +
'whose error string the classifier buckets',
).toBe(true);
expect(
bucketOccurrences,
'app.asar contains the `"permission-denied"` bucket name (sanity ' +
'check — implied by classifier match but surfaced separately ' +
'so a future regression can distinguish a regex-shape change ' +
'from a bucket rename)',
).toBeGreaterThan(0);
});

View File

@@ -0,0 +1,294 @@
import { test, expect } from '@playwright/test';
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { launchClaude } from '../lib/electron.js';
import { skipUnlessRow } from '../lib/row.js';
import { sleep } from '../lib/retry.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
const exec = promisify(execFile);
// T10 — cowork daemon respawn after kill.
//
// docs/testing/cases/platform-integration.md T10 covers two
// claims: the daemon spawns when Cowork needs it (asserted by
// H04), AND it respawns within the documented timeout if it
// crashes mid-session. This runner covers the second half.
//
// The respawn path is implemented by Patch 6 in
// scripts/patches/cowork.sh:244-362 (issue #408). The auto-launch
// gate uses a timestamp-based cooldown (`_lastSpawn`, 10s window)
// instead of a one-shot boolean specifically so the retry loop
// in kUe()/the renamed retry function can re-fork the daemon
// after it dies. If the cooldown regresses back to a one-shot
// boolean, or the cooldown window grows past the renderer's
// retry budget, kill-then-respawn silently breaks and the user
// sees "VM service not running" until they restart the app.
//
// Shape: same baseline + spawn detection as H04. Once a daemon
// pid is captured, SIGKILL it and `retryUntil`-poll pgrep for a
// distinct new pid (NOT in baseline AND NOT the killed pid)
// within 20s — 10s cooldown + 10s slack for the renderer's next
// retry tick to land. Fail with a pgrep-state attachment if no
// new pid appears.
//
// Row gate matches H04 — daemon is Linux-only, gating mirrors the
// rest of the cowork lifecycle row set.
const PGREP_PATTERN = 'cowork-vm-service\\.js';
async function pgrepPids(pattern: string): Promise<Set<number>> {
try {
const { stdout } = await exec('pgrep', ['-f', pattern], {
timeout: 5_000,
});
return new Set(
stdout
.split('\n')
.map((l) => parseInt(l.trim(), 10))
.filter((n) => !Number.isNaN(n)),
);
} catch (err) {
// pgrep exits 1 with empty stdout when no matches. Treat as
// the empty set; everything else propagates.
const e = err as { code?: number; stdout?: string };
if (e.code === 1) return new Set();
const out = e.stdout ?? '';
return new Set(
out
.split('\n')
.map((l) => parseInt(l.trim(), 10))
.filter((n) => !Number.isNaN(n)),
);
}
}
test.setTimeout(90_000);
test('T10 — cowork daemon respawns after SIGKILL', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Cowork daemon respawn',
});
skipUnlessRow(testInfo, ['KDE-W', 'GNOME-W', 'Ubu-W', 'KDE-X', 'GNOME-X']);
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Baseline — launchClaude's cleanupPreLaunch (lib/electron.ts:160-191)
// pkills any leftover cowork daemon before spawning, so a stray
// pid here would mean the cleanup itself is broken.
const baselinePids = await pgrepPids(PGREP_PATTERN);
await testInfo.attach('baseline-pids', {
body: JSON.stringify(
{
pids: Array.from(baselinePids),
note:
'cleanupPreLaunch should leave this empty before launch. ' +
'Non-empty here is a bug in lib/electron.ts:160-191.',
},
null,
2,
),
contentType: 'application/json',
});
const useHostConfig = process.env.CLAUDE_TEST_USE_HOST_CONFIG === '1';
const app = await launchClaude({
isolation: useHostConfig ? null : undefined,
});
let daemonPid: number | null = null;
try {
// mainVisible — main shell up; the daemon spawn is gated on
// renderer activity (cowork.sh:262-362) which can begin
// asynchronously after the shell paints.
await app.waitForReady('mainVisible');
// Phase 1: capture the original daemon pid. Same 15s window
// as H04 — if the daemon never spawned in the first place,
// there's nothing to kill, so skip with the same reason.
const spawnStart = Date.now();
while (Date.now() - spawnStart < 15_000) {
const pids = await pgrepPids(PGREP_PATTERN);
const newPids = Array.from(pids).filter(
(p) => !baselinePids.has(p),
);
if (newPids.length > 0) {
daemonPid = newPids[0]!;
break;
}
await sleep(500);
}
if (daemonPid === null) {
await testInfo.attach('skip-reason', {
body: JSON.stringify(
{
reason:
'cowork daemon not spawned within 15s of mainVisible',
note:
'Auto-launch in cowork.sh:262-362 is gated on a VM ' +
'service connection attempt from the renderer; on a ' +
'passive launch with no Cowork-tab interaction it may ' +
'legitimately not fire. Without an initial spawn there ' +
'is no daemon to kill, so the respawn assertion is ' +
'unreachable. Same skip path as H04.',
},
null,
2,
),
contentType: 'application/json',
});
testInfo.skip(
true,
'cowork daemon not spawned by this build — gating in ' +
'cowork.sh:262-362 may have suppressed it on a passive launch',
);
return;
}
const originalSpawnElapsedMs = Date.now() - spawnStart;
await testInfo.attach('original-spawn', {
body: JSON.stringify(
{
pid: daemonPid,
elapsedMs: originalSpawnElapsedMs,
},
null,
2,
),
contentType: 'application/json',
});
// Phase 2: SIGKILL the daemon. Try direct process.kill first;
// the daemon is forked by the Electron main process under the
// same uid as the test runner, so this should not need root.
// Shell-out fallback covers the unlikely case where direct
// kill fails (e.g. EPERM on a misconfigured runner).
const killTs = Date.now();
let killMethod = 'process.kill';
try {
process.kill(daemonPid, 'SIGKILL');
} catch (err) {
killMethod = 'execFile-kill-9';
await exec('kill', ['-9', String(daemonPid)], { timeout: 5_000 });
}
await testInfo.attach('kill', {
body: JSON.stringify(
{
killedPid: daemonPid,
killMethod,
killedAt: new Date(killTs).toISOString(),
},
null,
2,
),
contentType: 'application/json',
});
// Phase 3: poll up to 20s for a NEW daemon pid. The cooldown
// in cowork.sh:329-332 is 10s (`Date.now()-_lastSpawn>1e4`),
// so a respawn cannot fire earlier than 10s after the original
// spawn timestamp. We add 10s of slack for the renderer's
// retry tick to land after the cooldown elapses.
//
// Predicate: a pid that's not in the original baseline AND
// not the killed pid. The killed pid is excluded explicitly
// so a kernel that hasn't yet reaped the zombie can't fool
// pgrep into reporting "respawned" with the dead pid.
const respawnStart = Date.now();
let respawnPid: number | null = null;
while (Date.now() - respawnStart < 20_000) {
const pids = await pgrepPids(PGREP_PATTERN);
const candidates = Array.from(pids).filter(
(p) => !baselinePids.has(p) && p !== daemonPid,
);
if (candidates.length > 0) {
respawnPid = candidates[0]!;
break;
}
await sleep(500);
}
const respawnElapsedMs = Date.now() - respawnStart;
if (respawnPid === null) {
const finalPids = await pgrepPids(PGREP_PATTERN);
await testInfo.attach('respawn-failure', {
body: JSON.stringify(
{
killedPid: daemonPid,
pgrepFinal: Array.from(finalPids),
elapsedMs: respawnElapsedMs,
note:
'No new cowork-vm-service pid observed within 20s ' +
'of SIGKILL. Cooldown in cowork.sh:329-332 is 10s; ' +
'budget includes 10s of slack for the renderer retry ' +
'tick. Possible regressions: cooldown reverted to a ' +
'one-shot boolean (issue #408), retry loop no longer ' +
're-enters the auto-launch branch on ECONNREFUSED, ' +
'or the renderer stopped retrying VM connections ' +
'after the daemon dropped its socket.',
},
null,
2,
),
contentType: 'application/json',
});
} else {
await testInfo.attach('respawn', {
body: JSON.stringify(
{
originalPid: daemonPid,
respawnPid,
elapsedMs: respawnElapsedMs,
},
null,
2,
),
contentType: 'application/json',
});
}
expect(
respawnPid,
'cowork-vm-service respawns within 20s of SIGKILL',
).not.toBeNull();
expect(
respawnPid,
'respawn pid is distinct from the killed pid',
).not.toBe(daemonPid);
} finally {
await app.close();
// Best-effort cleanup confirmation. If anything still matches
// PGREP_PATTERN after close, attach it for diagnosis but don't
// fail — H04 is the runner that asserts the cleanup contract.
await sleep(2_000);
const postExitPids = await pgrepPids(PGREP_PATTERN);
const lingering = Array.from(postExitPids).filter(
(p) => !baselinePids.has(p),
);
await testInfo.attach('post-exit-pgrep', {
body: JSON.stringify(
{
baseline: Array.from(baselinePids),
postExit: Array.from(postExitPids),
lingering,
note:
'Informational. H04 owns the cleanup-after-close ' +
'assertion; this attachment is for cross-referencing ' +
'when respawn passes but cleanup regresses elsewhere.',
},
null,
2,
),
contentType: 'application/json',
});
}
});

View File

@@ -0,0 +1,119 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { createIsolation, type Isolation } from '../lib/isolation.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
import { CodeTab, findCompactPills } from '../lib/claudeai.js';
// T16 — Code tab loads.
//
// Path: seed auth from the host's signed-in Claude Desktop config into
// a per-test tmpdir, launch the app against that hermetic config, wait
// for `userLoaded` (claude.ai past /login — the Code tab isn't
// reachable from /login), then click the Code tab via the AX-tree-
// backed CodeTab.activate() page-object. activate() polls for at
// least one compact pill (the env pill is the cheapest "Code-tab body
// mounted" signal — the URL doesn't change on Code-tab activation, so
// there's no navigation event to anchor on).
//
// Side effect of `seedFromHost: true`: the host's running Claude
// Desktop is killed (writer-lock release for LevelDB / SQLite); the
// host config dir itself is left untouched, only an allowlisted
// subset is copied into the per-test tmpdir which is rm -rf'd on
// app.close(). See lib/isolation.ts for the allowlist and
// lib/host-claude.ts for the kill semantics. Same pattern as T07.
test('T16 — Code tab loads', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Smoke' });
testInfo.annotations.push({
type: 'surface',
description: 'Code tab — top-level UI',
});
// No skipUnlessRow — T16 applies to all rows.
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Seed auth from host (same handshake as T07). Skip cleanly when no
// signed-in host config is available — createIsolation throws with a
// clear message in that case (no host dir, or dir present but
// missing the auth files).
let isolation: Isolation;
try {
isolation = await createIsolation({ seedFromHost: true });
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
test.skip(true, `seedFromHost unavailable: ${msg}`);
return;
}
const app = await launchClaude({ isolation });
try {
// userLoaded gates on claude.ai URL past /login. With seeded
// auth this should fire well within the default budget on a
// warm cache; if the seed was stale and the renderer bounces
// to /login, postLoginUrl stays absent and we skip.
const ready = await app.waitForReady('userLoaded');
await testInfo.attach('claude-ai-url', {
body: ready.claudeAiUrl ?? '(no claude.ai webContents observed)',
contentType: 'text/plain',
});
if (!ready.postLoginUrl) {
test.skip(
true,
'seeded auth did not reach post-login URL — host config ' +
'may be stale (signed out, expired session, etc.)',
);
return;
}
await testInfo.attach('post-login-url', {
body: ready.postLoginUrl,
contentType: 'text/plain',
});
// Click the Code tab and wait for the Code-tab body to mount.
// CodeTab.activate() does the AX-tree click (role: button,
// accessibleName: "Code") then polls findCompactPills() — the
// env pill rendering is the cheapest signal that the Code-tab
// body is up and interactive. Throws on miss with the candidate
// count for triage. Generous timeout: the Code-tab body has
// more wiring than Chat, and on a cold cache the first
// activation can take a few seconds.
const codeTab = new CodeTab(ready.inspector);
try {
await codeTab.activate({ timeout: 15_000 });
} catch (err) {
// On miss, capture the post-click compact-pill snapshot so
// the failure log shows what (if anything) was on the page
// instead of just "no pills found".
const fallback = await findCompactPills(ready.inspector).catch(
() => [],
);
await testInfo.attach('compact-pills-on-failure', {
body: JSON.stringify(fallback, null, 2),
contentType: 'application/json',
});
throw err;
}
// Diagnostic: the post-activate compact pill list. The env pill
// being present is the assertion (encoded by activate() not
// throwing); the snapshot is captured for case-doc anchor
// refinement and drift detection.
const pills = await findCompactPills(ready.inspector);
await testInfo.attach('compact-pills', {
body: JSON.stringify(pills, null, 2),
contentType: 'application/json',
});
expect(
pills.length,
'at least one compact pill rendered after activating the Code tab ' +
'(env pill is the cheapest "Code-tab body mounted" signal)',
).toBeGreaterThan(0);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,258 @@
import { test, expect } from '@playwright/test';
import { spawn, execFile } from 'node:child_process';
import { promisify } from 'node:util';
import { launchClaude } from '../lib/electron.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
import { sleep } from '../lib/retry.js';
const exec = promisify(execFile);
// T23 — Desktop notification fires and reaches the session bus.
//
// Tier 2 reframe of the case-doc T23. The full case-doc claim is
// "trigger notification source (T27 scheduled task / T22 PR
// completion / S24 dispatch), observe notification appears in DE
// notification area" — that's Tier 3 because every source needs a
// signed-in account + extra fixtures. Here we collapse the question
// to "does Electron's Notification API on this build still hit
// org.freedesktop.Notifications.Notify on the session bus?" and
// answer it from the inspector with a unique-titled notification
// while a dbus-monitor subprocess records bus traffic.
//
// Code anchors (build-reference/app-extracted/.vite/build/index.js):
// :494456 — `new hA.Notification(r)` (backed by Electron's
// libnotify-equivalent on Linux: a DBus call to
// org.freedesktop.Notifications.Notify).
// :495110 — `showNotification(title, body, tag, navigateTo)` is
// the dispatcher; on Linux it routes through the
// Electron Notification path.
// We don't drive showNotification directly (it's behind minified
// internal modules) — using `electron.Notification` proves the
// underlying surface is reachable, which is the load-bearing claim.
//
// Why a subprocess for monitoring rather than dbus-next:
// - org.freedesktop.Notifications.Notify is a method *call*, not a
// signal. dbus-next's match-rule API is shaped for signals;
// observing method calls TO another connection requires
// `eavesdrop=true` and the broker may reject it. dbus-monitor
// handles the eavesdrop dance for us when broker policy allows.
// - The existing lib/dbus.ts session-bus connection is for
// well-known method calls (GetConnectionUnixProcessID etc.); the
// monitor is short-lived and easier to clean up as a subprocess.
//
// Why dbus-monitor and not gdbus monitor:
// - `gdbus monitor --dest <name>` only sees signals OWNED BY that
// destination (e.g. PropertiesChanged on the daemon), not
// method calls TO it. The Notify is a method call FROM Electron
// TO the daemon, so gdbus monitor can't observe it. dbus-monitor
// installs a real match rule with eavesdrop support.
//
// Skip rules (cleanly, not failures — these are environment shapes,
// not regressions):
// 1. `dbus-monitor` not on PATH (rare on desktop Linux but
// possible in stripped CI containers).
// 2. No owner for `org.freedesktop.Notifications` on the bus
// (no notification daemon registered — minimal session, CI
// runner without a notification daemon, etc.).
//
// No row gate — Notification is a generic Electron surface; every
// row should support it.
// Default timeout (60s) leaves ~no margin around waitForReady's 90s
// budget plus our 5s monitor poll plus subprocess teardown. Match
// the T25 / T17 pattern.
test.setTimeout(120_000);
async function isOnPath(bin: string): Promise<boolean> {
try {
await exec('which', [bin], { timeout: 2_000 });
return true;
} catch {
return false;
}
}
async function notificationDaemonRegistered(): Promise<boolean> {
// `gdbus call` against o.f.DBus.NameHasOwner returns "(true,)" or
// "(false,)". Subprocess form keeps us off the shared lib/dbus.ts
// connection — this check runs before launchClaude and we don't
// want to warm up a bus connection just for one query.
try {
const { stdout } = await exec(
'gdbus',
[
'call',
'--session',
'--dest',
'org.freedesktop.DBus',
'--object-path',
'/org/freedesktop/DBus',
'--method',
'org.freedesktop.DBus.NameHasOwner',
'org.freedesktop.Notifications',
],
{ timeout: 5_000 },
);
return stdout.trim().startsWith('(true');
} catch {
return false;
}
}
test('T23 — notification reaches org.freedesktop.Notifications.Notify', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Desktop notifications',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Pre-flight skip checks — environment-shape, not regression.
if (!(await isOnPath('dbus-monitor'))) {
test.skip(
true,
'dbus-monitor not on PATH (install dbus-tools / dbus package); ' +
'cannot observe Notify method calls without it',
);
return;
}
if (!(await notificationDaemonRegistered())) {
test.skip(
true,
'no owner for org.freedesktop.Notifications on the session bus ' +
'(no notification daemon running) — environment limitation, ' +
'not a regression',
);
return;
}
const uniqueTitle = `T23-${Date.now()}-${Math.floor(Math.random() * 1e6)}`;
await testInfo.attach('unique-title', {
body: uniqueTitle,
contentType: 'text/plain',
});
// Spawn dbus-monitor BEFORE firing the notification so we can't
// race the Notify call. Match rule scopes us to just the Notify
// method on the Notifications interface — keeps the buffer small
// and avoids parsing unrelated bus chatter.
const matchRule =
"interface='org.freedesktop.Notifications',member='Notify'";
const monitor = spawn(
'dbus-monitor',
['--session', matchRule],
{ stdio: ['ignore', 'pipe', 'pipe'] },
);
let buffer = '';
monitor.stdout.on('data', (chunk: Buffer) => {
buffer += chunk.toString('utf8');
});
let stderr = '';
monitor.stderr.on('data', (chunk: Buffer) => {
stderr += chunk.toString('utf8');
});
// Give dbus-monitor ~250ms to install its match rule before
// firing. Without this, a fast Notify can arrive before the
// match rule is registered and we'd never see it.
await sleep(250);
const app = await launchClaude();
let observedAtMs: number | null = null;
let firedAtMs: number | null = null;
try {
const { inspector } = await app.waitForReady('mainVisible');
// `electron.Notification` is the public Electron API and on
// Linux thin-wraps libnotify (a DBus Notify call to the
// daemon). Returning .show() synchronously is fine — the bus
// call is async-fire from JS's perspective, and we poll the
// monitor buffer below.
firedAtMs = Date.now();
await inspector.evalInMain<null>(`
const { Notification } = process.mainModule.require('electron');
const n = new Notification({
title: ${JSON.stringify(uniqueTitle)},
body: 'T23 harness probe — ignore me',
silent: true,
});
n.show();
return null;
`);
// Poll buffer for our unique title. 5s budget — notification
// daemons respond fast (sub-100ms typical); if we don't see
// it within 5s the call almost certainly didn't reach the bus.
const deadline = Date.now() + 5_000;
while (Date.now() < deadline) {
if (buffer.includes(uniqueTitle)) {
observedAtMs = Date.now();
break;
}
await sleep(100);
}
} finally {
// Tear down monitor + app in a deterministic order. Monitor
// first so a kill failure doesn't block the (longer) app
// teardown. SIGTERM is enough for dbus-monitor.
try {
monitor.kill('SIGTERM');
} catch {
// already dead
}
await app.close();
}
// Trim buffer to last ~5KB for the attachment. dbus-monitor's
// per-call output is ~600 bytes for a tiny payload, so 5KB is
// plenty of context (last ~8 calls) without bloating the report.
const TRIM = 5 * 1024;
const trimmedBuffer =
buffer.length > TRIM
? `…(${buffer.length - TRIM}b truncated)…\n` + buffer.slice(-TRIM)
: buffer;
const elapsedMs =
observedAtMs !== null && firedAtMs !== null
? observedAtMs - firedAtMs
: null;
await testInfo.attach('dbus-monitor-buffer', {
body: trimmedBuffer || '(empty)',
contentType: 'text/plain',
});
if (stderr) {
await testInfo.attach('dbus-monitor-stderr', {
body: stderr,
contentType: 'text/plain',
});
}
await testInfo.attach('observation', {
body: JSON.stringify(
{
uniqueTitle,
firedAtMs,
observedAtMs,
elapsedMsFireToObserve: elapsedMs,
bufferBytes: buffer.length,
monitorMatchRule: matchRule,
},
null,
2,
),
contentType: 'application/json',
});
expect(
observedAtMs,
'unique-titled Notify method call appeared on the session bus ' +
'within 5s of firing — see dbus-monitor-buffer attachment for ' +
'the captured trace',
).not.toBeNull();
});

View File

@@ -0,0 +1,113 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
import {
installShowItemInFolderMock,
getShowItemInFolderCalls,
} from '../lib/claudeai.js';
// T25 — `shell.showItemInFolder` is reachable from main, accepts a
// path arg, and the IPC layer terminates at it without throwing.
//
// Tier 2 reframe of the case-doc T25 ("Code tab → right-click → Show
// in Files opens system file manager with file pre-selected"). The
// full click-chain version is Tier 3 and lives elsewhere; here we
// just prove the JS-level egress at index.js:509431
// (`hA.shell.showItemInFolder(Tc(path))`) is callable from main.
//
// Mock-then-call shape (mirrors T17's installOpenDialogMock pattern):
// monkey-patch `shell.showItemInFolder` to record invocations
// without performing the DBus FileManager1 / xdg-open dispatch, then
// `evalInMain` calls it with a synthetic path. Assertion is the
// recorded calls list contains our path and the call didn't throw.
//
// Why mock instead of invoking real: `showItemInFolder` returns void
// on Linux and gives no success signal, so the only thing the
// real-call form actually tests is "the JS layer is reachable" —
// which the mock tests equally well, without a host-side file-manager
// pop-up firing during the run. The xdg-open layer is OS-dependent
// and out of scope for a JS-level regression detector.
//
// Applies to all rows. No skipUnlessRow gate.
test.setTimeout(120_000);
test('T25 — shell.showItemInFolder reachable, no throw', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Code tab — show in files',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Synthetic path — the mock doesn't touch the filesystem, so a
// non-existent path is fine. A `/tmp/...` shape mirrors what the
// real IPC handler at index.js:509431 would receive after `Tc()`
// path normalisation.
const syntheticPath = '/tmp/claude-t25-show-in-files-target.txt';
await testInfo.attach('synthetic-path', {
body: syntheticPath,
contentType: 'text/plain',
});
const app = await launchClaude();
try {
// 'mainVisible' is the cheapest level that gives us an
// inspector + a known-good main process. `shell` is a static
// Electron module; doesn't depend on window/renderer state.
const { inspector } = await app.waitForReady('mainVisible');
await installShowItemInFolderMock(inspector);
const start = Date.now();
let threw: { message: string; stack?: string } | null = null;
try {
await inspector.evalInMain<null>(`
const { shell } = process.mainModule.require('electron');
shell.showItemInFolder(${JSON.stringify(syntheticPath)});
return null;
`);
} catch (err) {
threw = {
message: err instanceof Error ? err.message : String(err),
stack: err instanceof Error ? err.stack : undefined,
};
}
const elapsedMs = Date.now() - start;
const calls = await getShowItemInFolderCalls(inspector);
await testInfo.attach('show-item-in-folder-result', {
body: JSON.stringify(
{
path: syntheticPath,
elapsedMs,
threw,
calls,
},
null,
2,
),
contentType: 'application/json',
});
expect(
threw,
'shell.showItemInFolder(<path>) returned without throwing',
).toBeNull();
expect(
calls.length,
'mock recorded the showItemInFolder invocation',
).toBe(1);
expect(
calls[0]?.path,
'mock recorded the synthetic path arg verbatim',
).toBe(syntheticPath);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,256 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { createIsolation, type Isolation } from '../lib/isolation.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
import { retryUntil } from '../lib/retry.js';
import type { InspectorClient } from '../lib/inspector.js';
import {
type RawElement,
axTreeToSnapshot,
waitForAxTreeStable,
} from '../../explore/walker.js';
// T26 — Routines page renders.
//
// Path: seed auth from the host's signed-in Claude Desktop config into a
// per-test tmpdir, launch the app against that hermetic config, wait
// for `userLoaded` (claude.ai past /login — the sidebar Routines entry
// is rendered by claude.ai's authenticated SPA), find the
// `complementary > button[name="Routines"]` AX node, click it, then
// poll the post-click AX tree for one of the inventory's documented
// page anchors:
// - button[name="New routine"] (form trigger)
// - button[name="All"] or button[name="Calendar"] (list-view tabs)
//
// The complementary-landmark filter isn't needed at the click site —
// "Routines" is a unique accessibleName in the AX tree (verified
// against docs/testing/ui-inventory.json:244). The post-click anchors
// (`New routine`, `All`, `Calendar`) live under
// `main > region[name="Primary pane"]` and only render when the
// Routines page is mounted, so they're a good post-click signal.
//
// Schedule presets (Hourly/Daily/etc.), permission-mode picker, model
// picker, working-folder picker, and worktree toggle live inside the
// New-routine modal — out of T26's scope per the case-doc inventory
// note. Driving into the modal would belong in a sibling test.
interface AxAnchorMatch {
role: string;
name: string;
insideModalDialog: boolean;
}
interface AxAnchorSnapshot {
totalNodes: number;
totalInteractive: number;
matches: AxAnchorMatch[];
}
// Pull a flat AX snapshot via the same primitives lib/claudeai.ts uses.
// Inlined rather than importing a private helper from claudeai.ts —
// snapshotAx isn't exported there and adding a public wrapper for one
// runner is premature abstraction. `fast` skips the upfront stability
// gate so polling loops don't burn ~800ms per iteration.
async function snapshotAx(
inspector: InspectorClient,
opts: { fast?: boolean } = {},
): Promise<RawElement[]> {
if (!opts.fast) {
await waitForAxTreeStable(inspector, {
minNodes: 1,
timeoutMs: 10_000,
});
}
const nodes = await inspector.getAccessibleTree('claude.ai');
return axTreeToSnapshot(nodes);
}
// Find every interactive element whose role+accessibleName matches one
// of the supplied {role, name} pairs. Used both pre-click (to locate
// the Routines sidebar button) and post-click (to confirm the page
// rendered).
function findAnchors(
elements: RawElement[],
wanted: ReadonlyArray<{ role: string; name: string }>,
): AxAnchorMatch[] {
const out: AxAnchorMatch[] = [];
for (const el of elements) {
for (const w of wanted) {
if (el.computedRole !== w.role) continue;
if (el.accessibleName !== w.name) continue;
out.push({
role: el.computedRole,
name: el.accessibleName,
insideModalDialog: el.insideModalDialog,
});
}
}
return out;
}
test('T26 — Routines page renders', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Critical' });
testInfo.annotations.push({
type: 'surface',
description: 'Routines page',
});
// No skipUnlessRow — T26 applies to all rows.
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
// Seed auth from host (kills any running host Claude to release
// LevelDB/SQLite writer locks before copy). Skip cleanly when no
// signed-in host config is available — same pattern as T07.
let isolation: Isolation;
try {
isolation = await createIsolation({ seedFromHost: true });
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
test.skip(true, `seedFromHost unavailable: ${msg}`);
return;
}
const app = await launchClaude({ isolation });
try {
const ready = await app.waitForReady('userLoaded');
await testInfo.attach('claude-ai-url', {
body: ready.claudeAiUrl ?? '(no claude.ai webContents observed)',
contentType: 'text/plain',
});
if (!ready.postLoginUrl) {
test.skip(
true,
'seeded auth did not reach post-login URL — host config ' +
'may be stale (signed out, expired session, etc.)',
);
return;
}
await testInfo.attach('post-login-url', {
body: ready.postLoginUrl,
contentType: 'text/plain',
});
// Pre-click probe: locate the sidebar Routines button. Wrapped in
// retryUntil + try/catch for "Execution context was destroyed"
// because the renderer can still be mid-navigation when
// waitForReady('userLoaded') resolves (URL-only gate; SPA route
// settle is separate). claude.ai's sidebar mounts a few hundred
// ms after the URL stabilises.
const preClick = await retryUntil(
async () => {
try {
const elements = await snapshotAx(ready.inspector);
const matches = findAnchors(elements, [
{ role: 'button', name: 'Routines' },
]);
if (matches.length === 0) return null;
const interactive = elements.length;
return {
totalNodes: interactive,
totalInteractive: interactive,
matches,
} satisfies AxAnchorSnapshot;
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
if (msg.includes('context was destroyed')) return null;
throw err;
}
},
{ timeout: 15_000, interval: 500 },
);
await testInfo.attach('routines-sidebar-candidates', {
body: JSON.stringify(preClick, null, 2),
contentType: 'application/json',
});
if (!preClick) {
throw new Error(
'Routines sidebar button never appeared in the AX tree ' +
'within 15s after userLoaded',
);
}
// Re-walk the AX tree to grab the actual RawElement (with the
// backendDOMNodeId we need for the click) — preClick's
// AxAnchorMatch is a diagnostic projection.
const elementsForClick = await snapshotAx(ready.inspector);
const target = elementsForClick.find(
(el) =>
el.computedRole === 'button' &&
el.accessibleName === 'Routines',
);
if (!target || target.backendDOMNodeId === null) {
throw new Error(
'Routines button vanished between probe and click, or had ' +
'no backendDOMNodeId',
);
}
await ready.inspector.clickByBackendNodeId(
'claude.ai',
target.backendDOMNodeId,
);
// Post-click: gate once on AX-tree stability so the first poll
// iteration sees the populated page tree, then poll fast for any
// of the documented page anchors. Mirrors openPill's pattern in
// lib/claudeai.ts — re-gating on every iteration would burn
// ~800ms per cycle waiting for "no change" when what we want is
// "page anchors appear".
await waitForAxTreeStable(ready.inspector, {
minNodes: 1,
timeoutMs: 10_000,
});
const expected = [
{ role: 'button', name: 'New routine' },
{ role: 'button', name: 'All' },
{ role: 'button', name: 'Calendar' },
] as const;
const postClick = await retryUntil(
async () => {
try {
const elements = await snapshotAx(ready.inspector, {
fast: true,
});
const matches = findAnchors(elements, expected);
if (matches.length === 0) return null;
return {
totalNodes: elements.length,
totalInteractive: elements.length,
matches,
} satisfies AxAnchorSnapshot;
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
if (msg.includes('context was destroyed')) return null;
throw err;
}
},
{ timeout: 5_000, interval: 200 },
);
await testInfo.attach('routines-page-anchors', {
body: JSON.stringify(
postClick ?? { matches: [], note: 'no anchors observed' },
null,
2,
),
contentType: 'application/json',
});
expect(
postClick,
'one of [New routine | All | Calendar] appeared in the AX tree ' +
'within 5s after clicking the Routines sidebar button',
).not.toBeNull();
expect((postClick?.matches ?? []).length).toBeGreaterThan(0);
} finally {
await app.close();
}
});

View File

@@ -0,0 +1,144 @@
import { test, expect } from '@playwright/test';
import { launchClaude } from '../lib/electron.js';
import { captureSessionEnv } from '../lib/diagnostics.js';
// T38 — `LocalSessions.openInEditor` IPC handler is registered in main.
//
// Backs T38 in docs/testing/cases/code-tab-handoff.md ("Continue in
// IDE" — click chooser → IDE opens at the working directory). Same
// IPC surface as T24 ("Open in external editor"); per the case-doc,
// the "Continue in" chooser UI is rendered server-side by claude.ai
// and absent from the local asar — only the IPC bridge is anchorable.
//
// Tier 2 reframe (per docs/testing/runner-implementation-plan.md
// T38, line ~419): the full click-chain (login + IDE installed +
// chooser interaction) is Tier 3. Tier 2's slice asserts that the
// `LocalSessions.openInEditor(path, editor, sshConfig, line)` IPC
// handler is wired up in the main process — i.e. the renderer can
// reach it. If the handler is unregistered (rename, refactor that
// drops the registration, missing module load), the renderer's
// invoke() would reject with "No handler registered for ..." and
// every Tier-3 path through the chooser silently regresses with no
// other signal in the bundle.
//
// Why introspect, not invoke:
// - Invoking the handler would call A.openInEditor(...) which
// terminates at shell.openExternal('vscode://file/...') (case-doc
// anchor index.js:464011). On a host with VS Code installed,
// that's a real side effect (editor launches). T25 accepts that
// trade-off because the file manager popup is a single window;
// here the side effect is launching a full editor app.
// - The Tier 2 contract is "wired up", not "doesn't throw on a
// synthetic path". Invoking would also trip the channel's origin
// validation (`le(i)` at index.js:68820 rejects non-claude.ai
// senders), so invoking from main wouldn't even reach the impl.
//
// Channel-name shape (anchor index.js:68816, verified against the
// bundled source):
// $eipc_message$_<UUID>_$_claude.web_$_LocalSessions_$_openInEditor
// The UUID (`c0eed8c9-c94a-4931-8cc3-3a08694e9863` in the current
// bundle) appears to be build-stable but is not guaranteed across
// releases, so we match on the suffix `LocalSessions_$_openInEditor`
// rather than the full string. Surfacing the prefix as a diagnostic
// helps spot if the IPC framing ever changes.
//
// `ipcMain._invokeHandlers` is a private Electron API (a Map of
// channel → async handler that backs `ipcMain.handle()` /
// `ipcRenderer.invoke()`). It exists in current Electron and is
// already relied on by `lib/quickentry.ts` (captureSubmitIpc). If
// upstream Electron ever drops it, this runner will fail loudly with
// "_invokeHandlers is not a function/property" — see Open questions
// in the runner-implementation-plan.md follow-up doc.
//
// Applies to all rows. No skipUnlessRow gate.
interface HandlerProbe {
invokeHandlersType: string;
invokeHandlersSize: number | null;
localSessionsChannels: string[];
openInEditorChannel: string | null;
}
test.setTimeout(60_000);
test('T38 — LocalSessions.openInEditor IPC handler is registered', async ({}, testInfo) => {
testInfo.annotations.push({ type: 'severity', description: 'Should' });
testInfo.annotations.push({
type: 'surface',
description: 'Code tab — open in IDE',
});
await testInfo.attach('session-env', {
body: JSON.stringify(captureSessionEnv(), null, 2),
contentType: 'application/json',
});
const app = await launchClaude();
try {
// 'mainVisible' is the cheapest level that gives us an
// inspector + a known-good main process. The IPC handlers
// register during main bootstrap (alongside the rest of the
// LocalSessions surface) and are live before any renderer
// state matters; we don't need 'claudeAi' or 'userLoaded'.
const { inspector } = await app.waitForReady('mainVisible');
const probe = await inspector.evalInMain<HandlerProbe>(`
const { ipcMain } = process.mainModule.require('electron');
// Electron's ipcMain.handle() registry is a Map keyed by
// channel name. The property is undocumented but stable
// (also used by lib/quickentry.ts captureSubmitIpc).
const reg = ipcMain._invokeHandlers;
const invokeHandlersType = reg == null
? 'null'
: (reg instanceof Map ? 'Map' : typeof reg);
let channels = [];
let size = null;
if (reg instanceof Map) {
size = reg.size;
channels = Array.from(reg.keys());
} else if (reg && typeof reg === 'object') {
// Defensive: older/newer Electron builds may use a
// plain object instead of a Map. Surface both.
channels = Object.keys(reg);
size = channels.length;
}
const localSessionsChannels = channels.filter((c) =>
typeof c === 'string' && c.includes('LocalSessions_$_'),
);
const openInEditorChannel = channels.find((c) =>
typeof c === 'string'
&& c.endsWith('LocalSessions_$_openInEditor'),
) ?? null;
return {
invokeHandlersType,
invokeHandlersSize: size,
localSessionsChannels,
openInEditorChannel,
};
`);
await testInfo.attach('ipc-handler-probe', {
body: JSON.stringify(probe, null, 2),
contentType: 'application/json',
});
// Hard-fail if the registry shape itself changed — without
// this, the empty-list match below would be ambiguous between
// "channel missing" and "we couldn't read the registry at all".
expect(
probe.invokeHandlersType,
'ipcMain._invokeHandlers is a Map (Electron private API ' +
'still available in this build)',
).toBe('Map');
expect(
probe.openInEditorChannel,
'LocalSessions.openInEditor IPC handler is registered ' +
'(channel suffix `LocalSessions_$_openInEditor` present ' +
'in ipcMain._invokeHandlers; case-doc T38 anchor ' +
'index.js:68816)',
).not.toBeNull();
} finally {
await app.close();
}
});