mirror of
https://github.com/aaddrick/claude-desktop-debian.git
synced 2026-05-17 00:26:21 +03:00
docs(testing): session 6 plan/inventory + rotate session 7 prompt
Plan-doc Status (post-execution): session 6 section added at top covering S14 + lib/input-niri.ts ship + the cross-compositor-files- not-dispatcher reasoning + Category B (eipc-registry exposer) carrying over to session 7 unattempted. Untested-on-real-Niri caveats explicitly documented (Ok-wrapper schema version, Claude app_id literal value, foot-on-PATH) so the first Niri-row sweep knows what to confirm without re-deriving the recon. README inventory updated to 62 specs (24 cross-env T-tests, 33 env-specific S-tests, 5 H-prefix harness self-tests). S14 row added; lib/input-niri.ts entry added to the substrate-primitives layout block and to the lib/ paragraph that lists each primitive's consumer specs. Followup prompt rewritten for session 7. Main bet now shifts to: - A: eipc-registry exposer (now the cleanest single-session win available — sessions 3-6 each kept punting because lower-risk work was on the table; with the obvious focus-shifter / mock- then-call substrate work landed, Category A is the only path forward to proper Tier 2 runtime probes for T22/T31/T33/T38 AND unblocks T35 Phase 2 / T37 Phase 2). Three approaches documented for the inspector walk: module-level grep for registry exposers, hook-the-eipc-registration-site, patch-in- a-dev-only-exposer. - B: T35 Phase 2 / T37 Phase 2 paired with Category A. Skip unless A lands first. - C: Single-spec deferred items audit (S20 still open on #569; T34 OAuth round-trip; T36 Phase 2 reclassified out; cross-compositor S14 variants speculative without a consumer). New constraints from session 6 documented in the prompt: - lib/input-niri.ts stays Niri-only by design — strict XDG_CURRENT_DESKTOP === 'niri' gate. Sway / Hyprland / River consumers must skip or live in their own per-compositor files. - Don't speculate on a lib/input-wayland.ts dispatcher. Per-compositor files until a second Wayland consumer lands. Cumulative "stop and report" outcome count bumped to ~13 across sessions 1-6 (added: session-6 lib/input-niri.ts shipped untested- on-niri). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -1,19 +1,20 @@
|
||||
# test-harness runner implementation — session 6 prompt
|
||||
# test-harness runner implementation — session 7 prompt
|
||||
|
||||
This file is meant to be **copied verbatim into a fresh Claude Code
|
||||
session** as the initial user message. Don't paraphrase it; the
|
||||
orchestration depends on the exact directives below.
|
||||
|
||||
You're picking up after a runner-implementation session that landed 1
|
||||
new spec (T18) and a load-bearing reclassification finding (T36
|
||||
Phase 2 is no longer a Tier 2 candidate). Coverage 60/76 (79%) →
|
||||
61/76 (80%). One commit on `docs/compat-matrix`:
|
||||
new spec (S14) and 1 new primitive (`lib/input-niri.ts`). Coverage
|
||||
61/76 (80%) → 62/76 (82%). One commit on `docs/compat-matrix`:
|
||||
|
||||
- `XXX` — `test(harness): session 5 runner + SessionStart-fires-on-
|
||||
prompt finding` (T18 Tier 1 fingerprint pinning the drag-drop
|
||||
preload bridge in `mainView.js`; plan-doc updated with the
|
||||
SessionStart-hook trace + Code-tab AX anchor capture + S14 niri
|
||||
msg recon verdict).
|
||||
- `XXX` — `test(harness): session 6 runner + niri-native focus-shifter
|
||||
primitive` (S14 Tier 2 known-failing detector for the Niri portal
|
||||
`BindShortcuts` path, mirrored from S11's shape with imports swapped
|
||||
to the new `lib/input-niri.ts` primitive; primitive uses
|
||||
`niri msg --json windows` / `niri msg action focus-window` /
|
||||
`niri msg --json focused-window` chain plus `foot --title` for the
|
||||
marker window).
|
||||
|
||||
(Substitute the actual SHA after committing — the user reviews and
|
||||
commits at the end of every session.)
|
||||
@@ -22,227 +23,199 @@ The plan doc at
|
||||
[`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
|
||||
captures the tier classification and execution-time reclassifications.
|
||||
Its "Status (post-execution)" section is the source of truth for
|
||||
what's done and what's deferred — read **session 5** first, then
|
||||
**session 4**, then **session 3**, then **session 2**, then **session
|
||||
1** sub-sections.
|
||||
what's done and what's deferred — read **session 6** first, then
|
||||
**session 5**, then **session 4**, then **session 3**, then **session
|
||||
2**, then **session 1** sub-sections.
|
||||
|
||||
This session is a continuation, not a restart. Start by reading the
|
||||
plan doc's status sections.
|
||||
|
||||
### Big new findings from session 5
|
||||
### Big new findings from session 6
|
||||
|
||||
1. **SessionStart hook fires after first prompt submission, not on
|
||||
New-session click.** Trace through bundled `index.js`:
|
||||
`Ys.startSession` (`:454743` general, `:489371` CCD/Code-tab)
|
||||
requires `A.message`; the session record stores it as
|
||||
`initialMessage` (:489270); the agent SDK process is spawned via
|
||||
`DN({ prompt: k, options: v })` (`:489514`) only when there's a
|
||||
prompt stream to bind to. `createOrResumeSession` (`:489208`)
|
||||
creates the session record but doesn't spawn the agent. The
|
||||
SessionStart hook fires inside the agent SDK process once it
|
||||
boots — therefore only after a real prompt submission, which is
|
||||
a real-account write. **T36 Phase 2 reclassified Tier 2 →
|
||||
Tier 3/4**; unmockable without deep agent-SDK reverse-engineering.
|
||||
2. **Code-tab session-opener AX surface verified — anchors saved in
|
||||
plan-doc.** A one-shot AX-tree probe against the user's
|
||||
debugger-enabled running Claude (deleted after capture) confirmed:
|
||||
- **Top-tab Code button**: `button[name="Code"]` under
|
||||
`group[Mode]` under `complementary`. Disambiguator from the
|
||||
prompt-mode `tab[name="Code"]` in
|
||||
`tablist[name="Prompt categories"]` (which is what T16's
|
||||
existing `CodeTab.activate()` clicks).
|
||||
- **Sidebar entries**: `button[name="New session ⌘N"]`,
|
||||
`button[name="Routines"]`, `button[name="Customize"]`,
|
||||
`button[name="More navigation items"]`,
|
||||
`button[name="Pinned"]` / `button[name="Recents"]`.
|
||||
- **Recents items**: `button[name="<status> <title>"]` where
|
||||
status ∈ {Idle, Ready, Needs input, Awaiting input}. Main-pane
|
||||
Welcome surface uses `button[name="Open session <title>"]`.
|
||||
- **URL of Code-tab landing**: `/epitaxy`.
|
||||
No primitive shipped — these anchors live in the plan-doc until a
|
||||
consumer needs them. Premature abstraction is wrong abstraction.
|
||||
3. **niri msg IPC contract: `--json` shape is stable.** Wiki
|
||||
explicitly contracts the JSON output; plain text is unstable.
|
||||
`niri msg --json windows` returns `Vec<Window>` with `{id, title,
|
||||
app_id, pid, workspace_id, is_focused, ...}`; `niri msg action
|
||||
focus-window --id <u64>` injects focus; `niri msg --json
|
||||
focused-window` is the honest readback. `foot --title <T> -e
|
||||
sleep 600` is the Wayland-native marker (takes `--title` cleanly,
|
||||
ships in most niri setups). Niri 25.08+ has opt-in
|
||||
`xwayland-satellite` integration — existing X11 primitive *might*
|
||||
work on niri rows where it's running, but can't assume.
|
||||
4. **T18 Tier 1 fingerprint shipped against `mainView.js`, not
|
||||
`index.js`.** First runner to read a non-`index.js` source from
|
||||
the asar. `lib/asar.ts` already supports this via the existing
|
||||
`readAsarFile(filename, asarPath)` shape — no helper extraction
|
||||
needed. The case-doc anchor strings (`getPathForFile`, `webUtils`,
|
||||
`filePickers`, `claudeAppSettings`) are property names that
|
||||
survive minification verbatim — no minified-vs-beautified gotcha
|
||||
(unlike T35's `~/.claude.json` → `.claude.json`).
|
||||
5. **Tier 2/3 OS-level drag-drop is a primitive gap on BOTH
|
||||
backends.** X11 xdotool can simulate mouse motion but cannot put
|
||||
file URIs on the XDND selection (Chromium's drop handler would
|
||||
never see a file payload). Wayland needs per-compositor IPC +
|
||||
libei input injection. A real test needs either a custom XDND
|
||||
source app (X11) or a libei emitter (Wayland). The xdotool form
|
||||
the session-5 prompt suggested for T18 was a stub by this lens —
|
||||
pivot to Tier 1 was the right call.
|
||||
1. **`lib/input-niri.ts` shipped against session 5's recon — untested
|
||||
on real Niri.** The primitive landed against the recon notes
|
||||
without a live Niri row run. The first real Niri sweep will confirm:
|
||||
- The `Ok`-wrapper unwrap covers the niri version on the row. The
|
||||
primitive defensively handles both `{Ok: {FocusedWindow: ...}}`
|
||||
(older niri) and the bare-payload shape (newer niri); a third
|
||||
shape would fall through to `null` rather than crash.
|
||||
- Claude's `app_id` value on niri is literal `'Claude'`. The
|
||||
primitive's `app_id !== 'Claude'` guard becomes a no-op rather
|
||||
than wrong if the actual value differs (match still happens by
|
||||
title); tighten if needed.
|
||||
- `foot` is on the target row's PATH. Skip path is clean if not
|
||||
(`FootUnavailable` typed error → `testInfo.skip()` with install
|
||||
hint).
|
||||
Verified on KDE-W: the runner skips correctly via the row gate.
|
||||
2. **S14 is a known-failing detector by design.** Case-doc S14
|
||||
currently records `Failed to call BindShortcuts (error code 5)` on
|
||||
Niri. Same shape as S12's GNOME-W
|
||||
`--enable-features=GlobalShortcutsPortal` detector — the spec
|
||||
encodes the contract and will start passing on Niri rows once the
|
||||
upstream / Chromium-side portal issue resolves, without any spec
|
||||
edit.
|
||||
3. **Cross-compositor dispatcher deliberately not built.** Sway /
|
||||
Hyprland / River each have completely different IPCs (`swaymsg`,
|
||||
`hyprctl`, `riverctl`). Per-compositor files until a second
|
||||
consumer surfaces — a hypothetical `lib/input-wayland.ts` would
|
||||
just switch on `XDG_CURRENT_DESKTOP` and delegate. With only S14
|
||||
consuming `lib/input-niri.ts`, a dispatcher would be ceremony.
|
||||
Same anti-speculation rule that kept `lib/electron-mocks.ts`
|
||||
(session 3) and `lib/input.ts` (session 4) threshold-driven.
|
||||
|
||||
### Authoritative reference
|
||||
|
||||
Read these in order before fanning out:
|
||||
|
||||
- [`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
|
||||
— tier classification + status section. Read **session 5**,
|
||||
**session 4**, **session 3**, **session 2**, then **session 1**
|
||||
"Status (post-execution)" sub-sections. The Tier-3 list (around
|
||||
line 690 — search for "## Tier 3") is the candidate pool for
|
||||
further reframes; T18 has now landed (was Tier 3, shipped Tier 1)
|
||||
and T36 Phase 2 reclassified to Tier 3/4.
|
||||
— tier classification + status section. Read **session 6**,
|
||||
**session 5**, **session 4**, **session 3**, **session 2**, then
|
||||
**session 1** "Status (post-execution)" sub-sections. The Tier-3
|
||||
list (search for "## Tier 3") is the candidate pool for further
|
||||
reframes.
|
||||
- [`tools/test-harness/README.md`](../../tools/test-harness/README.md)
|
||||
— runner conventions, the now-61-spec inventory, primitives in
|
||||
— runner conventions, the now-62-spec inventory, primitives in
|
||||
`lib/`, isolation defaults, the CDP-gate workaround, the eipc
|
||||
note.
|
||||
- [`docs/testing/cases/README.md`](cases/README.md) — case-doc
|
||||
structure and the four anchor scopes.
|
||||
- [`tools/test-harness/src/lib/`](../../tools/test-harness/src/lib/)
|
||||
— the existing primitives. No new primitives in session 5.
|
||||
Notable: `input.ts` remains strict X11-only by design; do NOT bolt
|
||||
Wayland into it. If session 6 builds the niri-native sibling, put
|
||||
it in `lib/input-niri.ts` (per-compositor file, NOT a Wayland
|
||||
catch-all — sway/hyprland/river have totally different IPCs).
|
||||
— the existing primitives. Notable session 6 addition:
|
||||
`input-niri.ts` (Niri-only, `niri msg --json` IPC + `foot` marker;
|
||||
sibling of X11-only `input.ts`). DO NOT bolt other Wayland
|
||||
compositors into `input-niri.ts` — per-compositor files only.
|
||||
- [`tools/test-harness/src/runners/`](../../tools/test-harness/src/runners/)
|
||||
— every existing spec is a template. Notable session 5 templates:
|
||||
- `T18_drag_drop_files_into_prompt.spec.ts` — first runner to
|
||||
read a non-`index.js` source (`mainView.js`). Pattern for any
|
||||
future fingerprint that anchors on the preload bundle (e.g.
|
||||
bridge wiring, contextBridge exposes).
|
||||
— every existing spec is a template. Notable session 6 templates:
|
||||
- `S14_quick_entry_from_other_focus_niri.spec.ts` — first runner
|
||||
consuming `lib/input-niri.js`. Pattern for any future
|
||||
Niri-specific runner that needs Wayland-native focus injection.
|
||||
- [`docs/testing/cases/*.md`](cases/) — the spec each runner
|
||||
asserts. The **Code anchors:** field tells you exactly where
|
||||
upstream implements the feature.
|
||||
|
||||
### Tests in scope this session
|
||||
|
||||
**Realistic ceiling: ~3 new specs OR one new primitive landing.**
|
||||
Session 5 ran light (1 spec + 1 doc finding) because the runtime
|
||||
probe + bundled-source trace consumed half the budget. Session 6's
|
||||
clearest single-session win is **Category A — `lib/input-niri.ts`
|
||||
+ S14 runner** because:
|
||||
**Realistic ceiling: ~3 new specs OR one new primitive landing.** The
|
||||
session 6 work (1 spec + 1 primitive) was at the lower end of the
|
||||
ceiling because the primitive build was substantial. The obvious
|
||||
focus-shifter / mock-then-call substrate work is now done — the next
|
||||
session's main bets are narrower in shape.
|
||||
|
||||
- The recon already sketched the primitive API (mirrors
|
||||
`lib/input.ts`'s shape, swaps xdotool/xprop for `niri msg`).
|
||||
- The niri IPC contract is stable in `--json` mode per the wiki.
|
||||
- S14 is the single consumer waiting on it.
|
||||
- The `lib/input.ts` extraction in session 4 is a direct template.
|
||||
**Category B (eipc-registry exposer) is now the cleanest single-
|
||||
session win available.** Sessions 3-6 each kept punting Category B
|
||||
because (a) other lower-risk work was on the table (focus-shifter,
|
||||
mock-then-call extraction, Tier 1 fingerprints) and (b) session 3's
|
||||
inspector walk came up empty. With the obvious work landed, Category
|
||||
B is the only path forward to proper Tier 2 runtime probes for
|
||||
T22/T31/T33/T38 (currently shipped as Tier 1 fingerprints) AND
|
||||
unblocks T35 Phase 2 / T37 Phase 2.
|
||||
|
||||
Three categories — pick ONE as the main bet, treat the others as
|
||||
fallback if the main bet hits an early blocker:
|
||||
|
||||
| # | Tests | Source | Notes |
|
||||
|---|---|---|---|
|
||||
| **A** `lib/input-niri.ts` + S14 runner | S14 | new `lib/input-niri.ts` + S14 runner | Recon-sketched; niri IPC contract is stable in `--json` mode. Cleanest single-session win. |
|
||||
| **B** eipc-registry exposer | unblocks T22/T31/T33/T38 Tier 2 reframes | `lib/electron.ts` or new `lib/eipc.ts` | High-risk-high-reward closure-local reverse-engineering. Same warning as sessions 4 / 5: session 3's inspector walk came up empty; needs a fresh approach. |
|
||||
| **C** Single-spec deferred items audit | various (T35 Phase 2 / T37 Phase 2 still blocked on closure-local readback; T36 Phase 2 NO LONGER A CANDIDATE) | — | Lower ceiling, higher confidence per spec. |
|
||||
| **A** eipc-registry exposer | unblocks T22/T31/T33/T38 Tier 2 reframes + T35 Phase 2 / T37 Phase 2 | new `lib/eipc.ts` (or extension to `lib/electron.ts`) | High-risk-high-reward closure-local reverse-engineering. Session 3's inspector walk via `globalThis` came up empty; sessions 4/5/6 each skipped for budget. **Now the cleanest single-session win** — needs a fresh approach. |
|
||||
| **B** T35 Phase 2 / T37 Phase 2 (paired with eipc-registry exposer) | T35 Phase 2, T37 Phase 2 | depends on Category A | Only viable if Category A lands first. Don't attempt without it. |
|
||||
| **C** Single-spec deferred items audit | various deferred items | — | Lower ceiling, higher confidence per spec. Best fallback if Category A turns up empty. |
|
||||
|
||||
#### Category A — `lib/input-niri.ts` + S14 runner
|
||||
#### Category A — eipc-registry exposer
|
||||
|
||||
The session 5 recon's TRACTABLE verdict gives the API sketch
|
||||
verbatim:
|
||||
The closure-local IPC registry near `:68820` (`le(i)` origin
|
||||
validation) and `:68816` (channel framing) is what T22/T31/T33/T38
|
||||
should be probing at runtime — instead they all ship as Tier 1
|
||||
asar fingerprints because session 3 confirmed the standard
|
||||
`ipcMain._invokeHandlers` map only carries three chat-tab MCP-bridge
|
||||
handlers, not the `LocalSessions_$_*` / `CustomPlugins_$_*` channels.
|
||||
The custom `$eipc_message$_<UUID>_$_claude.web_$_<name>` protocol
|
||||
uses a closure-local message-port registry that's not introspectable
|
||||
from main without reverse-engineering the eipc bootstrap.
|
||||
|
||||
- `spawnMarkerWindow(title)` → `child_process.spawn('foot',
|
||||
['--title', title, '-e', 'sleep', '600'], {detached:true})`;
|
||||
teardown via PID + SIGTERM. Mirrors the X11 primitive's xterm
|
||||
pattern.
|
||||
- `focusOtherWindow(title)` → `niri msg --json windows`,
|
||||
`JSON.parse`, find row where `title === wantedTitle && app_id !==
|
||||
'Claude'`, then `niri msg action focus-window --id <id>`, then
|
||||
re-read `niri msg --json focused-window` and assert `id` matches.
|
||||
This gives the honest readback that S11's primitive needs.
|
||||
- `getFocusedWindowId()` → `niri msg --json focused-window` →
|
||||
`.Ok.FocusedWindow?.id ?? null`.
|
||||
- `isNiriSession()` → check `XDG_CURRENT_DESKTOP === 'niri'` OR
|
||||
`niri msg version` exits 0 (the latter is more honest because
|
||||
XDG_CURRENT_DESKTOP can be overridden — but adds a process-spawn
|
||||
cost on every call; cache the result).
|
||||
**Approaches that have NOT been tried (good starting points):**
|
||||
|
||||
S14 runner shape: near-clone of `S11_quick_entry_from_other_focus.spec.ts`
|
||||
with the import swapped from `lib/input.js` to `lib/input-niri.js`
|
||||
and the row gate flipped from `['GNOME-X', 'Ubu-X']` to `['Niri']`.
|
||||
The X11-side "what this catches vs what it doesn't" leading
|
||||
comment from S11 has a Niri-side equivalent: this catches a
|
||||
regression in the Wayland path of the global shortcut on Niri (the
|
||||
load-bearing concern the case-doc carries forward from the S11
|
||||
mutter regression discussion).
|
||||
1. **Module-level grep for symbol references** — search the bundled
|
||||
`index.js` near `:68816` and `:68820` for any
|
||||
`Object.defineProperty` / `globalThis[`...`]` / `module.exports`
|
||||
call that exposes the registry to a reachable surface.
|
||||
2. **Hook the eipc message-port creation site** — instead of looking
|
||||
for a registry to inspect post-hoc, hook the registration site
|
||||
itself. If the channel-name string flows through a single
|
||||
function call, install a prototype-method hook at that site (see
|
||||
the hook pattern in
|
||||
[`docs/learnings/test-harness-electron-hooks.md`](../../learnings/test-harness-electron-hooks.md))
|
||||
and accumulate names into a side-channel map the test can read.
|
||||
3. **Patch in a dev-only registry exposer** — pre-launch, modify
|
||||
`index.js` (via the harness's `lib/asar.ts` write path) to add
|
||||
`globalThis.__eipcChannels = ...` near the registration site.
|
||||
Idempotent + reversible; the patched asar is per-test isolation
|
||||
so it doesn't leak.
|
||||
|
||||
**Cross-compositor consideration (do NOT bolt in this session):**
|
||||
Sway / Hyprland / River each have totally different IPCs.
|
||||
Per-compositor files (`lib/input-sway.ts`, `lib/input-hypr.ts`,
|
||||
…) are cleaner than a unified abstraction. A `lib/input-wayland.ts`
|
||||
dispatcher would just be a switch on `XDG_CURRENT_DESKTOP` that
|
||||
delegates. Don't speculate on it this session — let the second
|
||||
consumer drive the dispatcher.
|
||||
|
||||
**STOP AND REPORT** if: (a) `niri msg` output shape doesn't match
|
||||
the recon (the wiki contract is `--json` only, but the output
|
||||
schema may shift between niri versions even within the contract);
|
||||
(b) `foot` isn't on the target row's PATH (the primitive should
|
||||
fall back to `alacritty` / `kitty` / fail with a clear typed
|
||||
error matching `lib/input.ts`'s `XdotoolUnavailable` shape).
|
||||
|
||||
#### Category B — eipc-registry exposer
|
||||
|
||||
Same framing as session 4/5: closure-local reverse-engineering of
|
||||
the eipc bootstrap near `:68820` (`le(i)` origin validation) and
|
||||
`:68816` (channel framing). Session 3's inspector walk found
|
||||
nothing reachable via `globalThis`; the walk was repeated approach
|
||||
in sessions 4/5 implicitly (and skipped for budget reasons).
|
||||
|
||||
If you take this as the main bet, treat as exploratory — Phase 1
|
||||
is the inspector walk only. STOP AND REPORT if 2-3 distinct
|
||||
approaches turn up empty. The cleanest "tried, here's what was
|
||||
unreachable" report converts the primitive-gap annotation in the
|
||||
plan-doc from "TODO" to "tried, unfixable without an upstream
|
||||
change." Don't ship a stub.
|
||||
If Category A turns up empty after 2-3 distinct approaches, STOP
|
||||
AND REPORT. Don't keep digging — document what was tried, ship a
|
||||
"H06 documentation runner" that captures the dead-end as a finding
|
||||
in JUnit, and pivot to Category C. The cleanest "tried, here's
|
||||
what was unreachable" report converts the primitive-gap annotation
|
||||
in the plan-doc from "TODO" to "tried, unfixable without an
|
||||
upstream change."
|
||||
|
||||
If a stable handle is found, expose it via `lib/eipc.ts`
|
||||
(`getEipcChannels`, `invokeEipcChannel`); upgrade T22 / T31 /
|
||||
T33 / T38 from Tier 1 fingerprints to Tier 2 runtime probes.
|
||||
T33 / T38 from Tier 1 fingerprints to Tier 2 runtime probes. Cap
|
||||
at ~3 spec upgrades — don't try to land all four if the first one
|
||||
surfaces an unexpected issue.
|
||||
|
||||
#### Category B — T35 / T37 Phase 2 (paired with Category A)
|
||||
|
||||
Both currently ship as Tier 1 fingerprints because the parsed-state
|
||||
readback target is a closure-local minified symbol — the same
|
||||
gotcha as S28 from session 2 and S19's `cE()`/`Tce()`
|
||||
re-implementation note. Without Category A landing first, the
|
||||
fixture form of these specs would assert "the spec didn't crash"
|
||||
and nothing more.
|
||||
|
||||
Skip this category unless Category A lands a stable handle.
|
||||
|
||||
#### Category C — single-spec deferred items audit
|
||||
|
||||
Walk through session 1/2/3/4/5 deferrals and identify any that are
|
||||
now tractable. Specifically:
|
||||
Walk through session 1-6 deferrals and identify any that are now
|
||||
tractable. Specifically:
|
||||
|
||||
- **S20** — `powerSaveBlocker` Inhibit. Issue #569 still open;
|
||||
not this session.
|
||||
this is a separate workstream, not for this session.
|
||||
- **T18** — drag-drop OS-level form. Tier 1 fingerprint shipped
|
||||
session 5; OS-level (Tier 2/3) requires a custom XDND source
|
||||
(X11) or libei emitter (Wayland) — both are heavy primitive
|
||||
builds that don't fit this session's ceiling.
|
||||
- **T34** — OAuth round-trip. Hard to mock; not this session
|
||||
unless you have a clever idea.
|
||||
- **T35 Phase 2 / T37 Phase 2** — fixture-readback. Same
|
||||
closure-local target as T37b. Need either Category B
|
||||
(eipc-registry exposer) to land first, or a different readback
|
||||
path. Skip unless paired with Category B.
|
||||
- **T36 Phase 2** — NO LONGER A CANDIDATE. Session 5's
|
||||
SessionStart-hook trace showed the hook fires only after first
|
||||
prompt submission, which is a real-account write. Reclassified
|
||||
- **T35 Phase 2 / T37 Phase 2** — see Category B above. Need
|
||||
Category A first.
|
||||
- **T36 Phase 2** — NOT a candidate. Session 5's SessionStart-
|
||||
hook trace showed the hook fires only after first prompt
|
||||
submission, which is a real-account write. Reclassified
|
||||
Tier 2 → Tier 3/4. Don't try to ship it.
|
||||
- **S14 Wayland variant** — see Category A. Session 5 recon says
|
||||
TRACTABLE.
|
||||
- **S14 cross-compositor variants (Sway / Hyprland / River)** —
|
||||
no current case-doc consumer demands them. Don't speculate.
|
||||
|
||||
#### Code-tab session-opener primitive (NOT recommended this session)
|
||||
If Category A turns up empty, Category C's most-reachable target
|
||||
is **investigate Tier 3 reframes for issues opened against the
|
||||
project since session 6.** Check `gh issue list --state open
|
||||
--label test-coverage-gap` (if the label exists) or just walk
|
||||
recent open issues for ones that suggest a Tier 1 fingerprint is
|
||||
now possible (a regression that produces a stable string in the
|
||||
bundle, etc.).
|
||||
|
||||
Session 5 verified the AX surface (anchors in plan-doc), but the
|
||||
single biggest consumer (T36 Phase 2) was just reclassified out of
|
||||
Tier 2. Without a load-bearing consumer, building
|
||||
`CodeTab.activateTopTab()` / `startNewSession()` would be a
|
||||
speculative primitive. Wait until a real consumer surfaces.
|
||||
#### Cross-compositor focus-shifter expansion (NOT recommended this session)
|
||||
|
||||
Building `lib/input-sway.ts` / `lib/input-hypr.ts` would mirror
|
||||
`lib/input-niri.ts`'s shape but no consumer is asking for them.
|
||||
Sway / Hyprland / River specs aren't on the case-doc radar.
|
||||
Premature abstractions are wrong abstractions. Wait for a real
|
||||
consumer.
|
||||
|
||||
### Constraints to respect (don't violate)
|
||||
|
||||
These are unchanged from sessions 1/2/3/4/5 and still load-bearing:
|
||||
These are unchanged from sessions 1/2/3/4/5/6 and still load-bearing:
|
||||
|
||||
- **Default isolation** unless the spec needs otherwise. Use
|
||||
`seedFromHost: true` for any test that depends on authenticated
|
||||
@@ -252,19 +225,21 @@ These are unchanged from sessions 1/2/3/4/5 and still load-bearing:
|
||||
channels.** Session 3 confirmed those use a custom eipc protocol
|
||||
not in the standard registry. T22/T31/T33/T38 are Tier 1
|
||||
fingerprints. If you build the eipc-registry exposer (Category
|
||||
B), update the plan-doc and this prompt accordingly.
|
||||
A), update the plan-doc and this prompt accordingly.
|
||||
- **`lib/input.ts` is X11-only.** Strict `XDG_SESSION_TYPE ===
|
||||
'x11'` gate. Wayland consumers must skip — don't try to bolt
|
||||
Wayland into the file. Session 6's Category A puts the niri
|
||||
variant in `lib/input-niri.ts` (sibling), NOT `lib/input.ts`.
|
||||
Wayland into the file.
|
||||
- **`lib/input-niri.ts` is Niri-only.** Strict
|
||||
`XDG_CURRENT_DESKTOP === 'niri'` gate. Sway / Hyprland / River
|
||||
consumers must skip or live in their own per-compositor files.
|
||||
- **Don't speculate on `lib/input-wayland.ts` dispatcher.**
|
||||
Per-compositor files until a second consumer (Sway / Hyprland /
|
||||
River row) lands. Premature abstractions are wrong abstractions.
|
||||
Per-compositor files until a second Wayland consumer (Sway /
|
||||
Hyprland / River) lands. With only S14 on Niri, a dispatcher
|
||||
is ceremony.
|
||||
- **Code-tab AX anchors stay in plan-doc until a consumer needs
|
||||
them.** Don't preemptively add `CodeTab.activateTopTab()` to
|
||||
`claudeai.ts` — T36 Phase 2 was the only consumer and it's now
|
||||
Tier 3/4. Session 5's anchors block out the work for whenever
|
||||
a future consumer surfaces.
|
||||
`claudeai.ts` — session 5's anchors block out the work for
|
||||
whenever a future consumer surfaces.
|
||||
- **CDP auth gate is alive** — runtime SIGUSR1 attach via
|
||||
`app.attachInspector()`, never Playwright's `_electron.launch()`
|
||||
or `chromium.connectOverCDP()`.
|
||||
@@ -277,10 +252,10 @@ These are unchanged from sessions 1/2/3/4/5 and still load-bearing:
|
||||
- **No fixed sleeps.** `retryUntil` from `lib/retry.ts`, or
|
||||
Playwright auto-wait. Fixed `sleep(N)` is a smell. (Exception:
|
||||
short sleeps inside hand-rolled retry loops that catch typed
|
||||
errors and short-circuit; see S11 for the pattern.)
|
||||
errors and short-circuit; see S11 / S14 for the pattern.)
|
||||
- **Diagnostics on every run.** `testInfo.attach()` the artefacts.
|
||||
Single-shot JSON dumps for multi-state tests (S11, S31 pattern)
|
||||
are cleaner than 5+ separate attachments.
|
||||
Single-shot JSON dumps for multi-state tests (S11, S14, S31
|
||||
pattern) are cleaner than 5+ separate attachments.
|
||||
- **Tag with annotations.** `severity:` and `surface:` on every
|
||||
test so JUnit carries them through to matrix-regen.
|
||||
- **Tabs in TS, ~80-char wrap as the existing files do.** Match
|
||||
@@ -298,26 +273,22 @@ These are unchanged from sessions 1/2/3/4/5 and still load-bearing:
|
||||
`lib/electron-mocks.ts`,** not `lib/claudeai.ts`. Documented in
|
||||
T24/T25's leading comments.
|
||||
- **Marker windows / sacrificial host processes always die in
|
||||
`finally`.** S11 is the template — `marker.kill()` runs before
|
||||
`app.close()` so the kill happens even if the spec throws. The
|
||||
niri sibling's `foot` marker should follow the same pattern.
|
||||
`finally`.** S11 / S14 are the templates — `marker.kill()` runs
|
||||
before `app.close()` so the kill happens even if the spec throws.
|
||||
|
||||
### Phases
|
||||
|
||||
#### Phase 0 — calibration
|
||||
|
||||
1. `cd tools/test-harness && npm run typecheck` — should pass.
|
||||
2. Read the plan doc's "Status (post-execution)" session 5 section,
|
||||
then read S11's leading comment + `lib/input.ts`'s leading
|
||||
comment (the X11-only-row-gate reasoning still applies; the
|
||||
niri sibling will mirror its shape but with niri-specific
|
||||
honest-readback discussion). Confirm you understand both.
|
||||
3. Pick ONE Category as the main bet. Don't write it yet — confirm
|
||||
you can plan from the spec. For Category A, verify niri's IPC
|
||||
doc is still consistent with the session 5 recon (the wiki
|
||||
page may have changed; re-fetch). For Category B, confirm the
|
||||
closure-local landscape hasn't shifted (re-run the session 3
|
||||
inspector walk's premise).
|
||||
2. Read the plan doc's "Status (post-execution)" session 6 section,
|
||||
then read S14's leading comment + `lib/input-niri.ts`'s leading
|
||||
comment. Confirm you understand the niri-only gate reasoning.
|
||||
3. Pick ONE Category as the main bet. For Category A, plan the
|
||||
approach: (a) module-level grep for registry exposers, (b) hook
|
||||
the eipc registration site, (c) patch in a dev-only exposer.
|
||||
List which approaches you'll try in what order, with the cap at
|
||||
2-3 distinct approaches before STOP AND REPORT.
|
||||
|
||||
If Phase 0 surfaces a problem (typecheck failing, primitives
|
||||
unclear, the chosen Category's prerequisites don't hold), stop and
|
||||
@@ -325,41 +296,30 @@ report. Don't fan out.
|
||||
|
||||
#### Phase 1 — fan-out batch
|
||||
|
||||
For Category A (`lib/input-niri.ts` + S14):
|
||||
- Spawn ONE subagent for `lib/input-niri.ts` against the
|
||||
recon-sketched API (mirror `lib/input.ts` style — leading
|
||||
comment with the `--json`-stability rationale and the
|
||||
honest-readback reasoning, sibling typed errors
|
||||
`NiriIpcUnavailable` / `FootUnavailable`, exports matching
|
||||
`focusOtherWindow` / `spawnMarkerWindow` / `getFocusedWindowId`
|
||||
/ `isNiriSession` / `MarkerWindow` interface).
|
||||
- Spawn ONE subagent in parallel for the S14 runner (near-clone
|
||||
of S11 with imports swapped + row gate `['Niri']`).
|
||||
- After both return: typecheck, ensure the two files agree on the
|
||||
primitive's exported shape.
|
||||
|
||||
For Category B (eipc-registry exposer):
|
||||
- Spawn ONE subagent doing the inspector walk — looking for
|
||||
module-level Maps / dispatch functions / `globalThis` writes
|
||||
near `:68816`-`:68820`. Treat as exploratory; report findings
|
||||
before committing to a primitive shape.
|
||||
For Category A (eipc-registry exposer):
|
||||
- Spawn ONE subagent per approach — module-level grep, hook-at-
|
||||
registration-site, dev-only patch-in. Treat as exploratory;
|
||||
report findings before committing to a primitive shape.
|
||||
- Cap re-spawns at 2-3 distinct approaches; if all empty, STOP
|
||||
AND REPORT.
|
||||
AND REPORT. Ship an `H06_eipc_registry_finding.spec.ts`
|
||||
documentation runner if useful state surfaces during the
|
||||
investigation.
|
||||
- If a stable handle is found, second batch: build `lib/eipc.ts`
|
||||
+ ship `H06_eipc_registry_finding.spec.ts`. Third batch:
|
||||
upgrade T22 / T31 / T33 / T38.
|
||||
+ ship the H06 finding runner. Third batch: upgrade T22 / T31 /
|
||||
T33 / T38.
|
||||
- Cap at ~3 specs total upgrade — don't try to land all four if
|
||||
the first one surfaces an unexpected issue.
|
||||
|
||||
For Category C (single-spec audit):
|
||||
- Pick 1-2 deferred items per the table above. Standard fan-out
|
||||
per `runners/<closest-template>.spec.ts`.
|
||||
- Walk recent open issues + the deferred-items list. Pick 1-2
|
||||
that are now tractable. Standard fan-out per
|
||||
`runners/<closest-template>.spec.ts`.
|
||||
|
||||
#### Per-subagent prompt shape
|
||||
|
||||
```
|
||||
You're implementing ONE [test-harness runner | primitive] for
|
||||
<TARGET>.
|
||||
You're implementing ONE [test-harness runner | primitive |
|
||||
investigation] for <TARGET>.
|
||||
|
||||
Read in order:
|
||||
- docs/testing/cases/<FILE>.md (focus on <TARGET>'s Code anchors)
|
||||
@@ -373,8 +333,9 @@ Write tools/test-harness/src/runners/<TARGET>_short_name.spec.ts
|
||||
[ AND/OR tools/test-harness/src/lib/<NEW-PRIMITIVE>.ts ].
|
||||
|
||||
[per-task specifics: pattern (seedFromHost / mock-then-call /
|
||||
asar fingerprint / shared isolation / new-primitive-build),
|
||||
assertion shape, skip rules, key constraint warnings]
|
||||
asar fingerprint / shared isolation / new-primitive-build /
|
||||
investigation), assertion shape, skip rules, key constraint
|
||||
warnings]
|
||||
|
||||
Constraints:
|
||||
- Tabs, ~80-char wrap.
|
||||
@@ -390,20 +351,21 @@ If the target isn't reasonable to implement (anchors don't resolve
|
||||
to anything assertable, the test depends on state you can't
|
||||
construct, the existing primitives don't cover the surface), DO
|
||||
NOT write a stub. Report under Open questions and stop. Sessions
|
||||
1-5 had cumulative ~12 "stop and report" outcomes that were the
|
||||
1-6 had cumulative ~13 "stop and report" outcomes that were the
|
||||
right call (S20 deferral, T05 reshape, T07 needs seedFromHost,
|
||||
T08 needs setState('close'), S28 reclassification, T38 framing,
|
||||
session-3 eipc-registry finding, T37 fixture-readback deferral,
|
||||
S14 primitive-gap, T35/T36 Phase 2 deferrals, T18 Tier 1 reframe,
|
||||
T36 Phase 2 reclassification to Tier 3/4).
|
||||
S14 primitive-gap then primitive-build, T35/T36 Phase 2 deferrals,
|
||||
T18 Tier 1 reframe, T36 Phase 2 reclassification to Tier 3/4,
|
||||
session-6 lib/input-niri.ts shipped untested-on-niri).
|
||||
|
||||
Report shape (~150 words):
|
||||
## <TARGET> [runner | primitive]
|
||||
## <TARGET> [runner | primitive | investigation]
|
||||
|
||||
- File written: tools/test-harness/src/runners/<filename>.spec.ts
|
||||
[or lib/<newfile>.ts]
|
||||
- Layer: file probe | argv probe | L1 | L2 (xprop) | L2 (DBus) |
|
||||
pgrep | new-primitive
|
||||
pgrep | new-primitive | investigation
|
||||
- Assertion shape: <one sentence>
|
||||
- Skip rules: <which rows + why>
|
||||
- Verification path: <typecheck + run result>
|
||||
@@ -417,10 +379,7 @@ After fan-out returns:
|
||||
1. `cd tools/test-harness && npm run typecheck` — must stay clean.
|
||||
2. Run the new runners against KDE-W (the dev box) — but flag the
|
||||
user first if any are destructive (seedFromHost kills running
|
||||
Claude). For Category A, S14 will skip on KDE-W (row gate is
|
||||
Niri-only); the typecheck pass is the verification on KDE-W,
|
||||
and a real Niri-row run is for the next sweep. Capture
|
||||
pass/skip/fail per spec for the matrix.
|
||||
Claude). Capture pass/skip/fail per spec for the matrix.
|
||||
3. Update [`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
|
||||
"Status (post-execution)" section to reflect newly-shipped
|
||||
specs and any reclassifications discovered mid-flight.
|
||||
@@ -431,7 +390,7 @@ After fan-out returns:
|
||||
- Primitives landed (with API shape)
|
||||
- Specs deferred (with the per-test rationale)
|
||||
- Specs reclassified (Tier 3 → Tier 2, Tier 2 → Tier 1, etc.)
|
||||
- Updated coverage stat (was 61/76 = 80%, now N/76 = M%)
|
||||
- Updated coverage stat (was 62/76 = 82%, now N/76 = M%)
|
||||
6. Don't commit. The user reviews and commits.
|
||||
7. Rotate this prompt: rewrite
|
||||
`docs/testing/runner-implementation-followup-prompt.md` for
|
||||
@@ -439,7 +398,7 @@ After fan-out returns:
|
||||
|
||||
### Self-correction loop
|
||||
|
||||
Same as sessions 1-5:
|
||||
Same as sessions 1-6:
|
||||
|
||||
1. Subagent typecheck failure → re-spawn with explicit fix
|
||||
instruction.
|
||||
@@ -455,16 +414,18 @@ Same as sessions 1-5:
|
||||
finding came from finding only 3 handlers in the registry —
|
||||
the lesson is to verify the assertion is meaningful, not just
|
||||
that it passes.
|
||||
5. **Carry-over from session 5:** If pursuing Category B and the
|
||||
inspector walk turns up empty after 2-3 approaches, STOP.
|
||||
Don't keep digging — document what was tried, ship the H06
|
||||
"documentation runner" if it surfaces useful state, move to
|
||||
Category A or C.
|
||||
6. **NEW for session 6:** If pursuing Category A and the niri
|
||||
IPC `--json` output has shifted from the session 5 recon
|
||||
(e.g. the Window struct shape changed; an action got renamed),
|
||||
STOP and re-fetch the wiki / probe a live niri instance if
|
||||
available. Don't ship against a stale schema.
|
||||
5. **Carry-over from session 5/6:** If pursuing Category A and the
|
||||
inspector / hook / patch approaches turn up empty after 2-3
|
||||
approaches, STOP. Don't keep digging — document what was
|
||||
tried, ship the H06 documentation runner if it surfaces
|
||||
useful state, move to Category C.
|
||||
6. **NEW for session 7:** If Category A's hook approach lands a
|
||||
handle but T22 / T31 / T33 / T38 upgrades reveal the channels
|
||||
route through different code paths than the bundle strings
|
||||
suggest (i.e. the runtime registry's contents don't match the
|
||||
case-doc Code anchors), re-examine the case-doc anchors before
|
||||
shipping the upgrade — the assertion shape might need
|
||||
adjustment, not the test target.
|
||||
|
||||
Cap re-spawns at 2 per file. Past that, mark as needing human
|
||||
review and move on.
|
||||
@@ -483,39 +444,41 @@ Stop and write the final report when one of:
|
||||
4. **Session budget hits ~3 new specs OR one new primitive
|
||||
landing.** Stop, synthesize, leave the rest for the next
|
||||
session.
|
||||
5. **Category B inspector walk turns up empty after 2-3 distinct
|
||||
approaches.** Document the dead-end as a finding, ship H06
|
||||
if useful, pivot to Category A or C if budget remains.
|
||||
5. **Category A approaches all turn up empty after 2-3 distinct
|
||||
attempts.** Document the dead-end as a finding, ship H06 if
|
||||
useful, pivot to Category C if budget remains.
|
||||
|
||||
### What you should NOT do
|
||||
|
||||
- **Don't try to land Category A + B + C in one batch.** Pick ONE
|
||||
as the main bet.
|
||||
- **Don't try to land Category A + Category C in one batch.** Pick
|
||||
ONE as the main bet.
|
||||
- **Don't ship stubs.** If a runner can't actually assert what the
|
||||
spec says, mark it as Tier 3 / blocked / primitive-gap and
|
||||
don't write a placeholder. The cumulative twelve "stop and
|
||||
report" outcomes from sessions 1-5 were the right call — every
|
||||
don't write a placeholder. The cumulative thirteen "stop and
|
||||
report" outcomes from sessions 1-6 were the right call — every
|
||||
one revealed a real constraint.
|
||||
- **Don't break existing runners.** H01-H05 are the canaries.
|
||||
- **Don't restructure `lib/`** beyond targeted additions.
|
||||
Premature abstractions are wrong abstractions.
|
||||
`electron-mocks.ts` (session 3) and `input.ts` (session 4) were
|
||||
threshold-driven extractions, not speculative. `input-niri.ts`
|
||||
for Category A is the same shape — a single-consumer extraction
|
||||
with the API mirroring its X11 sibling.
|
||||
`electron-mocks.ts` (session 3), `input.ts` (session 4), and
|
||||
`input-niri.ts` (session 6) were threshold-driven extractions,
|
||||
not speculative.
|
||||
- **Don't run destructive Tier 3 tests** that write to the user's
|
||||
real claude.ai account (T22 PR write, T27 scheduling, T29
|
||||
worktree creation, T34 OAuth, T36 hooks-fire-on-prompt-submit).
|
||||
Only the *read-only reframes* of those are in scope.
|
||||
- **Don't introspect `ipcMain._invokeHandlers` for `claude.web`
|
||||
eipc channels.** Confirmed broken in session 3. Category B is
|
||||
eipc channels.** Confirmed broken in session 3. Category A is
|
||||
the ONLY appropriate path to runtime IPC verification for those
|
||||
channels.
|
||||
- **Don't bolt Wayland into `lib/input.ts`.** Sibling file or new
|
||||
primitive only; the X11-strict gate is load-bearing. Session 6
|
||||
Category A puts niri in `lib/input-niri.ts`.
|
||||
- **Don't bolt other compositors into `lib/input-niri.ts`.**
|
||||
Sway / Hyprland / River each get their own per-compositor file
|
||||
if a consumer surfaces. With S14 the only consumer, no
|
||||
expansion is justified yet.
|
||||
- **Don't bolt Wayland into `lib/input.ts`.** X11-strict gate is
|
||||
load-bearing.
|
||||
- **Don't speculate on a `lib/input-wayland.ts` dispatcher.**
|
||||
Per-compositor files until a second consumer lands.
|
||||
Per-compositor files until a second Wayland consumer lands.
|
||||
- **Don't preemptively build `CodeTab.activateTopTab()` /
|
||||
`startNewSession()`.** Session 5 captured the AX anchors but
|
||||
T36 Phase 2 (the only known consumer) was reclassified out.
|
||||
@@ -527,13 +490,13 @@ Stop and write the final report when one of:
|
||||
### Final report format
|
||||
|
||||
```markdown
|
||||
## Runner implementation summary (session 6)
|
||||
## Runner implementation summary (session 7)
|
||||
|
||||
- Main-bet category: A | B | C
|
||||
- Specs landed: N
|
||||
- Primitives landed: N
|
||||
- Reclassified mid-flight: N (with reasons)
|
||||
- Coverage: was 61/76 (80%), now <NEW>/76 (<PCT>%)
|
||||
- Coverage: was 62/76 (82%), now <NEW>/76 (<PCT>%)
|
||||
- Typecheck: clean | <errors>
|
||||
- KDE-W test run: <pass/skip/fail counts>
|
||||
|
||||
@@ -541,7 +504,7 @@ Stop and write the final report when one of:
|
||||
|
||||
| Cat | Test ID | File | Assertion shape | Status |
|
||||
|---|---|---|---|---|
|
||||
| A | S14 | S14_quick_entry_from_other_focus_niri.spec.ts | … | ✓ pass / skip (Niri-only) |
|
||||
| A | T22 | T22_pr_monitoring_handler.spec.ts | … | ✓ pass / skip / fail |
|
||||
| ... |
|
||||
|
||||
## Notable findings
|
||||
@@ -585,14 +548,13 @@ git diff --stat
|
||||
`Promise<boolean>` variant + T25's for the void variant.
|
||||
- For focus-shifting (X11 only): `lib/input.ts` exports
|
||||
`focusOtherWindow` + `spawnMarkerWindow`. See S11 for the
|
||||
end-to-end consumer pattern (single-shot diagnostic record,
|
||||
marker-window cleanup in `finally`, defensive
|
||||
`WaylandFocusUnavailable` / `XdotoolUnavailable` skip catches).
|
||||
- **For Wayland-native focus-shifting (Niri only, if Category A
|
||||
ships):** the recon's API sketch is in plan-doc session 5.
|
||||
Mirror `lib/input.ts`'s shape. Use `niri msg --json` (the
|
||||
contracted-stable surface; plain text is unstable per the
|
||||
wiki). `foot --title <T> -e sleep 600` is the marker process.
|
||||
end-to-end consumer pattern.
|
||||
- For Wayland-native focus-shifting (Niri only): `lib/input-niri.ts`
|
||||
exports the same shape with `niri msg --json` IPC + `foot`
|
||||
marker. See S14 for the end-to-end consumer pattern. The
|
||||
primitive is untested-on-real-Niri as of session 6 — the
|
||||
first real Niri sweep run will confirm the schema assumptions
|
||||
documented in its leading comment.
|
||||
- **For asar fingerprints: ALWAYS grep the installed asar
|
||||
first.** Build-reference is beautified; the bundle is
|
||||
minified. Case-doc text may be the user-facing form, not the
|
||||
|
||||
@@ -18,6 +18,82 @@ work begins.
|
||||
|
||||
## Status (post-execution)
|
||||
|
||||
**Shipped session 6 (1 new spec + 1 new primitive):** S14 (Tier 2 — Niri-
|
||||
only, currently known-failing detector). New primitive `lib/input-niri.ts`
|
||||
(Wayland-native focus-shifter sibling of `lib/input.ts`:
|
||||
`focusOtherWindow` / `spawnMarkerWindow` / `getFocusedWindowId` /
|
||||
`isNiriSession` plus `NiriIpcUnavailable` / `FootUnavailable` typed
|
||||
errors). Coverage moved from 61/76 (80%) to 62/76 (82%).
|
||||
|
||||
Session 6 findings + reclassifications:
|
||||
|
||||
- **S14 shipped as Tier 2 known-failing detector.** Near-clone of S11's
|
||||
shape with imports swapped from `lib/input.js` to `lib/input-niri.js`
|
||||
and the row gate flipped from `['GNOME-X', 'Ubu-X']` to `['Niri']`.
|
||||
Same five-phase shape: setup → ready → marker spawn → focus loop with
|
||||
sticky-error short-circuits → press shortcut + assert popup visible.
|
||||
Diagnostic record fields parallel S11's `s11-diagnostics`
|
||||
(`activeWidBeforeFocus` / `activeWidAfterFocus` typed `number | null`
|
||||
for niri u64 IDs vs the X11 hex strings). Currently a known-failing
|
||||
detector per case-doc S14 (`Failed to call BindShortcuts (error code
|
||||
5)`); same shape as S12's GNOME-W `--enable-features=GlobalShortcutsPortal`
|
||||
detector — the spec encodes the contract and will start passing on
|
||||
Niri rows once the upstream / Chromium-side portal issue is resolved
|
||||
without any spec edit.
|
||||
- **`lib/input-niri.ts` extracted as the niri-side focus-shifter
|
||||
substrate.** Niri-only by design — strict
|
||||
`XDG_CURRENT_DESKTOP === 'niri'` gate via `isNiriSession()`. Exports:
|
||||
`focusOtherWindow(title)` (`niri msg --json windows` →
|
||||
`app_id !== 'Claude'` filter + title match → `niri msg action
|
||||
focus-window --id <u64>` → honest readback via `getFocusedWindowId()`
|
||||
using `retryUntil`), `spawnMarkerWindow(title)` (backgrounded
|
||||
`foot --title <T> -e sleep 600` with kill-with-grace, mirroring the
|
||||
X11 primitive's xterm pattern), `getFocusedWindowId()` (parses
|
||||
`niri msg --json focused-window` to `number | null`), `isNiriSession()`,
|
||||
`MarkerWindow` interface, `NiriIpcUnavailable` / `FootUnavailable`
|
||||
typed errors. The primitive verifies the focus shift took (niri's
|
||||
`focus-window` action exits 0 even when the compositor refuses
|
||||
activation — only `focused-window` readback is the honest answer).
|
||||
Defensive `unwrapOk` helper handles both the older
|
||||
`{Ok: {FocusedWindow: ...}}` Result-style JSON envelope and newer
|
||||
bare-payload responses; if niri ships a third shape, the parser
|
||||
falls through to `null` rather than crashing.
|
||||
- **Cross-compositor dispatcher NOT speculated.** Sway / Hyprland /
|
||||
River each have totally different IPCs (`swaymsg`, `hyprctl`,
|
||||
`riverctl`); the long-term cross-compositor answer is libei but
|
||||
isn't widely deployed. Per-compositor files until a second consumer
|
||||
surfaces — a hypothetical `lib/input-wayland.ts` dispatcher would
|
||||
just switch on `XDG_CURRENT_DESKTOP` and delegate. With only S14
|
||||
consuming `lib/input-niri.ts`, a dispatcher would be ceremony.
|
||||
- **Category B (eipc-registry exposer) NOT attempted.** Same reasoning
|
||||
as sessions 4/5: session 3 already established the registry is
|
||||
closure-local, the inspector walk came up empty, and the early-exit
|
||||
cap on retries makes Category B a poor main bet without a new
|
||||
approach. Stays available for a future session that takes the
|
||||
closure-local reverse-engineering as its main work.
|
||||
|
||||
Tier 2 → Tier 2 candidates remaining for next session: **T35 Phase 2**
|
||||
and **T37 Phase 2** (still need closure-local readback or the
|
||||
eipc-registry exposer; unchanged from sessions 4/5). **eipc-registry
|
||||
exposer** (closure-local in main; reverse-engineering remains
|
||||
unattempted — now the cleanest single-session win available, with all
|
||||
the obvious focus-shifter / mock-then-call work already landed). The
|
||||
primitive surface itself isn't growing quickly — `lib/electron-mocks.ts`
|
||||
(session 3), `lib/input.ts` (session 4), and `lib/input-niri.ts`
|
||||
(session 6) are all threshold-driven extractions, not speculative.
|
||||
|
||||
Session 6 untested-on-Niri caveats: the `lib/input-niri.ts` primitive
|
||||
landed against session 5's recon notes, not a live niri session. First
|
||||
real Niri sweep run will confirm: (a) the `Ok`-wrapper unwrap covers
|
||||
the niri version on the row; (b) Claude's `app_id` value on niri is
|
||||
literal `'Claude'` (the primitive's `app_id !== 'Claude'` guard
|
||||
becomes a no-op rather than wrong if the actual value differs — match
|
||||
still happens by title; tighten if needed); (c) `foot` is on the
|
||||
target row's PATH (skip path is clean if not). Verified on KDE-W: the
|
||||
runner skips correctly via the row gate.
|
||||
|
||||
---
|
||||
|
||||
**Shipped session 5 (1 new spec):** T18 (Tier 1 fingerprint). No new
|
||||
primitives. Coverage moved from 60/76 (79%) to 61/76 (80%).
|
||||
|
||||
|
||||
@@ -7,7 +7,7 @@ architecture, decisions, and rationale.
|
||||
|
||||
## Status
|
||||
|
||||
Sixty-one specs wired (24 cross-env T-tests, 32 env-specific S-tests,
|
||||
Sixty-two specs wired (24 cross-env T-tests, 33 env-specific S-tests,
|
||||
5 H-prefix harness self-tests). See
|
||||
[`docs/testing/runner-implementation-plan.md`](../../docs/testing/runner-implementation-plan.md)
|
||||
for the tiered triage of remaining tests and the per-spec rationale
|
||||
@@ -62,6 +62,7 @@ behind tier classification.
|
||||
| [S10](../../docs/testing/cases/shortcuts-and-input.md#s10--quick-entry-popup-is-transparent-no-opaque-square-frame) | KDE-W only — popup runtime `getBackgroundColor() === '#00000000'` after Quick Entry opens (regression-detector against electron#50213 if bundled Electron in 41.0.4-bisect-window) | L1 + ydotool |
|
||||
| [S11](../../docs/testing/cases/shortcuts-and-input.md#s11--quick-entry-shortcut-fires-from-any-focus-on-wayland-mutter-xwayland-key-grab) | GNOME-X / Ubu-X only (X11-side regression detector) — spawn xterm marker, `xdotool windowfocus` to it, verify `_NET_ACTIVE_WINDOW` shifted, fire `Ctrl+Alt+Space` via ydotool, assert popup visible. Wayland-side mutter regression (#404) is a primitive gap — needs Wayland-native focus injection (libei) | L1 + xdotool focus + ydotool shortcut |
|
||||
| S12 | `--enable-features=GlobalShortcutsPortal` in Electron argv (GNOME-W only — currently a known-failing regression detector) | argv probe |
|
||||
| [S14](../../docs/testing/cases/shortcuts-and-input.md#s14--global-shortcuts-via-xdg-portal-work-on-niri) | Niri only — spawn `foot` marker, `niri msg action focus-window` to it, verify `niri msg --json focused-window` shifted, fire `Ctrl+Alt+Space` via ydotool, assert popup visible. Currently known-failing detector for the Niri portal `BindShortcuts` path (parallels S12's GNOME-W detector) | L1 + niri msg focus + ydotool shortcut |
|
||||
| [S15](../../docs/testing/cases/distribution.md#s15--appimage-extraction---appimage-extract-works-as-documented-fallback) | `--appimage-extract` exits 0; `squashfs-root/AppRun --version` runs without FUSE error | spawn + filesystem |
|
||||
| [S16](../../docs/testing/cases/distribution.md#s16--appimage-mount-cleans-up-on-app-exit) | `mount(8)` shows new `.mount_claude` while app is up; gone within 10s of close | mount delta |
|
||||
| [S17](../../docs/testing/cases/platform-integration.md#s17--app-launched-from-desktop-inherits-shell-path) | Shell-path-worker overlays user's login-shell PATH onto a deliberately-scrubbed env | L1 + utilityProcess |
|
||||
@@ -96,8 +97,11 @@ isolation env (S19), the `lib/electron-mocks.ts` mock-then-call
|
||||
helpers — `installOpenDialogMock` (T17), `installShowItemInFolderMock`
|
||||
(T25), `installOpenExternalMock` (T24) — the `lib/input.ts`
|
||||
focus-shifter (`focusOtherWindow` + `spawnMarkerWindow` for S11; X11
|
||||
only — `WaylandFocusUnavailable` thrown on native Wayland) — and the
|
||||
`createIsolation({ seedFromHost: true })` primitive that lets
|
||||
only — `WaylandFocusUnavailable` thrown on native Wayland) and its
|
||||
Niri-native sibling `lib/input-niri.ts` (`niri msg --json` for the
|
||||
focus-injection + readback chain, `foot --title` for the marker
|
||||
window; `NiriIpcUnavailable` thrown off-Niri; consumed by S14) — and
|
||||
the `createIsolation({ seedFromHost: true })` primitive that lets
|
||||
login-required tests run hermetically against a copy of the host's
|
||||
signed-in auth state (T07, T16, T26).
|
||||
|
||||
@@ -250,6 +254,7 @@ tools/test-harness/
|
||||
│ │ ├── claudeai.ts # claude.ai renderer UI domain (CodeTab, dialog mock, atoms)
|
||||
│ │ ├── electron-mocks.ts # mock-then-call helpers (dialog/showItemInFolder/openExternal)
|
||||
│ │ ├── input.ts # focus-shifter primitive (X11 only — xdotool + xprop verify; spawnMarkerWindow xterm)
|
||||
│ │ ├── input-niri.ts # focus-shifter primitive (Niri only — niri msg --json verify; spawnMarkerWindow foot)
|
||||
│ │ ├── retry.ts # poll-until-true with timeout
|
||||
│ │ └── diagnostics.ts # launcher log, --doctor, session env
|
||||
│ └── runners/ # one .spec.ts per test ID
|
||||
|
||||
Reference in New Issue
Block a user