mirror of
https://github.com/aaddrick/claude-desktop-debian.git
synced 2026-05-17 00:26:21 +03:00
docs(testing): session 11 plan/inventory + rotate session 12 prompt
Plan-doc gets a new "Shipped session 11" status section above
session 10's. Captures the T21 spec landed (commit 3ea677f), the
cwd-validator-is-typeof-string finding, the 30-callable-Launch-
members observation (5 wrapper-only `on*` event subscribers + 2
proxies don't show in `_invokeHandlers`), and the dual case-doc-
anchored read-side invocation pattern (distinct from T19/T20's
foundational-surrogate shape).
README inventory adds T21 row, bumps spec count from 72 to 73 (35
T-tests now).
Followup prompt rotates for session 12 — T11 plugin install
runtime upgrade becomes the main bet (currently a Tier 1
fingerprint; LocalPlugins registers 15 handlers per session 7's
probe). Operon-mode navigation probe stays as the smaller-scope
fallback. Constraints / phases / self-correction loop sections
unchanged from sessions 10-11; the per-session section just
swaps in the new findings.
Co-Authored-By: Claude <claude@anthropic.com>
This commit is contained in:
@@ -1,116 +1,104 @@
|
||||
# test-harness runner implementation — session 11 prompt
|
||||
# test-harness runner implementation — session 12 prompt
|
||||
|
||||
This file is meant to be **copied verbatim into a fresh Claude Code
|
||||
session** as the initial user message. Don't paraphrase it; the
|
||||
orchestration depends on the exact directives below.
|
||||
|
||||
You're picking up after a runner-implementation session that landed 2
|
||||
new specs (T19 + T20) by way of registering the case-doc-anchored
|
||||
write-side eipc surfaces plus invoking the foundational read-side
|
||||
`LocalSessions/getAll` as the read-side surrogate. No primitive
|
||||
change. Coverage 70/76 (92%) → 72/76 (95%). Two commits on
|
||||
`docs/compat-matrix` expected (SHAs inserted after the test-harness
|
||||
commit lands — the user reviews and commits at the end of every
|
||||
session):
|
||||
You're picking up after a runner-implementation session that landed 1
|
||||
new spec (T21) by way of registering the case-doc-anchored Launch
|
||||
preview-pane suffixes plus invoking BOTH case-doc-anchored read-side
|
||||
getters (`getConfiguredServices` returns array, `getAutoVerify` returns
|
||||
boolean). No primitive change. Coverage 72/76 (95%) → 73/76 (96%). Two
|
||||
commits on `docs/compat-matrix` expected (SHAs inserted after the
|
||||
test-harness commit lands — the user reviews and commits at the end of
|
||||
every session):
|
||||
|
||||
- TBD — `test(harness): session 10 T19/T20 runtime probes`
|
||||
(Tier 2 reframes; multi-suffix `waitForEipcChannels` over the
|
||||
case-doc-anchored write-side suffixes — `startShellPty` / `writeShellPty`
|
||||
/ `stopShellPty` / `resizeShellPty` / `getShellPtyBuffer` for T19,
|
||||
`readSessionFile` / `writeSessionFile` / `pickSessionFile` for T20
|
||||
— plus single `invokeEipcChannel('LocalSessions_$_getAll', [])`
|
||||
array-shape assertion as the foundational read-side surrogate;
|
||||
passes on KDE-W in 23.4s + 27.7s sequential).
|
||||
- TBD — `test(harness): session 11 T21 dev server preview runtime`
|
||||
(Tier 2 reframe; multi-suffix `waitForEipcChannels` over the
|
||||
case-doc-anchored Launch suffixes — `getConfiguredServices` /
|
||||
`startFromConfig` / `stopServer` / `getAutoVerify` /
|
||||
`capturePreviewScreenshot` — plus dual `invokeEipcChannel` on
|
||||
`Launch_$_getConfiguredServices` and `Launch_$_getAutoVerify` with
|
||||
`cwd = process.cwd()`; passes on KDE-W in 16.7s cold).
|
||||
|
||||
The plan doc at
|
||||
[`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
|
||||
captures the tier classification and execution-time reclassifications.
|
||||
Its "Status (post-execution)" section is the source of truth for
|
||||
what's done and what's deferred — read **session 10** first, then
|
||||
**session 9**, then **session 8**, then **session 7**, then **session
|
||||
6**, then **session 5**, then **session 4**, then **session 3**, then
|
||||
**session 2**, then **session 1** sub-sections.
|
||||
what's done and what's deferred — read **session 11** first, then
|
||||
**session 10**, then **session 9**, then **session 8**, then **session
|
||||
7**, then **session 6**, then **session 5**, then **session 4**, then
|
||||
**session 3**, then **session 2**, then **session 1** sub-sections.
|
||||
|
||||
This session is a continuation, not a restart. Start by reading the
|
||||
plan doc's status sections.
|
||||
|
||||
### Big new findings from session 10
|
||||
### Big new findings from session 11
|
||||
|
||||
1. **`claude.web/Launch` IS registered on claude.ai with 25 handlers.**
|
||||
Overturns session 7's per-interface map (which captured /epitaxy
|
||||
with cowork loaded but didn't list Launch). Session 10's registry
|
||||
probe re-run on /epitaxy with an active session saw all 25:
|
||||
`getLogs`, `stopServer`, `showPreview`, `hidePreview`,
|
||||
`startFromConfig`, `getConfiguredServices`, `getAutoVerify`,
|
||||
`setAutoVerify`, `deployPreview`, `destroyPreview`, `pickHtmlFile`,
|
||||
`loadHtmlPreview`, `goBack`, `goForward`, `refreshPreview`,
|
||||
`navigatePreview`, `getPreviewUrl`, `setPreviewColorScheme`,
|
||||
`setPreviewViewport`, `clearPreviewViewport`,
|
||||
`capturePreviewScreenshot`, `suggestDeployName`, `unpublishDeploy`,
|
||||
`toggleSelectionMode`, `activeServers_$store$_getState`. T21 is now
|
||||
tractable as a Tier 2 reframe.
|
||||
2. **Launch invocation is `cwd`-gated.** Smoke-test of
|
||||
`Launch/getConfiguredServices` and `Launch/getAutoVerify` rejected
|
||||
with `Argument "cwd" at position 0 to method "<name>" in interface
|
||||
"Launch" failed to pass validation`. Schema-rev via the rejection-
|
||||
message grep pattern (session 9 finding) — the validator block sits
|
||||
~50-200 chars before the throw site in the bundled `index.js`. T21
|
||||
ships once the cwd format is recovered.
|
||||
3. **`claude.operon/OperonBootstrap.ensure` registers eagerly on
|
||||
claude.ai** (1 handler). Partial answer to session 8's open
|
||||
question. The other 21 wrapper-exposed operon interfaces remain
|
||||
registry-unconfirmed; they likely lazy-register on operon-mode
|
||||
entry. Worth a follow-up navigation probe — operon-mode URL form
|
||||
TBD (search `claude.ai/...` paths in the bundle for `operon`-keyed
|
||||
routes).
|
||||
4. **`LocalSessions/getAll` is the foundational read-side surrogate
|
||||
for any session-scoped Tier 2 reframe.** Pattern: `args = []`,
|
||||
returns `Array<Session>`, the case-doc connection is "this surface
|
||||
binds to a LocalSession; getAll proves the LocalSessions impl
|
||||
object is reachable through the renderer wrapper". T19 (terminal
|
||||
binds to session) and T20 (file pane edits session-bound files)
|
||||
both ship with this. Reuse for any future LocalSessions-scoped
|
||||
case-doc test where the case-doc anchor is write-side.
|
||||
5. **Smoke-test enumeration of LocalSessions read-sides.** The
|
||||
following all invoke cleanly with `args = []`:
|
||||
- `getAll`, `getInstalledEditors`, `getDetectedProjects`,
|
||||
`isVSCodeInstalled`, `getSSHConfigs`, `getTrustedSSHHosts`,
|
||||
`getDefaultEffort`, `getSupportedCommands`
|
||||
These DO require args (rejected on smoke-test):
|
||||
- `getDefaultPermissionMode` rejects on `cwd` arg
|
||||
- `getSSHSupportedCommands` rejects on `config` arg (SSH config
|
||||
object)
|
||||
The full LocalSessions method list (117 methods) is in the registry
|
||||
probe dump — if a future session needs to identify a specific
|
||||
read-side, dump and grep there rather than re-enumerating.
|
||||
6. **Filename convention for first-runtime-probe siblings.** When a
|
||||
case-doc test has no Tier 1 fingerprint sibling and the Tier 2
|
||||
reframe is the FIRST runner against that case-doc, name it
|
||||
`T<NN>_runtime.spec.ts` (no `b` / `c` letter suffix). T19 / T20
|
||||
followed this — same as T26 / T27 from earlier sessions. Use `b` /
|
||||
`c` only when there's an earlier sibling to disambiguate against
|
||||
(T22b after T22; T33c after T33b).
|
||||
1. **`claude.web/Launch` cwd validator is `typeof cwd === 'string'`
|
||||
only.** No path-existence check, no absolute-path requirement,
|
||||
empty / relative / non-existent paths all pass. Only `null`,
|
||||
`undefined`, and object wraps reject. The handler tolerates
|
||||
missing `<cwd>/.claude/launch.json` — returns `[]` for
|
||||
`getConfiguredServices` and `false` for `getAutoVerify`. Smoke-
|
||||
test resolved the schema in one round-trip; bundle-grep on the
|
||||
rejection literal was not needed. Suggests a class of `claude.web`
|
||||
handlers may have similarly-trivial validators — when the
|
||||
rejection-message grep pattern from session 9 is the right tool,
|
||||
it's typically because the validator IS more elaborate (closed-
|
||||
over Zod schemas, optional-field unions). For simple `cwd`-only
|
||||
handlers, smoke-test first.
|
||||
2. **`window['claude.web'].Launch` exposes 30 callable members,
|
||||
not 25.** The registry probe sees 25 `_invokeHandlers`; the
|
||||
wrapper additionally surfaces 5 `on*` event subscribers
|
||||
(`onDeployEvent` / `onElementSelected` /
|
||||
`onPreviewSelectionShortcut` / `onPreviewUrlChanged`) plus
|
||||
`isAvailable` and `activeServersStore`. Wrapper-only entries
|
||||
don't show up in `webContents.ipc._invokeHandlers` because
|
||||
they're event emitters and store proxies, bound via different
|
||||
bridge primitives. Worth noting: a future case-doc test that
|
||||
wants to probe event subscription paths (e.g. "preview pane
|
||||
reacts to deploy progress events") would need a different
|
||||
primitive than `invokeEipcChannel`. No consumer asks for it
|
||||
yet — anti-speculation rule applies.
|
||||
3. **Dual case-doc-anchored read-side invocation pattern is
|
||||
distinct from foundational-surrogate.** T21 follows T33c's
|
||||
shape (invoke each case-doc-anchored read-side suffix, assert
|
||||
the documented shape per handler) rather than T19 / T20's
|
||||
foundational-surrogate shape (invoke `LocalSessions/getAll` as
|
||||
a stand-in because case-doc anchors were write-side). When the
|
||||
case-doc has read-side anchors with resolvable arg shapes,
|
||||
prefer invoking the case-doc-anchored handlers directly — it
|
||||
removes the surrogate hop, and the assertion is "the documented
|
||||
handler returns the documented shape" rather than "a
|
||||
foundational sibling is reachable through the wrapper".
|
||||
4. **`getConfiguredServices` returns Array<…>, `getAutoVerify`
|
||||
returns boolean.** Mixed-shape dual invocation is fine — the
|
||||
spec does `Array.isArray(...)` and `typeof === 'boolean'`
|
||||
assertions independently. Diagnostic JSON captures both
|
||||
`responseShape` per invocation.
|
||||
|
||||
### Authoritative reference
|
||||
|
||||
Read these in order before fanning out:
|
||||
|
||||
- [`docs/testing/runner-implementation-plan.md`](runner-implementation-plan.md)
|
||||
— tier classification + status section. Read **session 10**, then
|
||||
**session 9**, **session 8**, **session 7**, **session 6**, **session
|
||||
5**, **session 4**, **session 3**, **session 2**, then **session 1**
|
||||
"Status (post-execution)" sub-sections. The Tier-3 list (search for
|
||||
"## Tier 3") is the candidate pool for further reframes.
|
||||
— tier classification + status section. Read **session 11**, then
|
||||
**session 10**, **session 9**, **session 8**, **session 7**, **session
|
||||
6**, **session 5**, **session 4**, **session 3**, **session 2**, then
|
||||
**session 1** "Status (post-execution)" sub-sections. The Tier-3 list
|
||||
(search for "## Tier 3") is the candidate pool for further reframes.
|
||||
- [`tools/test-harness/README.md`](../../tools/test-harness/README.md)
|
||||
— runner conventions, the now-72-spec inventory, primitives in
|
||||
— runner conventions, the now-73-spec inventory, primitives in
|
||||
`lib/`, isolation defaults, the CDP-gate workaround, the eipc
|
||||
note (covers registry walk, renderer-wrapper invocation, the
|
||||
schema-rev pattern from session 9, and the foundational-getAll
|
||||
pattern from session 10).
|
||||
schema-rev pattern from session 9, the foundational-getAll
|
||||
pattern from session 10, and the dual-case-doc-anchored-read-side
|
||||
pattern from session 11).
|
||||
- [`docs/testing/cases/README.md`](cases/README.md) — case-doc
|
||||
structure and the four anchor scopes.
|
||||
- [`tools/test-harness/src/lib/`](../../tools/test-harness/src/lib/)
|
||||
— the existing primitives. No session 10 additions; surface remains
|
||||
— the existing primitives. No session 11 additions; surface remains
|
||||
the session 8 shape (`getEipcChannels` / `findEipcChannel` /
|
||||
`findEipcChannels` / `waitForEipcChannel` / `waitForEipcChannels` /
|
||||
`invokeEipcChannel` on `lib/eipc.ts`).
|
||||
@@ -118,17 +106,20 @@ Read these in order before fanning out:
|
||||
— the session 7 read-only registry probe. Re-run against a
|
||||
debugger-attached Claude (`Developer → Enable Main Process
|
||||
Debugger` from the menu) to capture the current registry shape.
|
||||
Session 10 used the existing probe verbatim plus a small
|
||||
per-interface method-list dump (deleted after; lives in /tmp at
|
||||
capture time).
|
||||
Session 11 used a small one-off smoke-test in the test-harness
|
||||
dir (clones the InspectorClient connection pattern from
|
||||
eipc-registry-probe.ts, runs N cwd shapes via `evalInRenderer`,
|
||||
reports `[OK]` / `[REJ]` per shape; deleted after).
|
||||
- [`tools/test-harness/src/runners/`](../../tools/test-harness/src/runners/)
|
||||
— every existing spec is a template. Notable session 10 templates:
|
||||
- `T19_runtime.spec.ts` / `T20_runtime.spec.ts` — multi-suffix
|
||||
`waitForEipcChannels` over case-doc-anchored write-side suffixes
|
||||
+ single `invokeEipcChannel('LocalSessions_$_getAll', [])` for
|
||||
foundational read-side reachability. Pattern for any case-doc
|
||||
test whose anchors are write-side and no read-side equivalent
|
||||
invokes cleanly with `args = []`.
|
||||
— every existing spec is a template. Notable session 11 templates:
|
||||
- `T21_runtime.spec.ts` — multi-suffix `waitForEipcChannels` over
|
||||
case-doc-anchored Launch suffixes + dual `invokeEipcChannel` on
|
||||
case-doc-anchored read-side getters (`getConfiguredServices`
|
||||
returns array, `getAutoVerify` returns boolean). Pattern for any
|
||||
case-doc test whose anchors include read-side handlers with
|
||||
resolvable arg shapes — invoke the case-doc-anchored read-sides
|
||||
directly (no foundational surrogate needed). Mixed-shape dual
|
||||
invocation is fine.
|
||||
- [`docs/testing/cases/*.md`](cases/) — the spec each runner
|
||||
asserts. The **Code anchors:** field tells you exactly where
|
||||
upstream implements the feature.
|
||||
@@ -136,97 +127,93 @@ Read these in order before fanning out:
|
||||
### Tests in scope this session
|
||||
|
||||
**Realistic ceiling: ~1-2 new specs OR one investigation + one new
|
||||
spec landing.** Session 10 landed 2 specs without primitive change;
|
||||
Session 9 landed 1 spec. Session 11's main bet should aim for 1-2.
|
||||
spec landing.** Session 11 landed 1 spec; session 10 landed 2; session
|
||||
9 landed 1. Session 12's main bet should aim for 1.
|
||||
|
||||
**Category A (T21 dev server preview) is now the natural next step.**
|
||||
Launch IS registered (session 10 finding); only the cwd-arg schema-
|
||||
rev separates it from invocation. Category B (T11 plugin install
|
||||
runtime upgrade) is a parallel option using the same pattern.
|
||||
Category C (operon-mode navigation probe) is investigation-shaped.
|
||||
**Category A (T11 plugin install runtime upgrade) is the natural next
|
||||
step.** It's currently a Tier 1 fingerprint only; promoting to a Tier
|
||||
2 reframe follows the T21 shape — investigate `LocalPlugins`
|
||||
read-side candidates, smoke-test, ship a single spec. Category B
|
||||
(operon-mode navigation probe) is the smaller-scope investigation
|
||||
fallback. There's no Category C this session — the deferral list
|
||||
narrowed enough that two categories cover the budget.
|
||||
|
||||
Three categories — pick ONE as the main bet, treat the others as
|
||||
Two categories — pick ONE as the main bet, treat the other as
|
||||
fallback if the main bet hits an early blocker:
|
||||
|
||||
| # | Tests | Source | Notes |
|
||||
|---|---|---|---|
|
||||
| **A** T21 dev server preview | T21 | T19/T20 template + `lib/eipc.ts` invokeEipcChannel + bundle grep for `cwd` validator | `claude.web/Launch` registers 25 handlers (session 10 finding). T21's case-doc claim is "dev server preview pane"; the read-side reframe targets `Launch/getAutoVerify` or `Launch/getConfiguredServices` — both reject with `Argument "cwd" at position 0` on `args = []`. Schema-rev cwd via rejection-grep (session 9 pattern); cwd is likely just a string filesystem path. Then ship a Tier 2 invocation runner asserting the array-or-object shape. Risk: cwd validation may be more elaborate than a string (might need an existing-directory check); have a fallback path that uses the harness's isolation tmpdir as cwd. Smaller than A from session 10 — single suffix invocation, no multi-suffix registration probe needed unless you want to belt-and-suspender it. |
|
||||
| **B** T11 plugin install runtime upgrade | T11 | T19/T20 template + read-side `LocalPlugins` enumeration | Session 7's registry probe surfaced 15 `LocalPlugins_*` handlers. T11 currently is a Tier 1 fingerprint only. Read-side candidate: `LocalPlugins/getPlugins` (likely returns array of installed plugins; needs schema-rev or smoke-test first). Same pattern as T19/T20 — registration probe + foundational read-side invocation. Risk: getPlugins may need a cwd or plugin-context arg; smoke-test first. |
|
||||
| **C** operon-mode navigation probe | n/a (investigation) + maybe small Tier 2 reframe | new probe + bundle grep for operon URL routes | Session 10 confirmed `OperonBootstrap.ensure` registers eagerly but the other 21 operon interfaces remain registry-unconfirmed. Outputs: either an operon-mode URL form recovered from the bundle (search for `operon`-keyed routes in `claude.ai/...` paths) plus a registry re-probe after navigation, OR a deferral note explaining why operon scope can't be reached without an operon-mode entry. Smaller scope than A or B. |
|
||||
| **A** T11 plugin install runtime upgrade | T11 | T21 template + `lib/eipc.ts` invokeEipcChannel + bundle grep / smoke-test for any LocalPlugins arg validators | T11 currently is a Tier 1 fingerprint only. Session 7's registry probe surfaced 15 `LocalPlugins_*` handlers (sample names: `getPlugins`, `getDownloadedRemotePlugins`, `syncRemotePlugins`, `listSkillFiles`). Read-side candidates for invocation — likely return array of installed plugins / downloaded remotes / detected skills. Pattern: registration probe over the case-doc-anchored install-flow suffixes (which the T11 case-doc says — read it for the exact list) PLUS invocation of one or two read-side getters with whatever args the validators allow. Risk: getPlugins / similar may need a `cwd` arg or `pluginContext` object (mirrors T33c's `egressAllowedDomains` + `pluginContext` shape); smoke-test first. |
|
||||
| **B** operon-mode navigation probe | n/a (investigation) + maybe small Tier 2 reframe | new probe + bundle grep for operon URL routes | Session 10 confirmed `OperonBootstrap.ensure` registers eagerly but the other 21 wrapper-exposed operon interfaces remain registry-unconfirmed. Outputs: either an operon-mode URL form recovered from the bundle (search for `operon`-keyed routes in `claude.ai/...` paths) plus a registry re-probe after navigation, OR a deferral note explaining why operon scope can't be reached without an operon-mode entry. Smaller scope than A. |
|
||||
|
||||
#### Category A — T21 dev server preview
|
||||
|
||||
The plan: schema-rev the `cwd` validator on `Launch/getAutoVerify`
|
||||
or `Launch/getConfiguredServices`, then ship a Tier 2 invocation
|
||||
runner.
|
||||
|
||||
**Investigation phase first** — cwd format isn't yet known:
|
||||
|
||||
1. **Re-run smoke-test** against the user's debugger-attached Claude
|
||||
with various cwd shapes (mirror session 10's `/tmp/eipc-smoke-
|
||||
test.ts`):
|
||||
- `args = ['']` (empty string)
|
||||
- `args = ['/tmp']` (existing directory)
|
||||
- `args = ['/nonexistent']` (non-existent path — does the validator
|
||||
gate on existence?)
|
||||
- `args = ['/home/$USER']` (home dir)
|
||||
- `args = ['.']` (relative path)
|
||||
- `args = [process.cwd()]` (test CWD itself)
|
||||
- `args = [{ path: '/tmp' }]` (object form — some validators wrap)
|
||||
2. **Capture rejection messages**. If `Argument "cwd" ... must be a
|
||||
string` → it's a flat string. If `must be an absolute path` → it
|
||||
needs absolute. If a successful invocation returns an array/object,
|
||||
you have the shape.
|
||||
3. **Schema-rev the validator** via bundle grep on the rejection
|
||||
message literal (session 9 finding). The validator block sits
|
||||
~50-200 chars before the throw site.
|
||||
4. **Motivate the reframe** in the leading comment. T21's case-doc
|
||||
claim is "dev server preview pane starts on Preview → Start"; the
|
||||
read-side reframe is e.g. "the configured-services / auto-verify
|
||||
getters are wired and return their documented shape — the Preview
|
||||
dropdown populates from this surface". The connection to the case-
|
||||
doc surface needs to be plausible, not just "this handler returns
|
||||
an array".
|
||||
|
||||
**Approaches to investigate (in order):**
|
||||
|
||||
1. **Smoke-test cwd shapes** against the user's debugger-attached
|
||||
Claude. Cheapest signal — directly probes what the validator
|
||||
accepts.
|
||||
2. **Bundle grep on the rejection message literal** for any rejection
|
||||
not resolved by smoke-test alone. The validator block is byte-
|
||||
adjacent to the throw site.
|
||||
3. **Draft Tier 2 spec** using T19_runtime / T20_runtime shape (multi-
|
||||
suffix `waitForEipcChannels` over the read-side getters, plus
|
||||
`invokeEipcChannel` on the resolved cwd shape).
|
||||
|
||||
If Category A's cwd schema doesn't resolve cleanly after 2-3 attempts
|
||||
(rejections include shape constraints not derivable from the bundle,
|
||||
all attempts fail validation, the validator demands an existing
|
||||
directory and the test isolation tmpdir doesn't qualify), STOP AND
|
||||
REPORT. Pivot to Category B or C.
|
||||
|
||||
#### Category B — T11 plugin install runtime upgrade
|
||||
#### Category A — T11 plugin install runtime upgrade
|
||||
|
||||
The plan: confirm `LocalPlugins/*` is a tractable invocation surface,
|
||||
then ship a Tier 2 reframe.
|
||||
|
||||
**Investigation phase first** — `LocalPlugins` arg shapes aren't yet
|
||||
known:
|
||||
|
||||
1. **Re-run `eipc-registry-probe.ts`** filtering for
|
||||
`LocalPlugins_$_*`. Session 7 surfaced 15 handlers but only listed
|
||||
4 sample method names per interface. Dump the full method list.
|
||||
2. **Smoke-test candidate read-sides** with `args = []`:
|
||||
4 sample method names per interface. Dump the full method list to
|
||||
`/tmp/eipc-localplugins-methods.json` for grep. This is also a
|
||||
cheap re-probe of "is the registry shape unchanged from session
|
||||
7" — drift detection.
|
||||
2. **Smoke-test candidate read-sides** with `args = []` first (the
|
||||
cheapest signal). Mirror session 11's `launch-cwd-smoke.ts`
|
||||
pattern (clone the InspectorClient pattern from
|
||||
eipc-registry-probe.ts, iterate over candidate methods + arg
|
||||
shapes, report `[OK]` / `[REJ]` per probe):
|
||||
- `getPlugins`, `getDownloadedRemotePlugins`, `syncRemotePlugins`,
|
||||
`listSkillFiles` (sample names from session 7)
|
||||
- Capture rejections. Schema-rev via bundle grep if needed.
|
||||
3. **Draft Tier 2 spec** as `T11_runtime.spec.ts` — registration
|
||||
probe + foundational read-side invocation. The case-doc connection:
|
||||
"T11 verifies the plugin install code path; the LocalPlugins
|
||||
listing handler is wired and returns the documented array shape".
|
||||
`listSkillFiles` — and any other read-shaped names from the
|
||||
full method list dump.
|
||||
- Capture rejection messages. If `Argument "<name>" at position
|
||||
<N> ... failed to pass validation`, schema-rev via bundle grep
|
||||
on the literal (session 9 finding) — validator block sits
|
||||
~50-200 chars before throw site.
|
||||
- Try the same shapes session 11 used for Launch: `[]` (empty),
|
||||
`['/tmp']` (string cwd), `[process.cwd()]` (real cwd),
|
||||
`[{}]` (empty object), `[[]]` (empty array — what T33c uses
|
||||
for `egressAllowedDomains`).
|
||||
3. **Schema-rev any unresolved validator** via bundle grep on the
|
||||
rejection message literal. Session 9's pattern: the validator
|
||||
block sits ~50-200 chars before the throw site in the bundled
|
||||
`index.js`. Session 11's smoke-test path resolved Launch's
|
||||
trivial `cwd: string` validator without bundle-grep — try
|
||||
smoke-test first, fall back to grep only if rejection messages
|
||||
need more decoding.
|
||||
4. **Read T11's case-doc anchors** — they'll name the install-flow
|
||||
suffixes (likely `installPlugin` / `enablePlugin` / similar
|
||||
write-sides). Build the `EXPECTED_SUFFIXES` registration list
|
||||
from those.
|
||||
5. **Motivate the reframe** in the leading comment. T11's case-doc
|
||||
claim is "plugin install code path is wired"; the read-side
|
||||
reframe is e.g. "the plugin enumeration handler is wired and
|
||||
returns the documented array shape — the install button only
|
||||
activates when the listing populates". The connection to the
|
||||
case-doc surface needs to be plausible, not just "this handler
|
||||
returns an array".
|
||||
|
||||
Skip this category unless Category A is blocked AND Category C is
|
||||
unappealing.
|
||||
**Approaches to investigate (in order):**
|
||||
|
||||
#### Category C — operon-mode navigation probe
|
||||
1. **Smoke-test LocalPlugins read-sides** with `args = []` and a few
|
||||
common shapes against the user's debugger-attached Claude.
|
||||
Cheapest signal — directly probes what each validator accepts.
|
||||
2. **Bundle grep on any rejection-message literal** for any rejection
|
||||
not resolved by smoke-test alone.
|
||||
3. **Draft Tier 2 spec** using T21 shape (multi-suffix
|
||||
`waitForEipcChannels` over the case-doc-anchored install-flow
|
||||
suffixes, plus `invokeEipcChannel` on the resolved read-side).
|
||||
|
||||
If Category A's read-side invocation doesn't resolve cleanly after 2-3
|
||||
attempts (every candidate rejects with shape constraints not derivable
|
||||
from the bundle, all candidate read-sides require user-account-scoped
|
||||
args like a real `cwd` with installed plugins, the validator demands
|
||||
a session/plugin-context that the harness can't construct), STOP AND
|
||||
REPORT. Pivot to Category B.
|
||||
|
||||
#### Category B — operon-mode navigation probe
|
||||
|
||||
The plan: find an operon-mode URL form and verify whether the other
|
||||
21 operon interfaces register lazily.
|
||||
@@ -246,7 +233,7 @@ The plan: find an operon-mode URL form and verify whether the other
|
||||
document as "operon scope handlers register lazily on a navigation
|
||||
we can't easily construct from the harness" and defer.
|
||||
|
||||
This is the smallest-scope category — investigation + maybe one
|
||||
This is the smaller-scope category — investigation + maybe one
|
||||
spec landing. Best fallback if Category A is blocked.
|
||||
|
||||
#### Cross-compositor focus-shifter expansion (NOT recommended this session)
|
||||
@@ -266,12 +253,12 @@ section. Don't add it speculatively — wait for a real consumer.
|
||||
|
||||
### Constraints to respect (don't violate)
|
||||
|
||||
These are unchanged from sessions 1-10 and still load-bearing:
|
||||
These are unchanged from sessions 1-11 and still load-bearing:
|
||||
|
||||
- **Default isolation** unless the spec needs otherwise. Use
|
||||
`seedFromHost: true` for any test that depends on authenticated
|
||||
renderer state — never assume default isolation gets past
|
||||
`/login`. T16/T19/T20/T26/T22b/T27/T31b/T33b/T33c/T35b/T37b/T38b
|
||||
`/login`. T16/T19/T20/T21/T26/T22b/T27/T31b/T33b/T33c/T35b/T37b/T38b
|
||||
are the templates.
|
||||
- **eipc handlers register on `webContents.ipc._invokeHandlers`,
|
||||
NOT global `ipcMain._invokeHandlers`.** Session 7 finding. Use
|
||||
@@ -286,23 +273,32 @@ These are unchanged from sessions 1-10 and still load-bearing:
|
||||
gate honestly. Main-side direct calls work but require spoofing
|
||||
`senderFrame.url`; reserved as a fallback for non-claude.ai
|
||||
webContents (no current consumer).
|
||||
- **For arg validator schema-rev: grep the rejection message
|
||||
literal first.** Session 9 finding. When `invokeEipcChannel`
|
||||
rejects with `Argument "<name>" at position N ... failed to pass
|
||||
validation`, that exact string lives inline in the validator
|
||||
block. One grep on the literal resolves the location; reading
|
||||
~2KB around it surfaces the full schema. Cheaper than runtime
|
||||
closure inspection in most cases (closure inspection is a good
|
||||
cross-check).
|
||||
- **For arg validator schema-rev: try smoke-test first, then grep
|
||||
the rejection message literal.** Session 9 finding. When
|
||||
`invokeEipcChannel` rejects with `Argument "<name>" at position N
|
||||
... failed to pass validation`, that exact string lives inline in
|
||||
the validator block. One grep on the literal resolves the
|
||||
location; reading ~2KB around it surfaces the full schema. Cheaper
|
||||
than runtime closure inspection in most cases. Session 11 finding:
|
||||
for trivial `typeof === 'string'` validators, the smoke-test
|
||||
resolves the shape in one round-trip — bundle-grep is unnecessary
|
||||
overhead for simple validators.
|
||||
- **For session-scoped Tier 2 reframes: `LocalSessions/getAll` is
|
||||
the foundational read-side surrogate.** Session 10 finding. When
|
||||
a case-doc test's anchors are write-side LocalSessions handlers,
|
||||
ship a registration probe over the case-doc-anchored suffixes
|
||||
PLUS a single `invokeEipcChannel('LocalSessions_$_getAll', [])`
|
||||
array-shape assertion as the read-side surrogate. The case-doc
|
||||
connection: "this surface binds to a LocalSession; getAll proves
|
||||
the LocalSessions impl object is reachable through the renderer
|
||||
wrapper".
|
||||
a case-doc test's anchors are write-side LocalSessions handlers
|
||||
with no read-side equivalent, ship a registration probe over the
|
||||
case-doc-anchored suffixes PLUS a single
|
||||
`invokeEipcChannel('LocalSessions_$_getAll', [])` array-shape
|
||||
assertion as the read-side surrogate.
|
||||
- **For Tier 2 reframes with case-doc-anchored read-side handlers:
|
||||
invoke the case-doc-anchored handlers directly.** Session 11
|
||||
finding. When the case-doc has read-side anchors with resolvable
|
||||
arg shapes (like T21's `getConfiguredServices(cwd)` /
|
||||
`getAutoVerify(cwd)`), prefer invoking those over a foundational
|
||||
surrogate. Mixed-shape dual invocation (one returns array, another
|
||||
returns boolean) is fine — assert each shape independently. This
|
||||
pattern is strictly stronger than the foundational-surrogate
|
||||
pattern when applicable.
|
||||
- **`lib/input.ts` is X11-only.** Strict `XDG_SESSION_TYPE ===
|
||||
'x11'` gate. Wayland consumers must skip — don't try to bolt
|
||||
Wayland into the file.
|
||||
@@ -332,8 +328,8 @@ These are unchanged from sessions 1-10 and still load-bearing:
|
||||
errors and short-circuit; see S11 / S14 for the pattern.)
|
||||
- **Diagnostics on every run.** `testInfo.attach()` the artefacts.
|
||||
Single-shot JSON dumps for multi-state tests (S11, S14, S31,
|
||||
T19, T20, T22b, T27, T31b, T33b, T33c, T35b, T37b, T38b pattern)
|
||||
are cleaner than 5+ separate attachments.
|
||||
T19, T20, T21, T22b, T27, T31b, T33b, T33c, T35b, T37b, T38b
|
||||
pattern) are cleaner than 5+ separate attachments.
|
||||
- **Tag with annotations.** `severity:` and `surface:` on every
|
||||
test so JUnit carries them through to matrix-regen.
|
||||
- **Tabs in TS, ~80-char wrap as the existing files do.** Match
|
||||
@@ -357,26 +353,27 @@ These are unchanged from sessions 1-10 and still load-bearing:
|
||||
contain credentials; scheduled-task instructions may reference
|
||||
internal projects; marketplace `pluginContext`-filtered listings
|
||||
may surface internal-org marketplace pointers (T33c's defensive
|
||||
default). T19/T20's `getAll` defensive default extends the
|
||||
pattern: session metadata may include user-account-scoped paths
|
||||
and titles.
|
||||
default). T19/T20's `getAll` and T21's `getConfiguredServices`
|
||||
defensive defaults extend the pattern: session metadata may
|
||||
include user-account-scoped paths and titles; configured dev
|
||||
service entries may include workspace paths from auto-detect.
|
||||
|
||||
### Phases
|
||||
|
||||
#### Phase 0 — calibration
|
||||
|
||||
1. `cd tools/test-harness && npm run typecheck` — should pass.
|
||||
2. Read the plan doc's "Status (post-execution)" session 10 section,
|
||||
2. Read the plan doc's "Status (post-execution)" session 11 section,
|
||||
then read `lib/eipc.ts`'s `invokeEipcChannel` API +
|
||||
`T19_runtime.spec.ts` / `T20_runtime.spec.ts` leading comments.
|
||||
Confirm you understand the multi-suffix registration + foundational
|
||||
read-side invocation pattern.
|
||||
`T21_runtime.spec.ts` leading comments. Confirm you understand the
|
||||
multi-suffix registration + dual case-doc-anchored read-side
|
||||
invocation pattern.
|
||||
3. Pick ONE Category as the main bet. For Category A, plan the
|
||||
approach: (a) smoke-test cwd shapes against `Launch/getAutoVerify`,
|
||||
(b) bundle-grep any rejection literal for shape constraints, (c)
|
||||
draft the Tier 2 invocation spec. List which approaches you'll try
|
||||
in what order, with the cap at 2-3 distinct approaches before STOP
|
||||
AND REPORT.
|
||||
approach: (a) re-run registry probe filtering for LocalPlugins,
|
||||
(b) smoke-test candidate read-sides with various arg shapes, (c)
|
||||
bundle-grep any unresolved validator, (d) draft the Tier 2 spec.
|
||||
List which approaches you'll try in what order, with the cap at
|
||||
2-3 distinct approaches before STOP AND REPORT.
|
||||
|
||||
If Phase 0 surfaces a problem (typecheck failing, primitives unclear,
|
||||
the chosen Category's prerequisites don't hold), stop and report.
|
||||
@@ -384,24 +381,19 @@ Don't fan out.
|
||||
|
||||
#### Phase 1 — fan-out batch
|
||||
|
||||
For Category A (T21 dev server preview):
|
||||
- Spawn ONE subagent for the cwd schema-rev investigation
|
||||
(smoke-test + bundle-grep). Treat as exploratory; report findings
|
||||
before committing to a spec shape. The user's debugger-attached
|
||||
running Claude is a great target for verification probes.
|
||||
- Cap re-spawns at 2-3 distinct approaches; if cwd schema doesn't
|
||||
resolve, STOP AND REPORT. Pivot to Category B or C if budget
|
||||
remains.
|
||||
- If schema is recoverable AND invocation lands cleanly with valid
|
||||
args, second batch: ship `T21_runtime.spec.ts`.
|
||||
- Cap at ~1 spec total — T21 is single-suffix invocation, smaller
|
||||
scope than T19/T20.
|
||||
For Category A (T11 plugin install runtime upgrade):
|
||||
- Spawn ONE subagent for the LocalPlugins read-side schema
|
||||
investigation (registry re-probe + smoke-test + any needed
|
||||
bundle-grep). Treat as exploratory; report findings before
|
||||
committing to a spec shape. The user's debugger-attached running
|
||||
Claude is a great target for verification probes.
|
||||
- Cap re-spawns at 2-3 distinct approaches; if no read-side resolves
|
||||
cleanly, STOP AND REPORT. Pivot to Category B if budget remains.
|
||||
- If a read-side lands cleanly with valid args, second batch: ship
|
||||
`T11_runtime.spec.ts`.
|
||||
- Cap at ~1 spec total — same scope as session 11's T21.
|
||||
|
||||
For Category B (T11 plugin install runtime upgrade):
|
||||
- Same shape as Category A — investigate read-side `LocalPlugins`
|
||||
candidates, smoke-test, schema-rev, ship `T11_runtime.spec.ts`.
|
||||
|
||||
For Category C (operon-mode navigation probe):
|
||||
For Category B (operon-mode navigation probe):
|
||||
- Single subagent does bundle-grep for operon URL routes + per-URL
|
||||
registry re-probe. Report findings; if a Tier 2 reframe is
|
||||
tractable, ship one spec.
|
||||
@@ -442,7 +434,7 @@ If the target isn't reasonable to implement (anchors don't resolve
|
||||
to anything assertable, the test depends on state you can't
|
||||
construct, the existing primitives don't cover the surface), DO
|
||||
NOT write a stub. Report under Open questions and stop. Sessions
|
||||
1-10 had cumulative ~17 "stop and report" outcomes that were the
|
||||
1-11 had cumulative ~17 "stop and report" outcomes that were the
|
||||
right call.
|
||||
|
||||
Report shape (~150 words):
|
||||
@@ -476,7 +468,7 @@ After fan-out returns:
|
||||
- Primitives landed (with API shape)
|
||||
- Specs deferred (with the per-test rationale)
|
||||
- Specs reclassified (Tier 3 → Tier 2, Tier 2 → Tier 1, etc.)
|
||||
- Updated coverage stat (was 72/76 = 95%, now N/76 = M%)
|
||||
- Updated coverage stat (was 73/76 = 96%, now N/76 = M%)
|
||||
6. Don't commit. The user reviews and commits.
|
||||
7. Rotate this prompt: rewrite
|
||||
`docs/testing/runner-implementation-followup-prompt.md` for
|
||||
@@ -484,7 +476,7 @@ After fan-out returns:
|
||||
|
||||
### Self-correction loop
|
||||
|
||||
Same as sessions 1-10:
|
||||
Same as sessions 1-11:
|
||||
|
||||
1. Subagent typecheck failure → re-spawn with explicit fix
|
||||
instruction.
|
||||
@@ -497,10 +489,10 @@ Same as sessions 1-10:
|
||||
an unauthenticated launch where the handler check vacuously
|
||||
passes because no handlers are registered) → re-examine the
|
||||
assertion shape.
|
||||
5. **Carry-over from session 5/6/7/8/9/10:** If pursuing Category A
|
||||
and the cwd schema doesn't resolve / requires schema-rev that
|
||||
exceeds budget after 2-3 approaches, STOP. Don't keep digging —
|
||||
pivot to Category B or C. Document what was tried.
|
||||
5. **Carry-over from session 5/6/7/8/9/10/11:** If pursuing Category A
|
||||
and the LocalPlugins read-side schema doesn't resolve / requires
|
||||
schema-rev that exceeds budget after 2-3 approaches, STOP. Don't
|
||||
keep digging — pivot to Category B. Document what was tried.
|
||||
6. **Carry-over from session 10:** If a registration probe surfaces
|
||||
"registered but uninvocable" (handler is on the registry but the
|
||||
renderer-side wrapper isn't exposed for the relevant scope or the
|
||||
@@ -521,22 +513,21 @@ Stop and write the final report when one of:
|
||||
3. **Discovered a primitive gap that breaks 5+ Tier 2/Tier 3
|
||||
tests.** Stop, propose where the new primitive should live in
|
||||
`lib/`. Future session adds the primitive first, then resumes.
|
||||
4. **Session budget hits ~1-2 new specs OR one new primitive
|
||||
4. **Session budget hits ~1 new spec OR one new primitive
|
||||
landing.** Stop, synthesize, leave the rest for the next
|
||||
session.
|
||||
5. **Category A cwd schema doesn't resolve after 2-3 distinct
|
||||
5. **Category A read-side schema doesn't resolve after 2-3 distinct
|
||||
attempts.** Document the dead-end as a finding, pivot to
|
||||
Category B or C if budget remains.
|
||||
Category B if budget remains.
|
||||
|
||||
### What you should NOT do
|
||||
|
||||
- **Don't try to land Category A + Category B in one batch.** Pick
|
||||
ONE as the main bet. Category C is small enough to pair as a
|
||||
fallback.
|
||||
ONE as the main bet.
|
||||
- **Don't ship stubs.** If a runner can't actually assert what the
|
||||
spec says, mark it as Tier 3 / blocked / primitive-gap and
|
||||
don't write a placeholder. The cumulative seventeen "stop and
|
||||
report" outcomes from sessions 1-10 were the right call — every
|
||||
report" outcomes from sessions 1-11 were the right call — every
|
||||
one revealed a real constraint.
|
||||
- **Don't break existing runners.** H01-H05 are the canaries.
|
||||
- **Don't restructure `lib/`** beyond targeted additions.
|
||||
@@ -559,7 +550,7 @@ Stop and write the final report when one of:
|
||||
primitive doesn't enforce a read-only allowlist; the safety
|
||||
property is that case-doc-anchored suffixes are read-side OR
|
||||
case-doc-anchored write-side suffixes are tested via REGISTRATION
|
||||
ONLY (`waitForEipcChannels`), never invoked. T19/T20 ship
|
||||
ONLY (`waitForEipcChannels`), never invoked. T19/T20/T21 ship
|
||||
registration probes over write-side suffixes — that's the safe
|
||||
pattern.
|
||||
- **Don't bolt other compositors into `lib/input-niri.ts`.**
|
||||
@@ -576,6 +567,11 @@ Stop and write the final report when one of:
|
||||
speculatively.** Build it only if a concrete consumer needs to
|
||||
invoke through a non-claude.ai webContents. Premature primitives
|
||||
leak design debt.
|
||||
- **Don't speculate on a Launch event-subscription primitive.**
|
||||
Session 11 noted that `window['claude.web'].Launch` exposes 5
|
||||
`on*` event subscribers + `activeServersStore` not visible in
|
||||
`_invokeHandlers`. No consumer asks for an event-probe primitive
|
||||
yet. Wait for one.
|
||||
- **Don't implement the #569 power-inhibit patch in this
|
||||
session.** That's a separate workstream.
|
||||
- **Don't commit.** The user reviews and commits.
|
||||
@@ -583,13 +579,13 @@ Stop and write the final report when one of:
|
||||
### Final report format
|
||||
|
||||
```markdown
|
||||
## Runner implementation summary (session 11)
|
||||
## Runner implementation summary (session 12)
|
||||
|
||||
- Main-bet category: A | B | C
|
||||
- Main-bet category: A | B
|
||||
- Specs landed: N
|
||||
- Primitives landed: N
|
||||
- Reclassified mid-flight: N (with reasons)
|
||||
- Coverage: was 72/76 (95%), now <NEW>/76 (<PCT>%)
|
||||
- Coverage: was 73/76 (96%), now <NEW>/76 (<PCT>%)
|
||||
- Typecheck: clean | <errors>
|
||||
- KDE-W test run: <pass/skip/fail counts>
|
||||
|
||||
@@ -597,7 +593,7 @@ Stop and write the final report when one of:
|
||||
|
||||
| Cat | Test ID | File | Assertion shape | Status |
|
||||
|---|---|---|---|---|
|
||||
| A | T21_runtime | T21_runtime.spec.ts | … | ✓ pass / skip / fail |
|
||||
| A | T11_runtime | T11_runtime.spec.ts | … | ✓ pass / skip / fail |
|
||||
| ... |
|
||||
|
||||
## Notable findings
|
||||
@@ -652,28 +648,37 @@ git diff --stat
|
||||
- For eipc registry walking: `lib/eipc.ts` exports
|
||||
`getEipcChannels` / `findEipcChannel` / `findEipcChannels` /
|
||||
`waitForEipcChannel` / `waitForEipcChannels` against
|
||||
`webContents.ipc._invokeHandlers`. See T19 / T20 / T22b / T31b /
|
||||
T33b / T38b for end-to-end consumer patterns.
|
||||
`webContents.ipc._invokeHandlers`. See T19 / T20 / T21 / T22b /
|
||||
T31b / T33b / T38b for end-to-end consumer patterns.
|
||||
- For eipc invocation: `lib/eipc.ts` exports `invokeEipcChannel`
|
||||
(renderer-side wrapper at
|
||||
`window['claude.<scope>'].<Iface>.<method>`). See T19 / T20 /
|
||||
T27 / T33c / T35b / T37b for end-to-end consumer patterns. Only
|
||||
call read-side suffixes; the primitive doesn't enforce a read-
|
||||
only allowlist.
|
||||
- **For arg validator schema-rev (session 9 finding):** when
|
||||
T21 / T27 / T33c / T35b / T37b for end-to-end consumer patterns.
|
||||
Only call read-side suffixes; the primitive doesn't enforce a
|
||||
read-only allowlist.
|
||||
- **For arg validator schema-rev (sessions 9 / 11 findings):** when
|
||||
invocation rejects with `Argument "<name>" at position N ...
|
||||
failed to pass validation`, grep the bundled `index.js` for the
|
||||
literal rejection string. The validator block sits ~50-200 chars
|
||||
before that throw. Read ~2KB around it to capture the full
|
||||
schema. See plan-doc session 9 status section for the byte
|
||||
offsets of the two CustomPlugins validators (5013601 / 5018821)
|
||||
as worked examples.
|
||||
failed to pass validation`, FIRST try smoke-testing common arg
|
||||
shapes against the user's debugger-attached Claude (session 11's
|
||||
`launch-cwd-smoke.ts` pattern — clone the InspectorClient
|
||||
connection, iterate over arg shape candidates, report `[OK]` /
|
||||
`[REJ]` per shape). For trivial validators (`typeof === 'string'`
|
||||
/ similar), this resolves the schema in one round-trip and avoids
|
||||
needing bundle-grep. For more elaborate validators, fall back to
|
||||
grep on the bundled `index.js` for the literal rejection string;
|
||||
validator block sits ~50-200 chars before the throw site. See
|
||||
plan-doc session 9 status section for the byte offsets of the
|
||||
two CustomPlugins validators (5013601 / 5018821) as worked
|
||||
examples.
|
||||
- **For session-scoped Tier 2 reframes (session 10 finding):**
|
||||
`LocalSessions/getAll` is the foundational read-side surrogate.
|
||||
Pattern: `args = []`, returns `Array<Session>`, the case-doc
|
||||
connection is "this surface binds to a LocalSession; getAll
|
||||
proves the LocalSessions impl object is reachable through the
|
||||
renderer wrapper". T19 and T20 are the templates.
|
||||
`LocalSessions/getAll` is the foundational read-side surrogate
|
||||
when case-doc anchors are write-side. Pattern: `args = []`,
|
||||
returns `Array<Session>`. T19 and T20 are the templates.
|
||||
- **For Tier 2 reframes with case-doc-anchored read-side handlers
|
||||
(session 11 finding):** invoke the case-doc-anchored handlers
|
||||
directly rather than using a foundational surrogate. Mixed-shape
|
||||
dual invocation is fine. T21 is the template (one returns array,
|
||||
another returns boolean — assert each shape independently).
|
||||
- **For asar fingerprints: ALWAYS grep the installed asar
|
||||
first.** Build-reference is beautified; the bundle is
|
||||
minified. Case-doc text may be the user-facing form, not the
|
||||
|
||||
@@ -18,6 +18,94 @@ work begins.
|
||||
|
||||
## Status (post-execution)
|
||||
|
||||
**Shipped session 11 (1 new spec, no primitive change):** T21 (Tier 2
|
||||
reframe — `seedFromHost` + multi-suffix registration probe over five
|
||||
case-doc-anchored Launch handlers + dual-handler invocation of the
|
||||
case-doc-anchored read-side getters `getConfiguredServices` /
|
||||
`getAutoVerify`). First runtime probe for T21 — no fingerprint sibling
|
||||
shipped because the case-doc anchors point at impl-side function
|
||||
names (`setAutoVerify`, `parseLaunchJson`, `capturePage` /
|
||||
`captureViaCDP`) plus an MCP tool table (`preview_*`), not the user-
|
||||
facing channel names. Coverage moved from 72/76 (95%) to 73/76 (96%).
|
||||
Passes on KDE-W in 16.7s (cold) / 5.2s (warm follow-up).
|
||||
|
||||
Two commits on `docs/compat-matrix` expected (SHAs inserted after
|
||||
the test-harness commit lands — the user reviews and commits at the
|
||||
end of every session):
|
||||
|
||||
- TBD — `test(harness): session 11 T21 dev server preview runtime`
|
||||
(Tier 2 reframe; multi-suffix `waitForEipcChannels` over the
|
||||
case-doc-anchored Launch suffixes — `getConfiguredServices` /
|
||||
`startFromConfig` / `stopServer` / `getAutoVerify` /
|
||||
`capturePreviewScreenshot` — plus dual `invokeEipcChannel` on
|
||||
`Launch_$_getConfiguredServices` (returns array) and
|
||||
`Launch_$_getAutoVerify` (returns boolean), both with
|
||||
`cwd = process.cwd()` as the validator-passing string).
|
||||
|
||||
Session 11 findings + reclassifications:
|
||||
|
||||
- **`Launch` cwd validator is `typeof cwd === 'string'` only.** Smoke-
|
||||
test against the user's debugger-attached running Claude on
|
||||
`getAutoVerify` and `getConfiguredServices` showed: `''` (empty
|
||||
string), `'.'` (relative), `'/tmp'` (existing absolute), `'/'`
|
||||
(root), `'/home/aaddrick'` (home), `'/nonexistent-path-xyzzy'`
|
||||
(non-existent), `'/home/aaddrick/source/claude-desktop-debian'`
|
||||
(existing) ALL pass. Only `null`, `undefined`, and object wraps
|
||||
(`{path:'/tmp'}`, `{cwd:'/tmp'}`) reject. No path-existence check,
|
||||
no absolute-path requirement. The handler tolerates missing
|
||||
`<cwd>/.claude/launch.json` — returns `[]` for `getConfiguredServices`
|
||||
and `false` for `getAutoVerify` when the config file is absent.
|
||||
Bundle-grep on the rejection literal NOT needed — direct smoke-
|
||||
test resolved the schema in one round-trip.
|
||||
- **`window['claude.web'].Launch` exposes 30 callable members.** The
|
||||
registry probe sees 25 `_invokeHandlers`; the wrapper additionally
|
||||
surfaces 5 `on*` event subscribers (`onDeployEvent`,
|
||||
`onElementSelected`, `onPreviewSelectionShortcut`,
|
||||
`onPreviewUrlChanged`) plus `isAvailable` and `activeServersStore`.
|
||||
Wrapper-only entries don't show up in `webContents.ipc._invokeHandlers`
|
||||
because they're not invoke-style channels — they bind to event
|
||||
emitters and store proxies via different bridge primitives. Worth
|
||||
noting for any future T21-area test that wants to probe event
|
||||
subscription paths (those would need a different primitive than
|
||||
`invokeEipcChannel`).
|
||||
- **UUID still `c0eed8c9-c94a-4931-8cc3-3a08694e9863`.** Build-stable
|
||||
since session 2; smoke-test confirmed.
|
||||
- **Dual case-doc-anchored read-side invocation pattern.** T21
|
||||
follows T33c's shape (invoke each case-doc-anchored read-side
|
||||
suffix, assert the documented shape per handler) rather than T19
|
||||
/ T20's foundational-surrogate shape (invoke `LocalSessions/getAll`
|
||||
as a stand-in because case-doc anchors were write-side). When a
|
||||
case-doc has read-side anchors with resolvable arg shapes, prefer
|
||||
invoking the case-doc-anchored handlers directly — it removes the
|
||||
surrogate hop and the assertion is "the documented handler returns
|
||||
the documented shape" rather than "a foundational sibling is
|
||||
reachable".
|
||||
- **No primitive change.** `lib/eipc.ts`'s `waitForEipcChannels` +
|
||||
`invokeEipcChannel` cover T21 unchanged. Investigation budget was
|
||||
~3 minutes for the smoke-test (eight cwd shapes against
|
||||
`getAutoVerify` plus three against `getConfiguredServices`); no
|
||||
bundle-grep needed.
|
||||
- **Filename convention.** T21 had no fingerprint sibling, so
|
||||
follows T19 / T20 / T26 / T27's `_runtime` (no letter suffix)
|
||||
shape — `T21_runtime.spec.ts`. Same pattern as session 10's first-
|
||||
runtime-probe-against-case-doc filename rule.
|
||||
|
||||
Tier 2 → Tier 2 candidates remaining for next session: **T11 plugin
|
||||
install runtime upgrade** (currently a Tier 1 fingerprint;
|
||||
`LocalPlugins` registers 15 handlers per session 7's probe, includes
|
||||
`getPlugins` / `getDownloadedRemotePlugins` / `syncRemotePlugins` /
|
||||
`listSkillFiles` candidates — needs schema-rev or smoke-test). **operon
|
||||
scope navigation probe** still on the table (session 10 confirmed
|
||||
`OperonBootstrap.ensure` registers eagerly but the other 21
|
||||
wrapper-exposed operon interfaces remain registry-unconfirmed; would
|
||||
need an operon-mode URL form recovered from the bundle). **T11 is
|
||||
the natural main bet for session 12** — same pattern as session 11's
|
||||
T21 (single Launch interface investigated, single new spec landed).
|
||||
The primitive surface remains broad enough that consumer-driven
|
||||
extensions are the right next move.
|
||||
|
||||
---
|
||||
|
||||
**Shipped session 10 (2 new specs, no primitive change):** T19 + T20
|
||||
(Tier 2 reframes — `seedFromHost` + multi-suffix registration probe
|
||||
over the case-doc-anchored write-side handlers + invocation of the
|
||||
|
||||
@@ -7,7 +7,7 @@ architecture, decisions, and rationale.
|
||||
|
||||
## Status
|
||||
|
||||
Seventy-two specs wired (34 cross-env T-tests, 33 env-specific S-tests,
|
||||
Seventy-three specs wired (35 cross-env T-tests, 33 env-specific S-tests,
|
||||
5 H-prefix harness self-tests). See
|
||||
[`docs/testing/runner-implementation-plan.md`](../../docs/testing/runner-implementation-plan.md)
|
||||
for the tiered triage of remaining tests and the per-spec rationale
|
||||
@@ -35,6 +35,7 @@ behind tier classification.
|
||||
| [T18](../../docs/testing/cases/code-tab-foundations.md#t18--drag-and-drop-files-into-prompt) | Bundled `mainView.js` preload contains the path-resolution bridge fingerprints: `getPathForFile` (2× — property key + the `webUtils.getPathForFile(` call, both at case-doc :9267), `webUtils`, `filePickers`, and the `claudeAppSettings` `contextBridge.exposeInMainWorld` namespace (case-doc :9552) — pins the load-bearing wiring without faking OS-level XDND drag (xdotool can't put file URIs on the X11 selection; Wayland needs per-compositor IPC + libei) | file probe |
|
||||
| [T19](../../docs/testing/cases/code-tab-foundations.md#t19--integrated-terminal) | After `seedFromHost` + `userLoaded`, the integrated-terminal eipc surface (`startShellPty`, `writeShellPty`, `stopShellPty`, `resizeShellPty`, `getShellPtyBuffer` — five-suffix presence probe) is registered on the claude.ai webContents AND the foundational `LocalSessions/getAll` returns array shape (Tier 2 reframe of the case-doc T19 case; case-doc anchors are write-side `startShellPty` etc. so reframe asserts the FULL terminal IPC surface registers + a stateless read-side surrogate is invocable) | L1 (eipc registry + invoke) |
|
||||
| [T20](../../docs/testing/cases/code-tab-foundations.md#t20--file-pane-opens-and-saves) | After `seedFromHost` + `userLoaded`, the file-pane eipc surface (`readSessionFile`, `writeSessionFile`, `pickSessionFile` — three-suffix presence probe) is registered on the claude.ai webContents AND the foundational `LocalSessions/getAll` returns array shape (Tier 2 reframe of the case-doc T20 case; the case-doc's `readSessionFile` anchor is read-side but needs (sessionId, path) args not constructible from a fresh isolation, so the registration probe + foundational `getAll` invocation is the strongest non-destructive Tier 2 layer) | L1 (eipc registry + invoke) |
|
||||
| [T21](../../docs/testing/cases/code-tab-workflow.md#t21--dev-server-preview-pane) | After `seedFromHost` + `userLoaded`, the preview-pane eipc surface (`getConfiguredServices`, `startFromConfig`, `stopServer`, `getAutoVerify`, `capturePreviewScreenshot` — five-suffix presence probe) is registered on the claude.ai webContents AND BOTH case-doc-anchored read-side handlers are callable through the renderer-side wrapper: `getConfiguredServices(cwd)` returns array shape, `getAutoVerify(cwd)` returns boolean shape (Tier 2 reframe of the case-doc T21 case; cwd validator is `typeof cwd === 'string'` only, smoke-tested session 11) | L1 (eipc registry + invoke) |
|
||||
| [T22](../../docs/testing/cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) | Bundled `index.js` contains `LocalSessions_$_getPrChecks` eipc channel name *and* `gh CLI not found in PATH` Linux-fallthrough throw site (Tier 1 fingerprint) | file probe |
|
||||
| [T22b](../../docs/testing/cases/code-tab-workflow.md#t22--pr-monitoring-via-gh) | After `seedFromHost` + `userLoaded`, the `LocalSessions_$_getPrChecks` eipc handler is registered on the claude.ai webContents (`webContents.ipc._invokeHandlers` — Tier 2 runtime probe sibling of T22, strictly stronger than the bundle-string fingerprint) | L1 (eipc registry) |
|
||||
| [T23](../../docs/testing/cases/code-tab-handoff.md#t23--desktop-notifications-fire) | Firing `new Notification({title})` from main reaches the session bus's `org.freedesktop.Notifications.Notify` (observed via `dbus-monitor`) | L1 + DBus subprocess |
|
||||
|
||||
Reference in New Issue
Block a user