Build out a Playwright-based regression-detection harness covering the compat-matrix surfaces (KDE-W, KDE-X, GNOME, Sway, i3, Niri, packaging formats). Adds: - Planning + decision docs under docs/testing/ — README, matrix, runbook, automation, cases/ (11 case files), quick-entry-closeout - Playwright scaffolding (config, tsconfig) - 78 spec runners under tools/test-harness/src/runners/ — T## case- doc runners and S## distribution/smoke runners - Substrate primitives in tools/test-harness/src/lib/: AX-tree loader (snapshotAx + waitForAxNode + axTreeToSnapshot), focus- shifter, eipc-registry, niri-native bridge, drag-drop bridge, electron-mocks, claudeai page-objects, inspector client S03 (DEB Depends declared) and S04 (RPM Requires declared) ship marked test.fail() — they're regression detectors for the case-doc gap (deb.sh emits no Depends:, rpm.sh sets AutoReqProv: no), and the expected-failure shape lets them report green on every host until upstream packaging starts declaring runtime deps. 127 files, no runtime changes; harness is opt-in via 'cd tools/test-harness && npx playwright test'. Co-authored-by: Claude <claude@anthropic.com>
Linux Compatibility Testing
Last updated: 2026-05-03
This directory holds the manual test plan for the Linux fork of Claude Desktop. The structure is designed for human readers today and scripted runners tomorrow.
Layout
| Folder / file | Purpose |
|---|---|
matrix.md |
The dashboard. Cross-environment results table + per-section env-specific status snapshots. Single source of truth for test status. |
runbook.md |
How to run a sweep: VM setup, diagnostic capture, status update workflow, severity guidance. |
cases/ |
Functional test specs grouped by feature surface. Stable IDs: T### cross-env, S### env-specific. |
Environment key
| Abbrev | Distro | DE | Display server |
|---|---|---|---|
| KDE-W | Fedora 43 | KDE Plasma | Wayland |
| KDE-X | Fedora 43 | KDE Plasma | X11 |
| GNOME | Fedora 43 | GNOME | Wayland |
| Ubu | Ubuntu 24.04 | GNOME | Wayland |
| Sway | Fedora 43 | Sway | Wayland (wlroots) |
| i3 | Fedora 43 | i3 | X11 |
| Niri | Fedora 43 | Niri | Wayland (wlroots) |
| Hypr-O | OmarchyOS | Hyprland | Wayland (wlroots) |
| Hypr-N | NixOS | Hyprland | Wayland (wlroots) |
Status legend: ✓ pass · ✗ fail · 🔧 mitigated · ? untested · - N/A
Cells include linked issue/PR numbers when relevant — e.g. ✗ #404 or 🔧 #406. A bare ✗ means the failure is verified but no tracking issue is filed yet.
Severity tiers
Each test is tagged with one of:
| Tier | Meaning | Sweep cadence |
|---|---|---|
| Smoke | Release-gate. Must pass before any tag is cut. | Every release tag, on KDE-W + one wlroots row |
| Critical | Regression-blocker. Failure on any supported environment blocks the release. | Every release tag, on every active row |
| Should | Important but not blocking. Track as bugs, fix before next stable. | Quarterly + on demand |
| Could | Edge cases, nice-to-have. | On demand only |
Smoke set
The minimum set that gates a release. Run on KDE-W (daily-driver) plus Hypr-N (clean wlroots). Sweep target: ~20 minutes.
| ID | Surface | One-line check |
|---|---|---|
| T01 | Launch | App opens; main window renders within ~10s |
| T03 | Tray | Tray icon appears; click toggles window |
| T04 | Window | OS-native frame draws and responds |
| T05 | Input | xdg-open https://claude.ai/... opens in-app |
| T07 | Window | Hybrid topbar renders, every button clicks |
| T08 | Window | Close button hides to tray, doesn't quit |
| T11 | Extensibility | Anthropic & Partners plugin install completes |
| T15 | Auth | Sign-in completes via xdg-open browser handoff |
| T16 | Code tab | Code tab loads (no 403, no blank screen) |
| T17 | Code tab | Folder picker opens via portal/native chooser |
Test corpus snapshot
| Bucket | Count |
|---|---|
Cross-environment functional (T###) |
39 |
Environment-specific functional (S###) |
37 |
| UI surfaces inventoried | 10 |
| Total functional tests | 76 |
For detailed status by ID, see matrix.md.
Automation status
Automation is partially landed. The harness lives at
tools/test-harness/ — twenty Playwright
specs wired (T01, T03, T04, T17, S09, S12, S29-S37, plus four H-prefix
self-tests), thirteen passing on KDE-W and six skipping cleanly per
spec intent. See tools/test-harness/README.md
for the live status table, automation.md for
architectural decisions, and the SIGUSR1 / runtime-attach pattern that
bypasses the app's CDP auth gate.
Grounding sweep + probe
Separate from the test sweep:
runbook.md "Grounding sweep" covers
the workflow for verifying case docs themselves against the live
build on every upstream version bump — static anchor pass plus a
runtime probe (tools/test-harness/grounding-probe.ts)
that captures IPC handler registry, accelerator state, autoUpdater
gate, AX-tree fingerprint, and other claims static analysis can't
disambiguate. Anchor and drift conventions live in
cases/README.md.
The structure remains automation-friendly for new tests:
- Stable test IDs.
T01-T39andS01-S28won't move. New tests append. Sequential, not semantic. - Standardized test bodies. Every functional test has
Severity,Steps,Expected,Diagnostics on failure, andReferencessections. The Steps and Diagnostics fields are scripted-runner-shaped. - Per-element UI checklists. Each UI surface file lists interactive elements in a table — every row is a candidate
webContents.executeJavaScript/xprop/ DBus assertion. - Severity-driven sweeps. Tests with a
runner:field execute viatools/test-harness/orchestrator/sweep.sh; JUnit XML lands inresults/results-${ROW}-${DATE}/junit.xml. Tests without arunner:continue to run manually.
For tests that don't have a runner yet, status updates land in matrix.md by hand after each manual sweep. For tests that do, the automation invocation is the source of truth — see runbook.md.
Conventions
- One PR per sweep result, not per cell change. Bundle a full row update into a single commit titled
test: KDE-W sweep $(date +%F). Reduces matrix-merge noise. - Tested-version pin. Every status update should mention the
claude-desktopupstream version + the project version (v1.3.x+claude...) in the commit. Otherwise a✓from six months ago looks current. - Diagnostics on failure are mandatory. Don't file
✗without the captures listed in the test'sDiagnostics on failureblock. The runbook covers how to capture each. - Issue links go inline. Status cells link directly to the relevant issue/PR.
See runbook.md for the full mechanics.