docs(learnings): add patching-minified-js + CONTRIBUTING (#559) (#574)

PR 1 of 3 for issue #559 — docs and conventions, no behaviour change. - New `docs/learnings/patching-minified-js.md` covering anchor selection, identifier capture (`\w` vs `$`), beautified false-negative trap, whitespace tolerance, replacement-string escaping, idempotency, multi-site coordination, lastIndexOf disambiguation, and the SHA-256-pinned hypothesis-verification recipe. - New `CONTRIBUTING.md` as a navigation hub: scope policy (no net-new features outside Linux-environment parity), upstream routing, subsystem owners, PR checklist, AI-assisted contribution disclosure format, and the patch-script regex intent comment + markdown wrapping conventions. - Fix CLAUDE.md:126 example regex `\w+` → `[$\w]+` (same class of bug the new learnings doc documents). - CLAUDE.md learnings index entry for the new doc. PRs 2 (`verify-cowork-patches.sh` + BATS) and 3 (silent-no-op WARNING retrofits) follow. Refs #559 Co-authored-by: Claude <claude@anthropic.com>
2026-05-17 00:26:21 +03:00 · 2026-05-05 07:15:42 -04:00
parent 0efa67d417
commit ccce3eab37
3 changed files with 418 additions and 1 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -17,6 +17,7 @@ The [`docs/learnings/`](docs/learnings/) directory contains hard-won technical k
 - [`linux-topbar-shim.md`](docs/learnings/linux-topbar-shim.md) — why claude.ai's in-app topbar is missing on Linux, the four gates that hide it, why the upstream `frame:false` + WCO config has unclickable buttons on X11 (Chromium-level implicit drag region), and the resolution: hybrid mode (system frame + UA-spoof shim → stacked layout, full button functionality)
 - [`test-harness-electron-hooks.md`](docs/learnings/test-harness-electron-hooks.md) — why constructor-level `BrowserWindow` wraps are silently bypassed by `frame-fix-wrapper`'s Proxy, and the prototype-method hook pattern that works (used by the Quick Entry test runners)
 - [`test-harness-ax-tree-walker.md`](docs/learnings/test-harness-ax-tree-walker.md) — five non-obvious traps in the v7 fingerprint walker after the AX-tree migration: AX-enable async lag, navigateTo-to-same-URL no-op, claude.ai's flat `dialog>button[]` lists, the `more options for X` per-row shape, and sidebar virtualization vs the lookup-failure threshold
+- [`patching-minified-js.md`](docs/learnings/patching-minified-js.md) — general lessons from maintaining a long-lived patch suite against an actively re-minified upstream: anchor selection (literals over identifiers), the `\w` vs `$` identifier-capture trap, beautified false-negatives, idempotency guards, multi-site coordination, non-unique anchor disambiguation, and the SHA-256-pinned hypothesis-verification recipe

 ## Code Style

@@ -125,7 +126,7 @@ Contributors are listed in chronological order: inspirational projects first (k3
 4. **Extract variable names dynamically** rather than hardcoding them. Shared extraction helpers live in `scripts/patches/_common.sh`. Example:
   ```bash
   # Extract function name from a known pattern
-   TRAY_FUNC=$(grep -oP 'on\("menuBarEnabled",\(\)=>\{\K\w+(?=\(\)\})' app.asar.contents/.vite/build/index.js)
+   TRAY_FUNC=$(grep -oP 'on\("menuBarEnabled",\(\)=>\{\K[$\w]+(?=\(\)\})' app.asar.contents/.vite/build/index.js)
   ```

 5. **Handle optional whitespace** in regex patterns:
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -0,0 +1,121 @@
+# Contributing
+
+## Where to find what
+
+- [CLAUDE.md](CLAUDE.md): conventions, build, patches, attribution.
+- [STYLEGUIDE.md](STYLEGUIDE.md): bash style ([style.ysap.sh](https://style.ysap.sh)).
+  Tabs, 80 cols, `[[ ]]`, no `set -e`.
+- [docs/learnings/](docs/learnings/): subsystem deep-dives. Read the
+  relevant entry first.
+- [docs/BUILDING.md](docs/BUILDING.md): local build setup.
+- [docs/DECISIONS.md](docs/DECISIONS.md): architectural choices.
+- [.github/CODEOWNERS](.github/CODEOWNERS): auto-review routing.
+
+## What we accept
+
+We're a repackager, not a fork. Net-new feature PRs default to no: we'd
+own that behaviour across every re-minified upstream release.
+Exception: parity patches for Windows features broken on Linux
+(input methods, tray on Wayland/X11, frame defaults). Always welcome:
+
+- Bug fixes against existing behaviour.
+- Parity patches bringing Linux closer to the Windows build.
+- Packaging, distribution, launcher fixes.
+- Docs, tests, CI improvements.
+
+## What goes upstream, not here
+
+We patch the binary blob; we don't fix application logic inside it.
+If the bug reproduces on Windows, file at
+[anthropics/claude-code](https://github.com/anthropics/claude-code).
+In-app `/bug` and `/feedback` are inert.
+
+| File here                              | File upstream                       |
+|----------------------------------------|-------------------------------------|
+| `apt update` errors, install failures  | Plugin install fails on all OSes    |
+| Tray icon missing on KDE Wayland       | Conversation rendering glitch       |
+| AppImage won't launch on distro X      | MCP server connection drops         |
+| `--doctor` reports wrong diagnosis     | Account / login flow broken         |
+
+## Filing an issue
+
+1. Use the issue template, not freeform.
+2. Paste full `./build.sh --doctor` (or `claude-desktop --doctor`)
+   output. Most-skipped step.
+3. Include distro, DE, session type (Wayland/X11). Most Linux-only
+   bugs trace to one of these.
+4. Reproduce on a clean config: move `~/.config/Claude` aside, relaunch.
+   Stale config causes false positives.
+
+## Patches against upstream
+
+Patches live in `scripts/patches/*.sh`, one per subsystem; `build.sh`
+sources them. Before writing or editing one, read [the
+patching-minified-js learnings doc][pmj]: anchor selection, capture,
+idempotency, beautified-vs-minified gap. Short form: CLAUDE.md §
+Working with Minified JavaScript.
+
+Priority rule: a broken-patch upstream release beats feature work.
+
+## Subsystem owners
+
+CODEOWNERS auto-requests reviews; this list is for human discoverability.
+
+- **@aaddrick**: default. Build, non-Cowork patches, desktop, packaging, docs.
+- **@sabiut**: `tests/`, `scripts/doctor.sh`, test workflows.
+- **@RayCharlizard**: Cowork (`scripts/patches/cowork.sh`,
+  `scripts/cowork-vm-service.js`, `tests/cowork-*.bats`).
+- **@typedrat**: Nix (`flake.nix`, `flake.lock`, `/nix/`).
+
+## Before submitting a PR
+
+- Run `/lint` (or `shellcheck` + `actionlint`). See CLAUDE.md § Linting.
+- Local build: `./build.sh --build appimage --clean no`. Catches
+  patch failures unit tests miss.
+- Branch: `fix/123-description` or `feature/123-description`.
+- PR body links the issue: `Fixes #123` or `Refs #123`.
+- AI-assisted? Add the attribution block (next section).
+
+## AI-assisted contributions
+
+AI-assisted PRs accepted with disclosure. PR descriptions:
+
+```
+---
+Generated with [Claude Code](https://claude.ai/code)
+Co-Authored-By: Claude <model-name> <noreply@anthropic.com>
+XX% AI / YY% Human
+Claude: <what AI did>
+Human: <what human did>
+```
+
+Real model name (e.g., "Claude Opus 4.7"). Honest split. 
+Breakdown lines make the ratio auditable against the diff.
+
+Commits: `Co-Authored-By: Claude <claude@anthropic.com>`. 
+
+Issues/comments:
+`Written by Claude <model-name> via [Claude Code](https://claude.ai/code)`.
+
+## Conventions in this file
+
+### Patch-script regexes
+
+When a patch regex uses whitespace-tolerant constructs (`\s*`,
+`[ \t]*`) between tokens, add an intent comment with whitespace stripped:
+
+```js
+// Intent: VAR.code==="ENOENT"
+const enoentRe = /(\w+)\.code\s*===\s*"ENOENT"/g;
+```
+
+Apply to new patches and to existing regexes when editing for other
+reasons. No churn PRs. Background: [the learnings doc][pmj].
+
+[pmj]: docs/learnings/patching-minified-js.md
+
+### Markdown prose wrapping
+
+Wrap prose at ~80 chars, matching the bash column rule in
+STYLEGUIDE.md. Tables, code blocks, URLs, alt text may exceed when
+breaking hurts readability.
--- a/docs/learnings/patching-minified-js.md
+++ b/docs/learnings/patching-minified-js.md
@@ -0,0 +1,295 @@
+# Patching minified JavaScript
+
+Hard-won lessons from maintaining a long-lived patch suite against an
+actively re-minified upstream. Each section names a failure mode and
+the fix.
+
+The verification recipes below use claude-desktop-debian-specific
+incantations (Claude-Setup.exe, nupkg extraction, `build.sh
+--build appimage`); substitute your own project's fetch/extract/build
+commands as needed.
+
+## Capturing identifiers: `\w` doesn't match `$`
+
+JS identifiers allow `$` and `_`; minifiers freely emit names like
+`$e`, `C$i`, `g$x`. The character class `\w` is `[A-Za-z0-9_]` — it
+does not match `$`. A `(\w+)` against `$e` captures the suffix `e`
+and returns a name that doesn't exist in the file. The failure is
+silent: regex matches, downstream sed runs against a truncated name,
+asar ships broken JS. Three recurrences (PRs #253, #421, #555) before
+the convention stuck.
+
+Use `[$\w]+` (repo convention; `[\w$]+` is equivalent). Strict
+superset of `\w+`, so pre-`$` versions still match. Live at
+`cowork.sh:484-502`:
+
+```bash
+const fsMatch = region.match(/([$\w]+)\.existsSync\(/);
+```
+
+## The beautified false-negative trap
+
+Testing a regex against `build-reference/` is not verification. The
+beautified copy has whitespace the regex doesn't account for.
+
+During PR #555, both `\w+` and `[\w$]+` tested false against the
+beautified file. Shipped minified bytes:
+
+```js
+await new Promise(n=>setTimeout(n,g$x))
+```
+
+Beautified copy:
+
+```js
+await new Promise((n) => setTimeout(n, g$x))
+```
+
+`await new Promise\(([\w$]+)=>\s*setTimeout\(\1,\s*([\w$]+)\)\)` fails
+the beautified version on the parens and spaces around `=>`. Always
+close the loop against shipped bytes.
+
+## Whitespace tolerance: `\s*` vs `[ \t]*`
+
+`\s` matches newlines. A `\s*`-padded pattern is a license to span
+across structural boundaries the original line layout meant to
+keep apart — usually fine on minified bytes (no newlines to span),
+much looser on beautified.
+
+Use `[ \t]*` when the intent is "spaces but stay on this line."
+Reserve `\s*` for crossing structural boundaries on purpose. The
+existing `cowork.sh` patches mix both — `\s*` where the surrounding
+context is bounded enough that newline-spanning is harmless, and
+literal token sequences (`",b:` etc.) when stricter adjacency is
+required.
+
+## Replacement-string escaping: `\1`, `&`, `$1`
+
+A regex can match correctly and still produce corrupted output
+because the *replacement string* has its own metacharacters. Match
+debugging shows green; the asar still ships broken bytes. Three
+flavors:
+
+**sed `&`** — the entire match. `sed 's/foo/&_suffix/'` is fine
+(`foo_suffix`). `sed 's/foo/literal_&_dollar/'` accidentally
+interpolates the match (`literal_foo_dollar`). Escape with `\&` if
+you want a literal ampersand:
+
+```bash
+sed 's/foo/literal_\&_dollar/'   # → literal_&_dollar
+```
+
+**sed `\1`** — backreferences in the replacement. These work as
+expected in BRE/ERE. The footgun is the *pattern* side: in BRE, `$`
+is the end-of-line anchor, so a literal `$` in the search pattern
+needs `\$`. `_common.sh:25` does exactly this for `electron_var`,
+which can be `$e` on newer upstream:
+
+```bash
+electron_var_re="${electron_var//\$/\\$}"
+```
+
+That escaping is for the sed *pattern*, not its replacement.
+
+**JS `String.prototype.replace`: `$1`, `$&`, `$$`** — the JS
+replacement DSL is its own thing. `$&` is the whole match; `$1..$9`
+are capture groups; `$$` is a literal `$`. Plain `$` followed by an
+unrelated char is left alone, but `$&` and `$N` get interpolated:
+
+```js
+code.replace(/foo/g, '$cost')   // → '$cost' (safe, no special)
+code.replace(/foo/g, '$&_x')    // → 'foo_x' ($& = match)
+code.replace(/foo/g, '$$cost')  // → '$cost' (escaped)
+```
+
+If the replacement is an injected JS snippet that happens to
+contain `$1` or `$&` (template literals, jQuery, regex source), JS
+will eat them. Use `$$` to escape, or build the string with
+concatenation so `$` never sits next to a digit or `&`.
+
+## Idempotency: a re-run must be byte-identical
+
+Without it, CI re-runs and partial builds layer mutations until
+something breaks visibly. Three patterns:
+
+**Re-key the guard to post-rename names.** `tray.sh:174-180` keys its
+fast-path guard on the post-rename
+`${tray_var}.setImage(${electron_var}.nativeImage.createFromPath(${path_var}))`
+sequence, so the second run recognizes its own first-run output.
+
+**Negative lookbehind, inline.** `cowork.sh:102-106` — the
+`(?<!...)` prevents a second match against text the first run
+already wrapped:
+
+```js
+const logRe = new RegExp(
+    '(?<!\\|\\|process\\.platform==="linux"\\))' +
+    win32Var.replace(/[.*+?^${}()|[\]\\]/g, '\\$&') +
+    '(\\s*\\?\\s*"vmClient \\(TypeScript\\)")'
+);
+```
+
+**Explicit `code.includes(...)` check.** `cowork.sh:227-230`
+separates "anchor missing" from "already applied" in the build log:
+
+```js
+} else if (code.includes(
+    'getDownloadStatus(){return process.platform==="linux"?'
+)) {
+    console.log('  Cowork auto-nav suppression already applied');
+}
+```
+
+PR #436 verified by running the patch twice and diffing the output.
+
+## Anchor selection: prefer literals over identifiers
+
+The above sections cover making a patch work on first run. This one
+covers keeping it working release after release. A patch can apply
+cleanly today and silently no-op next month.
+
+Minified identifiers churn every release. Developer strings —
+property names, log messages, IPC channel names — survive
+minification untouched (true for the upstream bundler used here; a
+`--mangle-props` build would invalidate property-name anchors).
+Anchor on those. A hardcoded minified name silently no-ops the next
+release; the build log still says "patched."
+
+Three patterns from the suite:
+
+- **Quick-window (PR #390, fixing #144).** Original patch:
+  `s/e.hide()/e.blur(),e.hide()/`. When `e` became `Sa`, it no-oped.
+  The rewrite anchors on `"pop-up-menu"` (`quick-window.sh:17`), the
+  `isWindowFocused` property name (`quick-window.sh:60`), and the
+  `[QuickEntry]` log strings (`quick-window.sh:88-91`).
+- **Cowork spawn (PR #436).** Anchored on `,VAR.mountConda)`
+  (`cowork.sh:741`) — unique to the 12-arg call path, absent from the
+  10-arg one-shot. Asserts match count is exactly 1 and bails
+  otherwise (`cowork.sh:744`), so a future second caller surfaces
+  immediately.
+- **Tray (PR #515).** `tray.sh:16` uses the literal `"menuBarEnabled"`
+  as a *position anchor*, then captures the surrounding minified
+  identifier (`\K\w+(?=\(\)\})`) as the actual patch target. Two
+  stages: stable literal → derived identifier. Every other tray name
+  chains off that single dynamic extraction.
+
+The lesson is about finding stable points to anchor on, not about
+what gets patched. The patch target is usually a minified identifier;
+the *anchor* should be a developer string nearby.
+
+## Multi-site coordinated patches: surface partial application
+
+Site 1 patches, site 2 misses, the asar ships half-wired. The
+pattern: each sub-patch sets a per-site boolean flag on success,
+then a single named WARNING fires if any flag is false:
+
+```js
+if (!siteADone || !siteBDone) {
+    console.log('  WARNING: <ticket> partial — siteA=' + siteADone +
+        ' siteB=' + siteBDone + '; <fallback consequence>');
+}
+```
+
+CI greps the build log for `WARNING:` and fails the build. That
+catches the half-patched state even when individual sub-patches each
+log "applied." See `cowork.sh:759-763` for a real instance —
+three-site `sharedCwdPath` forwarding, daemon fallback if any site
+misses.
+
+## Disambiguating non-unique anchors: lastIndexOf over indexOf
+
+A string anchor can appear in source maps, dead exports, or
+chunk-merged duplicates alongside the live code. `indexOf` returns
+the first; that may be wrong.
+
+`cowork.sh:264` uses `lastIndexOf(serviceErrorStr)` to bias toward
+appended code. On 1.5354.0 the string occurs once, so the change is
+a no-op there — the defense is for a future upstream that
+reintroduces the string in onboarding text or sample data far from
+the live retry-loop site.
+
+When neither side is reliable, narrow the search region first.
+`cowork.sh:269-276` does this for the ENOENT check, scanning only a
+300-character window before the error string.
+
+## Verifying a hypothesis before shipping a fix
+
+Pull the pinned URL and SHA from `scripts/setup/detect-host.sh`,
+download, verify hash, extract without beautifying, and test the
+regex against the minified bytes:
+
+```bash
+url=$(grep -oP "claude_download_url='\K[^']+" \
+    scripts/setup/detect-host.sh | head -1)
+expected=$(grep -oP "claude_exe_sha256='\K[^']+" \
+    scripts/setup/detect-host.sh | head -1)
+mkdir -p /tmp/verify && cd /tmp/verify
+wget -q -O Claude-Setup.exe "$url"
+echo "$expected  Claude-Setup.exe" | sha256sum -c -
+
+7z x -y Claude-Setup.exe -o exe
+nupkg=$(find exe -name 'AnthropicClaude-*.nupkg' | head -1)
+7z x -y "$nupkg" -o nupkg
+npx asar extract nupkg/lib/net45/resources/app.asar app
+
+node -e '
+  const fs = require("fs");
+  const code = fs.readFileSync(
+    "app/.vite/build/index.js", "utf8");
+  const re = /await new Promise\(([\w$]+)=>\s*setTimeout\(\1,\s*([\w$]+)\)\)/;
+  const m = code.match(re);
+  console.log(m ? `MATCH: ${m[0]}` : "NO MATCH");
+'
+```
+
+`NO MATCH` means the regex is wrong. Verifying the SHA defends against
+stale URL pinning or server-side binary swap.
+
+## End-to-end verification (post-build)
+
+Four layers: build log, syntactic validity, asar markers, runtime.
+
+1. Check the patch-count line:
+
+   ```bash
+   ./build.sh --build appimage --clean no 2>&1 | tee build.log
+   grep -E 'Applied [0-9]+ cowork patches' build.log
+   ```
+
+   Healthy 1.5354.0 build: `Applied 12 cowork patches`. A lower
+   number, or any `WARNING:` in the cowork section, is a half-patched
+   asar.
+
+2. `node --check` on the patched `index.js` — catches malformed
+   replacements that serialize but don't parse (PR #436 used this in
+   dry-run validation):
+
+   ```bash
+   node --check test-build/.../app.asar.contents/.vite/build/index.js
+   ```
+
+3. Static-grep the shipped asar for the 9 cowork markers from PR
+   #555. A `verify-cowork-patches.sh` helper that automates this is
+   planned (issue #559 D6); prefer it once it lands.
+
+4. Launch the AppImage and check runtime state:
+
+   ```bash
+   tail -20 ~/.config/Claude/logs/cowork_vm_daemon.log
+   ls -la "${XDG_RUNTIME_DIR}/cowork-vm-service.sock"
+   ss -lpx | grep cowork-vm-service.sock
+   ```
+
+   Daemon log should have `lifecycle startup` and `lifecycle
+   listening`; socket should exist and be owned by the
+   `cowork-vm-service.js` process listed by `ss`.
+
+## Cross-references
+
+- `tray-rebuild-race.md` "Resilience to minifier churn" — prior art
+  for dynamic extraction across a six-variable patch site and the
+  post-rename idempotency-guard pattern.
+- `plugin-install.md` "Getting the Minified Source for Any Shipped
+  Version" — the `reference-source.tar.gz` release asset gives
+  beautified asar contents of any prior version for diffing. Useful
+  for spotting when an identifier renamed and which version did it.