Updates agent definitions, learnings, CLAUDE.md, and BUILDING.md so
path references point at the new module files instead of the old
monolithic build.sh.
Agent definitions:
.claude/agents/issue-triage.md — table of per-category
investigation paths now points at scripts/patches/*.sh and
scripts/packaging/*.sh instead of "build.sh (search patch_X)".
.claude/agents/electron-linux-specialist.md — patching-functions
table now includes each function's file location; directory tree
illustration reflects the new scripts/ layout.
Documentation:
CLAUDE.md — "Working with Minified
JavaScript" section points at scripts/patches/*.sh; frame-fix
injection attributed to scripts/patches/app-asar.sh; the
version-bump checks now grep scripts/setup/detect-host.sh.
docs/BUILDING.md — automated version
detection paragraph now mentions scripts/setup/detect-host.sh as
the file that holds the URLs.
docs/learnings/cowork-vm-daemon.md — Patch 6 pointer now
says scripts/patches/cowork.sh; line-number references dropped in
favour of anchor-based search (line numbers drift between releases).
docs/learnings/plugin-install.md — Key Files section
points at scripts/patches/cowork.sh for patch_cowork_linux.
Historical changelog-style references (e.g. docs/cowork-linux-handover.md
describing what was "added to build.sh" during initial cowork work)
are intentionally left unchanged — they describe a point-in-time state
of the codebase.
Co-Authored-By: Claude <claude@anthropic.com>
7.4 KiB
Cowork VM Daemon — Learnings
Architecture Overview
Cowork mode on Linux uses a custom Node.js daemon
(scripts/cowork-vm-service.js)
that replaces the Windows cowork-vm-service. The Electron app talks to
it over a Unix domain socket at
$XDG_RUNTIME_DIR/cowork-vm-service.sock using length-prefixed JSON —
the same wire format as the Windows named pipe.
The daemon is forked by Patch 6 in the
patch_cowork_linux() function (scripts/patches/cowork.sh), which
injects auto-launch code into the Electron app's retry loop for the
VM-service connection.
Daemon Lifecycle
- First connect attempt: the app tries
$XDG_RUNTIME_DIR/cowork-vm-service.sock. ENOENT/ECONNREFUSED: retry loop catches the error (theECONNREFUSEDbranch is Linux-only, added by Patch 6 step 1 so stale sockets don't bypass retry).- Auto-launch (Patch 6 step 2): the injected code forks the daemon
via
child_process.fork()withdetached:true, stdio redirected to~/.config/Claude/logs/cowork_vm_daemon.log. - Spawn cooldown:
FUNC._lastSpawn = Date.now()— subsequent iterations only re-fork after 10 s have elapsed. This replaces the old one-shot_svcLaunchedboolean so the retry loop can recover after mid-session daemon death (issue #408). - Retry: the loop waits and reconnects, which now succeeds.
Issue #408 — Daemon Recovery
Root cause (one-shot guard)
Before the fix, Patch 6 injected:
process.platform==="linux" && !FUNC._svcLaunched && (
FUNC._svcLaunched = true,
/* fork daemon */
)
FUNC._svcLaunched was set on the first successful spawn and never
cleared, so when the daemon died mid-session the retry loop saw the
guard already set and skipped the re-fork. The client looped forever
on connect ENOENT.
Fix (rate-limited respawn)
Timestamp-based cooldown replaces the boolean:
process.platform==="linux" &&
(!FUNC._lastSpawn || Date.now() - FUNC._lastSpawn > 1e4) &&
(FUNC._lastSpawn = Date.now(), /* fork daemon */)
10 s is short enough that the retry loop (which sleeps on the order of seconds between iterations) recovers promptly after a crash, and long enough that a crash-looping daemon can't turn into a fork bomb.
Secondary cause (preserved images block recovery)
The app's _ue() / deleteVMBundle() function deletes a whitelist of
reinstall files on auto-reinstall. Upstream deliberately preserves
sessiondata.img and rootfs.img.zst to avoid re-download.
On 1.2773.0 those preserved files put the daemon into an unstartable
state that persists across app restart and OS reboot. The client's
symptom is connect ENOENT (daemon never got far enough to create the
socket) rather than ECONNREFUSED (daemon started, crashed, socket
stayed). RayCharlizard (2026-04-16) confirmed that manually wiping
~/.config/Claude/vm_bundles/claudevm.bundle/ is required to recover,
even after rolling back the AppImage to a known-good version.
Fix (extend delete list — Patch 6b)
scripts/patches/cowork.sh now matches the const NAME=["rootfs.img",...] array at
module level and appends "sessiondata.img" and "rootfs.img.zst" if
they're not already present. The auto-reinstall path now wipes these
too. Trade-off: the next successful startup re-downloads/re-extracts
these files. Acceptable because auto-reinstall only runs after startup
has already failed — biasing toward recovery over re-download
avoidance is correct.
Not included in the delete list: ~/.config/Claude/claude-code-vm/.
That's CLI-binary storage (2.1.x/claude), unrelated to the VM
daemon, and has its own version-check logic at this.vmStorageDir
inside the app. Wiping it would just force a slow re-download of the
CLI on every auto-reinstall.
Silent Death — Now Logged
Before the fix the daemon was forked with stdio:"ignore", and its
internal log() function was gated by COWORK_VM_DEBUG=1, so a crash
left no trace anywhere.
Two changes together make crashes visible:
- Patch 6 (client side) redirects the forked daemon's stdout +
stderr to
~/.config/Claude/logs/cowork_vm_daemon.log. Any Node-level crash dump (uncaught exception pre-handler, native assertion, etc.) now lands in that file. cowork-vm-service.js(daemon side) addslogLifecycle()— an always-on writer that bypassesDEBUGfor startup, SIGTERM, SIGINT,uncaughtException,unhandledRejection, andexitevents. It also proactivelymkdirSync's the log directory so the first write doesn't get swallowed if the daemon is the first thing writing under~/.config/Claude/logs/.
Interpreting the log after a failure:
| Last line | Diagnosis |
|---|---|
lifecycle startup ... + gap + no further entries |
SIGKILL'd (OOM killer, kill -9, etc.) — no handler fires |
lifecycle startup + lifecycle listening + nothing else |
Daemon running fine but died by signal with no handler (rare; check dmesg) |
lifecycle uncaughtException ... |
JS-level crash, stack is in the log entry |
lifecycle SIGTERM received + lifecycle exit code=0 |
Clean app-initiated shutdown |
No startup entry at all |
fork() didn't complete; check launcher.log for [cowork-autolaunch] errors |
Key Files
scripts/patches/cowork.shinsidepatch_cowork_linux()— Patch 6 (auto-launch + stdio pipe + rate limiter) and Patch 6b (reinstall array extension). Search for# Patch 6anchors; line numbers drift between upstream releases.scripts/cowork-vm-service.jslines ~49-86 — log infrastructure, includinglogLifecycle().scripts/cowork-vm-service.jslines ~2399-2440 — signal handlers and entry point.scripts/launcher-common.sh—--doctorchecks.docs/cowork-linux-handover.md— architecture reference.
Diagnostic Commands
# Is the daemon running?
pgrep -af cowork-vm-service
# Socket present?
ls -la "${XDG_RUNTIME_DIR:-/tmp}/cowork-vm-service.sock"
# Watch lifecycle events as they happen
tail -f ~/.config/Claude/logs/cowork_vm_daemon.log
# Look for the last startup / exit pair
grep -E 'lifecycle (startup|exit|SIGTERM|SIGINT|uncaughtException|unhandledRejection)' \
~/.config/Claude/logs/cowork_vm_daemon.log | tail -20
# Find any orphan sockets
lsof -U 2>/dev/null | grep -iE 'cowork|claude'
# Force a respawn test: kill daemon, watch client log for reconnect
pkill -9 -f cowork-vm-service.js
tail -f ~/.cache/claude-desktop-debian/launcher.log
# Find the daemon script inside a mounted AppImage
find /tmp -path '*claude*cowork-vm-service*' 2>/dev/null
Testing Notes
- Host-direct (
COWORK_VM_BACKEND=host): no isolation, direct execution. Matches the--doctor"host-direct (no isolation, via override)" line. This is what issue #408 was reported against. - Bwrap (
COWORK_VM_BACKEND=bwrap): Bubblewrap sandbox; requiresbwrapinstalled. - KVM (
COWORK_VM_BACKEND=kvm): full VM; requires QEMU, KVM, rootfs image. - Debug (
COWORK_VM_DEBUG=1orCLAUDE_LINUX_DEBUG=1): verbose logging via the existinglog()path.logLifecycle()is always on regardless of this flag. - Force-cooldown test: kill the daemon, relaunch a Cowork session within 10 s — the guard should block that single retry. Wait 10 s and retry: should succeed. Confirms the cooldown boundary.