* feat: APT/DNF Worker scaffolding (#493) Adds the implementation scaffolding for the Cloudflare Worker that fronts the APT/DNF repo, per docs/worker-apt-plan.md. New files: - worker/src/worker.js: redirects /pool/.../*.deb and /rpm/*/*.rpm to GitHub Release assets via 302; passes metadata through to the gh-pages origin - worker/wrangler.toml: bound to pkg-staging.claude-desktop-debian.dev initially; Phase 4a switches to pkg.claude-desktop-debian.dev - .github/workflows/deploy-worker.yml: deploys Worker on worker/** push, post-deploy probe verifies route bound + Worker responding - .github/workflows/apt-repo-heartbeat.yml: daily cron, deb+rpm matrix, walks ordered redirect chain + size match against Releases asset, opens format-specific tracking issue on failure (auto-close on recovery), gates on Worker liveness (skips silently before Phase 4a) Modified: - .github/workflows/ci.yml: gated strip step + ordered-chain smoke test added to update-apt-repo and update-dnf-repo; the destructive strip only fires when the production Worker probe succeeds, so this PR can land before Phase 4a without affecting current behavior - docs/worker-apt-plan.md: bake in real domain values, mark Decisions table entries as concrete, fix Cloudflare API token permissions list (current names: Workers Scripts Edit, Account Settings Read, Workers Routes Edit; previous "Zone:Zone:Read" name no longer matches the dropdown) Pre-Phase-4a behavior: the strip step's liveness probe targets the production hostname which doesn't exist yet, so it always skips and .debs/.rpms are pushed to gh-pages exactly as today. Smoke tests skip on the same gate. Heartbeat workflow's gate skips before the Worker is live. Nothing destructive happens until Phase 4a explicitly cuts the Worker over to production. Co-Authored-By: Claude <claude@anthropic.com> * refactor: simplify worker scaffolding per cdd-code-simplifier review - worker.js: use named capture group `asset` instead of opaque `m[1]` positional reference; inline single-use `tagFor()` helper; demote unused `arch` capture to non-capturing group. - ci.yml: hoist `WORKER_DOMAIN` from per-step env to job-level env in both `update-apt-repo` and `update-dnf-repo` (matches the pattern already used in `apt-repo-heartbeat.yml`). - apt-repo-heartbeat.yml: use github-script's native `context.serverUrl` / `context.runId` instead of reconstructing from process.env; spread `...context.repo` instead of repeating owner/repo on every API call; destructure `{ data: open }` to flatten `open.data` references. All changes preserve behaviour. The contrarian-fix mechanisms (positive Worker liveness probe gating the strip step, hop-by-hop ordered chain walk in smoke tests) are unchanged. APT/DNF strip + smoke pairs remain in-place per reviewer-readability preference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
37 KiB
Plan: APT/DNF binary distribution via Cloudflare Worker → GitHub Releases
Status: Draft (post-contrarian-review revision 3, domain locked in)
Issue: #493
Trigger: run 24811974733 — update-apt-repo push rejected because .deb exceeds GitHub's 100 MB per-file cap
Relationship to #449: This plan addresses the forward component of #449's gh-pages clone-bloat (no new .deb accumulation after Phase 4b). Backfill — shrinking the existing history — is a mandatory follow-up via one-time orphan-reset of gh-pages, not optional. The previously-drafted gh-pages-split-plan.md is deleted in this branch; the split-into-separate-repo machinery is no longer required.
Problem
apt update users are pinned to v2.0.1+claude1.3561.0 because the v2.0.2+claude1.3883.0 .deb is 129.81 MB and git push to gh-pages is rejected by GitHub's 100 MB hard cap. Shrinking experiments on a throwaway branch got the .deb to ~113 MB compressed; the floor for a working build is ~110 MB given Electron + libs + ion-dist + smol-bin VHDX + app.asar are all individually irreducible. Shrinking is not a viable path under the cap.
Approach
Front the existing GitHub Pages apt/dnf repo with a Cloudflare Worker on a custom domain. The Worker passes metadata through to gh-pages and 302-redirects pool requests to GitHub Release assets (which already exist — Create Release succeeds every tag). Existing user sources.list URLs keep working transparently via GitHub Pages' auto-301 from *.github.io to the configured custom domain.
Architecturally important: the Worker only emits redirect responses (a few hundred bytes). The .deb bytes themselves flow directly from objects.githubusercontent.com to the user, never crossing Cloudflare. This matters for both TOS posture (see Phase 0) and bandwidth economics.
Reference architecture: Cloudflare's own apt/yum repo at pkg.cloudflare.com uses a related pattern (R2 + Worker) to ship cloudflared to Debian/Ubuntu/RHEL/CentOS users.
Decisions
| Decision | Value |
|---|---|
| Custom domain | claude-desktop-debian.dev (registered at Cloudflare Registrar) |
| Cloudflare account | Free tier; account email aliased to cf-pkg@claude-desktop-debian.dev |
| Worker route (production) | pkg.claude-desktop-debian.dev/* |
| Worker route (staging, initial) | pkg-staging.claude-desktop-debian.dev/* |
| Worker source | worker/ directory in this repo, version-controlled, deployed via CI |
| Worker deploy creds | CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID as repo secrets |
| RPM filename regex | Verify against actual CI-produced filename in Phase 1 |
| #449 follow-up | One-time orphan-reset of gh-pages after Phase 4b — separate, smaller PR, mandatory |
Combine with gh-pages-split work |
No — that complexity is no longer needed once .deb files stop accumulating |
Architecture
existing user with old sources.list
│
▼
github.io/.../foo.deb
↓ 301 (Pages auto-redirect from CNAME file)
pkg.claude-desktop-debian.dev/.../foo.deb
↓ Worker route handler
├─ /dists/*, /KEY.gpg, /index.html, /repodata/* → fetch() from gh-pages origin (200)
└─ /pool/.../*.deb, /rpm/*/*.rpm → 302 to github.com/.../releases/download/<tag>/<asset>
↓ 302 to objects.githubusercontent.com
↓ 200 (the binary, direct from GitHub CDN)
apt's default redirect cap is 5; max chain length here is 3.
Worker code (initial)
const ORIGIN = 'https://aaddrick.github.io/claude-desktop-debian';
const RELEASES = 'https://github.com/aaddrick/claude-desktop-debian/releases/download';
const DEB_RE = /^\/pool\/main\/c\/claude-desktop\/(claude-desktop_(?<claudeVer>[^-]+)-(?<repoVer>[^_]+)_(?<arch>amd64|arm64)\.deb)$/;
const RPM_RE = /^\/rpm\/(?<arch>x86_64|aarch64)\/(claude-desktop-(?<claudeVer>[\d.]+)-(?<repoVer>[\d.]+)-\d+\.[^.]+\.rpm)$/;
function tagFor(claudeVer, repoVer) {
return `v${repoVer}+claude${claudeVer}`;
}
export default {
async fetch(request) {
const url = new URL(request.url);
const m = DEB_RE.exec(url.pathname) || RPM_RE.exec(url.pathname);
if (m) {
const { claudeVer, repoVer } = m.groups;
return Response.redirect(
`${RELEASES}/${tagFor(claudeVer, repoVer)}/${m[1]}`, 302
);
}
return fetch(ORIGIN + url.pathname + url.search, request);
}
};
RPM filename format confirmed against existing Release assets: claude-desktop-1.3883.0-2.0.2-1.x86_64.rpm (note the -1 release number after the version).
CI changes
Two surgical edits to .github/workflows/ci.yml. The first adds a step to both update-apt-repo and update-dnf-repo jobs to delete binary files from the working tree after metadata generation, before commit. The destructive action is gated on a positive liveness probe — it only fires if the production Worker is actually responding. This makes the gating self-protecting: a misconfigured env var, accidentally-true condition, or premature merge cannot strip binaries before the Worker is genuinely live.
- name: Add packages to repository
working-directory: apt-repo
run: |
# ... existing reprepro includedeb loop, unchanged ...
+ - name: Strip binaries from pool (gated on Worker liveness)
+ working-directory: apt-repo
+ env:
+ WORKER_DOMAIN: pkg.claude-desktop-debian.dev
+ run: |
+ probe_url="https://${WORKER_DOMAIN}/dists/stable/InRelease"
+ if curl -fsI --max-time 10 "$probe_url" >/dev/null; then
+ echo "Worker live at ${WORKER_DOMAIN}; stripping binaries from pool"
+ find pool -type f -name '*.deb' -delete
+ else
+ echo "Worker not responding at ${WORKER_DOMAIN}; preserving .debs in pool"
+ echo "(this is expected before Phase 4a; an error after Phase 4a)"
+ fi
- name: Commit and push changes
dists/.../Packages retains Filename: pool/main/c/claude-desktop/foo.deb — the Worker intercepts that path. Signed InRelease is unaffected because signatures are over content, not URL.
The second adds a smoke-test step at the end of each repo-update job that walks the redirect chain hop-by-hop in expected order and asserts size match against the GitHub Releases asset. Substring-grep on collected Location: headers is order-blind and would pass on a misconfigured Worker that 302'd straight to the wrong tag's asset; we walk the chain explicitly:
- name: Smoke test published deb (ordered chain + size)
env:
WORKER_DOMAIN: pkg.claude-desktop-debian.dev # the registered custom domain
GH_TOKEN: ${{ github.token }}
run: |
deb_name="claude-desktop_${CLAUDE_VERSION}-${REPO_VERSION}_amd64.deb"
deb_url="https://aaddrick.github.io/claude-desktop-debian/pool/main/c/claude-desktop/${deb_name}"
# Wait for propagation; fail after 5 min instead of cargo-cult sleep
deadline=$((SECONDS + 300))
until curl -fsI --max-time 10 "$deb_url" -o /dev/null; do
[[ $SECONDS -gt $deadline ]] \
&& { echo "::error::Reachability timeout"; exit 1; }
sleep 10
done
# Walk the redirect chain hop-by-hop, asserting each hop's
# Location matches the expected pattern in order.
# Patterns are extended regex; '.' is literal here-and-there
# because we anchor with full hostname matches.
expected_hops=(
"https://${WORKER_DOMAIN}/"
"https://github\.com/aaddrick/claude-desktop-debian/releases/download/v${REPO_VERSION}\+claude${CLAUDE_VERSION}/"
"https://objects\.githubusercontent\.com/"
)
url="$deb_url"
for i in "${!expected_hops[@]}"; do
hop_status=$(curl -s -o /dev/null -w '%{http_code}' "$url")
redirect_url=$(curl -s -o /dev/null -w '%{redirect_url}' "$url")
echo "Hop $i: ${hop_status} ${url} -> ${redirect_url}"
[[ "$hop_status" =~ ^30[12]$ ]] \
|| { echo "::error::Hop $i expected 301/302, got ${hop_status}"; exit 1; }
[[ "$redirect_url" =~ ^${expected_hops[$i]} ]] \
|| { echo "::error::Hop $i mismatch: expected ${expected_hops[$i]}, got ${redirect_url}"; exit 1; }
url="$redirect_url"
done
# Fetch the file (now that we trust the chain)
curl -fsSL -o /tmp/smoke.deb "$deb_url"
file /tmp/smoke.deb | grep -q 'Debian binary package' \
|| { echo "::error::Not a valid Debian package"; exit 1; }
# Size match against the Releases asset (catches truncation,
# wrong-asset redirects, middleware that rewrites Content-Length)
asset_size=$(gh release view "v${REPO_VERSION}+claude${CLAUDE_VERSION}" \
--repo aaddrick/claude-desktop-debian \
--json assets --jq ".assets[] | select(.name == \"${deb_name}\") | .size")
local_size=$(stat -c %s /tmp/smoke.deb)
[[ "$asset_size" == "$local_size" ]] \
|| { echo "::error::Size mismatch: ${local_size} vs ${asset_size}"; exit 1; }
echo "Smoke test passed: ordered chain validated, file matches Releases asset"
The DNF smoke test is the same shape: same expected_hops ordering (Pages 301 → Worker 302 → objects.githubusercontent.com), but the URL uses the RPM pool path (/rpm/x86_64/claude-desktop-${CLAUDE_VERSION}-${REPO_VERSION}-1.x86_64.rpm), filename validation uses rpm -qpi /tmp/smoke.rpm instead of file ... | grep Debian, and the asset name in gh release view uses the RPM pattern.
Phases
Each phase has a hard exit criterion. Don't progress until met.
Phase 0 — Pre-work (manual, one-time)
Infrastructure:
- Register domain (~$10–15/yr) at a registrar that supports auto-renewal
- Configure auto-renewal with a payment method that won't expire in the next 5 years
- Create Cloudflare account (or audit existing one); add domain with proxied DNS
Bus factor — accepted risk, with mitigations (replacing the earlier "≥2 maintainers reachable" requirement, which was unrealistic for a solo-maintained project):
The honest reality: @aaddrick is the sole maintainer for everything outside cowork (@RayCharlizard) and nix (@typedrat). Neither collaborator is a candidate for shared Cloudflare or registrar credentials, and pretending otherwise is checklist theatre. So the bus factor is 1, and the mitigation strategy is to make recovery from a future maintainer's loss tractable, not to fictionally distribute credentials today:
- Email forwarding: Cloudflare account and registrar email both forward to a personal backup mailbox (e.g., a Gmail filter rule into a separate folder), so account-recovery emails don't land in a dead inbox if the primary mail provider becomes unreachable
- Auto-renewal: registrar configured with auto-renew on a credit card that doesn't expire in the next 5 years
- CI-only deploys:
wranglercredentials live as repo secrets (CLOUDFLARE_API_TOKEN,CLOUDFLARE_ACCOUNT_ID), never on a single workstation. Deploys happen via CI from any pushed commit, not fromaaddrick's laptop. This eliminates the "lost workstation" failure mode without requiring a second human - Recovery runbook:
docs/learnings/apt-worker-architecture.md(created in Phase 5) documents which Cloudflare account and which registrar own what, plus exact steps for a future maintainer to take over (rotate API token, point registrar contact at new email, update DNS if migrating accounts)
Cloudflare API token scopes (for the CLOUDFLARE_API_TOKEN repo secret).
Recommended path: use the "Edit Cloudflare Workers" template in Cloudflare's Custom Token creation UI. It bundles all required permissions plus Workers KV / R2 Storage edit (broader than strictly needed today, harmless if you don't use those resources, and avoids needing to rotate the token if you add KV/R2 later). Cloudflare maintains the template and updates it as API changes happen.
Minimum-viable explicit alternative (current as of 2026-04, having survived a recent dropdown rename — the previously-documented Zone.Zone:Read is no longer the right label):
- Account →
Workers Scripts→ Edit (deploy Worker code) - Account →
Account Settings→ Read (wrangler validates account context during deploy) - Zone →
Workers Routes→ Edit (bind Worker topkg.claude-desktop-debian.dev/*)
A token missing Workers Routes:Edit will deploy the Worker successfully but fail silently to bind the route — the Worker will exist but receive no traffic. Phase 3's post-deploy probe catches this.
wrangler.toml shape (committed in worker/):
name = "claude-desktop-apt-redirect"
main = "src/worker.js"
compatibility_date = "2026-04-22"
account_id = "<from CLOUDFLARE_ACCOUNT_ID secret at deploy time>"
routes = [
{ pattern = "pkg.claude-desktop-debian.dev/*", zone_name = "claude-desktop-debian.dev" }
]
TOS review (completed during planning, no action needed):
- Cloudflare's old "Section 2.8" (no non-HTML content on free plans) was removed in October 2025
- Our pattern only routes redirect responses through Cloudflare. Binary bytes flow directly from
objects.githubusercontent.comto the user; Cloudflare never sees the.debbytes - Reselling / proxying-as-service restrictions don't apply (we're not providing service to third parties; we're routing our own users to our own binaries)
- Conclusion: no known TOS conflict for this use case.
pkg.cloudflare.comis a Cloudflare-owned precedent and not a guarantee that third-party use is blessed; if Cloudflare ever suspends the account, the documented fallback (split-package or commercial CDN) is the recovery path
GitHub Releases dependency review (completed during planning):
- Release asset URL format
/releases/download/<tag>/<asset>is documented as stable Content-Dispositionheaders are NOT guaranteed stable — irrelevant to us (we use the URL path)- Auto-generated source code zip URLs are unstable — irrelevant (we don't use those)
- Unauthenticated per-IP rate limits on
*.githubusercontent.comrolled out in 2025; users don't share a quota - Per-account egress throttling can return 503 under unusual load — heartbeat (Phase 5) catches this
Exit: domain resolves through Cloudflare; auto-renewal configured; account email forwards to backup mailbox; CLOUDFLARE_API_TOKEN (with all three required scopes) + CLOUDFLARE_ACCOUNT_ID stored as repo secrets; worker/wrangler.toml drafted.
Phase 1 — Worker dev (locally, no production traffic)
- Worker code in a new top-level
worker/directory (will be CI-deployed in later phases) wrangler devruns locallycurl localhost:8787/dists/stable/InReleasereturns gh-pages content unchangedcurl -L localhost:8787/pool/main/c/claude-desktop/claude-desktop_1.3561.0-2.0.1_amd64.deblands on the actual published.debvia the 302 chaincurl -L localhost:8787/rpm/x86_64/claude-desktop-1.3883.0-2.0.2-1.x86_64.rpmlands on the actual published.rpm(the v2.0.2 release already has RPM assets, so this is verifiable today)
Exit: both curl checks succeed against the previously-published version; RPM regex confirmed against the real -1 release-numbered filename format.
Phase 2 — Test domain validation (broad container matrix)
- Deploy Worker to
pkg-staging.claude-desktop-debian.dev/*, no production traffic - Container matrix expanded beyond happy-path distros to catch real-world configurations:
| Container | Why |
|---|---|
debian:stable |
Baseline |
ubuntu:lts |
Baseline |
debian:testing |
Catches early apt regressions |
fedora:latest |
DNF baseline |
rockylinux:9 |
RHEL-family compat |
debian:stable + apt-cacher-ng |
Caching proxy in front of apt — RFC says don't cache 302s, in practice some configs do |
debian:stable --network with IPv6-only |
Confirm pkg.claude-desktop-debian.dev and objects.githubusercontent.com resolve AAAA |
For each container, drop a temporary sources.list pointing at pkg-staging.claude-desktop-debian.dev, run apt update && apt install claude-desktop (or DNF equivalent). Specifically validate a .deb > 100 MB install (use 1.3883.0).
apt-secure origin-change check — requires a two-step run because apt only emits the "changed its 'Origin'" warning when comparing against a previously-cached state. A fresh container has no prior origin recorded, so the warning never fires regardless of behavior. The check has to establish baseline first, then change URL, then re-update:
# Step 1: install with the original github.io URL (current state),
# capture the cached origin
echo "deb [signed-by=/usr/share/keyrings/claude-desktop.gpg] https://aaddrick.github.io/claude-desktop-debian stable main" \
> /etc/apt/sources.list.d/claude-desktop.list
apt-get update
# Step 2: switch sources.list to the test custom domain directly
echo "deb [signed-by=/usr/share/keyrings/claude-desktop.gpg] https://pkg-staging.claude-desktop-debian.dev stable main" \
> /etc/apt/sources.list.d/claude-desktop.list
# Step 3: re-update with debug; this is when the warning would surface
apt-get update -o Debug::Acquire::http=true 2>&1 \
| tee /tmp/apt-debug.log
grep -iE "changed its '(Origin|Label|Suite|Codename)'|expected entry.*not found|not signed" /tmp/apt-debug.log \
&& { echo "FAIL: apt-secure surfacing warnings"; exit 1; }
In our specific case the Origin: field comes from reprepro's conf/distributions and is unchanged across the redirect (Worker passes metadata through). The warning is unlikely to fire — but worth verifying because any signed-metadata mismatch surfaces the same way and the cost of testing is low.
If origin-change warnings appear, the README must document the fix (typically: re-add the source with the new URL or refresh signed-by=). Do not proceed to Phase 3 with this unresolved.
Exit: all containers install successfully with the >100 MB .deb; no apt-secure warnings on stable / LTS distros after the two-step URL change.
Phase 3 — CI plumbing PR (NOT YET ENABLING THE PRODUCTION DOMAIN)
- PR adds the Worker source under
worker/withwrangler.toml(route bound to stagingpkg-staging.claude-desktop-debian.dev/*initially), and a CI workflow.github/workflows/deploy-worker.ymlthat runswrangler deployon push tomainwhenworker/**changes. Workflow needsCLOUDFLARE_API_TOKENandCLOUDFLARE_ACCOUNT_IDrepo secrets - PR adds the liveness-probed strip step (
curl -fsIagainsthttps://${WORKER_DOMAIN}/dists/stable/InRelease) to bothupdate-apt-repoandupdate-dnf-repo. Gating mechanism: the destructivefind ... -deleteruns only if the probe succeeds. Before Phase 4a, the production Worker doesn't exist, so the probe fails harmlessly and binaries stay in pool. After Phase 4a, the probe succeeds and binaries get stripped. No env-var gating — the gate is the actual reachability of the production endpoint - PR adds smoke-test step (deb + rpm versions) to each repo-update job, also implicitly gated by Worker existence
- PR adds a post-Worker-deploy probe to
deploy-worker.ymlthat confirms the Worker received the update and the route resolves:curl -fsI https://pkg-staging.claude-desktop-debian.dev/dists/stable/InRelease(against the staging route during this phase)
Manually trigger CI on a test tag (e.g., v0.0.0-test+claude0.0.0) to confirm:
- Worker deploys to staging route successfully
- Strip step does not fire (because production Worker isn't live yet)
- Push to gh-pages succeeds with
.debsstill in pool (current behavior, just adds the new probes/steps idempotently)
Exit: CI green on test tag; staging Worker deployed and reachable; strip step correctly skips because production probe fails; smoke test correctly skips or runs against staging successfully.
Phase 4a — Production Worker provisioning (gh-pages binaries as cold standby)
The critical insight from contrarian review: don't strip .debs from gh-pages until the Worker path is proven live in production. Otherwise there's a guaranteed user-visible outage between strip and Worker enable.
- Add
CNAMEfile togh-pagesroot containingpkg.claude-desktop-debian.dev(Pages settings UI) - Wait for Let's Encrypt cert provisioning. Typical: ~1h. Edge cases: 24h+ for DNS CAA records, registrar propagation delays, Let's Encrypt rate limits. Monitor in Pages settings UI
- Update
wrangler.tomlroute from staging (pkg-staging.claude-desktop-debian.dev/*) to production (pkg.claude-desktop-debian.dev/*) and merge — CI deploys the Worker to the production route - Important correction from earlier draft: once the CNAME is live, GitHub Pages auto-301s all
aaddrick.github.io/claude-desktop-debian/...traffic topkg.claude-desktop-debian.dev/.... So the "direct path" via github.io is no longer directly serving — the auto-301 makes the Worker the active path for all traffic. The.debs remaining in gh-pages are not actively serving most users; they exist as a cold standby for rollback only (if we unbind the Worker route, gh-pages still has the binaries to serve directly via the github.io URL, since CNAME removal stops the auto-301) - Validation, on each container in Phase 2's matrix: clean install with original
sources.listsucceeds via the Worker chain - Validation:
curl -IL https://aaddrick.github.io/claude-desktop-debian/dists/stable/InReleaseshows the 301 chain landing on the custom domain
Exit: clean container installs succeed via the new Worker path with original sources.list URLs; cert is valid and stable for ≥24h; rollback path (unbind Worker → traffic flows direct to gh-pages binaries) verified by briefly toggling the Worker route off in a maintenance window and confirming a clean container still installs.
Phase 4b — Atomic cutover
No PR-merge required for the gating flip — the strip step's liveness probe automatically activates once Phase 4a's production Worker is live. The cutover step is just triggering a release that exercises the new path end-to-end:
- Re-run the v2.0.2+claude1.3883.0
update-apt-repoandupdate-dnf-repojobs (or tag a follow-on release) - The strip step's liveness probe now succeeds (production Worker is live), so binaries get stripped from pool before commit
- The push to gh-pages succeeds with metadata-only tree
- Smoke test passes (ordered chain validation, size match against Releases asset)
- Container test on clean
debian:stablewith originalsources.listrunsapt update && apt installand gets v2.0.2
Exit: failed run from issue #493 succeeds; v2.0.2+claude1.3883.0 reaches apt users.
Phase 5 — Documentation, monitoring, follow-up
- README install snippet still works as-is (no URL change required); mention the new domain as canonical going forward in a "preferred URL" note
- New
docs/learnings/apt-worker-architecture.mddescribing:- The redirect chain
- Worker config and deploy mechanism
- Credential ownership map (which email owns Cloudflare, which owns the registrar, where
wranglertoken lives) - What to do when the heartbeat workflow fails
- CLAUDE.md mention under "CI/CD" or new "Distribution" section
- Cloudflare Workers Analytics alert (free, configured via dashboard): error rate >1% sustained for 15 min, request rate >80% of free tier
- Heartbeat workflow (
.github/workflows/apt-repo-heartbeat.yml): daily cron walks both the.deband.rpmchains (matrix strategy, parallel, independent failure tracking per format), opens a tracking issue on failure with a format-specific label (and auto-closes on next success). Pure cron-failure surfacing isn't enough — GitHub doesn't email-notify on scheduled workflow failures by default, and most maintainers have those notifications filtered. An open issue is visible from the repo's home page
Heartbeat sketch:
name: APT/DNF Repo Heartbeat
on:
schedule:
- cron: '0 12 * * *' # daily noon UTC
workflow_dispatch:
permissions:
contents: read
issues: write # required for issue creation/comment on failure
jobs:
ping:
strategy:
fail-fast: false # if deb fails, still test rpm
matrix:
format: [deb, rpm]
runs-on: ubuntu-latest
env:
WORKER_DOMAIN: pkg.claude-desktop-debian.dev
GH_TOKEN: ${{ github.token }}
steps:
- name: Resolve latest release for ${{ matrix.format }}
id: latest
run: |
tag=$(gh release list --limit 1 --json tagName --jq '.[0].tagName' \
--repo aaddrick/claude-desktop-debian)
repoVer=${tag#v}; repoVer=${repoVer%+claude*}
claudeVer=${tag#*+claude}
if [[ "${{ matrix.format }}" == "deb" ]]; then
asset="claude-desktop_${claudeVer}-${repoVer}_amd64.deb"
url="https://aaddrick.github.io/claude-desktop-debian/pool/main/c/claude-desktop/${asset}"
else
asset="claude-desktop-${claudeVer}-${repoVer}-1.x86_64.rpm"
url="https://aaddrick.github.io/claude-desktop-debian/rpm/x86_64/${asset}"
fi
{
echo "tag=$tag"
echo "repoVer=$repoVer"
echo "claudeVer=$claudeVer"
echo "asset=$asset"
echo "url=$url"
} >> "$GITHUB_OUTPUT"
- name: Validate chain + fetch
run: |
# Same hop-by-hop walk as the smoke test in update-{apt,dnf}-repo,
# asserts ordered chain (Pages 301 → Worker 302 → objects.githubusercontent.com)
# + size match against the Releases asset for ${{ steps.latest.outputs.asset }}.
# Validator differs by format:
# deb: file /tmp/x | grep -q 'Debian binary package'
# rpm: rpm -qpi /tmp/x
- name: Open or update failure issue
if: failure()
uses: actions/github-script@v7
env:
FORMAT: ${{ matrix.format }}
with:
script: |
const fmt = process.env.FORMAT;
const label = `heartbeat-failure-${fmt}`;
const title = `APT/DNF repo heartbeat failing (${fmt})`;
const body_url = `${process.env.GITHUB_SERVER_URL}/${process.env.GITHUB_REPOSITORY}/actions/runs/${process.env.GITHUB_RUN_ID}`;
const open = await github.rest.issues.listForRepo({
owner: context.repo.owner, repo: context.repo.repo,
labels: label, state: 'open',
});
const body = `Heartbeat failed for \`${fmt}\` at ${new Date().toISOString()}.\nRun: ${body_url}`;
if (open.data.length === 0) {
await github.rest.issues.create({
owner: context.repo.owner, repo: context.repo.repo,
title, body, labels: [label],
});
} else {
await github.rest.issues.createComment({
owner: context.repo.owner, repo: context.repo.repo,
issue_number: open.data[0].number, body,
});
}
- name: Auto-close failure issue on recovery
if: success()
uses: actions/github-script@v7
env:
FORMAT: ${{ matrix.format }}
with:
script: |
const fmt = process.env.FORMAT;
const label = `heartbeat-failure-${fmt}`;
const open = await github.rest.issues.listForRepo({
owner: context.repo.owner, repo: context.repo.repo,
labels: label, state: 'open',
});
for (const issue of open.data) {
await github.rest.issues.createComment({
owner: context.repo.owner, repo: context.repo.repo,
issue_number: issue.number,
body: `Heartbeat for \`${fmt}\` recovered at ${new Date().toISOString()}; auto-closing.`,
});
await github.rest.issues.update({
owner: context.repo.owner, repo: context.repo.repo,
issue_number: issue.number, state: 'closed',
});
}
Format-specific labels (heartbeat-failure-deb, heartbeat-failure-rpm) prevent a recovering format from auto-closing the other format's still-open failure issue.
- Mandatory follow-up PR: one-time orphan-reset of
gh-pagesto address #449's clone-bloat backfill. Now safe because nothing important lives in gh-pages history (metadata is regenerated by reprepro/createrepo on every release; no binaries to lose)
Test plan
Repeat at every phase boundary, on each container in the matrix.
Debian/Ubuntu side:
docker run --rm -it debian:stable bash -c '
apt-get update && apt-get install -y curl gnupg file
curl -fsSL https://aaddrick.github.io/claude-desktop-debian/KEY.gpg \
| gpg --dearmor > /usr/share/keyrings/claude-desktop.gpg
echo "deb [signed-by=/usr/share/keyrings/claude-desktop.gpg] https://aaddrick.github.io/claude-desktop-debian stable main" \
> /etc/apt/sources.list.d/claude-desktop.list
apt-get update -o Debug::Acquire::http=true 2>&1 | tee /tmp/apt-debug.log
# Note: this single-step check captures only Origin/Suite/Codename/Label
# warnings on first update; for the full apt-secure check after URL change,
# see Phase 2 two-step procedure
grep -iE "changed its .(Origin|Suite|Codename|Label)." /tmp/apt-debug.log && exit 1 || true
apt-get install -y claude-desktop
dpkg -l claude-desktop
dpkg -L claude-desktop | head
'
Plus apt-cache policy claude-desktop to confirm the version resolved, and an apt-cacher-ng-fronted variant of the same.
Fedora/RHEL side:
docker run --rm -it fedora:latest bash -c '
dnf install -y curl rpm-build
curl -fsSL -o /etc/yum.repos.d/claude-desktop.repo \
https://aaddrick.github.io/claude-desktop-debian/claude-desktop.repo
rpm --import https://aaddrick.github.io/claude-desktop-debian/KEY.gpg
dnf --setopt=debuglevel=10 makecache 2>&1 | tee /tmp/dnf-debug.log
# Surface any signature or repo-metadata mismatches that would surface
# after a URL change
grep -iE "(GPG check FAILED|repomd\.xml signature|metadata is outdated)" /tmp/dnf-debug.log \
&& exit 1 || true
dnf install -y claude-desktop
rpm -qi claude-desktop
rpm -ql claude-desktop | head
'
rockylinux:9 runs the same flow; dnf semantics are equivalent across the RHEL family.
Risks and mitigations
| Risk | Mitigation |
|---|---|
| Worker regex breaks on filename/tag scheme change | CI smoke test catches first regression; chain assertion is explicit, not silent |
| Cloudflare outage → repo unreachable | Heartbeat workflow surfaces; fast rollback in Phase 4b reverts to direct gh-pages serving (binaries 404 but metadata works); accept this as cost of single-vendor dependency |
| Cloudflare account suspension | TOS reviewed; redirect-only architecture means no large bandwidth attributed to the account; documented fallback (split-package or commercial CDN) if account is suspended |
Hardened apt clients with Acquire::http::AllowRedirect=false |
Document in README that this must remain true (the default); link from heartbeat-failure runbook |
| Custom domain cert provisioning slow/fails | Phase 4a explicitly waits for stable cert before Phase 4b; if it fails, Phase 4a doesn't exit and nothing breaks — .debs still on gh-pages |
| Filename regex divergence between deb and rpm | Phase 1 dev with both filename samples in hand; both smoke tests in CI |
| Apt-secure origin-change warnings | Phase 2 explicit check with Debug::Acquire::http=true; do not exit Phase 2 with this unresolved |
apt-cacher-ng caches 302 incorrectly |
Phase 2 matrix entry; if regression, document workaround or flag as known issue |
| IPv6-only network breaks chain | Phase 2 matrix entry; both pkg.claude-desktop-debian.dev and objects.githubusercontent.com must have AAAA records |
| Domain registrar lapse | Auto-renewal + secondary contact email + heartbeat catches |
| GitHub Releases per-account egress throttle (503) | Heartbeat catches; if persistent, consider authenticated CDN (rare in practice for desktop-app traffic) |
| GitHub changes Releases asset URL format | Smoke test catches first failed release; documented mitigation: update Worker RELEASES constant |
| Bus factor (single maintainer with all credentials) | Accepted risk for a solo-maintained project; mitigated via email-forwarding to backup mailbox, registrar auto-renewal, CI-only Worker deploys (no workstation dependency), and recovery runbook in docs/learnings/apt-worker-architecture.md for a future maintainer to take over |
| Worker free tier exhausted | Cloudflare Analytics alert at 80% threshold; daily cron is ~30 reqs/day, real apt traffic dominated by metadata polls (most return 304); upgrading is $5/mo for 10M+ reqs/day |
Rollback strategy
If Phase 4b cutover causes user-visible breakage:
- Cold-standby restore via CNAME removal (Pages settings, ~5 min): remove the CNAME file from
gh-pages. github.io URL stops 301-ing. Apt fetches directly from gh-pages — and because the strip step's liveness probe targets the production Worker URL (which now no longer 301s into existence), future CI runs will see the probe fail and stop stripping binaries. The pre-Phase-4a.debs still in gh-pages history serve direct-from-Pages until the next release re-pushes binaries - Fast Worker disable (Cloudflare dashboard, <1 min): unbind the Worker from
pkg.claude-desktop-debian.dev/*. Custom domain still resolves but Cloudflare returns Pages content directly. Useful for isolating "is this a Worker bug?" — but if the most recent release already stripped.debs from gh-pages (Phase 4b succeeded), binary fetches still 404. Combine with #1 if user impact is ongoing - Recovery if architecture is fundamentally broken: rollback via #1, then accept that the next upstream growth triggers the original cap problem, and pursue one of the documented fallbacks (split-package, R2, commercial CDN)
The critical invariant: Phase 4a completing successfully (cert + Worker live + container tests pass with original sources.list) means Phase 4b is a low-risk release trigger (no PR-merge required — the strip step's liveness probe activates automatically once the production Worker is up). Phases 2 + 4a must catch issues before Phase 4b. Once 4b ships and the smoke test passes, the path forward from a regression is forward (fix Worker bug, push new release) or backward via rollback #1.
Documented fallbacks (not the chosen path, kept for if this fails)
- Splitting the
.debinto multiple smaller packages withDepends:chains — pure in-tree change, no external dependencies, no recurring costs. More invasive packaging refactor. Buys 6–12 months until a half crosses 100 MB. Real fallback if Cloudflare/GitHub-Releases dependency proves untenable - Migrating storage to Cloudflare R2 (the variant
pkg.cloudflare.comuses) — full hosting in R2 instead of Releases. Larger CI change for marginal benefit given GitHub Releases already works as backend for our scale. Reasonable if we ever hit GitHub egress throttling regularly - Commercial package CDN (Cloudsmith, Packagecloud, JFrog Artifactory) — outsources the same architecture, monthly fees ($20–100+/mo for proprietary). Use if we want a fully managed answer
Out of scope
- AUR / Nix / AppImage / Snap / Flathub. Unaffected by this plan.
Sources
- Cloudflare Workers — limits and free tier
- Cloudflare Workers — pricing
- Goodbye, section 2.8 and hello to Cloudflare's new terms of service (Oct 2025)
- pkg.cloudflare.com — Cloudflare's own apt/yum repo, reference architecture
- Using Cloudflare R2 as an apt/yum repository (Cloudflare blog)
- kdrag0n/github-releases-proxy — Worker proxying GitHub Release assets
- GitHub Pages: managing a custom domain (auto-301 from
*.github.io) - GitHub Docs: linking to releases (asset URL format stability)
- GitHub Changelog: updated rate limits for unauthenticated requests
- apt-transport-http(1) —
Acquire::http::AllowRedirect