mirror of
https://github.com/aaddrick/claude-desktop-debian.git
synced 2026-05-17 00:26:21 +03:00
Revert "docs: add issue triage pipeline design document"
This reverts commit 1d020aa. Moving the change to a branch for review
instead of shipping directly to main.
Co-Authored-By: Claude <claude@anthropic.com>
This commit is contained in:
@@ -1,614 +0,0 @@
|
||||
# Issue Triage Pipeline
|
||||
|
||||
Automated first-pass triage for GitHub issues on this repository. Runs on
|
||||
`issues: [opened, reopened]` and on manual `workflow_dispatch`. Classifies the
|
||||
issue, investigates likely root cause against the repo and the upstream
|
||||
beautified source, validates every factual claim mechanically and with a
|
||||
fresh-context LLM reviewer, and — when the validated findings clear hard
|
||||
gates — posts an **explicitly non-authoritative draft comment** plus triage
|
||||
labels.
|
||||
|
||||
The design is constrained by three simultaneous goals:
|
||||
|
||||
- **Useful**: give the maintainer a head start on orientation, candidate sites,
|
||||
and related issues.
|
||||
- **Safe**: never mislead a reporter or reviewer with fabricated identifiers,
|
||||
non-matching patch code, or authoritative voice on unverified claims.
|
||||
- **Cheap**: run in under three minutes per issue for a two-digit dollar
|
||||
monthly spend.
|
||||
|
||||
Everything in this document follows from those three goals being in tension.
|
||||
|
||||
---
|
||||
|
||||
## Design principles
|
||||
|
||||
> [!IMPORTANT]
|
||||
> These seven principles are load-bearing. Every stage in the pipeline exists
|
||||
> to serve one of them. If a future change breaks a principle, the stage it
|
||||
> lives in should be removed, not weakened.
|
||||
|
||||
### 1. Mechanical checks before LLM checks
|
||||
|
||||
Grep, `gh api`, file stat, regex matching — deterministic, cheap, and
|
||||
complementary to LLM reasoning. The class of error an LLM reviewer misses most
|
||||
often is the one an LLM drafter made: fabricated identifiers, non-matching
|
||||
anchors, misremembered issue numbers. A second LLM pass that only sees the
|
||||
first pass's output can rubber-stamp the fabrication. A `grep -P` against the
|
||||
actual source cannot.
|
||||
|
||||
LLM review is reserved for questions a grep cannot answer — semantic
|
||||
entailment, intent, whether two issues describe the same failure mode.
|
||||
|
||||
### 2. Structured output, not prose
|
||||
|
||||
Every claim the pipeline produces has a typed slot: `file`, `line_start`,
|
||||
`line_end`, `evidence_quote`, `claim_type`, `confidence`. Prose is generated
|
||||
last, from already-validated structure. Free-form investigation output is
|
||||
banned because it allows unverifiable assertions to hide inside narrative.
|
||||
|
||||
Anthropic's own Claude Code Security Review action uses structured tool output
|
||||
for exactly this reason: it lets the pipeline drop individual findings without
|
||||
rewriting prose around them.[^anthropic-security-review]
|
||||
|
||||
### 3. Writer/Reviewer with fresh context on source
|
||||
|
||||
The reviewer reads the **source** and the **claim**. It does *not* read the
|
||||
drafter's reasoning or the draft comment. This is Anthropic's recommended
|
||||
pattern for bias prevention in Claude Code[^anthropic-best-practices] — a
|
||||
fresh context avoids the reviewer rationalizing the drafter's framing.
|
||||
|
||||
The reviewer is also **adversarial by construction**: it must produce the
|
||||
strongest counter-reading of each evidence quote *before* emitting a verdict.
|
||||
Rubber-stamping is the base rate for reviewer agents otherwise.
|
||||
|
||||
### 4. Category exclusion over classification
|
||||
|
||||
Whole classes of issue skip investigation entirely. Hardware-specific GPU
|
||||
driver crashes, kernel-level behavior, non-reproducible reports, and
|
||||
upstream-only bugs do not benefit from automated investigation against our
|
||||
patch surface — the bot ends up inventing "launcher flag workarounds" for
|
||||
problems outside our control.
|
||||
|
||||
This is the pattern Anthropic's security-review action uses to keep
|
||||
signal-to-noise high: whole finding categories (DoS, open-redirects,
|
||||
rate-limiting) are excluded upfront rather than triaged with
|
||||
disclaimers.[^anthropic-security-review]
|
||||
|
||||
### 5. Confidence gates suppress, not hedge
|
||||
|
||||
Low-confidence findings do not get posted with a cautionary footer. They do
|
||||
not get posted. Labels apply; comment is skipped. Research consistently shows
|
||||
that posting hedged but confident-looking output is worse than posting
|
||||
nothing — specificity reads as authority regardless of preamble
|
||||
phrasing.[^diffray-hallucinations][^lakera-hallucinations]
|
||||
|
||||
### 6. Non-authoritative framing is structural, not textual
|
||||
|
||||
The comment template signals tentativeness through structure:
|
||||
|
||||
- Upfront "won't-do" boundary statement (what the bot *will not* claim),
|
||||
modeled on Anthropic's "won't approve PRs — that's still a human
|
||||
call"[^anthropic-code-review]
|
||||
- Required file:line citations on every claim (enforced by post-processor —
|
||||
claims without citations are dropped)
|
||||
- Hypothesis phrasing ("Looks like X", "Likely path is Y") — prompt-enforced
|
||||
and post-processor-checked
|
||||
- Patch code in a collapsed `<details>` block, labeled as unverified draft
|
||||
- No voice replication of the maintainer
|
||||
|
||||
A disclaimer sentence alone cannot neutralize the amplifying effect of
|
||||
authoritative voice; the shape of the artifact has to do the work.
|
||||
|
||||
### 7. Feedback is an explicit signal, not a pattern-matched one
|
||||
|
||||
Keyword-triggered "bot got this wrong" capture is theater — corrections arrive
|
||||
as natural language, curation never happens, the file grows stale. Real
|
||||
feedback uses explicit signals the pipeline can act on deterministically:
|
||||
|
||||
- 👎 reaction on a bot comment auto-opens a correction issue
|
||||
- `/triage-wrong` slash command in any reply opens the same
|
||||
- A human-curated `.claude/triage-corrections.md` file is loaded as context
|
||||
into future runs
|
||||
|
||||
No free-form NLP classification of collaborator replies.
|
||||
|
||||
---
|
||||
|
||||
## Pipeline overview
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Issue opened/reopened] --> B[1. Gate]
|
||||
B -->|needs-human or<br/>already triaged| Z[exit]
|
||||
B -->|proceed| C[2. Classify + double-check]
|
||||
C -->|category excluded| L[9. Label + archive]
|
||||
C -->|investigable| D[3. Fetch reference]
|
||||
D --> E[4. Investigate<br/>structured output]
|
||||
E --> F[5. Mechanical validation<br/>no LLM]
|
||||
F --> G[6. Adversarial review<br/>fresh context]
|
||||
G --> H[7. Decision gate]
|
||||
H -->|version drift| L
|
||||
H -->|no surviving findings<br/>or low confidence| L
|
||||
H -->|pass| I[8. Comment generation]
|
||||
I --> L
|
||||
L --> M[Post + archive]
|
||||
|
||||
N[Sideband:<br/>triage-feedback.yml] -.->|corrections.md| B
|
||||
|
||||
style C fill:#e1f5ff
|
||||
style E fill:#e1f5ff
|
||||
style G fill:#e1f5ff
|
||||
style I fill:#e1f5ff
|
||||
style F fill:#fff4e1
|
||||
style H fill:#fff4e1
|
||||
style L fill:#fff4e1
|
||||
```
|
||||
|
||||
Stages shaded blue are LLM calls; amber are deterministic bash steps.
|
||||
|
||||
| Stage | Runtime | Tool | Purpose |
|
||||
|-------|---------|------|---------|
|
||||
| 1. Gate | ~5s | bash | Skip already-triaged, load corrections file |
|
||||
| 2. Classify | ~30s | Sonnet (×2) | Categorize + double-check exclusions |
|
||||
| 3. Fetch reference | ~10s | bash | Download `reference-source.tar.gz` |
|
||||
| 4. Investigate | ~90s | Sonnet | Structured findings + sweeps + anchors |
|
||||
| 5. Mechanical validation | ~15s | bash | Grep, `gh`, closed-world extraction |
|
||||
| 6. Adversarial review | ~40s | Sonnet | Counter-reading + verdict, fresh context |
|
||||
| 7. Decision gate | ~1s | bash | Enforce hard gates |
|
||||
| 8. Comment generation | ~40s | Sonnet | Template-enforced draft |
|
||||
| 9. Label + archive | ~5s | bash | Apply labels, post, upload artifacts |
|
||||
|
||||
Total: ~3 minutes per investigable issue. Excluded-category issues exit at
|
||||
~35s (stages 1–2 plus archive).
|
||||
|
||||
---
|
||||
|
||||
## Stage-by-stage detail
|
||||
|
||||
### 1. Gate
|
||||
|
||||
Deterministic filter. Runs before any paid API call.
|
||||
|
||||
**Skip conditions:**
|
||||
|
||||
- Issue labeled `triage: needs-human` (unless manually dispatched)
|
||||
- Issue already has a terminal triage label (`investigated`, `duplicate`,
|
||||
`not-actionable`)
|
||||
- Title similarity >0.9 with an open issue (likely duplicate, defers to
|
||||
human)
|
||||
|
||||
**Side effects:**
|
||||
|
||||
- Loads `.claude/triage-corrections.md` as context for downstream stages.
|
||||
This file contains human-curated past errors, classified by failure class
|
||||
(identifier-hallucination, false-duplicate, missed-site, version-drift,
|
||||
etc.). Future runs see them and are instructed to avoid the same shapes.
|
||||
|
||||
### 2. Classify
|
||||
|
||||
First Sonnet call. Structured JSON output only.
|
||||
|
||||
<details>
|
||||
<summary><b>Classify output schema</b></summary>
|
||||
|
||||
```json
|
||||
{
|
||||
"classification": "bug|feature|question|duplicate|needs-info|not-actionable|needs-human",
|
||||
"confidence": "high|medium|low",
|
||||
"claimed_version": "1.3109.0 | null",
|
||||
"category": "bug|feature|question|driver|hardware|kernel|upstream-only|container|...",
|
||||
"exclusion_reason": "null | string — filled iff category excluded",
|
||||
"category_evidence": "verbatim excerpt from the issue body supporting the category",
|
||||
"suggested_labels": ["priority: high", "format: rpm", ...],
|
||||
"duplicate_of": "null | integer"
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
The `claimed_version` field is parsed from the issue body — `--doctor`
|
||||
output, `claude-desktop (X.Y.Z)` references, AppImage filenames. Consumed
|
||||
by stage 7 to gate against version drift.
|
||||
|
||||
> [!WARNING]
|
||||
> **Category exclusion is verified by a second Sonnet pass.** If the first
|
||||
> pass flags the issue as excluded, a second call sees only the issue body,
|
||||
> the `category_evidence` excerpt, and the exclusion list — no other context.
|
||||
> Only if both agree does the exclusion apply. Classify is the least-validated
|
||||
> stage elsewhere in the pipeline; this double-check is how we budget for its
|
||||
> miscalibration.
|
||||
|
||||
**Excluded categories** route to stage 9 with a `triage: needs-human` label
|
||||
and no comment. Current exclusions:
|
||||
|
||||
- `driver` (NVIDIA, AMD, Intel GPU driver issues)
|
||||
- `hardware` (specific-chipset crashes, sensor/thermal issues)
|
||||
- `kernel` (namespace restrictions, cgroup behavior, distro kernel
|
||||
differences)
|
||||
- `upstream-only` (reproduces on unmodified Claude Desktop on Windows/macOS)
|
||||
- `container` (Distrobox/Flatpak/Docker without display passthrough)
|
||||
|
||||
### 3. Fetch reference
|
||||
|
||||
Downloads `reference-source.tar.gz` from the GitHub release matching
|
||||
`CLAUDE_DESKTOP_VERSION`. This tarball is produced by `ci.yml` on every
|
||||
release: `app.asar` extracted, `.vite/build/*.js` beautified with Prettier,
|
||||
tarred. No re-extraction happens in the triage pipeline — that work was
|
||||
already done once at release time.
|
||||
|
||||
If `claimed_version` from stage 2 differs from `CLAUDE_DESKTOP_VERSION`,
|
||||
`VERSION_DRIFT=true` is exported and consumed by the stage 7 gate. The
|
||||
reference still downloads, investigation still runs (the artifacts are useful
|
||||
internally), but stage 7 will prevent a public comment.
|
||||
|
||||
### 4. Investigate
|
||||
|
||||
Sonnet call with repo + reference source + corrections file + issue
|
||||
context. **Output is schema-enforced — no free prose permitted.**
|
||||
|
||||
<details>
|
||||
<summary><b>Investigation output schema</b></summary>
|
||||
|
||||
```json
|
||||
{
|
||||
"findings": [
|
||||
{
|
||||
"claim_type": "identifier|behavior|flow|absence",
|
||||
"claim": "string — the factual assertion being made",
|
||||
"file": "path/to/file.js",
|
||||
"line_start": 1234,
|
||||
"line_end": 1240,
|
||||
"evidence_quote": "verbatim source excerpt supporting the claim",
|
||||
"confidence": "high|medium|low",
|
||||
"enclosing_construct": "for identifier claims only — the enum/switch/literal containing the identifier"
|
||||
}
|
||||
],
|
||||
"pattern_sweep": [
|
||||
{
|
||||
"pattern": "regex pattern used to sweep the repo",
|
||||
"match_count": 17,
|
||||
"matches": [
|
||||
{ "file": "...", "line": 42, "snippet": "..." }
|
||||
]
|
||||
}
|
||||
],
|
||||
"proposed_anchors": [
|
||||
{
|
||||
"description": "what this regex targets",
|
||||
"regex": "pattern",
|
||||
"expected_match_count": 1,
|
||||
"target_file": "path/to/file",
|
||||
"word_boundary_required": true
|
||||
}
|
||||
],
|
||||
"related_issues": [
|
||||
{
|
||||
"number": 288,
|
||||
"why_related": "one-sentence rationale",
|
||||
"quoted_excerpt": "relevant snippet from the cited issue"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
**Hard schema bans** (validator rejects output if any present):
|
||||
|
||||
| Banned | Why |
|
||||
|--------|-----|
|
||||
| Negative per-site assertions ("X should stay as-is") | Bad historical track record; these block fixes instead of enabling them |
|
||||
| "Already fixed in #N" without a diff/PR link | Same failure class — an unverified negative claim that blocks scope |
|
||||
| Substring regex on identifier claims | Substring matches pass `grep` but don't prove identifier identity |
|
||||
| `expected_match_count: ">=1"` | Must be exact — ≥1 is what lets fabricated anchors slip through |
|
||||
| Prescriptive patch text without a backing finding | Detached prescriptions are how unverified `sed` patterns get posted |
|
||||
|
||||
**Pattern-sweep caps:** each sweep is limited to 20 match rows. Additional
|
||||
matches are summarized as `match_count: N (showing first 20)`. This prevents
|
||||
investigation token bloat on common identifiers.
|
||||
|
||||
### 5. Mechanical validation
|
||||
|
||||
Pure bash. No LLM call. Produces `validation.json` with pass/fail per item.
|
||||
|
||||
**Per finding:**
|
||||
|
||||
- [x] `file` exists and `line_end` is within the file's length
|
||||
- [x] `evidence_quote` grep-matches at the cited `file:line_start`
|
||||
- [x] If `claim_type == "identifier"`, extract `closed_world_options` — the
|
||||
full enclosing enum/switch/case-block/object-literal — verbatim, using
|
||||
`ast-grep` (fast, language-aware). The options list is attached to the
|
||||
finding for the reviewer to use in stage 6.
|
||||
|
||||
**Per proposed anchor:**
|
||||
|
||||
- [x] `grep -P` against the reference source with `\b` word boundaries
|
||||
enforced for identifier anchors
|
||||
- [x] Match count **exactly equal** to `expected_match_count` (not ≥)
|
||||
- [x] No substring hits on identifier-type anchors
|
||||
|
||||
**Per related_issue:**
|
||||
|
||||
- [x] `gh issue view NNN` — actual title, state, first 500 chars of body
|
||||
captured and attached to the finding. The bot's `why_related` claim is
|
||||
not trusted — the reviewer in stage 6 reads the real body.
|
||||
|
||||
**Per pattern_sweep match:**
|
||||
|
||||
- [x] Re-grep to confirm the match still exists (catches investigation
|
||||
hallucinating file paths or line numbers)
|
||||
|
||||
> [!NOTE]
|
||||
> **Why the closed-world extraction matters.** A bot that fabricates an
|
||||
> identifier (e.g., claiming the VM backend values are `qemu`/`virt` when
|
||||
> they're actually `kvm`/`bwrap`/`host`) can pick a nearby real line
|
||||
> containing the substring "virt" as its `evidence_quote`. Grep validation
|
||||
> alone would pass this — the quote exists, the file exists, the line
|
||||
> matches. Closed-world extraction pulls the full enum the claim is *about*
|
||||
> and hands it to the reviewer as a bounded option list. "Is the claimed
|
||||
> identifier in this list?" is a closed question the reviewer cannot
|
||||
> rationalize around.
|
||||
|
||||
### 6. Adversarial review
|
||||
|
||||
Sonnet call with **fresh context**. Does NOT see the draft comment, the
|
||||
investigation's free-form reasoning, or any voice instructions. Sees only:
|
||||
|
||||
- The original issue
|
||||
- `validation.json` with findings that passed mechanical
|
||||
- `closed_world_options` for each identifier-type finding
|
||||
- The actual fetched body of each cited related issue
|
||||
- Source excerpts at claim sites
|
||||
|
||||
**Prompt is adversarial per finding, in this order:**
|
||||
|
||||
1. State the strongest counter-reading of this evidence quote. What would
|
||||
make this claim wrong, given the actual code shown?
|
||||
2. For identifier claims: list every option in `closed_world_options`. Is
|
||||
the claimed identifier verbatim in that list? (yes/no — exact match only)
|
||||
3. For related_issues: does the fetched body describe the same failure mode?
|
||||
(exact / related / unrelated)
|
||||
4. **Only after steps 1–3**, give a verdict on each finding. No new claims.
|
||||
|
||||
Output is per-item verdicts. The reviewer cannot propose new findings,
|
||||
cannot rewrite the drafter's claims, and cannot insert prose. Its only
|
||||
powers are: approve, reject, or downgrade confidence.
|
||||
|
||||
> [!WARNING]
|
||||
> **Reviewer approval rate is a health metric.** Rolling approval rate is
|
||||
> logged to telemetry. If it exceeds **70%**, a GitHub issue is auto-opened
|
||||
> warning the reviewer is rubber-stamping — the prompt needs re-tuning, or
|
||||
> the drafter is producing findings too weak for the reviewer to
|
||||
> meaningfully discriminate. See [Health monitoring](#health-monitoring).
|
||||
|
||||
### 7. Decision gate
|
||||
|
||||
Deterministic. Evaluates all hard gates; any failure routes to
|
||||
"labels-only, no post."
|
||||
|
||||
| Gate | Trigger | Action |
|
||||
|------|---------|--------|
|
||||
| Version drift | `claimed_version != CLAUDE_DESKTOP_VERSION` | Labels only |
|
||||
| No surviving findings | Zero items passed mechanical + review | `triage: needs-human`, labels only |
|
||||
| Low average confidence | Avg confidence of survivors < medium | `triage: needs-human`, labels only |
|
||||
| Excluded category | Already caught at stage 2, defensive check | Labels only |
|
||||
|
||||
All gates are fail-closed. If any condition is ambiguous, the gate denies
|
||||
the comment.
|
||||
|
||||
### 8. Comment generation
|
||||
|
||||
Sonnet call with only the findings that survived every prior stage. Prompt
|
||||
mandates hypothesis framing ("Looks like", "Likely", "Worth checking first")
|
||||
and requires a file:line citation on every factual claim.
|
||||
|
||||
**Template structure, enforced by post-processor:**
|
||||
|
||||
```markdown
|
||||
**Automated draft — AI analysis, not maintainer judgment.** This bot won't
|
||||
close issues, apply labels beyond triage routing, or claim fixes are
|
||||
shipped. Findings below are starting points; the code citations are what
|
||||
to verify first.
|
||||
|
||||
[one hypothesis line]
|
||||
|
||||
- [finding with required `file:line` citation]
|
||||
- [finding with required `file:line` citation]
|
||||
|
||||
<details>
|
||||
<summary>Unverified patch sketch (draft, not applied)</summary>
|
||||
|
||||
[patch code here, if any — runs only if confidence high AND proposed_anchor
|
||||
verified match-count exactly 1]
|
||||
|
||||
</details>
|
||||
|
||||
Related: #NNN — exact-match | related | unrelated (per reviewer rating)
|
||||
|
||||
---
|
||||
React 👎 on this comment if a finding is wrong, or run `/triage-wrong` in
|
||||
a reply to open a correction issue.
|
||||
```
|
||||
|
||||
**Post-processor enforcement:**
|
||||
|
||||
- [x] Drop any claim sentence without a `file:line` or `#NNN` citation
|
||||
- [x] Strip LLM preambles — regex removal of lines like
|
||||
`^(Here'?s|I'?ve got|I have everything|Sure|Certainly|Got it|OK)`
|
||||
- [x] Enforce 400-word cap (truncate the `<details>` patch section if
|
||||
needed, never the findings)
|
||||
- [x] Refuse to post if the enforced template leaves less than two
|
||||
findings standing
|
||||
|
||||
### 9. Label + post + archive
|
||||
|
||||
Deterministic. Always applies triage labels (classification maps to
|
||||
`triage: investigated` | `duplicate` | `needs-info` | `not-actionable` |
|
||||
`needs-human`). Always applies `suggested_labels`.
|
||||
|
||||
Posts the comment only if stage 7 passed and the template enforcer produced
|
||||
non-empty output.
|
||||
|
||||
Uploads three artifacts (14-day retention):
|
||||
|
||||
- `investigation.json` — raw investigation output
|
||||
- `validation.json` — per-item mechanical + review verdicts
|
||||
- `review.json` — counter-readings and closed-world answers
|
||||
|
||||
Writes a structured summary to `$GITHUB_STEP_SUMMARY`:
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Classification | bug |
|
||||
| Confidence | medium |
|
||||
| Category | bug (investigable) |
|
||||
| Findings proposed | 4 |
|
||||
| Findings passed mechanical | 3 |
|
||||
| Findings passed review | 2 |
|
||||
| Comment posted | yes |
|
||||
|
||||
---
|
||||
|
||||
## Feedback loop
|
||||
|
||||
Separate workflow: `.github/workflows/triage-feedback.yml`.
|
||||
|
||||
### Trigger A — slash command
|
||||
|
||||
`issue_comment` event. If the comment body contains `/triage-wrong`, parse
|
||||
the surrounding context (bot comment being replied to, the correction text)
|
||||
and open a correction issue labeled `triage-correction`. Assign to
|
||||
`@aaddrick`. Link back to both the bot comment and the source issue.
|
||||
|
||||
### Trigger B — reaction
|
||||
|
||||
`reaction` event for 👎 on a comment by `github-actions[bot]` in the triage
|
||||
flow. Fetch the bot comment plus the issue body, open a correction issue
|
||||
with both, assign to `@aaddrick`.
|
||||
|
||||
### Trigger C — human curation
|
||||
|
||||
Maintainer consolidates correction issues into
|
||||
`.claude/triage-corrections.md`. Entries are classified by failure class:
|
||||
|
||||
```markdown
|
||||
## Identifier hallucination — issue #442 (2026-04-19)
|
||||
Bot invented `qemu` / `virt` as VM backend values. Actual values are
|
||||
`kvm` / `bwrap` / `host` — see `scripts/cowork-vm-service.js:2226-2232`.
|
||||
Closed-world extraction should have caught this; reviewer approved
|
||||
because `evidence_quote` was from a nearby real line.
|
||||
|
||||
## False duplicate — issue #382 (2026-04-12)
|
||||
Bot asserted fix landed in #373; it had not. Negative assertions about
|
||||
issue state need diff/PR link citations.
|
||||
```
|
||||
|
||||
This file is loaded as context by every subsequent triage run (stage 1).
|
||||
The drafter and reviewer prompts instruct: "Before making a claim, check
|
||||
whether a similar claim appears in the corrections file. If so, be
|
||||
especially strict about evidence."
|
||||
|
||||
---
|
||||
|
||||
## Health monitoring
|
||||
|
||||
Telemetry emitted per run, aggregated on a rolling window:
|
||||
|
||||
| Metric | Alert threshold |
|
||||
|--------|----------------|
|
||||
| Classify confidence distribution | — (logged, not alerted) |
|
||||
| Mechanical validation pass rate | > 95% rolling → validation too lax |
|
||||
| Reviewer approval rate | > 70% rolling → reviewer is rubber-stamping |
|
||||
| Version drift rate | > 30% → fetch-reference needs stronger pinning |
|
||||
| Exclusion rate | (monitored for classify-miscalibration) |
|
||||
| Comment-post rate vs. labels-only rate | (expected ~60/40; skew signals gate miscalibration) |
|
||||
| Time-to-correction (bot comment → 👎) | (low = high error rate) |
|
||||
|
||||
Alerts open GitHub issues in the repo labeled `triage-health`.
|
||||
|
||||
---
|
||||
|
||||
## What is explicitly out of scope
|
||||
|
||||
- **Voice replication.** The bot speaks as bot. No prior-art fetching of
|
||||
writing-style profiles. The disclaimer banner does not try to mimic the
|
||||
maintainer.
|
||||
- **Closing issues, merging patches, assigning priority beyond label
|
||||
routing.** The bot's label scope is `triage: *` and `suggested_labels`
|
||||
from the classification schema. Priority, assignee, and milestone changes
|
||||
are manual.
|
||||
- **Speculative fixes for out-of-scope categories.** Driver/hardware/kernel
|
||||
issues route to `needs-human` without investigation; no launcher-flag
|
||||
workarounds get prescribed.
|
||||
- **Posting hedged low-confidence output.** If the confidence or validation
|
||||
gates fail, the comment is not posted. Labels are still applied.
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
### Anthropic — agent design and review patterns
|
||||
|
||||
[^anthropic-framework]: [Our framework for developing safe and trustworthy
|
||||
agents](https://www.anthropic.com/news/our-framework-for-developing-safe-and-trustworthy-agents).
|
||||
Five principles for agent design; emphasizes process transparency and
|
||||
human-in-the-loop over output-level disclaimers.
|
||||
|
||||
[^anthropic-code-review]: [Code Review for Claude
|
||||
Code](https://claude.com/blog/code-review). Source of the "won't approve
|
||||
PRs — that's still a human call" framing pattern. Documents the
|
||||
Writer/Reviewer pattern and confidence-based surface gating.
|
||||
|
||||
[^anthropic-security-review]: [claude-code-security-review (GitHub
|
||||
Action)](https://github.com/anthropics/claude-code-security-review).
|
||||
Anthropic's own automated review bot for PRs. Source of
|
||||
category-exclusion and upfront limitation-disclosure patterns.
|
||||
|
||||
[^anthropic-best-practices]: [Best Practices for Claude
|
||||
Code](https://code.claude.com/docs/en/best-practices). Documents the
|
||||
fresh-context Writer/Reviewer pattern and confidence-scoring multi-agent
|
||||
review.
|
||||
|
||||
[^anthropic-autonomy]: [Measuring AI agent autonomy in
|
||||
practice](https://www.anthropic.com/research/measuring-agent-autonomy).
|
||||
User trust is earned and measurable (~20% auto-approve for novices
|
||||
rising to ~40% with experience). Motivates the conservative-framing
|
||||
choice: trust-erosion from wrong output is slow to recover.
|
||||
|
||||
### LLM hallucination research
|
||||
|
||||
[^diffray-hallucinations]: [LLM Hallucinations in AI Code Review
|
||||
(diffray)](https://diffray.ai/blog/llm-hallucinations-code-review/).
|
||||
29–45% of AI-generated code contains security vulnerabilities; ~20% of
|
||||
package recommendations reference non-existent libraries. Motivates the
|
||||
"validate proposed patches against actual source" gate.
|
||||
|
||||
[^lakera-hallucinations]: [LLM Hallucinations in 2026: How to Understand
|
||||
and Tackle AI's Most Persistent
|
||||
Quirk](https://www.lakera.ai/blog/guide-to-hallucinations-in-large-language-models).
|
||||
Hallucinations originate from training incentives where confident
|
||||
guessing outperforms acknowledging uncertainty. Motivates
|
||||
structural-tentativeness framing over prose hedges.
|
||||
|
||||
### GitHub — automated triage in production
|
||||
|
||||
[^github-taskflow]: [AI-supported vulnerability triage with the GitHub
|
||||
Security Lab Taskflow
|
||||
Agent](https://github.blog/security/ai-supported-vulnerability-triage-with-the-github-security-lab-taskflow-agent/).
|
||||
Production LLM-triage system. Source of the "require precise file and
|
||||
line references" and "staged verification with intermediate artifacts"
|
||||
patterns.
|
||||
|
||||
[^github-copilot-review]: [Responsible use of GitHub Copilot code
|
||||
review](https://docs.github.com/en/copilot/responsible-use/code-review).
|
||||
Documents the structural-tentativeness approach (suggestions require
|
||||
manual approval rather than explicit uncertainty signals) and the
|
||||
"missed issues / false positives / unreliable suggestions" disclosure
|
||||
triad.
|
||||
|
||||
### Other
|
||||
|
||||
[^triage-project]: [trIAge — LLM-powered triage bot for open
|
||||
source](https://github.com/trIAgelab/trIAge). Reference implementation of
|
||||
an open-source triage bot; useful as a comparative architecture.
|
||||
Reference in New Issue
Block a user