Compare commits

...

4 Commits

Author SHA1 Message Date
Xavier Roche
f987083f14 Add an opt-in pre-commit hook that auto-formats changed C lines
Enable with: git config core.hooksPath .githooks

The hook runs git-clang-format (clang-format 19, repo .clang-format) on the
staged C lines only and re-stages the result, so commits stay
clang-format-clean and the CI format check passes without a round-trip. It never
reformats the whole tree, only the lines a commit changes.

Safe by construction: if clang-format 19 is absent it skips (CI still enforces);
and if a file has both staged and unstaged changes it does not auto-mutate
(which would commit the unstaged part), it reports and asks the author to
stage/stash. HTTRACK_NO_AUTOFORMAT=1 skips it for one commit. README covers the
noexec-working-tree case (point core.hooksPath at an exec-fs copy).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 12:55:17 +02:00
Xavier Roche
eb565f0bd8 Merge pull request #333 from xroche/cleanup/clang-format-setup
Add a .clang-format and a changed-lines CI format check
2026-06-14 12:38:20 +02:00
Xavier Roche
71398d510e Add a .clang-format and a changed-lines CI format check
The engine predates clang-format (it was shaped by an old Visual Studio
formatter) and does not round-trip through it: a whole-tree reformat is ~25k
lines of churn, so we never do one. Instead we format only the lines a change
touches, via git-clang-format, and enforce that in CI diff-scoped.

.clang-format is reverse-engineered from src/*.c (2-space, no tabs, 80 cols,
char *x pointers, attached braces, un-indented case labels, space after C-style
casts). That is mostly LLVM defaults; the deliberate deviations are
SpaceAfterCStyleCast (the dominant "(int) x" form) and SortIncludes: false
(C include order can be significant, so never reorder).

The CI "format" job pins clang-format-19 from apt.llvm.org's noble channel
(ubuntu-24.04's native is 18) to match local dev, and fails only if a PR's
changed C lines are not clang-format-clean. Existing untouched code is left
alone.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 12:26:49 +02:00
Xavier Roche
75fc040f06 Merge pull request #331 from xroche/cleanup/htsbuff-builder
Add htsbuff: a bounded string builder over a fixed buffer
2026-06-14 10:40:23 +02:00
4 changed files with 193 additions and 2 deletions

27
.clang-format Normal file
View File

@@ -0,0 +1,27 @@
# clang-format 19 config for the HTTrack C engine.
#
# IMPORTANT: this is applied to TOUCHED LINES ONLY (via git-clang-format / the
# CI format check). The engine was originally formatted by GNU indent / by hand
# and does NOT round-trip through clang-format, so a whole-tree reformat is
# intentionally never done. Format the lines you change; leave the rest.
#
# Reverse-engineered from src/*.c: 2-space indent, no tabs, 80 columns, pointers
# bound to the name (char *x), attached braces, un-indented case labels, and a
# space after C-style casts ((int) x). Most of that is LLVM's defaults; the
# lines below are the deliberate deviations.
BasedOnStyle: LLVM
# Engine specifics / deviations from LLVM:
SpaceAfterCStyleCast: true # "(int) x", overwhelmingly dominant (542 vs 7)
SortIncludes: false # C include order can be significant; never reorder
IncludeBlocks: Preserve # do not merge/reflow include groups
# Stated explicitly for robustness against base-style drift (these match LLVM):
IndentWidth: 2
UseTab: Never
ColumnLimit: 80
PointerAlignment: Right
IndentCaseLabels: false
SpaceBeforeParens: ControlStatements
AllowShortIfStatementsOnASingleLine: Never

35
.githooks/README.md Normal file
View File

@@ -0,0 +1,35 @@
# Git hooks
Versioned hooks for this repo. Enable them once per clone:
```sh
git config core.hooksPath .githooks
```
## pre-commit: auto-format changed C lines
Runs `git-clang-format` (clang-format 19, using the repo `.clang-format`) on the
**staged lines only** and re-stages the result, so every commit is
clang-format-clean and the CI `format` check passes. It never reformats the
whole tree, only the lines you changed.
- Disable for a single commit: `HTTRACK_NO_AUTOFORMAT=1 git commit ...`
- If clang-format 19 isn't installed, the hook skips silently (CI still
enforces). Install it with your distro's `clang-format-19`, or from
apt.llvm.org.
- If a file has *both* staged and unstaged changes, the hook does not
auto-mutate it (that would commit the unstaged part); it instead reports
whether its staged lines need formatting and asks you to stage/stash the rest.
### noexec working trees
Git executes the hook directly, so if your working tree is on a `noexec` mount
git cannot run `.githooks/pre-commit`. Point `core.hooksPath` at a copy on an
exec filesystem instead:
```sh
mkdir -p ~/.httrack-hooks && cp .githooks/pre-commit ~/.httrack-hooks/
chmod +x ~/.httrack-hooks/pre-commit
git config core.hooksPath ~/.httrack-hooks
```
</content>

71
.githooks/pre-commit Executable file
View File

@@ -0,0 +1,71 @@
#!/usr/bin/env bash
#
# Auto-format the staged C lines with clang-format (touched lines only), then
# re-stage them, so commits stay clang-format-clean and CI's format check passes.
#
# Enable once per clone: git config core.hooksPath .githooks
# Skip for one commit: HTTRACK_NO_AUTOFORMAT=1 git commit ...
#
# Matches the CI gate (.clang-format, clang-format 19). It only ever touches the
# lines a commit changes; it never reformats the whole tree.
set -euo pipefail
[ "${HTTRACK_NO_AUTOFORMAT:-}" = "1" ] && exit 0
# Staged C/H files (added/copied/modified/renamed).
mapfile -t files < <(git diff --cached --name-only --diff-filter=ACMR -- '*.c' '*.h')
[ "${#files[@]}" -eq 0 ] && exit 0
# Locate clang-format 19 and the git driver; if absent, skip (CI is the backstop).
cf=""
for c in clang-format-19 clang-format; do
if command -v "$c" >/dev/null 2>&1; then
case "$("$c" --version)" in *"version 19."*)
cf="$c"
break
;;
esac
fi
done
gcf=""
for g in git-clang-format-19 git-clang-format; do
command -v "$g" >/dev/null 2>&1 && {
gcf="$g"
break
}
done
if [ -z "$cf" ] || [ -z "$gcf" ]; then
echo "pre-commit: clang-format 19 not found; skipping auto-format (CI still checks)." >&2
exit 0
fi
# Files that are staged AND also have unstaged changes: re-staging them would
# pull in the unstaged work, so don't auto-mutate. Check instead and let the
# author resolve it.
partial=()
for f in "${files[@]}"; do
if ! git diff --quiet -- "$f"; then partial+=("$f"); fi
done
if [ "${#partial[@]}" -ne 0 ]; then
d="$("$gcf" --binary "$cf" --style=file --staged --diff --extensions c,h || true)"
case "$d" in
"" | "no modified files to format" | *"did not modify any files"*)
exit 0
;; # staged lines already clean
*)
echo "pre-commit: these files have both staged and unstaged changes, so" >&2
echo "auto-format was skipped to avoid committing unstaged work:" >&2
printf ' %s\n' "${partial[@]}" >&2
echo "Their staged lines need formatting. Stage the rest (or stash it)," >&2
echo "or run: $gcf --binary $cf --staged" >&2
exit 1
;;
esac
fi
# Clean-staged files: format the staged lines in the working tree, then re-stage.
"$gcf" --binary "$cf" --style=file --staged --extensions c,h >/dev/null || true
git add -- "${files[@]}"
exit 0

View File

@@ -81,7 +81,65 @@ jobs:
# Lint the scripts we maintain; the legacy scripts are a separate cleanup.
- name: shellcheck
run: shellcheck man/makeman.sh tools/mkdeb.sh tests/*.test tests/check-network.sh
run: shellcheck man/makeman.sh tools/mkdeb.sh .githooks/pre-commit tests/*.test tests/check-network.sh
- name: shfmt
run: shfmt -d -i 4 man/makeman.sh tools/mkdeb.sh
run: shfmt -d -i 4 man/makeman.sh tools/mkdeb.sh .githooks/pre-commit
# Check clang-format on CHANGED LINES ONLY. The engine predates clang-format
# (it was shaped by an old Visual Studio formatter) and does not round-trip,
# so we never reformat the whole tree -- only the lines a PR touches.
format:
name: format (clang-format-19, changed lines)
if: github.event_name == 'pull_request'
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install clang-format 19 (pinned, from apt.llvm.org)
run: |
set -euo pipefail
# ubuntu-24.04's native clang-format is 18; pin 19 to match local dev.
wget -qO- https://apt.llvm.org/llvm-snapshot.gpg.key \
| sudo tee /etc/apt/trusted.gpg.d/apt.llvm.org.asc >/dev/null
echo "deb http://apt.llvm.org/noble/ llvm-toolchain-noble-19 main" \
| sudo tee /etc/apt/sources.list.d/llvm-19.list >/dev/null
sudo apt-get update
sudo apt-get install -y --no-install-recommends clang-format-19
# git-clang-format driver, pinned to an immutable release tag (not a
# moving branch) since we curl and then execute it.
sudo curl -fsSL -o /usr/local/bin/git-clang-format \
https://raw.githubusercontent.com/llvm/llvm-project/llvmorg-19.1.7/clang/tools/clang-format/git-clang-format
sudo chmod 0755 /usr/local/bin/git-clang-format
clang-format-19 --version
- name: Check formatting of changed lines
run: |
set -euo pipefail
git fetch --no-tags origin \
"+refs/heads/${{ github.base_ref }}:refs/remotes/origin/${{ github.base_ref }}"
base="origin/${{ github.base_ref }}"
set +e
diff="$(git clang-format --binary clang-format-19 --style=file \
--diff --extensions c,h "$base")"
rc=$?
set -e
# Classify by output first: a non-empty diff means "not clean",
# regardless of the driver's exit convention (the release-tag driver
# exits 0 and signals via stdout; some packaged drivers exit 1 on a
# diff). A nonzero exit with clean output is a real checker error.
case "$diff" in
"" | "no modified files to format" | *"did not modify any files"*)
if [ "$rc" -ne 0 ]; then
echo "::error::git clang-format failed (exit $rc): checker error."
exit 1
fi
echo "Formatting OK: changed C lines are clang-format-clean." ;;
*)
echo "$diff"
echo "::error::Changed C lines are not clang-format-clean."
echo "Fix locally with: git clang-format --binary clang-format-19 $base"
exit 1 ;;
esac