Compare commits

..

2 Commits

Author SHA1 Message Date
Xavier Roche
594cf0da39 debian: override embedded-library for bundled minizip, lint under debian:sid (#419)
httrack statically links its own patched minizip (src/minizip): it carries a
zipFlush() API the system libminizip lacks, which htscache.c uses to flush the
cache .zip so an interrupted crawl leaves a valid archive, plus Android and
old-zlib portability fixes. The system library can't be substituted until that
lands upstream, so add justified lintian overrides for the resulting
embedded-library tag on libhttrack3 and proxytrack.

The tag never showed in CI because the deb job built and linted on the Ubuntu
runner, whose lintian predates the minizip fingerprint; it only fires on the
newer lintian the Debian buildds and UDD run. Build and lint the package inside
a debian:sid container instead, matching the upload target. That also keeps the
archive legal: a Debian dpkg-deb writes xz members where an Ubuntu host defaults
to zstd, which Debian's lintian rejects as a malformed deb. And being unable to
unpack a zstd member, lintian never scans the binaries the embedded-library
check reads, so the override would otherwise have looked unused.

Signed-off-by: Xavier Roche <roche@httrack.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 22:27:18 +02:00
Xavier Roche
3845cd1fb3 Store the DNS cache in a coucal hashtable (#420)
The resolver cache was a hand-rolled singly-linked list with a dummy head
node: O(n) lookup, O(n^2) build, and each record carried its own next
pointer plus an inline copy of the hostname key. Swap it for coucal, the
hashtable already used for the backing cache and the ready slots, keyed by
hostname with the address record as the value.

coucal owns the records (freed through a value handler on coucal_delete)
and dups the key itself, so t_dnscache sheds both its next link and its
inline iadr string and becomes a pure address record. The state field
keeps the same pointer width (t_dnscache* -> coucal), so the installed
htsopt.h layout and the ABI are unchanged.

Behaviour is identical: same -1/0/>0 lookup contract, same negative
caching, same resolve-once semantics, all under the existing
opt->state.lock (coucal is not internally serialized against the FTP/web
threads). The DNS self-test exercises the full contract black-box and
passes unchanged.

Signed-off-by: Xavier Roche <roche@httrack.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 21:18:53 +02:00
3 changed files with 30 additions and 10 deletions

View File

@@ -232,30 +232,42 @@ jobs:
deb:
name: deb package (lintian)
runs-on: ubuntu-24.04
# Build and gate inside Debian sid, the upload target. A Debian dpkg-deb
# produces archive-legal xz members (an Ubuntu host defaults to zstd, which
# the archive's lintian rejects), and sid's lintian carries the same
# data-driven checks (embedded-lib fingerprints and the like) the buildds and
# UDD apply -- so issues surface here instead of after upload.
container: debian:sid
steps:
- uses: actions/checkout@v6
with:
submodules: recursive
- name: Install packaging toolchain
run: |
set -euo pipefail
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
apt-get update
apt-get install -y --no-install-recommends \
ca-certificates git \
build-essential autoconf automake libtool autoconf-archive \
zlib1g-dev libssl-dev \
debhelper devscripts lintian fakeroot
- uses: actions/checkout@v6
with:
submodules: recursive
# --unsigned: CI has no GPG key (also skips the release sig/checksums).
# debuild builds every package, then lintian gates on errors.
# mkdeb builds every package then runs the lintian gate (--fail-on=error,
# warning); debuild runs the packaged test pass.
#
# DEB_BUILD_OPTIONS trims work CI does not need (release builds via
# mkdeb.sh are untouched): noautodbgsym drops the -dbgsym packages whose
# LTO payloads are slow to compress and that CI never ships; parallel uses
# every core. We let debuild run its test pass -- the only one now that
# mkdeb no longer runs its own -- so CI exercises the packaged tests.
- name: Build Debian packages
# every core.
- name: Build and lint Debian packages
run: |
set -euo pipefail
# The workspace volume is owned by the host runner uid, but the
# container runs as root, so mkdeb's git calls (superproject and the
# coucal submodule) trip "dubious ownership"; mark them all safe.
git config --global --add safe.directory "*"
export DEB_BUILD_OPTIONS="noautodbgsym parallel=$(nproc)"
bash tools/mkdeb.sh --unsigned --no-release-artifacts

View File

@@ -1,3 +1,8 @@
# The shared libraries ship without a versioned symbols control file (ABI is
# tracked via the SONAME plus a >= upstream-version dependency, see debian/rules).
libhttrack3: no-symbols-control-file usr/lib/*
# Bundled, locally patched minizip (src/minizip): it adds a zipFlush() API the
# system libminizip lacks (htscache.c flushes the cache .zip so an interrupted
# crawl leaves a valid archive), plus Android/old-zlib portability fixes.
libhttrack3: embedded-library *libminizip*

3
debian/proxytrack.lintian-overrides vendored Normal file
View File

@@ -0,0 +1,3 @@
# Statically linked against httrack's bundled, patched minizip (see src/minizip
# and libhttrack3's override): the zipFlush() API is absent from the system one.
proxytrack: embedded-library *libminizip*