mirror of
https://github.com/xroche/httrack.git
synced 2026-06-21 09:38:24 +03:00
Replace the network dependency for crawl tests with a self-contained Python stdlib server (http.server + ssl) that httrack crawls over loopback. The server binds an ephemeral port and prints it on stdout; local-crawl.sh discovers the port, substitutes the BASEURL token into the httrack arguments, runs the crawl, and audits the mirror under the discovered host-root directory. This prototype migrates two cases off ut.httrack.com: - 13_local-cookies.test drives the cookie chain (entrance/second/third) reimplemented as Python handlers from the old ut/cookies/*.php fixtures. A missing or wrong cookie answers 500, so a clean 3-files/0-errors run proves the cookie jar is replayed across links. - 14_local-https.test crawls over HTTPS using a shipped long-dated self-signed cert. httrack does not verify certs, so the cert is accepted as-is and the real TLS path runs offline. The group skips (exit 77) when python3 is missing, mirroring check-network.sh. Fixtures and the cert are listed explicitly in EXTRA_DIST (automake does not expand globs); make distcheck passes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Xavier Roche <roche@httrack.com>
19 lines
649 B
Bash
Executable File
19 lines
649 B
Bash
Executable File
#!/bin/bash
|
|
#
|
|
# HTTPS crawl against the local test server, using the shipped self-signed
|
|
# cert. httrack does not verify certs (htslib.c: SSL_CTX_new with no
|
|
# SSL_CTX_set_verify), so the self-signed cert is accepted as-is and this
|
|
# exercises the real TLS path offline. basic.html links to link.html with four
|
|
# distinct query strings, each saved under a hashed name -> 5 files.
|
|
|
|
: "${top_srcdir:=..}"
|
|
|
|
if test "$HTTPS_SUPPORT" == "no"; then
|
|
echo "no https support compiled, skipping"
|
|
exit 77
|
|
fi
|
|
|
|
bash "$top_srcdir/tests/local-crawl.sh" --tls --errors 0 --files 5 \
|
|
--found 'simple/basic.html' \
|
|
httrack 'BASEURL/simple/basic.html'
|