Stop the 412/416 partial-reget loop on continue/update (#206 )

On resume, the Range request is rebuilt by back_add from a temp-ref keyed on (adr,fil) that records the partial download's real save name. A 412/416 ("Range Not Satisfiable") means that partial is stale and the whole file must be re-fetched. The handler only removed heap->sav, so when the resume pass recomputed a save name different from the temp-ref's (the default delayed-type machinery renames freely), the partial was never cleared: back_add re-sent the same Range, earned the same 416, and the link was re-recorded forever, growing the scan counter without bound. Clear the whole partial wherever it lives -- the temp-ref and the file it points at, plus heap->sav -- so the re-record falls through to a plain full GET. Re-get only when there was a partial to discard and both Range triggers (the ref and the on-disk file) are actually gone; once they are, a fresh 416 with nothing left to drop means the whole-file GET itself failed, so the link gives up cleanly instead of re-queueing. A failed removal (read-only or full cache) also gives up rather than looping, since back_add would otherwise re-Range the surviving ref; url_savename_refname_remove now reports the removal result so the handler can tell. (The request's range_used flag would be the natural one-shot signal, but it does not survive the delayed-type two-pass, so we key off the partial instead.) tests/20_local-resume-loop.test drives it offline: pass 1 is interrupted (SIGTERM, so the exit handler finalizes the cache and the temp-ref) to leave a partial, then pass 2 --continue gets 416 on every resume request. A portable watchdog kills pass 2 if it loops; the test asserts it terminates and attempts exactly one whole-file re-get (2 <= requests <= 8). It fails on the pre-fix handler (loops) and on a re-get that silently drops the link. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Xavier Roche <roche@httrack.com>
2026-06-29 05:26:32 +03:00 · 2026-06-24 21:06:05 +02:00
76 changed files with 2774 additions and 15624 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -61,50 +61,6 @@ jobs:
        if: failure()
        run: cat tests/test-suite.log 2>/dev/null || true

-  # Reproduce the Debian buildds: they build in a minimal chroot with no
-  # python3, so the local-server tests must SKIP (exit 77), not fail. GitHub
-  # runners ship python3, so every other job hides this path; here we remove it
-  # before `make check`. This is the guard that would have caught the 3.49.10-1
-  # FTBFS (28_local-pause failed instead of skipping when python3 was absent).
-  buildd-no-python3:
-    name: build (no python3, Debian buildd)
-    runs-on: ubuntu-24.04
-    steps:
-      - uses: actions/checkout@v6
-        with:
-          submodules: recursive
-
-      - name: Install build dependencies
-        run: |
-          set -euo pipefail
-          sudo apt-get update
-          sudo apt-get install -y --no-install-recommends \
-            build-essential autoconf automake libtool autoconf-archive \
-            zlib1g-dev libssl-dev
-
-      - name: Configure
-        run: |
-          set -euo pipefail
-          autoreconf -fi
-          ./configure
-
-      - name: Build
-        run: make -j"$(nproc)"
-
-      - name: Test without python3
-        run: |
-          set -euo pipefail
-          # Hide every python3* so `command -v python3` fails like it does in the
-          # buildd chroot; masking with /bin/false would still resolve.
-          sudo find /usr/bin /usr/local/bin -maxdepth 1 -name 'python3*' \
-            -exec mv {} {}.hidden \;
-          ! command -v python3
-          make check
-
-      - name: Print the test log on failure
-        if: failure()
-        run: cat tests/test-suite.log 2>/dev/null || true
-
  # Portability: build and test on macOS (Darwin/clang) on a native runner --
  # no VM. The tree has no __APPLE__ branches, so Darwin exercises the
  # generic-Unix path on a second libc and kernel. brew's openssl@3 is keg-only,
@@ -232,51 +188,6 @@ jobs:
        if: failure()
        run: cat tests/test-suite.log 2>/dev/null || true

-  # MemorySanitizer catches reads of uninitialized memory (#143's stack-garbage
-  # size filter) that ASan/UBSan miss. It flags any byte an uninstrumented lib
-  # wrote, so the job stays in our own code: offline self-tests only, no openssl
-  # (--disable-https), no zlib cache tests, static (the runtime is not in .so's).
-  msan:
-    name: msan (MemorySanitizer, clang)
-    runs-on: ubuntu-24.04
-    steps:
-      - uses: actions/checkout@v6
-        with:
-          submodules: recursive
-
-      - name: Install build dependencies
-        run: |
-          set -euo pipefail
-          sudo apt-get update
-          sudo apt-get install -y --no-install-recommends \
-            build-essential clang autoconf automake libtool autoconf-archive \
-            zlib1g-dev
-
-      - name: Configure (MSan, static, no https)
-        run: |
-          set -euo pipefail
-          autoreconf -fi
-          ./configure CC=clang \
-            CFLAGS="-fsanitize=memory -fsanitize-memory-track-origins=2 -fno-sanitize-recover=all -g -O1 -fno-omit-frame-pointer" \
-            LDFLAGS="-fsanitize=memory" \
-            --disable-https --disable-shared --enable-static
-
-      - name: Build
-        run: make -j"$(nproc)"
-
-      - name: Test (offline self-tests under MSan)
-        env:
-          MSAN_OPTIONS: abort_on_error=1:halt_on_error=1
-        run: |
-          set -euo pipefail
-          # Engine self-tests only; the cache trio pulls in uninstrumented zlib.
-          tests="$(cd tests && ls 01_engine-*.test | grep -v -- '-cache' | tr '\n' ' ')"
-          make check TESTS="$tests"
-
-      - name: Print the test log on failure
-        if: failure()
-        run: cat tests/test-suite.log 2>/dev/null || true
-
  # Optional-dependency build: compile and test with HTTPS/OpenSSL disabled --
  # the configuration users on minimal systems build, and one libssl is not even
  # installed here so configure cannot silently re-enable it. The matrix above
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -33,9 +33,8 @@ the operational checklist: toolchain, invariants, and how to ship a change.
 - Be terse. Comment the why, in English; translate French comments you touch.
 - Strip AI tells from prose (em-dash overuse, rule-of-three, filler, vague
  attributions). Ref: Wikipedia "Signs of AI writing". Claude Code: `/humanizer`.
- Behavior change → add a test. Fast path: a hidden `httrack -#test=NAME` engine
-  self-test (registry in `htsselftest.c`; `-#test` lists them) driven by a
-  `tests/NN_*.test`, over a slow crawl.
+- Behavior change → add a test. Fast path: a hidden `httrack -#N` debug
+  subcommand (`htscoremain.c`) driven by a `tests/NN_*.test`, over a slow crawl.

 ## Review your change adversarially (strongly suggested)
 Before pushing, and when reviewing others, don't skim for bugs:
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -39,10 +39,6 @@ Welcome, and nothing to disclose. Two rules:

 The sign-off covers AI-assisted code too.

-## Translations
-
-Interface strings live in [`lang/`](lang/). See [lang/README.md](lang/README.md) for the file format and how to add or update a language.
-
 ## Bugs

 Open an issue with the version, OS, command used, and expected vs actual result.
--- a/configure.ac
+++ b/configure.ac
@@ -1,6 +1,6 @@
 AC_PREREQ([2.71])

-AC_INIT([httrack], [3.49.10], [roche+packaging@httrack.com], [httrack], [http://www.httrack.com/])
+AC_INIT([httrack], [3.49.9], [roche+packaging@httrack.com], [httrack], [http://www.httrack.com/])
 AC_COPYRIGHT([
 HTTrack Website Copier, Offline Browser for Windows and Unix
 Copyright (C) 1998-2015 Xavier Roche and other contributors
@@ -29,10 +29,10 @@ AC_CONFIG_SRCDIR(src/httrack.c)
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_HEADERS(config.h)
 AM_INIT_AUTOMAKE([subdir-objects])
-# 3:2:0: 3.49.10 only appends tail fields to the options struct (no existing
-# symbol or offset changed vs 3.49.9), so it stays soname .so.3; bump revision.
-# (3:0:0 was the htsblk mime-buffer widening, the ABI break that moved .so.2 -> .so.3.)
-VERSION_INFO="3:2:0"
+# 3:1:0: 3.49.9 changed code but not the exported interface vs 3.49.8 (same 164
+# symbols, no struct-layout change), so bump revision only. (3:0:0 was the htsblk
+# mime-buffer widening, an ABI break that moved the soname .so.2 -> .so.3.)
+VERSION_INFO="3:1:0"
 AM_MAINTAINER_MODE
 AC_USE_SYSTEM_EXTENSIONS

--- a/debian/changelog
+++ b/debian/changelog
@@ -1,16 +1,3 @@
-httrack (3.49.10-1) unstable; urgency=medium
-
-  * New upstream release: new download-pacing and URL-handling options plus a
-    batch of crawl and robustness fixes (full list in history.txt).
-  * Rewrite debian/copyright in machine-readable DEP-5 format, crediting the
-    bundled minizip, md5 and coucal sources (#415).
-  * Lead the webhttrack browser dependency with chromium so httrack is not
-    dragged into the firefox-esr autoremoval cascade (#436).
-  * Override the embedded-library lint for the bundled minizip (#419).
-  * Bump Standards-Version to 4.7.4 (no changes required).
-
- -- Xavier Roche <xavier@debian.org>  Sun, 28 Jun 2026 14:01:53 +0200
-
 httrack (3.49.9-1) unstable; urgency=medium

  * New upstream release: Content-Type and file-type detection fixes (trust a
--- a/debian/control
+++ b/debian/control
@@ -2,7 +2,7 @@ Source: httrack
 Section: web
 Priority: optional
 Maintainer: Xavier Roche <roche@httrack.com>
-Standards-Version: 4.7.4
+Standards-Version: 4.7.0
 Build-Depends: debhelper-compat (= 13), autoconf, autoconf-archive, automake, libtool, zlib1g-dev, libssl-dev
 Rules-Requires-Root: no
 Homepage: http://www.httrack.com
@@ -30,7 +30,7 @@ Description: Copy websites to your computer (Offline browser)
 Package: webhttrack
 Architecture: any
 Multi-Arch: foreign
-Depends: ${misc:Depends}, ${shlibs:Depends}, webhttrack-common, sensible-utils, chromium | firefox-esr | www-browser
+Depends: ${misc:Depends}, ${shlibs:Depends}, webhttrack-common, sensible-utils, firefox-esr | chromium | www-browser
 Replaces: webhttrack-common (<< 3.43.9-2)
 Breaks: webhttrack-common (<< 3.43.9-2)
 Suggests: httrack, httrack-doc
--- a/history.txt
+++ b/history.txt
@@ -4,25 +4,7 @@ HTTrack Website Copier release history:

 This file lists all changes and fixes that have been made for HTTrack

-3.49-10
-+ New: --cookies-file to preload a Netscape cookies.txt before crawling (#215)
-+ New: --pause to space out file downloads by a random delay (#185)
-+ New: --strip-query to drop selected query keys from the dedup naming (#112)
-+ Changed: split the -%u URL hacks into independent --keep-www-prefix, --keep-double-slashes and --keep-query-order toggles (#271)
-+ Fixed: follow a redirect Location after dropping its #fragment, instead of requesting the fragment and polluting the saved name (#204)
-+ Fixed: escaped brackets inside a *[...] filter character class (#148)
-+ Fixed: honor the server's Content-Range when resuming a partial download, instead of appending overlapping bytes (#198)
-+ Fixed: abort the download as soon as the response type is excluded by -mime:, instead of fetching then discarding the body (#58)
-+ Fixed: keep size-based filter rules neutral until the file size is known (#143)
-+ Fixed: stop the mirror with a clean fatal error on a cache write failure, instead of crashing (#174, #219)
-+ Fixed: stop the 412/416 partial re-get loop on --continue and --update (#206)
-+ Fixed: keep an unrecognized URL tail instead of mangling it to .html (#115)
-+ Fixed: honor --tolerant (-%B) on a broken Content-Length, and fix an out-of-bounds read it exposed (#32, #41)
-+ Fixed: fall back to the next resolved address when a connection fails or stalls, instead of hanging on a dead IPv6 address
-+ Fixed: report why a -%L URL list could not be loaded (#49)
-+ Changed: multiple internal hardening, build and CI improvements
-
-.49-9
+3.49-9
 + Fixed: file-type detection from the Content-Type header: trust a declared type over a binary URL extension, honor --assume under the delayed type check, and keep a known extension against a bogus or empty Content-Type (#267, #29, #56)
 + Fixed: an uninitialized-buffer read when the Content-Type is empty (#411)
 + Fixed: restored C++ source-compatibility of the installed headers so reverse dependencies (httraqt) build again (#413)
--- a/html/filters.html
+++ b/html/filters.html
@@ -247,7 +247,7 @@ See also: The <a href="faq.html#VF1">FAQ</a><br>
        <td>the \ character</td>
      </tr>
      <tr>
-        <td nowrap><tt>*[\[,\]]</tt></td>
+        <td nowrap><tt>*[\[\]]</tt></td>
        <td>the [ or ] character</td>
      </tr>
      <tr>
--- a/lang/English.txt
+++ b/lang/English.txt
@@ -295,7 +295,7 @@ Max Depth
 Maximum external depth:
 Maximum external depth:
 Filters (refuse/accept links) :
-Filters (refuse/accept links):
+Filters (refuse/accept links) :
 Paths
 Paths
 Save prefs
--- a/lang/README.md
+++ b/lang/README.md
@@ -1,37 +0,0 @@
-# Translating HTTrack
-
-Interface strings live here, one `.txt` file per language. `English.txt` is the reference: every other file maps each English string to its translation.
-
-## File format
-
-Plain text, entries in consecutive pairs of lines:
-
-```
-<English string>
-<translation>
-```
-
-The first line of a pair is the lookup key and must stay identical to the one in `English.txt`; translate only the second line. Missing entries fall back to the English text at runtime, so a partial translation works.
-
-Preserve any `\r\n`, `\t` and `printf` placeholders (`%s`, `%d`, ...) in the translation.
-
-A few `LANGUAGE_*` entries at the top describe the file itself:
-
-| Key | Meaning |
-| --- | --- |
-| `LANGUAGE_NAME` | Name shown in the language picker, in its own language (`Deutsch`, not `German`) |
-| `LANGUAGE_ISO` | ISO 639 code, with region if needed (`de`, `pt_BR`) |
-| `LANGUAGE_CHARSET` | Encoding the file is saved in (`ISO-8859-1`, `windows-1251`, `UTF-8`, ...) |
-| `LANGUAGE_AUTHOR` | Your name and contact |
-| `LANGUAGE_WINDOWSID` | Windows locale name used by WinHTTrack (`German (Standard)`) |
-
-Save the file in exactly its declared `LANGUAGE_CHARSET`; an editor that rewrites it as UTF-8 will corrupt the non-ASCII bytes.
-
-## Adding or updating a language
-
-1. Copy `English.txt` to `<Language>.txt`, or edit the existing file.
-2. Translate each second line; leave the English keys untouched.
-3. Fill in the `LANGUAGE_*` header for a new file.
-4. Open a pull request, or attach the file to a GitHub issue.
-
-When new strings land in `English.txt` they show up untranslated (as English) until a translator fills them in.
--- a/man/httrack.1
+++ b/man/httrack.1
@@ -3,7 +3,7 @@
 .\"
 .\" This file is generated by man/makeman.sh; do not edit by hand.
 .\" SPDX-License-Identifier: GPL-3.0-or-later
-.TH httrack 1 "27 June 2026" "httrack website copier"
+.TH httrack 1 "13 June 2026" "httrack website copier"
 .SH NAME
 httrack \- offline browser : copy websites to a local directory
 .SH SYNOPSIS
@@ -24,7 +24,6 @@ httrack \- offline browser : copy websites to a local directory
 [ \fB\-EN, \-\-max\-time[=N]\fR ]
 [ \fB\-AN, \-\-max\-rate[=N]\fR ]
 [ \fB\-%cN, \-\-connection\-per\-second[=N]\fR ]
-[ \fB\-%G, \-\-pause\fR ]
 [ \fB\-GN, \-\-max\-pause[=N]\fR ]
 [ \fB\-cN, \-\-sockets[=N]\fR ]
 [ \fB\-TN, \-\-timeout[=N]\fR ]
@@ -44,13 +43,11 @@ httrack \- offline browser : copy websites to a local directory
 [ \fB\-x, \-\-replace\-external\fR ]
 [ \fB\-%x, \-\-disable\-passwords\fR ]
 [ \fB\-%q, \-\-include\-query\-string\fR ]
-[ \fB\-%g, \-\-strip\-query\fR ]
 [ \fB\-o, \-\-generate\-errors\fR ]
 [ \fB\-X, \-\-purge\-old[=N]\fR ]
 [ \fB\-%p, \-\-preserve\fR ]
 [ \fB\-%T, \-\-utf8\-conversion\fR ]
 [ \fB\-bN, \-\-cookies[=N]\fR ]
-[ \fB\-%K, \-\-cookies\-file\fR ]
 [ \fB\-u, \-\-check\-type[=N]\fR ]
 [ \fB\-j, \-\-parse\-java[=N]\fR ]
 [ \fB\-sN, \-\-robots[=N]\fR ]
@@ -156,8 +153,6 @@ maximum mirror time in seconds (60=1 minute, 3600=1 hour) (\-\-max\-time[=N])
 maximum transfer rate in bytes/seconds (1000=1KB/s max) (\-\-max\-rate[=N])
 .IP \-%cN
 maximum number of connections/seconds (*%c10) (\-\-connection\-per\-second[=N])
-.IP \-%G
-random pause of MIN[:MAX] seconds between files (e.g. %G5:10) (\-\-pause <param>)
 .IP \-GN
 pause transfer if N bytes reached, and wait until lock file is deleted (\-\-max\-pause[=N])
 .SS Flow control:
@@ -203,8 +198,6 @@ replace external html links by error pages (\-\-replace\-external)
 do not include any password for external password protected websites (%x0 include) (\-\-disable\-passwords)
 .IP \-%q
 *include query string for local files (useless, for information purpose only) (%q0 don't include) (\-\-include\-query\-string)
-.IP \-%g
-strip query keys for dedup ([host/pattern=]key1,key2,...) (\-\-strip\-query <param>)
 .IP \-o
 *generate output html file in case of error (404..) (o0 don't generate) (\-\-generate\-errors)
 .IP \-X
@@ -216,8 +209,6 @@ links conversion to UTF\-8 (\-\-utf8\-conversion)
 .SS Spider options:
 .IP \-bN
 accept cookies in cookies.txt (0=do not accept,* 1=accept) (\-\-cookies[=N])
-.IP \-%K
-load extra cookies from a Netscape cookies.txt (\-\-cookies\-file <param>)
 .IP \-u
 check document type if unknown (cgi,asp..) (u0 don't check, * u1 check but /, u2 check always) (\-\-check\-type[=N])
 .IP \-j
@@ -234,8 +225,6 @@ tolerant requests (accept bogus responses on some servers, but not standard!) (\
 update hacks: various hacks to limit re\-transfers when updating (identical size, bogus response..) (\-\-updatehack)
 .IP \-%u
 url hacks: various hacks to limit duplicate URLs (strip //, www.foo.com==foo.com..) (\-\-urlhack)
-.br
-opt out of one url\-hack part: \-\-keep\-www\-prefix (www.foo.com<>foo.com), \-\-keep\-double\-slashes (//), \-\-keep\-query\-order (?b&a)
 .IP \-%A
 assume that a type (cgi,asp..) is always linked with a mime type (\-%A php3,cgi=text/html;dat,bin=application/x\-zip) (\-\-assume <param>)
 .br
@@ -324,8 +313,12 @@ debug HTTP headers in logfile (\-\-debug\-headers)
 .SS Guru options: (do NOT use if possible)
 .IP \-#X
 *use optimized engine (limited memory boundary checks) (\-\-fast\-engine)
-.IP \-#test
-list engine self\-tests (run one with \-#test=NAME [args])
+.IP \-#0
+filter test (\-#0 '*.gif' 'www.bar.com/foo.gif') (\-\-debug\-testfilters <param>)
+.IP \-#1
+simplify test (\-#1 ./foo/bar/../foobar)
+.IP \-#2
+type test (\-#2 /foo/bar.php)
 .IP \-#C
 cache list (\-#C '*.com/spider*.gif' (\-\-debug\-cache <param>)
 .IP \-#R
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -56,7 +56,7 @@ whttrackrundir = $(bindir)
 whttrackrun_SCRIPTS = webhttrack

 libhttrack_la_SOURCES =  htscore.c htsparse.c htsback.c htscache.c \
-	htscache_selftest.c htsdns_selftest.c htsselftest.c \
+	htscache_selftest.c htsdns_selftest.c \
 	htscatchurl.c htsfilters.c htsftp.c htshash.c coucal/coucal.c \
 	htshelp.c htslib.c htscoremain.c \
 	htsname.c htsrobots.c htstools.c htswizard.c \
@@ -66,7 +66,7 @@ libhttrack_la_SOURCES =  htscore.c htsparse.c htsback.c htscache.c \
 	md5.c \
 	minizip/ioapi.c minizip/mztools.c minizip/unzip.c minizip/zip.c \
 	hts-indextmpl.h htsalias.h htsback.h htsbase.h htssafe.h \
-	htsbasenet.h htsbauth.h htscache.h htscache_selftest.h htsdns_selftest.h htsselftest.h htscatchurl.h  \
+	htsbasenet.h htsbauth.h htscache.h htscache_selftest.h htsdns_selftest.h htscatchurl.h  \
 	htsconfig.h htscore.h htsparse.h htscoremain.h htsdefines.h  \
 	htsfilters.h htsftp.h htsglobal.h htshash.h coucal/coucal.h \
 	htshelp.h htsindex.h htslib.h htsmd5.h \
--- a/src/htsalias.c
+++ b/src/htsalias.c
@@ -60,9 +60,6 @@ Please visit our Website: http://www.httrack.com
  param1 : this option must be alone, and needs one distinct parameter (-P <path>)
  param0 : this option must be alone, but the parameter should be put together (+*.gif)
 */
-/* clang-format off: hand-aligned table; clang-format reflows the whole
-   initializer (2->4 space) on any edit, churning every untouched row. */
-/* clang-format off */
 const char *hts_optalias[][4] = {
  /*   {"","","",""}, */
  {"path", "-O", "param1", "output path"},
@@ -110,12 +107,6 @@ const char *hts_optalias[][4] = {
  {"disable-passwords", "-%x", "single", ""}, {"disable-password", "-%x",
                                               "single", ""},
  {"include-query-string", "-%q", "single", ""},
-  {"strip-query", "-%g", "param1",
-   "strip [host/pattern=]key1,key2,... from URLs"},
-  {"cookies-file", "-%K", "param1",
-   "load extra cookies from a Netscape cookies.txt"},
-  {"pause", "-%G", "param1",
-   "random pause of MIN[:MAX] seconds between files"},
  {"generate-errors", "-o", "single", ""},
  {"do-not-generate-errors", "-o0", "single", ""},
  {"purge-old", "-X", "param", ""},
@@ -132,9 +123,6 @@ const char *hts_optalias[][4] = {
  {"tolerant", "-%B", "single", ""},
  {"updatehack", "-%s", "single", ""}, {"sizehack", "-%s", "single", ""},
  {"urlhack", "-%u", "single", ""},
-  {"keep-www-prefix", "-%j", "single", ""},
-  {"keep-double-slashes", "-%o", "single", ""},
-  {"keep-query-order", "-%y", "single", ""},
  {"user-agent", "-F", "param1", "user-agent identity"},
  {"referer", "-%R", "param1", "default referer URL"},
  {"from", "-%E", "param1", "from email address"},
@@ -253,7 +241,6 @@ const char *hts_optalias[][4] = {

  {"", "", "", ""}
 };
-/* clang-format on */

 /* 
  Check for alias in command-line 
--- a/src/htsback.c
+++ b/src/htsback.c
@@ -57,10 +57,7 @@ Please visit our Website: http://www.httrack.com
 // DOS
 #include <process.h>            /* _beginthread, _endthread */
 #endif
-#include <io.h> /* _chsize_s */
-#define HTS_FTRUNCATE(fp, sz) _chsize_s(_fileno(fp), (sz))
 #else
-#define HTS_FTRUNCATE(fp, sz) ftruncate(fileno(fp), (sz))
 #endif

 #define VT_CLREOL       "\33[K"
@@ -3766,27 +3763,7 @@ void back_wait(struct_back * sback, httrackp * opt, cache_back * cache,
                    }
 #endif
 /********** **************************** ********** */
-                  }
-                  // MIME type excluded by a -mime: filter: abort, don't fetch
-                  // the body (#58)
-                  else if (HTTP_IS_OK(back[i].r.statuscode) &&
-                           !back[i].testmode &&
-                           strnotempty(back[i].r.contenttype) &&
-                           hts_acceptmime(opt, 0, back[i].url_adr,
-                                          back[i].url_fil,
-                                          back[i].r.contenttype) == 1) {
-                    deletehttp(&back[i].r);
-                    back[i].r.soc = INVALID_SOCKET;
-                    back[i].status = STATUS_READY;
-                    back_set_finished(sback, i);
-                    back[i].r.statuscode = STATUSCODE_EXCLUDED;
-                    strcpybuff(back[i].r.msg, "Excluded by MIME type filter");
-                    hts_log_print(
-                        opt, LOG_NOTICE,
-                        "File excluded by MIME type filter (%s): %s%s",
-                        back[i].r.contenttype, back[i].url_adr,
-                        back[i].url_fil);
-                  } else { // il faut aller le chercher
+                  } else {      // il faut aller le chercher

                    // effacer buffer (requète)
                    if (!noFreebuff) {
@@ -3797,70 +3774,35 @@ void back_wait(struct_back * sback, httrackp * opt, cache_back * cache,
                    // xxc SI CHUNK VERIFIER QUE CA MARCHE??
                    if (back[i].r.statuscode == 206) {  // on nous envoie un morceau (la fin) coz une partie sur disque!
                      off_t sz = fsize_utf8(back[i].url_sav);
-                      /* RFC 7233: resume at the server's Content-Range start,
-                         not the offset we requested; a server may resume
-                         earlier and appending the overlap duplicates bytes
-                         (#198). */
-                      const LLint resume = back[i].r.crange_start;
-                      const hts_boolean range_ok =
-                          back[i].r.crange > 0 && resume >= 0 &&
-                          resume <= (LLint) sz &&
-                          back[i].r.crange_end + 1 == back[i].r.crange &&
-                          (back[i].r.totalsize < 0 ||
-                           back[i].r.totalsize ==
-                               back[i].r.crange_end - resume + 1);

 #if HDEBUG
                      printf("partial content: " LLintP " on disk..\n",
                             (LLint) sz);
 #endif
-                      if (sz >= 0 && range_ok) {
+                      if (sz >= 0) {
                        if (!is_hypertext_mime(opt, back[i].r.contenttype, back[i].url_sav)) {  // pas HTML
                          if (opt->getmode & HTS_GETMODE_NONHTML) {
                            filenote(&opt->state.strc, back[i].url_sav, NULL);  // noter fichier comme connu
                            file_notify(opt, back[i].url_adr, back[i].url_fil,
                                        back[i].url_sav, 0, 1,
                                        back[i].r.notmodified);
-                            back[i].r.out =
-                                FOPEN(fconv(catbuff, sizeof(catbuff),
-                                            back[i].url_sav),
-                                      "r+b"); // resume in place
+                            back[i].r.out = FOPEN(fconv(catbuff, sizeof(catbuff), back[i].url_sav), "ab");       // append
                            if (back[i].r.out && opt->cache != 0) {
-                              back[i].r.is_write = 1;
-                              back[i].r.size = resume; // bytes already on disk
-                              back[i].r.statuscode = HTTP_OK; // force 'OK'
+                              back[i].r.is_write = 1;   // écrire
+                              back[i].r.size = sz;      // déja écrit
+                              back[i].r.statuscode = HTTP_OK;   // Forcer 'OK'
                              if (back[i].r.totalsize >= 0)
-                                back[i].r.totalsize += resume; // -> full size
-                              // drop bytes past the resume point; a silent
-                              // failure could leave a stale tail, so on error
-                              // drop the partial and refetch the whole file
-                              if (HTS_FTRUNCATE(back[i].r.out,
-                                                (off_t) resume) != 0) {
-                                fclose(back[i].r.out);
-                                back[i].r.out = NULL;
-                                url_savename_refname_remove(
-                                    opt, back[i].url_adr, back[i].url_fil);
-                                UNLINK(back[i].url_sav);
-                                back[i].status = STATUS_READY;
-                                back_set_finished(sback, i);
-                                strcpybuff(back[i].r.msg,
-                                           "Can not truncate partial file, "
-                                           "restarting");
-                              } else {
-                                fseeko(back[i].r.out, (off_t) resume, SEEK_SET);
-                                /* create a temporary reference file in case of
-                                 * broken mirror */
-                                if (back_serialize_ref(opt, &back[i]) != 0) {
-                                  hts_log_print(opt, LOG_WARNING,
-                                                "Could not create temporary "
-                                                "reference file for %s%s",
-                                                back[i].url_adr,
-                                                back[i].url_fil);
-                                }
-#if HDEBUG
-                                printf("continue interrupted file\n");
-#endif
+                                back[i].r.totalsize += sz;      // plus en fait
+                              fseek(back[i].r.out, 0, SEEK_END);        // à la fin
+                              /* create a temporary reference file in case of broken mirror */
+                              if (back_serialize_ref(opt, &back[i]) != 0) {
+                                hts_log_print(opt, LOG_WARNING,
+                                              "Could not create temporary reference file for %s%s",
+                                              back[i].url_adr, back[i].url_fil);
                              }
+#if HDEBUG
+                              printf("continue interrupted file\n");
+#endif
                            } else {    // On est dans la m**
                              back[i].status = STATUS_READY;    // terminé (voir plus loin)
                              back_set_finished(sback, i);
@@ -3872,18 +3814,17 @@ void back_wait(struct_back * sback, httrackp * opt, cache_back * cache,
                          FILE *fp =
                            FOPEN(fconv(catbuff, sizeof(catbuff), back[i].url_sav), "rb");
                          if (fp) {
-                            LLint alloc_mem = resume + 1;
+                            LLint alloc_mem = sz + 1;

                            if (back[i].r.totalsize >= 0)
                              alloc_mem += back[i].r.totalsize; // AJOUTER RESTANT!
                            if (deleteaddr(&back[i].r)
                                && (back[i].r.adr =
                                    (char *) malloct((size_t) alloc_mem))) {
-                              back[i].r.size = resume;
+                              back[i].r.size = sz;
                              if (back[i].r.totalsize >= 0)
-                                back[i].r.totalsize += resume; // -> full size
-                              if ((fread(back[i].r.adr, 1, (size_t) resume,
-                                         fp)) != (size_t) resume) {
+                                back[i].r.totalsize += sz;      // plus en fait
+                              if ((fread(back[i].r.adr, 1, sz, fp)) != sz) {
                                back[i].status = STATUS_READY;  // terminé (voir plus loin)
                                back_set_finished(sback, i);
                                strcpybuff(back[i].r.msg,
@@ -3901,30 +3842,14 @@ void back_wait(struct_back * sback, httrackp * opt, cache_back * cache,
                                         "No memory for partial file");
                            }
                            fclose(fp);
-                          } else {                              // open failed
+                          } else {      // Argh.. 
                            back[i].status = STATUS_READY;      // terminé (voir plus loin)
                            back_set_finished(sback, i);
                            strcpybuff(back[i].r.msg,
                                       "Can not open partial file");
                          }
                        }
-                      } else if (sz >=
-                                 0) { // unusable range -> restart whole file
-                        hts_log_print(opt, LOG_WARNING,
-                                      "Unusable partial-content range for %s%s "
-                                      "(have " LLintP " bytes, got " LLintP
-                                      "-" LLintP "/" LLintP "), restarting",
-                                      back[i].url_adr, back[i].url_fil,
-                                      (LLint) sz, back[i].r.crange_start,
-                                      back[i].r.crange_end, back[i].r.crange);
-                        url_savename_refname_remove(opt, back[i].url_adr,
-                                                    back[i].url_fil);
-                        UNLINK(back[i].url_sav);
-                        back[i].status = STATUS_READY;
-                        back_set_finished(sback, i);
-                        strcpybuff(back[i].r.msg,
-                                   "Unusable partial content, restarting");
-                      } else {                          // partial not found
+                      } else {  // Non trouvé??
                        back[i].status = STATUS_READY;  // terminé (voir plus loin)
                        back_set_finished(sback, i);
                        strcpybuff(back[i].r.msg, "Can not find partial file");
@@ -4005,6 +3930,7 @@ void back_wait(struct_back * sback, httrackp * opt, cache_back * cache,

                      }
                    }
+
                  }

                  /*} */
--- a/src/htsbasenet.h
+++ b/src/htsbasenet.h
@@ -146,8 +146,7 @@ typedef enum BackStatusCode {
  STATUSCODE_NON_FATAL = -5,
  STATUSCODE_SSL_HANDSHAKE = -6,
  STATUSCODE_TOO_BIG = -7,
-  STATUSCODE_TEST_OK = -10,
-  STATUSCODE_EXCLUDED = -11 /* aborted: MIME excluded by a -mime: filter */
+  STATUSCODE_TEST_OK = -10
 } BackStatusCode;

 /** HTTrack status ('status' member of of 'lien_back') **/
--- a/src/htsbasiccharsets.sh
+++ b/src/htsbasiccharsets.sh
@@ -3,12 +3,12 @@

 # Change this to download files
 if false; then
-    echo "mget https://www.unicode.org/Public/MAPPINGS/ISO8859/8859-*.TXT" | lftp
-    echo "mget https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP*.TXT" | lftp
-    echo "mget https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP*.TXT" | lftp
-    echo "mget https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP*.TXT" | lftp
-    echo "mget https://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/CP*.TXT" | lftp
-    echo "mget https://www.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8*.TXT" | lftp
+    echo "mget ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-*.TXT" | lftp
+    echo "mget ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP*.TXT" | lftp
+    echo "mget ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP*.TXT" | lftp
+    echo "mget ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP*.TXT" | lftp
+    echo "mget ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MISC/CP*.TXT" | lftp
+    echo "mget ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8*.TXT" | lftp
    rm -f CP932.TXT CP936.TXT CP949.TXT CP950.TXT
 fi

--- a/src/htscache.c
+++ b/src/htscache.c
@@ -220,25 +220,6 @@ struct cache_back_zip_entry {
 	} \
 } while(0)

-/* A cache (new.zip) write failed: storage is gone (disk full / dropped share),
-   so the mirror is doomed too. Abort it via exit_xh, don't crash as assertf
-   did. */
-static void cache_zip_write_failed(httrackp *opt, cache_back *cache,
-                                   const char *what, int zErr) {
-  if (!cache->zipWriteFailed) {
-    cache->zipWriteFailed = HTS_TRUE;
-    if (check_fatal_io_errno()) {
-      hts_log_print(opt, LOG_ERROR,
-                    "Mirror aborted: disk full or filesystem problems");
-    } else {
-      hts_log_print(opt, LOG_ERROR,
-                    "Mirror aborted: cache write failed (%s): %s", what,
-                    hts_get_zerror(zErr));
-    }
-  }
-  opt->state.exit_xh = -1; /* fatal: stop the mirror, exit non-zero */
-}
-
 /* Ajout d'un fichier en cache */
 void cache_add(httrackp * opt, cache_back * cache, const htsblk * r,
               const char *url_adr, const char *url_fil, const char *url_save,
@@ -255,10 +236,6 @@ void cache_add(httrackp * opt, cache_back * cache, const htsblk * r,
  const char *url_save_suffix = url_save;
  int zErr;

-  /* already failed and aborting; don't touch the broken stream again */
-  if (cache->zipWriteFailed)
-    return;
-
  // robots.txt hack
  if (url_save == NULL) {
    dataincache = 0;            // testing links
@@ -369,8 +346,9 @@ void cache_add(httrackp * opt, cache_back * cache, const htsblk * r,
                                   */
                                  headers, (uInt) strlen(headers), NULL, 0, NULL,       /* comment */
                                  Z_DEFLATED, Z_DEFAULT_COMPRESSION)) != Z_OK) {
-    cache_zip_write_failed(opt, cache, "opening a cache entry", zErr);
-    return;
+    int zip_zipOpenNewFileInZip_failed = 0;
+
+    assertf(zip_zipOpenNewFileInZip_failed);
  }

  /* Write data in cache */
@@ -380,8 +358,9 @@ void cache_add(httrackp * opt, cache_back * cache, const htsblk * r,
        if ((zErr =
             zipWriteInFileInZip((zipFile) cache->zipOutput, r->adr,
                                 (int) r->size)) != Z_OK) {
-          cache_zip_write_failed(opt, cache, "writing to the cache", zErr);
-          return;
+          int zip_zipWriteInFileInZip_failed = 0;
+
+          assertf(zip_zipWriteInFileInZip_failed);
        }
      }
    } else {
@@ -402,10 +381,9 @@ void cache_add(httrackp * opt, cache_back * cache, const htsblk * r,
              if ((zErr =
                   zipWriteInFileInZip((zipFile) cache->zipOutput, buff,
                                       (int) nl)) != Z_OK) {
-                cache_zip_write_failed(opt, cache, "writing to the cache",
-                                       zErr);
-                fclose(fp);
-                return;
+                int zip_zipWriteInFileInZip_failed = 0;
+
+                assertf(zip_zipWriteInFileInZip_failed);
              }
            }
          } while(nl > 0);
@@ -419,14 +397,16 @@ void cache_add(httrackp * opt, cache_back * cache, const htsblk * r,

  /* Close */
  if ((zErr = zipCloseFileInZip((zipFile) cache->zipOutput)) != Z_OK) {
-    cache_zip_write_failed(opt, cache, "closing a cache entry", zErr);
-    return;
+    int zip_zipCloseFileInZip_failed = 0;
+
+    assertf(zip_zipCloseFileInZip_failed);
  }

  /* Flush */
  if ((zErr = zipFlush((zipFile) cache->zipOutput)) != 0) {
-    cache_zip_write_failed(opt, cache, "flushing the cache", zErr);
-    return;
+    int zip_zipFlush_failed = 0;
+
+    assertf(zip_zipFlush_failed);
  }
 }

--- a/src/htscache_selftest.c
+++ b/src/htscache_selftest.c
@@ -47,7 +47,6 @@ Please visit our Website: http://www.httrack.com
 #include "htslib.h"
 #include "htszlib.h"

-#include <errno.h>
 #include <stdio.h>
 #include <string.h>

@@ -317,136 +316,6 @@ static int disk_fallback_selftest(httrackp *opt) {
  return fail;
 }

-typedef struct {
-  size_t budget;  /**< bytes allowed through before writes start failing */
-  int fail_errno; /**< errno set on the failing write (ENOSPC, EIO, ...) */
-  int writes;     /**< zwrite call count, to detect re-entry into the stream */
-} writefail_inject;
-
-/* zwrite that copies until the budget runs out, then fails with inj->fail_errno
-   (the #174/#219 condition). Counts calls so the test can prove a flagged cache
-   never re-enters the stream. */
-static uLong selftest_failing_zwrite(voidpf opaque, voidpf stream,
-                                     const void *buf, uLong size) {
-  writefail_inject *inj = (writefail_inject *) opaque;
-
-  inj->writes++;
-  if (inj->budget >= (size_t) size) {
-    inj->budget -= (size_t) size;
-    return (uLong) fwrite(buf, 1, (size_t) size, (FILE *) stream);
-  }
-  errno = inj->fail_errno;
-  return 0; /* short write -> the minizip op returns an error */
-}
-
-/* Open a ZIP whose writes fail past inj->budget, so cache_add() hits an error.
- */
-static zipFile selftest_open_failing_zip(const char *path,
-                                         writefail_inject *inj) {
-  zlib_filefunc_def ff;
-
-  fill_fopen_filefunc(&ff); /* real fopen/read/seek/close; ignores opaque */
-  ff.zwrite_file = selftest_failing_zwrite;
-  ff.opaque = inj;
-  return zipOpen2(path, APPEND_STATUS_CREATE, NULL, &ff);
-}
-
-/* Store one octet-stream body into `cache` (all-in-cache, body in the ZIP). */
-static void writefail_store(httrackp *opt, cache_back *cache, const char *fil,
-                            const char *body, size_t body_len) {
-  htsblk r;
-  char locbuf[4];
-  char *bodycopy = malloct(body_len);
-
-  hts_init_htsblk(&r);
-  r.statuscode = 200;
-  r.size = (LLint) body_len;
-  strcpybuff(r.msg, "OK");
-  strcpybuff(r.contenttype, "application/octet-stream");
-  locbuf[0] = '\0';
-  r.location = locbuf;
-  r.is_write = 0;
-  memcpy(bodycopy, body, body_len);
-  r.adr = bodycopy;
-  cache_add(opt, cache, &r, "example.com", fil, "example.com/blob.bin", 1,
-            NULL);
-  freet(bodycopy);
-}
-
-/* #174/#219: a failing cache write used to crash via assertf(); it must instead
-   stop the mirror (exit_xh = -1) without crashing. Assert that, plus the cache
-   is flagged and a sibling write doesn't re-enter the broken stream. */
-int cache_write_failure_selftest(httrackp *opt, const char *dir) {
-  int fail = 0;
-  char path[HTS_URLMAXSIZE];
-  /* incompressible + big, so deflate flushes (and fails) mid-write, before
-   * close */
-  static const size_t body_len = 256 * 1024;
-  char *body = malloct(body_len);
-  int phase;
-
-  gen_body(body, body_len, 1 /* incompressible */);
-  fconcat(path, sizeof(path), dir, "/wfail.zip");
-
-  /* phase 0: fail on the body write, fatal errno (ENOSPC, the disk-full
-     branch). phase 1: fail on the open, non-fatal errno (EIO, dropped-share
-     branch). Both must abort the mirror. */
-  for (phase = 0; phase < 2; phase++) {
-    cache_back cache;
-    writefail_inject inj;
-    int writes_after_fail;
-
-    inj.budget = (phase == 0) ? 4096 : 0;
-    inj.fail_errno = (phase == 0) ? ENOSPC : EIO;
-    inj.writes = 0;
-    memset(&cache, 0, sizeof(cache));
-    cache.type = 1;
-    cache.log = stderr;
-    cache.errlog = stderr;
-    cache.hashtable = coucal_new(0);
-    cache.zipOutput = selftest_open_failing_zip(path, &inj);
-    if (cache.zipOutput == NULL) {
-      fprintf(stderr, "cache-writefail: could not open injected ZIP\n");
-      fail++;
-      continue;
-    }
-
-    opt->state.exit_xh = 0; /* clear; the failing write must set it to -1 */
-    writefail_store(opt, &cache, "/blob.bin", body, body_len);
-    if (!cache.zipWriteFailed) {
-      fprintf(stderr, "cache-writefail: phase %d: write error not caught\n",
-              phase);
-      fail++;
-    }
-    if (opt->state.exit_xh != -1) {
-      fprintf(stderr,
-              "cache-writefail: phase %d: mirror not aborted (exit_xh=%d)\n",
-              phase, opt->state.exit_xh);
-      fail++;
-    }
-
-    /* a flagged cache must no-op a sibling write: no further backend write */
-    writes_after_fail = inj.writes;
-    writefail_store(opt, &cache, "/blob2.bin", body, 16);
-    if (inj.writes != writes_after_fail) {
-      fprintf(stderr,
-              "cache-writefail: phase %d: sibling write re-entered the broken "
-              "stream (%d extra backend writes)\n",
-              phase, inj.writes - writes_after_fail);
-      fail++;
-    }
-
-    if (cache.zipOutput != NULL) {
-      zipClose(cache.zipOutput,
-               NULL); /* best-effort; may fail on the backend */
-      cache.zipOutput = NULL;
-    }
-  }
-
-  freet(body);
-  return fail;
-}
-
 int cache_selftests(httrackp *opt, const char *dir) {
  int failures = 0;
  cache_back cache;
--- a/src/htscache_selftest.h
+++ b/src/htscache_selftest.h
@@ -52,10 +52,6 @@ int cache_selftests(httrackp *opt, const char *dir);
   committed file, never by the test). Returns the failed-check count. */
 int cache_golden_selftest(httrackp *opt, const char *dir, int regen);

-/* #174/#219: assert a failing cache write aborts the mirror cleanly instead of
-   crashing. Returns the failed-check count. */
-int cache_write_failure_selftest(httrackp *opt, const char *dir);
-
 #endif

 #endif
--- a/src/htscore.c
+++ b/src/htscore.c
@@ -35,7 +35,6 @@ Please visit our Website: http://www.httrack.com

 #include <fcntl.h>
 #include <ctype.h>
-#include <stdint.h> /* uint64_t for the pause mixer (already a hard dep via md5.h) */

 /* File defs */
 #include "htscore.h"
@@ -524,12 +523,9 @@ int httpmirror(char *url1, httrackp * opt) {
    opt->cookie = &cookie;
    cookie.max_len = 30000;     // max len
    strcpybuff(cookie.data, "");
-    // Load the mirror's cookies.txt, then the one in the current directory
+    // Charger cookies.txt par défaut ou cookies.txt du miroir
    cookie_load(opt->cookie, StringBuff(opt->path_log), "cookies.txt");
    cookie_load(opt->cookie, "", "cookies.txt");
-    // A user-supplied cookie file is merged last so it wins on conflicts
-    if (strnotempty(StringBuff(opt->cookies_file)))
-      cookie_load(opt->cookie, "", StringBuff(opt->cookies_file));
  } else
    opt->cookie = NULL;

@@ -740,39 +736,26 @@ int httpmirror(char *url1, httrackp * opt) {
    /* OPTIMIZED for fast load */
    if (StringNotEmpty(opt->filelist)) {
      char *filelist_buff = NULL;
-      size_t filelist_sz = 0;
-      const char *filelist_err = NULL; /* failure reason, NULL on success */
-      const off_t fs = fsize(StringBuff(opt->filelist));
+      const size_t filelist_sz = off_t_to_size_t(fsize(StringBuff(opt->filelist)));

-      if (fs < 0) {
-        /* fsize() hides the cause; redo stat() for a precise errno (#49) */
-        struct stat st;
-        filelist_err = stat(StringBuff(opt->filelist), &st) != 0
-                           ? strerror(errno)
-                           : "not a regular file";
-      } else if ((filelist_sz = off_t_to_size_t(fs)) == (size_t) -1) {
-        filelist_err = "file too large";
-        filelist_sz = 0;
-      } else {
+      if (filelist_sz != (size_t) -1) {
        FILE *fp = fopen(StringBuff(opt->filelist), "rb");

-        if (fp == NULL) {
-          filelist_err = strerror(errno);
-        } else {
+        if (fp) {
          filelist_buff = malloct(filelist_sz + 1);
-          if (filelist_buff == NULL) {
-            filelist_err = "out of memory";
-          } else if (fread(filelist_buff, 1, filelist_sz, fp) != filelist_sz) {
-            freet(filelist_buff);
-            filelist_err = "read error";
-          } else {
-            filelist_buff[filelist_sz] = '\0';
+          if (filelist_buff) {
+            if (fread(filelist_buff, 1, filelist_sz, fp) != filelist_sz) {
+              freet(filelist_buff);
+              filelist_buff = NULL;
+            } else {
+              *(filelist_buff + filelist_sz) = '\0';
+            }
          }
          fclose(fp);
        }
      }

-      if (filelist_buff != NULL) {
+      if (filelist_buff) {
        int filelist_ptr = 0;
        int n = 0;
        char BIGSTK line[HTS_URLMAXSIZE * 2];
@@ -797,8 +780,8 @@ int httpmirror(char *url1, httrackp * opt) {
        // Free buffer
        freet(filelist_buff);
      } else {
-        hts_log_print(opt, LOG_ERROR, "Could not include URL list \"%s\": %s",
-                      StringBuff(opt->filelist), filelist_err);
+        hts_log_print(opt, LOG_ERROR, "Could not include URL list: %s",
+                      StringBuff(opt->filelist));
      }
    }

@@ -3315,21 +3298,6 @@ HTS_INLINE int back_fillmax(struct_back * sback, httrackp * opt,
  return -1;                    /* plus de place */
 }

-/* Seed-derived: stable within a gap, rerolls per launch; a per-call rand()
-   would bias the delay toward min_ms (see header). Jitter, not crypto. */
-int hts_pause_target_ms(TStamp seed, int min_ms, int max_ms) {
-  uint64_t z = (uint64_t) seed;
-
-  if (max_ms <= min_ms)
-    return min_ms;
-  /* SplitMix64 finalizer: scrambles the low-entropy ms timestamp. */
-  z += 0x9E3779B97F4A7C15ULL;
-  z = (z ^ (z >> 30)) * 0xBF58476D1CE4E5B9ULL;
-  z = (z ^ (z >> 27)) * 0x94D049BB133111EBULL;
-  z ^= z >> 31;
-  return min_ms + (int) (z % (uint64_t) (max_ms - min_ms + 1));
-}
-
 int back_pluggable_sockets_strict(struct_back * sback, httrackp * opt) {
  int n = opt->maxsoc - back_nsoc(sback);

@@ -3350,18 +3318,6 @@ int back_pluggable_sockets_strict(struct_back * sback, httrackp * opt) {
    }
  }

-  // #185 randomized inter-file pause: non-blocking, one launch per gap
-  if (n > 0 && opt->pause_max_ms > 0 && HTS_STAT.last_connect > 0) {
-    TStamp opTime =
-        HTS_STAT.last_request ? HTS_STAT.last_request : HTS_STAT.last_connect;
-    TStamp lap = mtime_local() - opTime;
-
-    if (lap < hts_pause_target_ms(opTime, opt->pause_min_ms, opt->pause_max_ms))
-      n = 0;
-    else
-      n = 1;
-  }
-
  return n;
 }

@@ -3770,17 +3726,6 @@ HTSEXT_API int copy_htsopt(const httrackp * from, httrackp * to) {
  if (StringNotEmpty(from->user_agent))
    StringCopyS(to->user_agent, from->user_agent);

-  if (StringNotEmpty(from->strip_query))
-    StringCopyS(to->strip_query, from->strip_query);
-
-  if (StringNotEmpty(from->cookies_file))
-    StringCopyS(to->cookies_file, from->cookies_file);
-
-  if (from->pause_max_ms > 0) {
-    to->pause_min_ms = from->pause_min_ms;
-    to->pause_max_ms = from->pause_max_ms;
-  }
-
  if (from->retry > -1)
    to->retry = from->retry;

--- a/src/htscore.h
+++ b/src/htscore.h
@@ -214,8 +214,6 @@ struct cache_back {
  cache_back_zip_entry *zipEntries;
  int zipEntriesOffs;
  int zipEntriesCapa;
-  hts_boolean
-      zipWriteFailed; /**< a cache write failed; stop touching the stream */
 };

 #ifndef HTS_DEF_FWSTRUCT_hash_struct
@@ -234,12 +232,8 @@ struct hash_struct {
  coucal adrfil;
  /* former address+path -> link index (renamed/moved entries) */
  coucal former_adrfil;
-  /* effective urlhack sub-flags: www.==host / // collapse / query-arg sort */
-  hts_boolean norm_host;
-  hts_boolean norm_slash;
-  hts_boolean norm_query;
-  /* query-strip keys (not owned); set from opt->strip_query at hash_init */
-  const char *strip_query;
+  /* scratch buffers reused across lookups (not reentrant) */
+  int normalized;
  char normfil[HTS_URLMAXSIZE * 2];
  char normfil2[HTS_URLMAXSIZE * 2];
  char catbuff[CATBUFF_SIZE];
@@ -368,22 +362,6 @@ int fspc(httrackp * opt, FILE * fp, const char *type);

 char *next_token(char *p, int flag);

-/* Like fil_normalized(), but first drops query keys in STRIP (comma-separated,
-   "*" = all); STRIP NULL/empty behaves exactly like fil_normalized(). */
-char *fil_normalized_filtered(const char *source, char *dest,
-                              const char *strip);
-
-/* As fil_normalized_filtered(), but DO_SLASH/DO_QUERY gate the // collapse and
-   the query-argument sort independently (the urlhack sub-flags). */
-char *fil_normalized_filtered_ex(const char *source, char *dest,
-                                 const char *strip, int do_slash, int do_query);
-
-/* For URL ADR/FIL, return (in DEST) the comma keylist to strip from the
-   '\n'-separated "[pattern=]keys" RULES (patterns matched on host/path via
-   strjoker, last wins); NULL if none match. Feeds fil_normalized_filtered(). */
-const char *hts_query_strip_keys(const char *rules, const char *adr,
-                                 const char *fil, char *dest, size_t destsize);
-
 /* Read a whole file into a freshly malloc'd, NUL-terminated buffer; the caller
   owns it and must release it with freet(). Return NULL on missing/unreadable
   file (readfile_or substitutes defaultdata instead). The byte content is NOT
@@ -418,10 +396,6 @@ int back_pluggable_sockets(struct_back * sback, httrackp * opt);

 int back_pluggable_sockets_strict(struct_back * sback, httrackp * opt);

-/* Randomized inter-file pause target in [min_ms,max_ms] (#185), derived from a
-   timestamp seed so it is stable within one gap and rerolls per launch. */
-int hts_pause_target_ms(TStamp seed, int min_ms, int max_ms);
-
 /* Schedule more links from the heap into free slots. Returns the number queued,
   or <=0 if none could be added (no free slot / paused / stopped). */
 int back_fill(struct_back * sback, httrackp * opt, cache_back * cache,
--- a/src/htscoremain.c
+++ b/src/htscoremain.c
--- a/src/htsencoding.c
+++ b/src/htsencoding.c
@@ -30,14 +30,12 @@ Please visit our Website: http://www.httrack.com
 /* Author: Xavier Roche                                         */
 /* ------------------------------------------------------------ */

-#include <stdint.h>
-
 #include "htscharset.h"
 #include "htsencoding.h"
 #include "htssafe.h"

-/* static int decode_entity(const uint64_t hash, const size_t len);
- */
+/* static int decode_entity(const unsigned int hash, const size_t len);
+*/
 #include "htsentities.h"

 /* hexadecimal conversion */
@@ -52,31 +50,30 @@ static int get_hex_value(char c) {
    return -1;
 }

-/* 64-bit FNV-1a; must match htsentities.sh, which keys the entity table on it.
- */
-#define HASH_INIT 0xcbf29ce484222325ULL
-#define HASH_PRIME 0x100000001b3ULL
-#define HASH_ADD(HASH, C)                                                      \
-  do {                                                                         \
-    (HASH) ^= (unsigned char) (C);                                             \
-    (HASH) *= HASH_PRIME;                                                      \
-  } while (0)
+/* Numerical Recipes,
+   see <http://en.wikipedia.org/wiki/Linear_congruential_generator> */
+#define HASH_PRIME ( 1664525 )
+#define HASH_CONST ( 1013904223 )
+#define HASH_ADD(HASH, C) do {                  \
+    (HASH) *= HASH_PRIME;                       \
+    (HASH) += HASH_CONST;                       \
+    (HASH) += (C);                              \
+  } while(0)

 int hts_unescapeEntitiesWithCharset(const char *src, char *dest, const size_t max, const char *charset) {
  size_t i, j, ampStart, ampStartDest;
  int uc;
  int hex;
-  uint64_t hash;
+  unsigned int hash;

  assertf(max != 0);
-  for (i = 0, j = 0, ampStart = (size_t) -1, ampStartDest = 0, uc = -1, hex = 0,
-      hash = HASH_INIT;
-       src[i] != '\0'; i++) {
+  for(i = 0, j = 0, ampStart = (size_t) -1, ampStartDest = 0,
+        uc = -1, hex = 0, hash = 0 ; src[i] != '\0' ; i++) {
    /* start of entity */
    if (src[i] == '&') {
      ampStart = i;
      ampStartDest = j;
-      hash = HASH_INIT;
+      hash = 0;
      uc = -1;
    }
    /* inside a potential entity */
@@ -177,11 +174,14 @@ int hts_unescapeEntitiesWithCharset(const char *src, char *dest, const size_t ma
      }
      /* alphanumerical entity */
      else {
-        /* alphanum, capped at the longest name
-         * '&CounterClockwiseContourIntegral;' (31) */
-        if (i <= ampStart + 31 && ((src[i] >= '0' && src[i] <= '9') ||
-                                   (src[i] >= 'A' && src[i] <= 'Z') ||
-                                   (src[i] >= 'a' && src[i] <= 'z'))) {
+        /* alphanum and not too far ('&thetasym;' is the longest) */
+        if (i <= ampStart + 10 &&
+            (
+             (src[i] >= '0' && src[i] <= '9')
+             || (src[i] >= 'A' && src[i] <= 'Z')
+             || (src[i] >= 'a' && src[i] <= 'z')
+             )
+            ) {
          /* compute hash */
          HASH_ADD(hash, (unsigned char) src[i]);
        } else {
--- a/src/htsentities.h
+++ b/src/htsentities.h
--- a/src/htsentities.sh
+++ b/src/htsentities.sh
@@ -1,92 +1,75 @@
 #!/bin/bash
 #
-# Regenerate htsentities.h from the WHATWG named character references.

-set -euo pipefail
-
-src=entities.json
-url=https://html.spec.whatwg.org/entities.json
+src=html40.txt
+url=http://www.w3.org/TR/1998/REC-html40-19980424/html40.txt
 dest=htsentities.h

-# 64-bit FNV-1a of $1, printed as a C constant. Must match the hash in
-# htsencoding.c. The offset basis is stored as its wrapped (signed) bit pattern;
-# bash arithmetic is 64-bit two's complement, so the result is bit-exact.
-fnv1a() {
-    local s=$1 i c h=$((0xcbf29ce484222325))
-    for ((i = 0; i < ${#s}; i++)); do
-        printf -v c '%d' "'${s:i:1}"
-        h=$(((h ^ (c & 0xff)) * 0x100000001b3))
-    done
-    printf '0x%016xULL' "$h"
-}
+(
+    cat <<EOF
+/*
+  -- ${dest} --
+  FILE GENERATED BY $0, DO NOT MODIFY

-if [ ! -f "$src" ]; then
-    curl -fsS "$url" -o "$src"
-fi
+  We compute the LCG hash
+  (see <http://en.wikipedia.org/wiki/Linear_congruential_generator>)
+  for each entity. We should in theory check using strncmp() that we
+  actually have the correct entity, but this is actually statistically
+  not needed.

-# Keep ';'-terminated single-codepoint names; the ~93 multi-codepoint refs can't
-# fit decode_entity's single-codepoint return and are skipped (left verbatim).
-pairs=$(jq -r '
-    to_entries
-    | map(select((.key | endswith(";")) and (.value.codepoints | length == 1)))
-    | sort_by(.key)
-    | .[] | "\(.key | ltrimstr("&") | rtrimstr(";"))\t\(.value.codepoints[0])"' "$src")
+  We may want to do better, but we expect the hash function to be uniform, and
+  let the compiler be smart enough to optimize the switch (for example by
+  checking in log2() intervals)
+  
+  This code has been generated using the evil $0 script.
+*/

-# Skipped multi-codepoint names, kept to prove none aliases an emitted hash.
-skipped=$(jq -r '
-    to_entries
-    | map(select((.key | endswith(";")) and (.value.codepoints | length > 1)))
-    | .[] | .key | ltrimstr("&") | rtrimstr(";")' "$src")
-
-cases=""
-emit_hashes=""
-while IFS=$'\t' read -r name cp; do
-    hash=$(fnv1a "$name")
-    cases+="    /* $name */"$'\n'
-    cases+="  case $hash:"$'\n'
-    cases+="    if (len == ${#name}) {"$'\n'
-    cases+="      return $cp;"$'\n'
-    cases+="    }"$'\n'
-    cases+="    break;"$'\n'
-    emit_hashes+="$hash"$'\n'
-done <<<"$pairs"
-
-skip_hashes=""
-while IFS= read -r name; do
-    [ -n "$name" ] && skip_hashes+="$(fnv1a "$name")"$'\n'
-done <<<"$skipped"
-
-# The switch keys on the hash alone, so the dispatch is correct only while every
-# emitted name hashes uniquely; prove it here, no runtime name compare needed.
-dups=$(printf '%s' "$emit_hashes" | sort | uniq -d || true)
-if [ -n "$dups" ]; then
-    echo "FATAL: two entity names share a hash (duplicate switch case); change the hash:" >&2
-    echo "$dups" >&2
-    exit 1
-fi
-# A skipped name colliding with an emitted hash would mis-decode instead of
-# staying verbatim; forbid that too.
-aliased=$(comm -12 <(printf '%s' "$emit_hashes" | sort -u) <(printf '%s' "$skip_hashes" | sort -u) || true)
-if [ -n "$aliased" ]; then
-    echo "FATAL: a skipped multi-codepoint name aliases an emitted hash:" >&2
-    echo "$aliased" >&2
-    exit 1
-fi
-
-cat >"$dest" <<EOF
-/* GENERATED by $0 from the WHATWG named character references
-   (${url}). DO NOT EDIT.
-   Dispatch keys on a 64-bit FNV-1a hash of the entity name; the generator
-   aborts on any hash collision, so no runtime name compare is needed. */
-
-#include <stdint.h>
-
-static int decode_entity(const uint64_t hash, const size_t len) {
+static int decode_entity(const unsigned int hash, const size_t len) {
  switch(hash) {
-${cases}  }
+EOF
+    (
+        if test -f ${src}; then
+            cat ${src}
+        else
+            GET "${url}"
+        fi
+    ) |
+        grep -E '^<!ENTITY [a-zA-Z0-9_]' |
+        sed \
+            -e 's/<!ENTITY //' -e "s/[[:space:]][[:space:]]*/ /g" \
+            -e 's/-->$//' \
+            -e 's/\([^ ]*\) CDATA "&#\([^\"]*\);" -- \(.*\)/\1 \2 \3/' |
+        (
+            read -r A
+            while test -n "$A"; do
+                ent="${A%% *}"
+                code=$(echo "$A" | cut -f2 -d' ')
+                # compute hash
+                hash=0
+                i=0
+                a=1664525
+                c=1013904223
+                m="$((1 << 32))"
+                while test "$i" -lt ${#ent}; do
+                    d="$(echo -n "${ent:${i}:1}" | hexdump -v -e '/1 "%d"')"
+                    hash="$((((hash * a) % (m) + d + c) % (m)))"
+                    i=$((i + 1))
+                done
+                echo -e "    /* $A */"
+                echo -e "  case ${hash}u:"
+                echo -e "    if (len == ${#ent} /* && strncmp(ent, \"${ent}\") == 0 */) {"
+                echo -e "      return ${code};"
+                echo -e "    }"
+                echo -e "    break;"
+
+                # next
+                read -r A
+            done
+        )
+    cat <<EOF
+  }
  /* unknown */
  return -1;
 }
 EOF
-
-echo "wrote $dest ($(grep -c '^  case ' "$dest") entities)" >&2
+) >${dest}
--- a/src/htsfilters.c
+++ b/src/htsfilters.c
@@ -76,8 +76,7 @@ int fa_strjoker(int type, char **filters, int nfil, const char *nom, LLint * siz
    }
    if (size)
      sz = *size;
-    /* size unknown (scan time): no size pointer => size tests stay neutral */
-    if (strjoker(nom, filters[i] + filteroffs, size ? &sz : NULL, size_flag)) {
+    if (strjoker(nom, filters[i] + filteroffs, &sz, size_flag)) {       // reconnu
      if (size)
        if (sz != *size)
          sizelimit = sz;
@@ -193,12 +192,7 @@ HTS_INLINE const char *strjoker(const char *chaine, const char *joker, LLint * s
          int len = (int) strlen(joker);

          while((joker[i] != RIGHT) && (joker[i]) && (i < len)) {
-            // '\' escapes the next char as a literal member, e.g. *[\[\]]
-            if (joker[i] == '\\' && joker[i + 1] != '\0') {
-              i++;
-              pass[(int) (unsigned char) joker[i]] = 1;
-              i++;
-            } else if ((joker[i] == '<') || (joker[i] == '>')) { // *[<10]
+            if ((joker[i] == '<') || (joker[i] == '>')) {       // *[<10]
              int lsize = 0;
              int lverdict;

@@ -226,9 +220,7 @@ HTS_INLINE const char *strjoker(const char *chaine, const char *joker, LLint * s
                while(isdigit((unsigned char) joker[i]))
                  i++;
              }
-            } else if (joker[i + 1] == '-' && joker[i + 2] != '\0') {
-              // range *[A-Z]; the '\0' guard rejects a truncated *[a- (else
-              // i+=3 overshoots the NUL)
+            } else if (joker[i + 1] == '-') {   // 2 car, ex: *[A-Z]
              if ((int) (unsigned char) joker[i + 2] >
                  (int) (unsigned char) joker[i]) {
                int j;
@@ -240,7 +232,10 @@ HTS_INLINE const char *strjoker(const char *chaine, const char *joker, LLint * s
              }
              // else err=1;
              i += 3;
-            } else { // 1 car, ex: *[ ]
+            } else {            // 1 car, ex: *[ ]
+              if (joker[i + 2] == '\\' && joker[i + 3] != 0) {  // escaped char, such as *[\[] or *[\]]
+                i++;
+              }
              pass[(int) (unsigned char) joker[i]] = 1;
              i++;
            }
--- a/src/htsglobal.h
+++ b/src/htsglobal.h
@@ -43,8 +43,8 @@ Please visit our Website: http://www.httrack.com
   configure.ac, decoupled from these). VERSION is the display form, VERSIONID
   the dotted numeric form, AFF_VERSION the short form shown in footers,
   LIB_VERSION the data/cache format generation. */
-#define HTTRACK_VERSION "3.49-10"
-#define HTTRACK_VERSIONID "3.49.10"
+#define HTTRACK_VERSION "3.49-9"
+#define HTTRACK_VERSIONID "3.49.9"
 #define HTTRACK_AFF_VERSION "3.x"
 #define HTTRACK_LIB_VERSION "2.0"

@@ -229,10 +229,6 @@ Please visit our Website: http://www.httrack.com
 #define HTS_DEFAULT_FOOTER                                                     \
  "<!-- Mirrored from %s%s by HTTrack Website Copier/" HTTRACK_AFF_VERSION     \
  " " HTTRACK_AFF_AUTHORS ", %s -->"
-/* Honest crawler User-Agent; no fake OS/browser to go stale. */
-#define HTS_DEFAULT_USER_AGENT                                                 \
-  "Mozilla/5.0 (compatible; HTTrack/" HTTRACK_AFF_VERSION                      \
-  "; +https://www.httrack.com/)"
 #define HTTRACK_WEB "http://www.httrack.com"
 #define HTS_UPDATE_WEBSITE                                                     \
  "http://www.httrack.com/"                                                    \
--- a/src/htshash.c
+++ b/src/htshash.c
@@ -106,10 +106,10 @@ static coucal_hashkeys key_adrfil_hashes_generic(void *arg,
  const lien_url*const lien = (const lien_url*) value;
  const char *const adr = !former ? lien->adr : lien->former_adr;
  const char *const fil = !former ? lien->fil : lien->former_fil;
-  const char *const adr_norm =
-      adr != NULL ? (hash->norm_host ? jump_normalized_const(adr)
-                                     : jump_identification_const(adr))
-                  : NULL;
+  const char *const adr_norm = adr != NULL ? 
+    ( hash->normalized  ? jump_normalized_const(adr)
+                        : jump_identification_const(adr) )
+    : NULL;

  // copy address
  assertf(adr_norm != NULL);
@@ -117,18 +117,10 @@ static coucal_hashkeys key_adrfil_hashes_generic(void *arg,

  // copy link
  assertf(fil != NULL);
-  {
-    /* resolve the per-URL strip keys; strip applies even when urlhack is off */
-    char BIGSTK keybuf[HTS_URLMAXSIZE];
-    const char *const keys = hts_query_strip_keys(hash->strip_query, adr, fil,
-                                                  keybuf, sizeof(keybuf));
-
-    if (hash->norm_slash || hash->norm_query || keys != NULL) {
-      fil_normalized_filtered_ex(fil, &hash->normfil[strlen(hash->normfil)],
-                                 keys, hash->norm_slash, hash->norm_query);
-    } else {
-      strcpy(&hash->normfil[strlen(hash->normfil)], fil);
-    }
+  if (hash->normalized) {
+    fil_normalized(fil, &hash->normfil[strlen(hash->normfil)]);
+  } else {
+    strcpy(&hash->normfil[strlen(hash->normfil)], fil);
  }

  // hash
@@ -140,7 +132,8 @@ static int key_adrfil_equals_generic(void *arg,
                                     coucal_key_const a_,
                                     coucal_key_const b_, 
                                     const int former) {
-  hash_struct *const hash = (hash_struct *) arg;
+  hash_struct *const hash = (hash_struct*) arg;
+  const int normalized = hash->normalized;
  const lien_url*const a = (const lien_url*) a_;
  const lien_url*const b = (const lien_url*) b_;
  const char *const a_adr = !former ? a->adr : a->former_adr;
@@ -157,10 +150,10 @@ static int key_adrfil_equals_generic(void *arg,
  assertf(b_fil != NULL);

  // skip scheme and authentication to the domain (possibly without www.)
-  ja = hash->norm_host ? jump_normalized_const(a_adr)
-                       : jump_identification_const(a_adr);
-  jb = hash->norm_host ? jump_normalized_const(b_adr)
-                       : jump_identification_const(b_adr);
+  ja = normalized
+    ? jump_normalized_const(a_adr) : jump_identification_const(a_adr);
+  jb = normalized
+    ? jump_normalized_const(b_adr) : jump_identification_const(b_adr);
  assertf(ja != NULL);
  assertf(jb != NULL);
  if (strcasecmp(ja, jb) != 0) {
@@ -168,23 +161,12 @@ static int key_adrfil_equals_generic(void *arg,
  }

  // now compare pathes
-  {
-    char BIGSTK ka[HTS_URLMAXSIZE], kb[HTS_URLMAXSIZE];
-    const char *const keysa =
-        hts_query_strip_keys(hash->strip_query, a_adr, a_fil, ka, sizeof(ka));
-    const char *const keysb =
-        hts_query_strip_keys(hash->strip_query, b_adr, b_fil, kb, sizeof(kb));
-
-    if (hash->norm_slash || hash->norm_query || keysa != NULL ||
-        keysb != NULL) {
-      fil_normalized_filtered_ex(a_fil, hash->normfil, keysa, hash->norm_slash,
-                                 hash->norm_query);
-      fil_normalized_filtered_ex(b_fil, hash->normfil2, keysb, hash->norm_slash,
-                                 hash->norm_query);
-      return strcmp(hash->normfil, hash->normfil2) == 0;
-    } else {
-      return strcmp(a_fil, b_fil) == 0;
-    }
+  if (normalized) {
+    fil_normalized(a_fil, hash->normfil);
+    fil_normalized(b_fil, hash->normfil2);
+    return strcmp(hash->normfil, hash->normfil2) == 0;
+  } else {
+    return strcmp(a_fil, b_fil) == 0;
  }
 }

@@ -240,17 +222,11 @@ static int key_former_adrfil_equals(void *arg,
  return key_adrfil_equals_generic(arg, a, b, 1);
 }

-void hash_init(httrackp *opt, hash_struct *hash, hts_boolean normalized) {
+void hash_init(httrackp *opt, hash_struct * hash, int normalized) {
  hash->sav = coucal_new(0);
  hash->adrfil = coucal_new(0);
  hash->former_adrfil = coucal_new(0);
-  /* urlhack is the umbrella; per-feature negatives opt out of each part */
-  hash->norm_host = normalized && !opt->no_www_dedup;
-  hash->norm_slash = normalized && !opt->no_slash_dedup;
-  hash->norm_query = normalized && !opt->no_query_dedup;
-  /* snapshot the query-strip list (not owned; valid for the hash lifetime) */
-  hash->strip_query =
-      StringNotEmpty(opt->strip_query) ? StringBuff(opt->strip_query) : NULL;
+  hash->normalized = normalized;

  hts_set_hash_handler(hash->sav, opt);
  hts_set_hash_handler(hash->adrfil, opt);
@@ -306,26 +282,6 @@ void hash_free(hash_struct *hash) {
  }
 }

-/* Test helper: do the two URLs dedupe to the same key under opt's urlhack
-   flags? Exercises the live hash compare (norm_host/slash/query resolution). */
-hts_boolean hash_url_equals(httrackp *opt, const char *adra, const char *fila,
-                            const char *adrb, const char *filb) {
-  hash_struct hash;
-  lien_url la, lb;
-  hts_boolean eq;
-
-  memset(&la, 0, sizeof(la));
-  memset(&lb, 0, sizeof(lb));
-  la.adr = key_duphandler(NULL, adra);
-  la.fil = key_duphandler(NULL, fila);
-  lb.adr = key_duphandler(NULL, adrb);
-  lb.fil = key_duphandler(NULL, filb);
-  hash_init(opt, &hash, opt->urlhack);
-  eq = key_adrfil_equals(&hash, &la, &lb);
-  hash_free(&hash);
-  return eq;
-}
-
 // retour: position ou -1 si non trouvé
 int hash_read(const hash_struct * hash, const char *nom1, const char *nom2,
              hash_struct_type type) {
--- a/src/htshash.h
+++ b/src/htshash.h
@@ -51,12 +51,8 @@ typedef enum hash_struct_type {
 } hash_struct_type;

 // tables de hachage
-void hash_init(httrackp *opt, hash_struct *hash, hts_boolean normalized);
+void hash_init(httrackp *opt, hash_struct *hash, int normalized);
 void hash_free(hash_struct *hash);
-/* Test helper: HTS_TRUE if the two URLs dedupe together under opt's urlhack
-   flags. */
-hts_boolean hash_url_equals(httrackp *opt, const char *adra, const char *fila,
-                            const char *adrb, const char *filb);
 int hash_read(const hash_struct * hash, const char *nom1, const char *nom2,
              hash_struct_type type);
 void hash_write(hash_struct * hash, size_t lpos);
--- a/src/htshelp.c
+++ b/src/htshelp.c
@@ -521,7 +521,6 @@ void help(const char *app, int more) {
  infomsg("  EN maximum mirror time in seconds (60=1 minute, 3600=1 hour)");
  infomsg("  AN maximum transfer rate in bytes/seconds (1000=1KB/s max)");
  infomsg(" %cN maximum number of connections/seconds (*%c10)");
-  infomsg(" %G  random pause of MIN[:MAX] seconds between files (e.g. %G5:10)");
  infomsg
    ("  GN pause transfer if N bytes reached, and wait until lock file is deleted");
  infomsg("");
@@ -564,7 +563,6 @@ void help(const char *app, int more) {
    (" %x  do not include any password for external password protected websites (%x0 include)");
  infomsg
    (" %q *include query string for local files (useless, for information purpose only) (%q0 don't include)");
-  infomsg(" %g  strip query keys for dedup ([host/pattern=]key1,key2,...)");
  infomsg
    ("  o *generate output html file in case of error (404..) (o0 don't generate)");
  infomsg("  X *purge old files after update (X0 keep delete)");
@@ -573,7 +571,6 @@ void help(const char *app, int more) {
  infomsg("");
  infomsg("Spider options:");
  infomsg("  bN accept cookies in cookies.txt (0=do not accept,* 1=accept)");
-  infomsg(" %K  load extra cookies from a Netscape cookies.txt");
  infomsg
    ("  u  check document type if unknown (cgi,asp..) (u0 don't check, * u1 check but /, u2 check always)");
  infomsg
@@ -590,9 +587,6 @@ void help(const char *app, int more) {
    (" %s  update hacks: various hacks to limit re-transfers when updating (identical size, bogus response..)");
  infomsg
    (" %u  url hacks: various hacks to limit duplicate URLs (strip //, www.foo.com==foo.com..)");
-  infomsg("     opt out of one url-hack part: --keep-www-prefix "
-          "(www.foo.com<>foo.com), --keep-double-slashes (//), "
-          "--keep-query-order (?b&a)");
  infomsg
    (" %A  assume that a type (cgi,asp..) is always linked with a mime type (-%A php3,cgi=text/html;dat,bin=application/x-zip)");
  infomsg("     shortcut: '--assume standard' is equivalent to -%A "
@@ -652,7 +646,9 @@ void help(const char *app, int more) {
  infomsg("");
  infomsg("Guru options: (do NOT use if possible)");
  infomsg(" #X *use optimized engine (limited memory boundary checks)");
-  infomsg(" #test  list engine self-tests (run one with -#test=NAME [args])");
+  infomsg(" #0  filter test (-#0 '*.gif' 'www.bar.com/foo.gif')");
+  infomsg(" #1  simplify test (-#1 ./foo/bar/../foobar)");
+  infomsg(" #2  type test (-#2 /foo/bar.php)");
  infomsg(" #C  cache list (-#C '*.com/spider*.gif'");
  infomsg(" #R  cache repair (damaged cache)");
  infomsg(" #d  debug parser");
--- a/src/htslib.c
+++ b/src/htslib.c
@@ -563,39 +563,6 @@ const char *hts_mime[][2] = {
  {"", ""}
 };

-/* Modern web formats (post-2010), kept in their own table: appending to the
-   legacy hts_mime[] above makes clang-format reflow its whole initializer.
-   Scanned after hts_mime[], so it never shadows a legacy mapping. */
-static const char *hts_mime_modern[][2] = {
-    {"image/webp", "webp"},
-    {"image/avif", "avif"},
-    {"image/heic", "heic"},
-    {"font/woff", "woff"},
-    {"font/woff2", "woff2"},
-    {"font/ttf", "ttf"},
-    {"font/otf", "otf"},
-    {"application/json", "json"},
-    {"application/ld+json", "jsonld"},
-    {"application/manifest+json", "webmanifest"},
-    {"application/wasm", "wasm"},
-    {"text/javascript", "js"},
-    {"text/javascript", "mjs"},
-    {"text/markdown", "md"},
-    {"video/mp4", "mp4"},
-    {"video/webm", "webm"},
-    {"video/ogg", "ogv"},
-    {"video/mp2t", "ts"},
-    {"audio/mp4", "m4a"},
-    {"audio/aac", "aac"},
-    {"audio/ogg", "oga"},
-    {"audio/opus", "opus"},
-    {"audio/flac", "flac"},
-    {"audio/webm", "weba"},
-    {"application/x-7z-compressed", "7z"},
-    {"application/x-rar-compressed", "rar"},
-    {"application/zstd", "zst"},
-    {"", ""}};
-
 // Reserved (RFC2396)
 #define CIS(c,ch) ( ((unsigned char)(c)) == (ch) )
 #define CHAR_RESERVED(c)  ( CIS(c,';') \
@@ -3643,10 +3610,7 @@ static int sortNormFnc(const void *a_, const void *b_) {
  return strcmp(*a + 1, *b + 1);
 }

-/* Path normalizer core: optionally collapse redundant '//' (DO_SLASH) and/or
-   sort query arguments (DO_QUERY) so equivalent URLs dedupe. */
-static char *fil_normalized_ex(const char *source, char *dest, int do_slash,
-                               int do_query) {
+HTSEXT_API char *fil_normalized(const char *source, char *dest) {
  char lastc = 0;
  int gotquery = 0;
  int ampargs = 0;
@@ -3656,8 +3620,8 @@ static char *fil_normalized_ex(const char *source, char *dest, int do_slash,
  for(i = j = 0; source[i] != '\0'; i++) {
    if (!gotquery && source[i] == '?')
      gotquery = ampargs = 1;
-    if (do_slash && !gotquery && lastc == '/' && source[i] == '/') {
-      // foo//bar -> foo/bar
+    if ((!gotquery && lastc == '/' && source[i] == '/') // foo//bar -> foo/bar
+      ) {
    } else {
      if (gotquery && source[i] == '&') {
        ampargs++;
@@ -3669,7 +3633,7 @@ static char *fil_normalized_ex(const char *source, char *dest, int do_slash,
  dest[j++] = '\0';

  /* Sort arguments (&foo=1&bar=2 == &bar=2&foo=1) */
-  if (do_query && ampargs > 1) {
+  if (ampargs > 1) {
    char **amps = malloct(ampargs * sizeof(char *));
    char *copyBuff = NULL;
    size_t qLen = 0;
@@ -3717,153 +3681,6 @@ static char *fil_normalized_ex(const char *source, char *dest, int do_slash,
  return dest;
 }

-HTSEXT_API char *fil_normalized(const char *source, char *dest) {
-  return fil_normalized_ex(source, dest, 1, 1);
-}
-
-/* Is query key ARG[0..keylen) in the comma-separated STRIP list? "*" = all;
-   case-sensitive, space-trimmed tokens. */
-static int hts_query_key_stripped(const char *arg, size_t keylen,
-                                  const char *strip) {
-  const char *p = strip;
-
-  while (*p != '\0') {
-    const char *start = p;
-    size_t toklen;
-
-    while (*p != '\0' && *p != ',')
-      p++;
-    toklen = (size_t) (p - start);
-    while (toklen > 0 && *start == ' ') {
-      start++;
-      toklen--;
-    }
-    while (toklen > 0 && start[toklen - 1] == ' ')
-      toklen--;
-    if (toklen == 1 && start[0] == '*')
-      return 1;
-    if (toklen == keylen && strncmp(start, arg, keylen) == 0)
-      return 1;
-    if (*p == ',')
-      p++;
-  }
-  return 0;
-}
-
-/* see htscore.h */
-char *fil_normalized_filtered_ex(const char *source, char *dest,
-                                 const char *strip, int do_slash,
-                                 int do_query) {
-  const char *query;
-  char BIGSTK tmp[HTS_URLMAXSIZE * 2];
-  htsbuff cb;
-  int wrote = 0;
-
-  /* No strip list, or no query: plain normalization. */
-  if (strip == NULL || *strip == '\0' ||
-      (query = strchr(source, '?')) == NULL) {
-    return fil_normalized_ex(source, dest, do_slash, do_query);
-  }
-
-  /* Copy the path, re-emit kept query args, let fil_normalized() sort. Walk
-     every field incl. empty/trailing ("a&","?&&") so the result is a fixpoint
-     (the read re-normalizes it; a dropped empty arg would miss dedup). */
-  cb = htsbuff_ptr(tmp, sizeof(tmp));
-  htsbuff_catn(&cb, source, (size_t) (query - source));
-  for (query++;;) {
-    const char *const arg = query;
-    const char *eq = NULL;
-    size_t keylen, arglen;
-
-    while (*query != '\0' && *query != '&') {
-      if (eq == NULL && *query == '=')
-        eq = query;
-      query++;
-    }
-    arglen = (size_t) (query - arg);
-    keylen = eq != NULL ? (size_t) (eq - arg) : arglen;
-    if (!hts_query_key_stripped(arg, keylen, strip)) {
-      htsbuff_catc(&cb, wrote ? '&' : '?');
-      htsbuff_catn(&cb, arg, arglen);
-      wrote = 1;
-    }
-    if (*query == '\0')
-      break;
-    query++;
-  }
-  return fil_normalized_ex(tmp, dest, do_slash, do_query);
-}
-
-/* see htscore.h */
-char *fil_normalized_filtered(const char *source, char *dest,
-                              const char *strip) {
-  return fil_normalized_filtered_ex(source, dest, strip, 1, 1);
-}
-
-/* see htscore.h */
-const char *hts_query_strip_keys(const char *rules, const char *adr,
-                                 const char *fil, char *dest, size_t destsize) {
-  const char *p, *q;
-  const char *result = NULL;
-  char BIGSTK url[HTS_URLMAXSIZE * 2];
-
-  if (rules == NULL || *rules == '\0' || destsize == 0)
-    return NULL;
-
-  /* Match string = normalized host/path, query removed. jump_normalized_const
-     collapses www+scheme/auth so read and write (double-normalized) agree;
-     query excluded keeps the decision on host/path only. */
-  url[0] = '\0';
-  strcatbuff(url, jump_normalized_const(adr));
-  if (fil[0] != '/')
-    strcatbuff(url, "/");
-  q = strchr(fil, '?');
-  if (q != NULL)
-    strncatbuff(url, fil, (int) (q - fil));
-  else
-    strcatbuff(url, fil);
-
-  /* Walk the '\n' entries; last match wins (like the +/- filter eval). Each is
-     "pattern=keys"; no '=' is the bare form, pattern "*". */
-  for (p = rules; *p != '\0';) {
-    const char *const line = p;
-    const char *eol, *eq, *keys;
-    char BIGSTK pat[HTS_URLMAXSIZE * 2];
-
-    while (*p != '\0' && *p != '\n')
-      p++;
-    eol = p;
-    if (*p == '\n')
-      p++;
-    if (eol == line)
-      continue;
-    eq = memchr(line, '=', (size_t) (eol - line));
-    if (eq != NULL) {
-      size_t patlen = (size_t) (eq - line);
-
-      if (patlen >= sizeof(pat))
-        patlen = sizeof(pat) - 1;
-      memcpy(pat, line, patlen);
-      pat[patlen] = '\0';
-      keys = eq + 1;
-    } else {
-      pat[0] = '*';
-      pat[1] = '\0';
-      keys = line;
-    }
-    if (strjoker(url, pat, NULL, NULL) != NULL) {
-      size_t klen = (size_t) (eol - keys);
-
-      if (klen >= destsize)
-        klen = destsize - 1;
-      memcpy(dest, keys, klen);
-      dest[klen] = '\0';
-      result = dest;
-    }
-  }
-  return result;
-}
-
 #define endwith(a) ( (len >= (sizeof(a)-1)) ? ( strncmp(dest, a+len-(sizeof(a)-1), sizeof(a)-1) == 0 ) : 0 );
 HTSEXT_API char *adr_normalized_sized(const char *source, char *dest,
                                      size_t destsize) {
@@ -4341,20 +4158,6 @@ void guess_httptype(httrackp * opt, char *s, const char *fil) {
  (void) get_httptype_sized(opt, s, HTS_MIMETYPE_SIZE, fil, 1);
 }

-// first match in a NUL-terminated {mime,ext} table. key selects the lookup
-// column (0=mime, 1=ext); returns the other column, or NULL if no row matches
-// (a "*" partner means the row carries no value).
-static const char *hts_mime_lookup(const char *(*table)[2], int key,
-                                   const char *needle) {
-  int j;
-
-  for (j = 0; strnotempty(table[j][1]); j++) {
-    if (strfield2(table[j][key], needle) && table[j][!key][0] != '*')
-      return table[j][!key];
-  }
-  return NULL;
-}
-
 // write the mime type for fil into s (capacity ssize)
 // flag: 1 to always return a type (the "application/..." / octet-stream
 // fallback) returns 1 if a type was written to s, 0 otherwise
@@ -4374,19 +4177,20 @@ HTSEXT_API hts_boolean get_httptype_sized(httrackp *opt, char *s, size_t ssize,
    /* Check html -> text/html */
    const char *a = fil + strlen(fil) - 1;

-    /* a < fil when fil is empty: bound before dereferencing */
-    while ((a > fil) && (*a != '.') && (*a != '/'))
+    while((*a != '.') && (*a != '/') && (a > fil))
      a--;
-    if (a >= fil && *a == '.' && strlen(a) < 32) {
-      const char *mime;
+    if (*a == '.' && strlen(a) < 32) {
+      int j = 0;

      a++;
-      mime = hts_mime_lookup(hts_mime, 1, a);
-      if (mime == NULL)
-        mime = hts_mime_lookup(hts_mime_modern, 1, a);
-      if (mime != NULL) {
-        strlcpybuff(s, mime, ssize);
-        return 1;
+      while(strnotempty(hts_mime[j][1])) {
+        if (strfield2(hts_mime[j][1], a)) {
+          if (hts_mime[j][0][0] != '*') { // a match exists
+            strlcpybuff(s, hts_mime[j][0], ssize);
+            return 1;
+          }
+        }
+        j++;
      }

      if (flag) {
@@ -4521,16 +4325,18 @@ int get_userhttptype(httrackp * opt, char *s, const char *fil) {
 // returns 1 if an extension was found (and written to s), 0 otherwise
 int give_mimext(char *s, size_t ssize, const char *st) {
  int ok = 0;
-  const char *ext;
+  int j = 0;

  st = hts_effective_mime(st); /* no declared type: derive an html ext */
  s[0] = '\0';
-  ext = hts_mime_lookup(hts_mime, 0, st);
-  if (ext == NULL)
-    ext = hts_mime_lookup(hts_mime_modern, 0, st);
-  if (ext != NULL) {
-    strlcpybuff(s, ext, ssize);
-    ok = 1;
+  while((!ok) && (strnotempty(hts_mime[j][1]))) {
+    if (strfield2(hts_mime[j][0], st)) {
+      if (hts_mime[j][1][0] != '*') { // a match exists
+        strlcpybuff(s, hts_mime[j][1], ssize);
+        ok = 1;
+      }
+    }
+    j++;
  }
  // wrap "x" mimetypes, such as:
  // application/x-mp3
@@ -6048,7 +5854,8 @@ HTSEXT_API httrackp *hts_create_opt(void) {
  opt->shell = HTS_FALSE;
  opt->proxy.active = 0;        // pas de proxy
  opt->user_agent_send = HTS_TRUE;
-  StringCopy(opt->user_agent, HTS_DEFAULT_USER_AGENT);
+  StringCopy(opt->user_agent,
+             "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)");
  StringCopy(opt->referer, "");
  StringCopy(opt->from, "");
  opt->savename_83 = HTS_SAVENAME_83_LONG; // long names by default
@@ -6082,14 +5889,7 @@ HTSEXT_API httrackp *hts_create_opt(void) {
  opt->verbosedisplay = HTS_VERBOSE_NONE; // no text animation
  opt->sizehack = HTS_FALSE;
  opt->urlhack = HTS_TRUE;
-  opt->no_www_dedup = HTS_FALSE;
-  opt->no_slash_dedup = HTS_FALSE;
-  opt->no_query_dedup = HTS_FALSE;
  StringCopy(opt->footer, HTS_DEFAULT_FOOTER);
-  StringCopy(opt->strip_query, "");
-  StringCopy(opt->cookies_file, "");
-  opt->pause_min_ms = 0;
-  opt->pause_max_ms = 0;
  opt->ftp_proxy = HTS_TRUE;
  opt->convert_utf8 = HTS_TRUE;
  StringCopy(opt->filelist, "");
@@ -6234,8 +6034,6 @@ HTSEXT_API void hts_free_opt(httrackp * opt) {
    StringFree(opt->urllist);
    StringFree(opt->footer);
    StringFree(opt->mod_blacklist);
-    StringFree(opt->strip_query);
-    StringFree(opt->cookies_file);

    StringFree(opt->path_html);
    StringFree(opt->path_html_utf8);
--- a/src/htsname.c
+++ b/src/htsname.c
@@ -198,13 +198,6 @@ int url_savename(lien_adrfilsave *const afs,
  // copy of fil, used for lookups (see urlhack)
  const char *normadr = adr;
  const char *normfil = fil_complete;
-  /* query keys to strip for this URL (NULL = none); decoupled from urlhack */
-  char BIGSTK stripkeys[HTS_URLMAXSIZE];
-  const char *const strip =
-      StringNotEmpty(opt->strip_query)
-          ? hts_query_strip_keys(StringBuff(opt->strip_query), adr,
-                                 fil_complete, stripkeys, sizeof(stripkeys))
-          : NULL;
  const char *const print_adr = jump_protocol_const(adr);
  const char *start_pos = NULL, *nom_pos = NULL, *dot_pos = NULL;     // Position nom et point

@@ -237,13 +230,9 @@ int url_savename(lien_adrfilsave *const afs,
  // www-42.foo.com -> foo.com
  // foo.com/bar//foobar -> foo.com/bar/foobar
  if (opt->urlhack) {
-    // dedup-lookup key; honor the per-feature negatives like htshash.c so
-    // distinct URLs keep distinct savenames (else keep normadr = adr)
-    if (!opt->no_www_dedup)
-      normadr = adr_normalized_sized(adr, normadr_, sizeof(normadr_));
-    normfil =
-        fil_normalized_filtered_ex(fil_complete, normfil_, strip,
-                                   !opt->no_slash_dedup, !opt->no_query_dedup);
+    // copy of adr (without protocol), used for lookups (see urlhack)
+    normadr = adr_normalized_sized(adr, normadr_, sizeof(normadr_));
+    normfil = fil_normalized(fil_complete, normfil_);
  } else {
    if (link_has_authority(adr_complete)) {     // https or other protocols : in "http/" subfolder
      char *pos = strchr(adr_complete, ':');
@@ -256,11 +245,6 @@ int url_savename(lien_adrfilsave *const afs,
        normadr = normadr_;
      }
    }
-    // strip still applies with urlhack off (host left untouched); no // or
-    // query-sort here, to match the hash key (norm_slash/norm_query are 0 when
-    // urlhack is off) so a URL is looked up under the key it was stored with
-    if (strip != NULL)
-      normfil = fil_normalized_filtered_ex(fil_complete, normfil_, strip, 0, 0);
  }

  // à afficher sans ftp://
--- a/src/htsopt.h
+++ b/src/htsopt.h
@@ -529,16 +529,6 @@ struct httrackp {
  htslibhandles libHandles; /**< loaded external module handles */
  //
  htsoptstate state; /**< embedded live engine state */
-  String strip_query; /**< query keys to drop when deduping URLs (-strip-query);
-                           appended at the tail to keep field offsets stable */
-  hts_boolean
-      no_www_dedup; /**< with urlhack, keep www.host distinct from host */
-  hts_boolean no_slash_dedup; /**< with urlhack, keep redundant // in paths */
-  hts_boolean no_query_dedup; /**< with urlhack, keep query-argument order */
-  String cookies_file;        /**< extra Netscape cookies.txt to preload
-                                 (--cookies-file) */
-  int pause_min_ms; /**< inter-file pause lower bound, ms (0=off, #185) */
-  int pause_max_ms; /**< inter-file pause upper bound, ms */
 };

 /* Running statistics for a mirror. */
--- a/src/htsparse.c
+++ b/src/htsparse.c
@@ -302,14 +302,6 @@ static HTS_INLINE char html_prevc(const char *html, const char *start) {
  return html > start ? html[-1] : ' ';
 }

-/* Drop a redirect Location's #fragment: a UA anchor, never part of the fetched
- * resource (#204). */
-static void url_drop_fragment(char *const url) {
-  char *const frag = strchr(url, '#');
-  if (frag != NULL)
-    *frag = '\0';
-}
-
 /* True if [s, s+len) is exactly an HTTP method token (XHR.open's first
   argument is a method, not a URL: #218). Case-insensitive. */
 static int is_http_method(const char *s, size_t len) {
@@ -3604,35 +3596,22 @@ int hts_mirror_check_moved(htsmoduleStruct * str,
        //

        strcpybuff(mov_url, r->location);
-        url_drop_fragment(mov_url);

        // url qque -> adresse+fichier
        if ((reponse =
             ident_url_relatif(mov_url, urladr(), urlfil(), moved)) >= 0) {
          int set_prio_to = 0;  // pas de priotité fixéd par wizard

-          // check whether URLHack is harmless or not (per the effective
-          // sub-flags)
-          if (opt->urlhack && (!opt->no_www_dedup || !opt->no_slash_dedup ||
-                               !opt->no_query_dedup)) {
-            const int norm_host = !opt->no_www_dedup;
-            const int norm_slash = !opt->no_slash_dedup;
-            const int norm_query = !opt->no_query_dedup;
+          // check whether URLHack is harmless or not
+          if (opt->urlhack) {
            char BIGSTK n_adr[HTS_URLMAXSIZE * 2], n_fil[HTS_URLMAXSIZE * 2];
            char BIGSTK pn_adr[HTS_URLMAXSIZE * 2], pn_fil[HTS_URLMAXSIZE * 2];

-            strlcpybuff(n_adr,
-                        norm_host ? jump_normalized_const(moved->adr)
-                                  : jump_identification_const(moved->adr),
-                        sizeof(n_adr));
-            strlcpybuff(pn_adr,
-                        norm_host ? jump_normalized_const(urladr())
-                                  : jump_identification_const(urladr()),
-                        sizeof(pn_adr));
-            fil_normalized_filtered_ex(moved->fil, n_fil, NULL, norm_slash,
-                                       norm_query);
-            fil_normalized_filtered_ex(urlfil(), pn_fil, NULL, norm_slash,
-                                       norm_query);
+            n_adr[0] = n_fil[0] = '\0';
+            (void) adr_normalized_sized(moved->adr, n_adr, sizeof(n_adr));
+            (void) fil_normalized(moved->fil, n_fil);
+            (void) adr_normalized_sized(urladr(), pn_adr, sizeof(pn_adr));
+            (void) fil_normalized(urlfil(), pn_fil);
            if (strcasecmp(n_adr, pn_adr) == 0
                && strcasecmp(n_fil, pn_fil) == 0) {
              hts_log_print(opt, LOG_WARNING,
@@ -4812,7 +4791,6 @@ int hts_wait_delayed(htsmoduleStruct * str, lien_adrfilsave *afs,

            mov_url[0] = '\0';
            strcpybuff(mov_url, back[b].r.location);    // copier URL
-            url_drop_fragment(mov_url);

            /* Remove (temporarily created) file if it was created */
            UNLINK(fconv(OPT_GET_BUFF(opt), OPT_GET_BUFF_SIZE(opt), back[b].url_sav));
--- a/src/htsselftest.c
+++ b/src/htsselftest.c
--- a/src/htsselftest.h
+++ b/src/htsselftest.h
@@ -1,52 +0,0 @@
-/* ------------------------------------------------------------ */
-/*
-HTTrack Website Copier, Offline Browser for Windows and Unix
-Copyright (C) 2026 Xavier Roche and other contributors
-
-SPDX-License-Identifier: GPL-3.0-or-later
-
-This program is free software: you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation, either version 3 of the License, or
-(at your option) any later version.
-
-This program is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
-
-You should have received a copy of the GNU General Public License
-along with this program. If not, see <http://www.gnu.org/licenses/>.
-
-Ethical use: we kindly ask that you NOT use this software to harvest email
-addresses or to collect any other private information about people. Doing so
-would dishonor our work and waste the many hours we have spent on it.
-
-Please visit our Website: http://www.httrack.com
-*/
-
-/* ------------------------------------------------------------ */
-/* File: htsselftest.h                                          */
-/*       named dispatch for the hidden engine self-tests        */
-/* Author: Xavier Roche                                         */
-/* ------------------------------------------------------------ */
-
-#ifndef HTSSELFTEST_DEFH
-#define HTSSELFTEST_DEFH
-
-#ifdef HTS_INTERNAL_BYTECODE
-
-#ifndef HTS_DEF_FWSTRUCT_httrackp
-#define HTS_DEF_FWSTRUCT_httrackp
-typedef struct httrackp httrackp;
-#endif
-
-/* Run engine self-test `name` over the positional args argv[0..argc-1], or list
-   the available tests when name is NULL, empty, or "list". Prints the result;
-   returns the process exit code (0 == success). The caller owns option cleanup.
-   Reached through the hidden `httrack -#test[=NAME ...]` subcommand. */
-int hts_selftest(httrackp *opt, const char *name, int argc, char **argv);
-
-#endif
-
-#endif
--- a/src/htsserver.c
+++ b/src/htsserver.c
@@ -358,12 +358,12 @@ int smallserver(T_SOC soc, char *url, char *method, char *data, char *path) {
      {NULL, 0}
    };
    initStrElt initStr[] = {
-        {"user", HTS_DEFAULT_USER_AGENT},
-        {"footer", "<!-- Mirrored from %s%s by HTTrack Website Copier/3.x "
-                   "[XR&CO'2014], %s -->"},
-        {"url2",
-         "+*.png +*.gif +*.jpg +*.jpeg +*.css +*.js -ad.doubleclick.net/*"},
-        {NULL, NULL}};
+      {"user", "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"},
+      {"footer",
+       "<!-- Mirrored from %s%s by HTTrack Website Copier/3.x [XR&CO'2014], %s -->"},
+      {"url2", "+*.png +*.gif +*.jpg +*.jpeg +*.css +*.js -ad.doubleclick.net/*"},
+      {NULL, NULL}
+    };
    int i = 0;

    for(i = 0; initInt[i].name; i++) {
--- a/src/htswizard.c
+++ b/src/htswizard.c
@@ -80,10 +80,6 @@ htspair_t hts_detect_embed[] = {
  {NULL, NULL}
 };

-/* HTML5 media siblings of <img src>: same near-link treatment (#451) */
-static const htspair_t hts_detect_embed_html5[] = {
-    {"source", "src"}, {"source", "srcset"}, {"track", "src"}, {NULL, NULL}};
-
 /* Internal */
 static int hts_acceptlink_(httrackp * opt, int ptr, const char *adr,
                           const char *fil, const char *tag,
@@ -140,17 +136,6 @@ static int cmp_token(const char *tag, const char *cmp) {
          && !isalnum((unsigned char) tag[p]));
 }

-/* TRUE if (tag, attribute) matches an embedded-asset pair in the table */
-static hts_boolean is_embed_pair(const htspair_t *table, const char *tag,
-                                 const char *attribute) {
-  int i;
-  for (i = 0; table[i].tag != NULL; i++) {
-    if (cmp_token(tag, table[i].tag) && cmp_token(attribute, table[i].attr))
-      return HTS_TRUE;
-  }
-  return HTS_FALSE;
-}
-
 static int hts_acceptlink_(httrackp * opt, int ptr,
                           const char *adr, const char *fil, const char *tag,
                           const char *attribute, int *set_prio_to,
@@ -178,9 +163,15 @@ static int hts_acceptlink_(httrackp * opt, int ptr,

  /* Built-in known tags (<img src=..>, ..) */
  if (forbidden_url != 0 && opt->nearlink && tag != NULL && attribute != NULL) {
-    if (is_embed_pair(hts_detect_embed, tag, attribute) ||
-        is_embed_pair(hts_detect_embed_html5, tag, attribute)) {
-      embedded_triggered = 1;
+    int i;
+
+    for(i = 0; hts_detect_embed[i].tag != NULL; i++) {
+      if (cmp_token(tag, hts_detect_embed[i].tag)
+          && cmp_token(attribute, hts_detect_embed[i].attr)
+        ) {
+        embedded_triggered = 1;
+        break;
+      }
    }
  }

--- a/tests/01_engine-cache-golden.test
+++ b/tests/01_engine-cache-golden.test
@@ -4,7 +4,7 @@
 # POSIX /bin/sh on some platforms (e.g. macOS), so avoid bashisms and GNU-only
 # tool flags despite the #!/bin/bash above.

-# Golden cache-format regression test (driven by 'httrack -#test=cache-golden <dir>').
+# Golden cache-format regression test (driven by 'httrack -#B <dir>').
 #
 # 01_engine-cache.test writes the cache with the same build it reads back (a
 # round-trip), so it cannot catch a read-path or ZIP-format regression where
@@ -13,7 +13,7 @@
 # byte-exact.
 #
 # Regenerate the fixture after a deliberate format change with
-# 'httrack -#test=cache-golden <dir> regen', then copy <dir>/hts-cache/new.zip over the
+# 'httrack -#B <dir> regen', then copy <dir>/hts-cache/new.zip over the
 # committed file.

 set -eu
@@ -37,11 +37,11 @@ trap 'rm -rf "$dir"' EXIT
 mkdir -p "$dir/hts-cache"
 cp "$fixture/hts-cache/new.zip" "$dir/hts-cache/new.zip"

-out=$(httrack -#test=cache-golden "$dir")
+out=$(httrack -#B "$dir")

 # Match the exact success line: the read must have found and verified every
-# entry, not merely failed to enter the mode (a renamed/removed test prints the
-# registry to stderr, which also exits non-zero but never prints this).
+# entry, not merely failed to enter the mode (a bad -#B falls through to the
+# usage screen, which also exits non-zero but never prints this).
 test "$out" = "cache-golden: OK" || {
    echo "expected 'cache-golden: OK', got: $out" >&2
    exit 1
--- a/tests/01_engine-cache-writefail.test
+++ b/tests/01_engine-cache-writefail.test
@@ -1,24 +0,0 @@
-#!/bin/bash
-#
-# Keep this POSIX-portable: the harness runs it via $(BASH), which is a plain
-# POSIX /bin/sh on some platforms (e.g. macOS), so avoid bashisms and GNU-only
-# tool flags despite the #!/bin/bash above.
-
-# Cache write-failure handling (httrack -#test=cache-writefail <dir>). #174/#219.
-# A failing new.zip write (disk full) used to crash the process via assertf; it
-# must instead stop the mirror with a fatal error (exit_xh=-1), no crash. The
-# self-test asserts that; reverting the fix makes -#test=cache-writefail abort (SIGABRT) and fail.
-
-set -eu
-
-dir=$(mktemp -d)
-trap 'rm -rf "$dir"' EXIT
-
-out=$(httrack -#test=cache-writefail "$dir")
-
-# Match the exact success line (error logs also go to stdout); a renamed/removed
-# test prints the registry to stderr, which exits non-zero but never prints this.
-printf '%s\n' "$out" | grep -qx "cache-writefail: OK" || {
-    echo "expected 'cache-writefail: OK', got: $out" >&2
-    exit 1
-}
--- a/tests/01_engine-cache.test
+++ b/tests/01_engine-cache.test
@@ -4,7 +4,7 @@
 # POSIX /bin/sh on some platforms (e.g. macOS), so avoid bashisms and GNU-only
 # tool flags despite the #!/bin/bash above.

-# Cache create/read/update logic (driven by 'httrack -#test=cache <dir>').
+# Cache create/read/update logic (driven by 'httrack -#A <dir>').
 #
 # The in-process self-test stores several hand-crafted edge entries (normal
 # HTML, an empty redirect with a near-limit location, a non-HTML body kept via
@@ -20,13 +20,13 @@ set -eu
 dir=$(mktemp -d)
 trap 'rm -rf "$dir"' EXIT

-# The working directory is a required argument; without it the test prints a
-# usage line to stderr and returns non-zero.
-out=$(httrack -#test=cache "$dir")
+# Like the other -# debug modes, a trailing token (the working directory) is
+# required; a bare '-#A' falls through to the usage screen.
+out=$(httrack -#A "$dir")

 # Match the exact success line, so the test cannot pass for an unrelated reason
-# (e.g. the cache test being gone, which prints the registry to stderr but
-# never prints this line).
+# (e.g. the -#A mode being gone and falling through to the usage screen, which
+# also exits non-zero but never prints this).
 test "$out" = "cache-selftest: OK" || {
    echo "expected 'cache-selftest: OK', got: $out" >&2
    exit 1
--- a/tests/01_engine-charset.test
+++ b/tests/01_engine-charset.test
@@ -4,13 +4,13 @@
 set -euo pipefail

 # charset -> UTF-8 conversion (hts_convertStringToUTF8).
-# -#test=charset <charset> <string> prints the string re-decoded from <charset> as UTF-8.
+# -#3 <charset> <string> prints the string re-decoded from <charset> as UTF-8.
 conv() {
-    test "$(httrack -O /dev/null -#test=charset "$1" "$2")" == "$3" || exit 1
+    test "$(httrack -O /dev/null -#3 "$1" "$2")" == "$3" || exit 1
 }
 # crash probe: malformed input must exit cleanly, not abort.
 runs() {
-    httrack -O /dev/null -#test=charset "$1" "$2" >/dev/null 2>&1 || exit 1
+    httrack -O /dev/null -#3 "$1" "$2" >/dev/null 2>&1 || exit 1
 }

 # the source bytes below are UTF-8 (this file is UTF-8); "café" is 0x63 61 66 C3 A9.
@@ -31,7 +31,7 @@ conv 'us-ascii' 'hello' 'hello'
 # unknown charset: ASCII passes through unchanged, but non-ASCII input cannot be
 # decoded and yields empty output (an error is printed to stderr).
 conv 'no-such-charset-xyz' 'abc' 'abc'
-test "$(httrack -O /dev/null -#test=charset 'no-such-charset-xyz' 'café' 2>/dev/null)" == "" || exit 1
+test "$(httrack -O /dev/null -#3 'no-such-charset-xyz' 'café' 2>/dev/null)" == "" || exit 1

 # malformed UTF-8 (lone continuation byte, truncated lead byte) must not crash
 runs 'utf-8' $'\x80'
--- a/tests/01_engine-cmdline.test
+++ b/tests/01_engine-cmdline.test
@@ -90,16 +90,4 @@ refused "dangling-quote argument not refused cleanly"
 run_only "$tmp/q-lone" '"'
 refused "lone-quote argument not refused cleanly"

-# --pause (#185): valid MIN[:MAX] accepted; malformed, reversed, over-range and
-# non-finite values refused cleanly. NaN defeats naive `<`/`>` checks (it
-# compares false to everything), so it must not slip through to the int cast.
-run "$tmp/pause-ok" --pause 0.2:0.4
-accepted "$tmp/pause-ok" "#185: valid --pause range rejected"
-run "$tmp/pause-fix" --pause 0.2
-accepted "$tmp/pause-fix" "#185: valid fixed --pause rejected"
-for bad in nan nan:5 5:nan inf 10:5 99999; do
-    run "$tmp/pause-bad" --pause "$bad"
-    refused "#185: invalid --pause '$bad' not refused cleanly"
-done
-
 exit 0
--- a/tests/01_engine-cookies.test
+++ b/tests/01_engine-cookies.test
@@ -1,15 +1,14 @@
 #!/bin/bash
 #
 # Issue #151 guard: the request Cookie header must be bare RFC 6265 name=value
-# pairs, no $Version/$Path attributes. Driven by the 'httrack -#test=cookies' selftest.
+# pairs, no $Version/$Path attributes. Driven by the 'httrack -#Q' selftest.

 set -eu

-# 'run' is an ignored placeholder argument.
-out=$(httrack -#test=cookies run)
+# A trailing token is required; a bare '-#Q' falls through to the usage screen.
+out=$(httrack -#Q run)

-# Exact-match the success line so a renamed/removed test (it prints the registry
-# to stderr) can't pass.
+# Exact-match the success line so a fall-through to usage can't pass the test.
 test "$out" = "cookie-header: OK" || {
    echo "expected 'cookie-header: OK', got: $out" >&2
    exit 1
--- a/tests/01_engine-copyopt.test
+++ b/tests/01_engine-copyopt.test
@@ -2,16 +2,15 @@
 #
 # Regression guard for the unsigned-enum sentinel trap: copy_htsopt's
 # `if (from->X > -1)` guard is always false for unsigned hts_boolean fields, so
-# they silently stop being copied. Driven by the in-process 'httrack -#test=copyopt' test.
+# they silently stop being copied. Driven by the in-process 'httrack -#9' test.
 # Keep POSIX-portable (harness runs it via $(BASH), a plain /bin/sh on macOS).

 set -eu

-# 'run' is an ignored placeholder argument.
-out=$(httrack -#test=copyopt run)
+# A trailing token is required; a bare '-#9' falls through to the usage screen.
+out=$(httrack -#9 run)

-# Exact-match the success line so a renamed/removed test (it prints the registry
-# to stderr) can't pass.
+# Exact-match the success line so a fall-through to usage can't pass the test.
 test "$out" = "copy-htsopt: OK" || {
    echo "expected 'copy-htsopt: OK', got: $out" >&2
    exit 1
--- a/tests/01_engine-dns.test
+++ b/tests/01_engine-dns.test
@@ -5,8 +5,9 @@ set -euo pipefail

 # DNS resolver/cache self-test: a mock getaddrinfo (no network) checks address
 # family, single-address selection, the -@i4/-@i6 family filter, and cache reuse.
-# 'run' is an ignored placeholder argument.
-out=$(httrack -#test=dns run)
+# The trailing token is required, like the other -# selftests, so a bare command
+# line isn't treated as "no arguments" and routed to the usage screen.
+out=$(httrack -#D run)

 test "$out" = "dns-selftest: OK" || {
    echo "expected 'dns-selftest: OK', got: $out" >&2
--- a/tests/01_engine-entities.test
+++ b/tests/01_engine-entities.test
@@ -4,13 +4,13 @@
 set -euo pipefail

 # HTML entity unescaping (hts_unescapeEntitiesWithCharset).
-# -#test=entities <string> prints the string with entities decoded (UTF-8 output).
+# -#6 <string> prints the string with entities decoded (UTF-8 output).
 ent() {
-    test "$(httrack -O /dev/null -#test=entities "$1")" == "$2" || exit 1
+    test "$(httrack -O /dev/null -#6 "$1")" == "$2" || exit 1
 }
 # crash probe: malformed input must exit cleanly, not abort.
 runs() {
-    httrack -O /dev/null -#test=entities "$1" >/dev/null 2>&1 || exit 1
+    httrack -O /dev/null -#6 "$1" >/dev/null 2>&1 || exit 1
 }

 # named entities
@@ -18,21 +18,6 @@ ent '&amp;' '&'
 ent '&lt;&gt;' '<>'
 ent '&eacute;' 'é'

-# HTML5 names from the WHATWG set
-ent '&hellip;' '…'
-ent '&bigcup;' '⋃'
-# longest name (31 chars) exercises the name-length cap
-ent '&CounterClockwiseContourIntegral;' '∳'
-# astral codepoint -> 4-byte UTF-8
-ent '&Aopf;' '𝔸'
-# multi-codepoint refs are skipped at generation, so left verbatim
-ent '&fjlig;' '&fjlig;'
-
-# common HTML4 names still decode (regression guard against accidental drops)
-ent '&copy;&reg;&trade;' '©®™'
-ent '&mdash;&ndash;' '—–'
-ent '&alpha;&beta;' 'αβ'
-
 # numeric: decimal and hex
 ent '&#65;&#66;' 'AB'
 ent '&#x41;' 'A'
--- a/tests/01_engine-filelist.test
+++ b/tests/01_engine-filelist.test
@@ -1,65 +0,0 @@
-#!/bin/bash
-#
-# -%L URL-list loading (#49): a readable list is honored; an unusable one fails
-# with the reason (errno / not-a-regular-file), not a bare "Could not include
-# URL list". Offline: file:// fixture, no server. Asserts on httrack's own
-# strings and the message shape, so it is locale-independent.
-
-set -euo pipefail
-
-tmp=$(mktemp -d "${TMPDIR:-/tmp}/httrack_filelist.XXXXXX") || exit 1
-trap 'rm -rf "$tmp"' EXIT HUP INT QUIT PIPE TERM
-
-echo '<html><body>hi</body></html>' >"$tmp/index.html"
-
-# run httrack with the given -%L target; structured log lands in $out/hts-log.txt
-run() {
-    local out="$1" list="$2"
-    rm -rf "$out"
-    mkdir -p "$out"
-    httrack -O "$out" --quiet -n "-%L" "$list" >"$out/.stdout" 2>&1 || true
-    LOG="$out/hts-log.txt"
-}
-
-fail() {
-    echo "FAIL: $1"
-    cat "$LOG"
-    exit 1
-}
-loghas() {
-    grep -Eq "$1" "$LOG" || fail "expected /$1/ in $LOG"
-}
-lognot() {
-    if grep -Eq "$1" "$LOG"; then fail "unexpected /$1/ in $LOG"; fi
-}
-
-# readable list: its one URL is loaded and counted (count must be non-zero)
-printf 'file://%s/index.html\n' "$tmp" >"$tmp/urls.txt"
-run "$tmp/ok" "$tmp/urls.txt"
-loghas '[1-9][0-9]* links added from'
-
-# missing file: quoted name + a non-empty reason, never the old reasonless
-# "Could not include URL list: <name>". The reason is the stat() errno, not the
-# directory fallback literal (guards against dropping the errno lookup).
-run "$tmp/miss" "$tmp/nope.txt"
-loghas 'Could not include URL list "[^"]+": .+'
-lognot 'Could not include URL list: '
-lognot 'not a regular file'
-
-# a directory is rejected with our own reason (locale-independent)
-mkdir -p "$tmp/adir"
-run "$tmp/dir" "$tmp/adir"
-loghas 'Could not include URL list "[^"]+": not a regular file'
-
-# unreadable regular file: the fopen() errno arm fires, distinct from the
-# directory branch. Root bypasses mode 000, so skip it there.
-if test "$(id -u)" -ne 0; then
-    : >"$tmp/noperm.txt"
-    chmod 000 "$tmp/noperm.txt"
-    run "$tmp/perm" "$tmp/noperm.txt"
-    chmod 644 "$tmp/noperm.txt"
-    loghas 'Could not include URL list "[^"]+": .+'
-    lognot 'not a regular file'
-fi
-
-exit 0
--- a/tests/01_engine-filter.test
+++ b/tests/01_engine-filter.test
@@ -4,13 +4,13 @@
 set -euo pipefail

 # wildcard filter engine (strjoker), the core of +/- include/exclude rules.
-# -#test=filter <filter> <string> prints "<string> does match <filter>" or "... does NOT match ...".
+# -#0 <filter> <string> prints "<string> does match <filter>" or "... does NOT match ...".

 match() {
-    test "$(httrack -O /dev/null -#test=filter "$1" "$2")" == "$2 does match $1" || exit 1
+    test "$(httrack -O /dev/null -#0 "$1" "$2")" == "$2 does match $1" || exit 1
 }
 nomatch() {
-    test "$(httrack -O /dev/null -#test=filter "$1" "$2")" == "$2 does NOT match $1" || exit 1
+    test "$(httrack -O /dev/null -#0 "$1" "$2")" == "$2 does NOT match $1" || exit 1
 }

 # bare star matches everything
@@ -50,75 +50,24 @@ match '*foo*bar' 'foozbar'
 # '?' is the query-string marker, not a single-char wildcard
 nomatch 'a?c' 'abc'

-# Inside a class, backslash escapes the next char as a literal member (#148):
-# '\X' matches X only (not '\'), and an escaped ']' is a member, not the terminator.
+# backslash escapes a metacharacter inside a class so it is matched literally.
+# Quirk: the decoder also adds the backslash itself to the set, so '\X' matches
+# both X and '\'. These assertions pin that behavior.
 match '*[\*]' '*'
-nomatch '*[\*]' "\\"
+match '*[\*]' "\\"
+nomatch '*[\*]' 'a'
 match '*[\\]' "\\"
-nomatch '*[\\]' '*'
+nomatch '*[\\]' 'a'
 match '*[\[]' '['
-nomatch '*[\[]' "\\"
-match '*[\]]' ']'
-nomatch '*[\]]' "\\"
+match '*[\[]' "\\"
+nomatch '*[\[]' 'a'

-# '*[\[\]]' is "the [ or ] character", as the filter guide documents.
-match '*[\[\]]' '['
-match '*[\[\]]' ']'
-nomatch '*[\[\]]' 'a'
-match '*[\[,\]]' '[' # comma between members is optional
-match '*[\[,\]]' ']'
-match '*[a,\[]' 'a' # an escaped member no longer eats the preceding one
-match '*[a,\[]' '['
-
-# Escape is decoded before the range/separator/size checks, so '\-' '\,' '\<'
-# are literal members, not operators.
-match '*[a\-z]' 'a'
-match '*[a\-z]' 'z'
-nomatch '*[a\-z]' 'b' # not the a..z range
-match '*[\,]' ','
-nomatch '*[\,]' "\\" # the escape must not leak '\' into the class
-match '*[\<]' '<'
-nomatch '*[\<]' "\\"
-match '*[\[,\],a]' '['
-match '*[\[,\],a]' ']'
-match '*[\[,\],a]' 'a'
-
-# A truncated range '*[a-' is the literal members {a,-}; the parser must not
-# read past the end decoding it (was a 1-byte heap over-read in the range arm).
-match '*[a-' 'a'
-nomatch '*[a-' 'b'
-
-# *(...) matches exactly one char from the class; *[...] matches a run.
-match '*(a,b)' 'a'
-nomatch '*(a,b)' 'aa'
-nomatch '*(a,b)' 'c'
-
-# documented composite filters (filters.html)
-match 'www.*[path].com/*[path].zip' 'www.foo.com/a/b.zip'
-nomatch 'www.*[path].com/*[path].zip' 'www.foo.com/a/b.tar'
-match '*.html*[]' 'page.html'
-nomatch '*.html*[]' 'page.html?x=1' # *[] forbids the trailing query
-
-# Size-based rules (-#test=filtersize <size> <string> <filter...>): a negative size
-# means the size is still unknown (scan time). A size exclusion must stay neutral
-# then, so the file is fetched and only cancelled once its size is known (#143).
-fsize() {
-    local want="$1"
-    shift
-    test "$(httrack -O /dev/null -#test=filtersize "$@")" == "$want" || exit 1
-}
-fsize 'verdict=allowed size_flag=0' -1 foo.jpg -* '+*.jpg' '-*.jpg*[<10]'   # scan time: keep
-fsize 'verdict=forbidden size_flag=1' 5 foo.jpg -* '+*.jpg' '-*.jpg*[<10]'  # <10KB: cancel
-fsize 'verdict=allowed size_flag=1' 20 foo.jpg -* '+*.jpg' '-*.jpg*[<10]'   # >=10KB: keep
-fsize 'verdict=forbidden size_flag=0' -1 foo.txt -* '+*.jpg' '-*.jpg*[<10]' # not a jpg
-# the '>' operator is just as neutral at scan time, and fires once size is known
-fsize 'verdict=allowed size_flag=0' -1 foo.jpg -* '+*.jpg' '-*.jpg*[>10]'   # scan time: keep
-fsize 'verdict=forbidden size_flag=1' 20 foo.jpg -* '+*.jpg' '-*.jpg*[>10]' # >10KB: cancel
-
-# [name]/[file]/[path] never span '?' mid-string; a trailing query is still
-# tolerated by the global '?' rule (same as plain *.aspx), not the class (#144).
-nomatch '*[path]/end' 'a?b/end'
-nomatch '*[file]end' 'foo?xend'
-nomatch '*[name]X' 'abc?X'
-match '*[file]' 'foo?x=1' # trailing query: tolerated, as for *.aspx
-match '*.aspx' 'page.aspx?y=2'
+# A literal ']' cannot be a class member: the class parser stops at the first
+# ']', escaped or not. So '*[\[\]]' does NOT mean "the [ or ] character" as the
+# filter guide claims (GitHub #148); it parses as the class {'[','\'} followed
+# by a trailing literal ']'. These assertions document the current (buggy)
+# behavior so any future matcher fix is a deliberate, visible change.
+nomatch '*[\[\]]' '[' # not matched, despite the docs
+match '*[\[\]]' ']'   # only via the empty class-match + trailing ']'
+match '*[\[\]]' '[]'  # one of {'[','\'} then the trailing ']'
+nomatch '*[\[\]]' '[]x'
--- a/tests/01_engine-hashtable.test
+++ b/tests/01_engine-hashtable.test
@@ -3,7 +3,5 @@

 set -euo pipefail

-# httrack internal hashtable autotest on 100K keys. Assert the success line (on
-# stderr) so a misrouted registry entry can't pass on exit code alone.
-out=$(httrack -#test=hashtable 100000 2>&1)
-printf '%s\n' "$out" | grep -q "all hashtable tests were successful!" || exit 1
+# httrack internal hashtable autotest on 100K keys
+httrack -#7 100000
--- a/tests/01_engine-idna.test
+++ b/tests/01_engine-idna.test
@@ -3,13 +3,13 @@

 set -euo pipefail

-# IDNA / punycode encode (-#test=idna-encode) and decode (-#test=idna-decode). This code has a CVE history,
+# IDNA / punycode encode (-#4) and decode (-#5). This code has a CVE history,
 # so the edge cases below cover passthrough, round-trips, and malformed input.

-enc() { test "$(httrack -O /dev/null -#test=idna-encode "$1")" == "$2" || exit 1; }
-dec() { test "$(httrack -O /dev/null -#test=idna-decode "$1")" == "$2" || exit 1; }
+enc() { test "$(httrack -O /dev/null -#4 "$1")" == "$2" || exit 1; }
+dec() { test "$(httrack -O /dev/null -#5 "$1")" == "$2" || exit 1; }
 # crash probe: malformed ACE input must exit cleanly, not abort.
-runs() { httrack -O /dev/null -#test=idna-decode "$1" >/dev/null 2>&1 || exit 1; }
+runs() { httrack -O /dev/null -#5 "$1" >/dev/null 2>&1 || exit 1; }

 # encode
 enc 'www.café.com' 'www.xn--caf-dma.com'
--- a/tests/01_engine-mime.test
+++ b/tests/01_engine-mime.test
@@ -4,13 +4,13 @@
 set -euo pipefail

 # MIME type guessing from extension (get_httptype / give_mimext).
-# -#test=mime <path> prints "<path> is '<mime>'" then "and its local type is '.<ext>'".
+# -#2 <path> prints "<path> is '<mime>'" then "and its local type is '.<ext>'".

 mime() {
-    test "$(httrack -O /dev/null -#test=mime "$1" | head -1)" == "$1 is '$2'" || exit 1
+    test "$(httrack -O /dev/null -#2 "$1" | head -1)" == "$1 is '$2'" || exit 1
 }
 unknown() {
-    test "$(httrack -O /dev/null -#test=mime "$1" | head -1)" == "$1 is of an unknown MIME type" || exit 1
+    test "$(httrack -O /dev/null -#2 "$1" | head -1)" == "$1 is of an unknown MIME type" || exit 1
 }

 mime '/a/b.html' 'text/html'
--- a/tests/01_engine-parse.test
+++ b/tests/01_engine-parse.test
@@ -323,33 +323,4 @@ grep -Fq 'href="ahref%20(4).gif"' "$saved9" ||
 ! grep -Eq '(src|href)="[^"]*%28' "$saved9" ||
    ! echo "FAIL #163: gate over-fired onto a non-url() attribute link" || exit 1

-# HTML5 <source>/<track> follow as embedded near-links past the -r2 depth boundary (#451).
-# img.gif positive control; plain.gif (bare <a href>) negative control proves the gate is selective.
-site10="$tmp/html5media"
-mkdir -p "$site10"
-for f in img ss plain; do gif "$site10/$f.gif"; done
-printf 'x' >"$site10/v.webm"
-printf 'x' >"$site10/subs.vtt"
-cat >"$site10/index.html" <<EOF
-<html><body><a href="leaf.html">leaf</a></body></html>
-EOF
-cat >"$site10/leaf.html" <<EOF
-<html><body>
-<img src="img.gif">
-<picture><source srcset="ss.gif 2x"></picture>
-<video><source src="v.webm"></video>
-<video><track src="subs.vtt"></video>
-<a href="plain.gif">plain link past the boundary</a>
-</body></html>
-EOF
-out10="$tmp/html5media-out"
-rm -rf "$out10"
-mkdir -p "$out10"
-httrack "file://$site10/index.html" -O "$out10" --quiet --near -r2 >"$out10/.log" 2>&1 || true
-found "img.gif" "$out10"
-found "ss.gif" "$out10"
-found "v.webm" "$out10"
-found "subs.vtt" "$out10"
-notfound "plain.gif" "$out10"
-
 exit 0
--- a/tests/01_engine-pause.test
+++ b/tests/01_engine-pause.test
@@ -1,15 +0,0 @@
-#!/bin/bash
-#
-# --pause (#185): the inter-file pause target must stay in [min,max] and spread
-# across it (a per-call rand() would collapse it toward min). Driven by the
-# in-process 'httrack -#test=pause' test. POSIX-portable ($(BASH) is /bin/sh on macOS).
-
-set -eu
-
-# 'run' is an ignored placeholder argument.
-out=$(httrack -#test=pause run)
-
-test "$out" = "pause: OK" || {
-    echo "expected 'pause: OK', got: $out" >&2
-    exit 1
-}
--- a/tests/01_engine-relative.test
+++ b/tests/01_engine-relative.test
@@ -8,7 +8,7 @@ set -euo pipefail
 # relative path from <curr>'s directory to <link>
 rel() {
    local got
-    got=$(httrack -O /dev/null -#test=relative "$1" "$2")
+    got=$(httrack -O /dev/null -#l "$1" "$2")
    test "$got" == "relative=$3" ||
        {
            echo "FAIL rel($1, $2): got '$got' want 'relative=$3'"
@@ -19,7 +19,7 @@ rel() {
 # resolve <link> against origin <adr>/<fil> -> adr=.. fil=..
 ident() {
    local got
-    got=$(httrack -O /dev/null -#test=resolve "$1" "$2" "$3")
+    got=$(httrack -O /dev/null -#i "$1" "$2" "$3")
    test "$got" == "$4" ||
        {
            echo "FAIL ident($1, $2, $3): got '$got' want '$4'"
--- a/tests/01_engine-savename.test
+++ b/tests/01_engine-savename.test
@@ -3,11 +3,11 @@

 set -euo pipefail

-# Local save-name extension resolution (url_savename via -#test=savename <fil> <content-type>).
+# Local save-name extension resolution (url_savename via -#N <fil> <content-type>).
 # Asserts on the basename of "savename: <path>".

 name() {
-    out="$(httrack -O /dev/null -#test=savename "$1" "$2" | sed -n 's/^savename: //p')"
+    out="$(httrack -O /dev/null -#N "$1" "$2" | sed -n 's/^savename: //p')"
    test "${out##*/}" == "$3" || {
        echo "FAIL: '$1' '$2' -> '$out' (want '$3')"
        exit 1
--- a/tests/01_engine-selftest-dispatch.test
+++ b/tests/01_engine-selftest-dispatch.test
@@ -1,17 +0,0 @@
-#!/bin/bash
-#
-# The -#test dispatch itself: a bare -#test lists the registry, and an unknown
-# name errors (non-zero, diagnostic) instead of silently passing.
-
-set -eu
-
-# Bare -#test lists known tests (printed to stderr).
-list=$(httrack -#test 2>&1)
-printf '%s\n' "$list" | grep -q "filter" || exit 1
-printf '%s\n' "$list" | grep -q "cache-writefail" || exit 1
-
-# Unknown name: non-zero exit + diagnostic, and no test result line.
-rc=0
-err=$(httrack -#test=bogus 2>&1) || rc=$?
-test "$rc" -ne 0 || exit 1
-printf '%s\n' "$err" | grep -q "Unknown self-test" || exit 1
--- a/tests/01_engine-simplify.test
+++ b/tests/01_engine-simplify.test
@@ -5,7 +5,7 @@ set -euo pipefail

 # path simplify engine (fil_simplifie): collapses ./ and ../ segments.
 simp() {
-    test "$(httrack -O /dev/null -#test=simplify "$1")" == "simplified=$2" || exit 1
+    test "$(httrack -O /dev/null -#1 "$1")" == "simplified=$2" || exit 1
 }

 simp './foo/bar/' 'foo/bar/'
--- a/tests/01_engine-stripquery.test
+++ b/tests/01_engine-stripquery.test
@@ -1,8 +0,0 @@
-#!/bin/bash
-#
-
-set -euo pipefail
-
-# --strip-query: pattern-scoped query-key stripping for dedup. All assertions
-# live in the engine self-test (hts_query_strip_keys + fil_normalized_filtered).
-httrack -O /dev/null -#test=stripquery | grep -q "strip-query self-test OK"
--- a/tests/01_engine-strsafe.test
+++ b/tests/01_engine-strsafe.test
@@ -3,22 +3,23 @@

 set -euo pipefail

-# htssafe.h bounded string operations (driven by 'httrack -#test=strsafe').
+# htssafe.h bounded string operations (driven by 'httrack -#8').

 # Success path: every bounded op (strcpybuff/strcatbuff/strncatbuff/strlcpybuff)
-# must behave correctly. 'run' selects the success path (vs the overflow modes).
+# must behave correctly. Like the other -# debug modes, a trailing token is
+# required (a bare '-#8' falls through to the usage screen).
 rc=0
-out=$(httrack -#test=strsafe run) || rc=$?
+out=$(httrack -#8 run) || rc=$?
 test "$rc" -eq 0 || exit 1
 test "$out" == "strsafe: OK" || exit 1

 # Overflow path: an over-capacity write into a sized buffer must be caught by
 # the bounded macro and abort the process, not be silently truncated/completed.
 # Assert the htssafe abort signature specifically, so the test cannot pass for
-# an unrelated reason (e.g. the strsafe test being gone, which prints the
-# registry to stderr and also exits non-zero).
+# an unrelated reason (e.g. the -#8 mode being gone and falling through to the
+# usage screen, which also exits non-zero).
 # the bounded macro aborts (non-zero exit), so don't let set -e trip on it
-err=$(httrack -#test=strsafe overflow "this string is far too long for the buffer" 2>&1) || true
+err=$(httrack -#8 overflow "this string is far too long for the buffer" 2>&1) || true
 case "$err" in
 *"strsafe: NOT aborted"*)
    echo "over-capacity write was NOT caught" >&2
@@ -35,7 +36,7 @@ esac
 # capacity (4 bytes into a 4-byte buffer), so this also pins the boundary: a
 # '<=' off-by-one in the capacity check would let it through (and print "NOT
 # aborted"). Match the specific htsbuff abort message, not just any assert.
-err=$(httrack -#test=strsafe overflow-buff "abcd" 2>&1) || true
+err=$(httrack -#8 overflow-buff "abcd" 2>&1) || true
 case "$err" in
 *"strsafe: NOT aborted"*)
    echo "htsbuff over-capacity write was NOT caught" >&2
--- a/tests/01_engine-urlhack.test
+++ b/tests/01_engine-urlhack.test
@@ -1,8 +0,0 @@
-#!/bin/bash
-#
-
-set -euo pipefail
-
-# -%u url-hack split (#271): www / // / query-order dedup toggle independently.
-# All assertions live in the engine self-test (hash compare flag resolution).
-httrack -O /dev/null -#test=urlhack run | grep -q "urlhack self-test OK"
--- a/tests/01_engine-useragent.test
+++ b/tests/01_engine-useragent.test
@@ -1,7 +0,0 @@
-#!/bin/bash
-#
-
-set -euo pipefail
-
-# Default User-Agent (#449): honest HTTrack token, no Windows 98 relic.
-httrack -O /dev/null -#test=useragent run | grep -q "useragent self-test OK"
--- a/tests/21_local-intl-update.test
+++ b/tests/21_local-intl-update.test
@@ -1,11 +0,0 @@
-#!/bin/bash
-#
-# #157: a dotless, accented URL named .html on the first crawl must keep .html
-# across an update -- not revert to the extensionless name.
-
-: "${top_srcdir:=..}"
-
-bash "$top_srcdir/tests/local-crawl.sh" --errors 0 --rerun \
-    --found 'intl/Instalação_CVS_no_Ubuntu.html' \
-    --not-found 'intl/Instalação_CVS_no_Ubuntu' \
-    httrack 'BASEURL/intl/index.html'
--- a/tests/22_local-broken-size.test
+++ b/tests/22_local-broken-size.test
@@ -1,17 +0,0 @@
-#!/bin/bash
-# Issues #32/#41: a Content-Length that disagrees with the body warns "bogus
-# state (broken size)" and skips the cache; -%B (tolerant) accepts it.
-
-: "${top_srcdir:=..}"
-
-# Default: warn, but the file is still written.
-bash "$top_srcdir/tests/local-crawl.sh" --errors 0 \
-    --found 'size/oversize.bin' \
-    --log-found 'bogus state \(broken size' \
-    httrack 'BASEURL/size/index.html'
-
-# -%B (tolerant): no warning, file written.
-bash "$top_srcdir/tests/local-crawl.sh" --errors 0 \
-    --found 'size/oversize.bin' \
-    --log-not-found 'bogus state' \
-    httrack 'BASEURL/size/index.html' '-%B'
--- a/tests/23_local-errpage.test
+++ b/tests/23_local-errpage.test
@@ -1,19 +0,0 @@
-#!/bin/bash
-# Issue #17: with "no error pages" (-o0), 4xx/5xx bodies must not be written;
-# a genuine 0-byte 200 stays. Default (-o1) writes the error page. (#17's purge
-# half also does not reproduce; the purge path is not exercised here.)
-set -e
-
-: "${top_srcdir:=..}"
-
-# -o0: 404 suppressed, good page and the legit 0-byte 200 kept.
-bash "$top_srcdir/tests/local-crawl.sh" --errors 1 \
-    --found 'errpage/good.html' \
-    --found 'errpage/empty.html' \
-    --not-found 'errpage/missing.html' \
-    httrack 'BASEURL/errpage/index.html' '-o0'
-
-# Control -o1 (default): the 404 error page is written.
-bash "$top_srcdir/tests/local-crawl.sh" --errors 1 \
-    --found 'errpage/missing.html' \
-    httrack 'BASEURL/errpage/index.html' '-o1'
--- a/tests/24_local-resume-overlap.test
+++ b/tests/24_local-resume-overlap.test
@@ -1,109 +0,0 @@
-#!/bin/bash
-# Issue #198: on a resumed download the server may answer the Range with a 206
-# that starts *before* the offset we asked for (block-aligned ranges). httrack
-# must honor the returned Content-Range, not blindly append, or the overlap
-# bytes get duplicated and the file grows (corrupt PDFs). Pass 1 interrupts
-# flaky.bin mid-body (partial + temp-ref); pass 2 resumes against a 206 that
-# backs up 8 bytes. The result must equal the same bytes fetched whole (full.bin).
-set -eu
-
-: "${top_srcdir:=..}"
-testdir=$(cd "$(dirname "$0")" && pwd)
-server="${testdir}/local-server.py"
-
-command -v python3 >/dev/null || ! echo "python3 not found; skipping" || exit 77
-
-tmpdir=$(mktemp -d "${TMPDIR:-/tmp}/httrack_198.XXXXXX") || exit 1
-serverpid=
-crawlpid=
-cleanup() {
-    if test -n "$crawlpid"; then kill -9 "$crawlpid" 2>/dev/null || true; fi
-    if test -n "$serverpid"; then
-        kill "$serverpid" 2>/dev/null || true
-        wait "$serverpid" 2>/dev/null || true
-    fi
-    rm -rf "$tmpdir"
-}
-trap cleanup EXIT HUP INT QUIT PIPE TERM
-
-# OVERLAP_COUNTER gets a byte per flaky.bin request so pass 1 knows when to interrupt.
-serverlog="${tmpdir}/server.log"
-counter="${tmpdir}/hits"
-resumed="${tmpdir}/resumed" # gets a byte when the server serves a resume 206
-OVERLAP_COUNTER="$counter" OVERLAP_RESUMED="$resumed" \
-    python3 "$server" --root "${testdir}/server-root" \
-    >"$serverlog" 2>&1 &
-serverpid=$!
-port=
-for _ in $(seq 1 50); do
-    line=$(head -n1 "$serverlog" 2>/dev/null)
-    if test "${line%% *}" == "PORT"; then
-        port="${line#PORT }"
-        break
-    fi
-    kill -0 "$serverpid" 2>/dev/null || {
-        echo "server exited early: $(cat "$serverlog")"
-        exit 1
-    }
-    sleep 0.1
-done
-test -n "$port" || {
-    echo "could not discover server port"
-    exit 1
-}
-base="http://127.0.0.1:${port}"
-
-which httrack >/dev/null || {
-    echo "could not find httrack"
-    exit 1
-}
-out="${tmpdir}/crawl"
-common=(-O "$out" --quiet --disable-security-limits --robots=0 --timeout=30 --retries=0 -c1)
-refdir="${out}/hts-cache/ref"
-
-# pass 1: interrupt once flaky.bin's prefix is streaming (partial + temp-ref).
-printf '[pass 1: interrupt flaky.bin] ..\t'
-httrack "${common[@]}" "${base}/overlap/index.html" >"${tmpdir}/log1" 2>&1 &
-crawlpid=$!
-for _ in $(seq 1 300); do
-    test -s "$counter" && break
-    kill -0 "$crawlpid" 2>/dev/null || break
-    sleep 0.1
-done
-sleep 0.5
-kill -TERM "$crawlpid" 2>/dev/null || true
-wait "$crawlpid" 2>/dev/null || true
-crawlpid=
-test -n "$(find "$refdir" -name '*.ref' 2>/dev/null)" || {
-    echo "FAIL: no temp-ref survived pass 1; cannot drive the resume"
-    exit 1
-}
-echo "OK (temp-ref present)"
-
-# pass 2: --continue -> resume Range -> 206 that starts 8 bytes early.
-printf '[pass 2: resume flaky.bin] ..\t'
-httrack "${common[@]}" --continue "${base}/overlap/index.html" >"${tmpdir}/log2" 2>&1 || true
-echo "OK"
-
-# Guard against a silent full re-download: the byte-compare below only tests the
-# fix if pass 2 actually went through the resume Range -> 206 path.
-printf '[resume path was exercised] ..\t'
-if ! test -s "$resumed"; then
-    echo "FAIL: pass 2 never triggered a resume 206; the overlap fix was not exercised"
-    exit 1
-fi
-echo "OK"
-
-printf '[resumed file is not corrupted] ..\t'
-dir=$(find "$out" -maxdepth 1 -type d -name '127.0.0.1*' | head -1)
-flaky="${dir}/overlap/flaky.bin"
-full="${dir}/overlap/full.bin"
-if ! test -f "$flaky" || ! test -f "$full"; then
-    echo "FAIL: flaky.bin or full.bin missing after pass 2"
-    exit 1
-fi
-if ! cmp -s "$flaky" "$full"; then
-    echo "FAIL: resumed flaky.bin ($(wc -c <"$flaky")) != full.bin ($(wc -c <"$full")); overlap duplicated"
-    exit 1
-fi
-echo "OK ($(wc -c <"$flaky") bytes, byte-identical)"
--- a/tests/25_local-mime-exclude.test
+++ b/tests/25_local-mime-exclude.test
@@ -1,16 +0,0 @@
-#!/bin/bash
-#
-# A -mime: exclusion must abort the transfer on the response Content-Type, not
-# fetch the whole 1 MB body then discard it (#58). The bytes-received guard is
-# the real one: the file is absent either way, but only the fix keeps the count
-# tiny (header only) instead of pulling the body. Match it positively (a small,
-# <=4-digit count) so a vanished/reworded summary line fails rather than passes.
-
-: "${top_srcdir:=..}"
-
-bash "$top_srcdir/tests/local-crawl.sh" --errors 0 \
-    --found 'mimex/real.html' \
-    --not-found 'mimex/blob.pdf' \
-    --log-found 'excluded by MIME type filter' \
-    --log-found '\[[0-9]{1,4} bytes received' \
-    httrack 'BASEURL/mimex/index.html' '-mime:application/pdf'
--- a/tests/26_local-strip-query.test
+++ b/tests/26_local-strip-query.test
@@ -1,23 +0,0 @@
-#!/bin/bash
-#
-# End-to-end --strip-query (#112): two links to one resource differing only by
-# ?utm_source dedup to a single saved file (2 files written: index + resource);
-# the control crawl without the option keeps both variants (3 files). Locks the
-# CLI->opt->hash plumbing the engine self-test can't reach.
-
-set -e
-
-: "${top_srcdir:=..}"
-
-# stripped: the two ?utm_source variants collapse to one resource
-bash "$top_srcdir/tests/local-crawl.sh" --errors 0 --files 2 \
-    httrack 'BASEURL/stripquery/index.html' --strip-query 'utm_source'
-
-# control: no stripping -> both query-named variants are saved
-bash "$top_srcdir/tests/local-crawl.sh" --errors 0 --files 3 \
-    httrack 'BASEURL/stripquery/index.html'
-
-# strip still applies with url-hack off (-%u0): exercises the urlhack-off
-# savename branch, which must normalize the dedup key the same way the hash does
-bash "$top_srcdir/tests/local-crawl.sh" --errors 0 --files 2 \
-    httrack 'BASEURL/stripquery/index.html' -%u0 --strip-query 'utm_source'
--- a/tests/27_local-cookies-file.test
+++ b/tests/27_local-cookies-file.test
@@ -1,22 +0,0 @@
-#!/bin/bash
-#
-# End-to-end --cookies-file (#215): /gated/secret.php needs a cookie no page
-# ever Set-Cookies, so it is reachable only when the option preloads it from a
-# Netscape cookies.txt. Locks the CLI->opt->cookie_load->wire plumbing.
-
-set -e
-
-: "${top_srcdir:=..}"
-
-# preloaded cookie -> secret page is served. -o0 means a 500 leaves no file, so
-# --found/--files only hold when the secret is genuinely fetched (200).
-bash "$top_srcdir/tests/local-crawl.sh" --cookie 'session=opensesame' \
-    --errors 0 --files 2 \
-    --found 'gated/index.html' --found 'gated/secret.html' \
-    httrack 'BASEURL/gated/index.php' -o0
-
-# control: without the cookie the secret 500s; -o0 suppresses the error page so
-# its absence is real (error + missing file)
-bash "$top_srcdir/tests/local-crawl.sh" --errors 1 \
-    --found 'gated/index.html' --not-found 'gated/secret.html' \
-    httrack 'BASEURL/gated/index.php' -o0
--- a/tests/28_local-pause.test
+++ b/tests/28_local-pause.test
@@ -1,36 +0,0 @@
-#!/bin/bash
-#
-# --pause (#185): a fixed inter-file delay must slow a multi-file crawl. Measure
-# the same crawl with and without --pause and compare: the harness overhead
-# cancels, leaving only the pause. Integer seconds keep it portable (BSD date
-# has no %N); a lower bound is not timing-flaky since a pause only adds time.
-
-set -e
-
-: "${top_srcdir:=..}"
-
-# python3 runs the local server (mirror local-crawl.sh); skip when absent, else
-# run() swallows its exit-77 and the serverless 0s/0s crawl looks like a fail.
-command -v python3 >/dev/null || {
-    echo "python3 not found; skipping local crawl tests"
-    exit 77
-}
-
-run() { # echoes the wall-clock seconds of one crawl
-    local t0 t1
-    t0=$(date +%s)
-    bash "$top_srcdir/tests/local-crawl.sh" --errors 0 \
-        httrack 'BASEURL/types/index.html' -c1 "$@" >/dev/null 2>&1
-    t1=$(date +%s)
-    echo $((t1 - t0))
-}
-
-base=$(run)
-paused=$(run --pause 0.5)
-delta=$((paused - base))
-
-echo "crawl: ${base}s, with --pause 0.5: ${paused}s (delta ${delta}s)"
-if [ "$delta" -lt 2 ]; then
-    echo "FAIL: --pause did not delay the crawl (delta ${delta}s)" >&2
-    exit 1
-fi
--- a/tests/29_local-redirect-fragment.test
+++ b/tests/29_local-redirect-fragment.test
@@ -1,11 +0,0 @@
-#!/bin/bash
-# Issue #204: a 302 Location with a #fragment must drop the fragment before the
-# target is fetched. The server is strict (400 on a '#' in the request-target),
-# so a leaked fragment logs an error and the target is never saved.
-set -e
-
-: "${top_srcdir:=..}"
-
-bash "$top_srcdir/tests/local-crawl.sh" --errors 0 \
-    --found 'redir/target.html' \
-    httrack 'BASEURL/redir/index.html'
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -5,7 +5,6 @@ EXTRA_DIST = $(TESTS) crawl-test.sh run-all-tests.sh check-network.sh \
 	proxy-https-server.py \
 	local-crawl.sh local-server.py server.crt server.key \
 	server-root/simple/basic.html server-root/simple/link.html \
-	server-root/stripquery/index.html server-root/stripquery/a.html \
 	fixtures/cache-golden/hts-cache/new.zip

 TESTS_ENVIRONMENT =
@@ -27,7 +26,6 @@ TESTS = \
 	00_runnable.test \
 	01_engine-cache.test \
 	01_engine-cache-golden.test \
-	01_engine-cache-writefail.test \
 	01_engine-charset.test \
 	01_engine-cmdline.test \
 	01_engine-cookies.test \
@@ -35,22 +33,16 @@ TESTS = \
 	01_engine-dns.test \
 	01_engine-doitlog.test \
 	01_engine-entities.test \
-	01_engine-filelist.test \
 	01_engine-filter.test \
 	01_engine-hashtable.test \
 	01_engine-idna.test \
 	01_engine-mime.test \
 	01_engine-parse.test \
-	01_engine-pause.test \
 	01_engine-rcfile.test \
 	01_engine-relative.test \
 	01_engine-savename.test \
-	01_engine-selftest-dispatch.test \
 	01_engine-simplify.test \
-	01_engine-stripquery.test \
 	01_engine-strsafe.test \
-	01_engine-urlhack.test \
-	01_engine-useragent.test \
 	02_manpage-regen.test \
 	02_update-cache.test \
 	10_crawl-simple.test \
@@ -68,15 +60,6 @@ TESTS = \
 	17_local-empty-ct.test \
 	18_local-update.test \
 	19_local-connect-fallback.test \
-	20_local-resume-loop.test \
-	21_local-intl-update.test \
-	22_local-broken-size.test \
-	23_local-errpage.test \
-	24_local-resume-overlap.test \
-	25_local-mime-exclude.test \
-	26_local-strip-query.test \
-	27_local-cookies-file.test \
-	28_local-pause.test \
-	29_local-redirect-fragment.test
+	20_local-resume-loop.test

 CLEANFILES = check-network_sh.cache
--- a/tests/local-crawl.sh
+++ b/tests/local-crawl.sh
@@ -12,14 +12,9 @@
 # the mirror directory name.
 #
 # Usage:
-#   bash local-crawl.sh [--tls] [--root DIR] [--cookie NAME=VALUE ...] \
+#   bash local-crawl.sh [--tls] [--root DIR] \
 #       --errors N --files N --found PATH ... --directory PATH ... \
-#       --log-found REGEX ... --log-not-found REGEX ... \
 #       httrack BASEURL/some/path [httrack-args...]
-# --log-found/--log-not-found grep (ERE) the crawl's hts-log.txt.
-# --cookie writes a Netscape cookies.txt (scoped to the discovered host:port,
-# which the ephemeral port forces into the cookie domain) and passes it to
-# httrack via --cookies-file, to exercise preloaded cookies.

 set -u

@@ -88,7 +83,6 @@ tmpdir=$(mktemp -d "${tmptopdir}/httrack_local.XXXXXX") || die "could not create

 # --- parse leading control flags --------------------------------------------
 declare -a audit=()
-declare -a cookies=()
 scheme=http
 pos=0
 args=("$@")
@@ -109,15 +103,11 @@ while test "$pos" -lt "$nargs"; do
        pos=$((pos + 1))
        root="${args[$pos]}"
        ;;
-    --cookie)
-        pos=$((pos + 1))
-        cookies+=("${args[$pos]}")
-        ;;
    --errors | --files)
        audit+=("${args[$pos]}" "${args[$((pos + 1))]}")
        pos=$((pos + 1))
        ;;
-    --found | --not-found | --directory | --log-found | --log-not-found)
+    --found | --not-found | --directory)
        audit+=("${args[$pos]}" "${args[$((pos + 1))]}")
        pos=$((pos + 1))
        ;;
@@ -166,17 +156,6 @@ while test "$pos" -lt "$nargs"; do
    pos=$((pos + 1))
 done

-# --- materialize any --cookie entries into a cookies.txt ---------------------
-if test "${#cookies[@]}" -gt 0; then
-    jar="${tmpdir}/cookies.txt"
-    : >"$jar"
-    for spec in "${cookies[@]}"; do
-        printf '127.0.0.1:%s\tTRUE\t/\tFALSE\t1999999999\t%s\t%s\n' \
-            "$port" "${spec%%=*}" "${spec#*=}" >>"$jar"
-    done
-    hts+=(--cookies-file "$jar")
-fi
-
 # --- run httrack -------------------------------------------------------------
 which httrack >/dev/null || die "could not find httrack"
 ver=$(httrack -O /dev/null --version | sed -e 's/HTTrack version //')
@@ -217,15 +196,6 @@ if test -n "$rerun"; then
        exit 1
    }
    result "OK (update)"
-    # The update summary reports "files updated"; a fresh crawl never does. Assert
-    # it so a regression that bypasses the cache (re-crawls fresh) can't pass.
-    info "checking update used the cache"
-    if grep -aqE "mirror complete in .*files updated" "${out}/hts-log.txt"; then
-        result "OK"
-    else
-        result "update pass did not report cache activity"
-        exit 1
-    fi
 fi

 # --- discover the single host root (127.0.0.1_<port> or 127.0.0.1) -----------
@@ -278,22 +248,6 @@ while test "$i" -lt "${#audit[@]}"; do
            exit 1
        fi
        ;;
-    --log-found)
-        i=$((i + 1))
-        info "checking log matches ${audit[$i]}"
-        if grep -aqE "${audit[$i]}" "${out}/hts-log.txt"; then result "OK"; else
-            result "not in log"
-            exit 1
-        fi
-        ;;
-    --log-not-found)
-        i=$((i + 1))
-        info "checking log lacks ${audit[$i]}"
-        if grep -aqE "${audit[$i]}" "${out}/hts-log.txt"; then
-            result "present in log"
-            exit 1
-        else result "OK"; fi
-        ;;
    esac
    i=$((i + 1))
 done
--- a/tests/local-server.py
+++ b/tests/local-server.py
@@ -110,19 +110,6 @@ class Handler(SimpleHTTPRequestHandler):
            return self.fail_cookie("badger")
        self.send_html("\tThis is a test.")

-    # --cookies-file (#215): the secret page needs a cookie no page ever sets,
-    # so it is reachable only when --cookies-file preloads it.
-    GATE_COOKIE = ("session", "opensesame")
-
-    def route_gated_index(self):
-        self.send_html('\tThis is a <a href="secret.php">link</a>')
-
-    def route_gated_secret(self):
-        name, value = self.GATE_COOKIE
-        if self.request_cookies().get(name) != value:
-            return self.fail_cookie(name)
-        self.send_html("\tThis is the secret.")
-
    def route_robots(self):
        body = b"User-agent: *\nDisallow:\n"
        self.send_response(200)
@@ -190,35 +177,6 @@ class Handler(SimpleHTTPRequestHandler):
        body, ctype = self.TYPE_MATRIX[path]
        self.send_raw(body, ctype)

-    # --- MIME-type exclusion abort (issue #58) -----------------------------
-    # A -mime:application/pdf filter must abort the transfer once the header
-    # arrives, not download the whole body and discard it.
-    def route_mimex_index(self):
-        self.send_html(
-            '\t<a href="blob.pdf">pdf</a>\n' '\t<a href="real.html">real</a>\n'
-        )
-
-    # 1 MB body: the fix aborts after the header, so httrack's "bytes received"
-    # stays tiny; without it the engine reads the body and the count jumps.
-    MIMEX_BLOB = b"%PDF-1.4\n" + b"\x00" * (1024 * 1024)
-
-    def route_mimex_blob(self):
-        self.send_raw(self.MIMEX_BLOB, "application/pdf")
-
-    def route_mimex_real(self):
-        self.send_raw(b"<html><body>real</body></html>", "text/html")
-
-    # --- special chars in URLs across an update (issue #157) ---------------
-    # A dotless, accented basename served as text/html (MediaWiki style). The
-    # name the first crawl picks (.html) must survive the update pass.
-    INTL_NAME = "Instalação_CVS_no_Ubuntu"
-
-    def route_intl_index(self):
-        self.send_html('\t<a href="%s">accented</a>\n' % self.INTL_NAME)
-
-    def route_intl_page(self):
-        self.send_raw(b"<html><body>accented page</body></html>\n", "text/html")
-
    # resume / 416 loop (#206): the first GET stalls after a prefix so the crawl
    # can be interrupted (partial + temp-ref); every later request is 416.
    RESUME_PREFIX = b"PARTIAL-" + b"x" * 4096  # flushed before the stall
@@ -256,125 +214,10 @@ class Handler(SimpleHTTPRequestHandler):
        self.send_header("Content-Length", "0")
        self.end_headers()

-    # 206 resume must honor the server's Content-Range, not the offset we asked
-    # for (#198): a server resuming a few bytes *before* the request must not
-    # leave httrack duplicating the overlap onto the partial. flaky.bin
-    # interrupts once then resumes OVERLAP_EARLY bytes early; full.bin serves
-    # the identical bytes in one shot, so the test can compare the two.
-    OVERLAP_BLOB = b"%PDF-1.4\n" + bytes((i * 37 + 11) % 256 for i in range(8000))
-    OVERLAP_EARLY = 8
-    OVERLAP_PREFIX_LEN = 4000  # flushed before the stall
-    _overlap_started = False
-
-    def route_overlap_index(self):
-        self.send_html('\t<a href="flaky.bin">flaky</a>\n\t<a href="full.bin">full</a>')
-
-    def route_overlap_full(self):
-        self.send_raw(self.OVERLAP_BLOB, "application/octet-stream")
-
-    def route_overlap(self):
-        counter = os.environ.get("OVERLAP_COUNTER")
-        if counter:
-            with open(counter, "a") as fp:
-                fp.write("x")
-        blob = self.OVERLAP_BLOB
-        rng = self.headers.get("Range")
-        # First GET: stream a prefix then stall, so the crawl can be interrupted
-        # mid-body (partial + temp-ref on disk).
-        if rng is None and not Handler._overlap_started:
-            Handler._overlap_started = True
-            self.send_response(200)
-            self.send_header("Content-Type", "application/octet-stream")
-            self.send_header("Content-Length", str(len(blob)))
-            self.send_header("Accept-Ranges", "bytes")
-            self.end_headers()
-            if self.command != "HEAD":
-                self.wfile.write(blob[: self.OVERLAP_PREFIX_LEN])
-                self.wfile.flush()
-                try:
-                    while True:
-                        time.sleep(3600)
-                except OSError:
-                    pass
-            return
-        if rng is None:  # no resume request: serve the whole file
-            return self.route_overlap_full()
-        # Resume: honor the Range, but back up OVERLAP_EARLY bytes.
-        start = (
-            int(rng[len("bytes=") :].split("-")[0]) if rng.startswith("bytes=") else 0
-        )
-        start = max(0, start - self.OVERLAP_EARLY)
-        # Signal that the resume Range -> 206 path actually fired, so the test
-        # can prove it was exercised (not a silent full re-download).
-        resumed = os.environ.get("OVERLAP_RESUMED")
-        if resumed:
-            with open(resumed, "a") as fp:
-                fp.write("x")
-        part = blob[start:]
-        self.send_response(206, "Partial Content")
-        self.send_header("Content-Type", "application/octet-stream")
-        self.send_header("Content-Length", str(len(part)))
-        self.send_header(
-            "Content-Range", "bytes %d-%d/%d" % (start, len(blob) - 1, len(blob))
-        )
-        self.end_headers()
-        if self.command != "HEAD":
-            self.wfile.write(part)
-
-    # error pages / 0-byte files (#17): -o0 ("no error pages") must keep 4xx/5xx
-    # bodies off disk; a genuine 0-byte 200 is a valid file and stays.
-    def route_errpage_index(self):
-        self.send_html(
-            '\t<a href="good.html">good</a>\n'
-            '\t<a href="missing.html">missing</a>\n'
-            '\t<a href="empty.html">empty</a>\n'
-        )
-
-    def route_errpage_good(self):
-        self.send_raw(b"<html><body>good page</body></html>\n", "text/html")
-
-    def route_errpage_missing(self):
-        self.send_html("\t404 error body", status=404, extra_status="Not Found")
-
-    def route_errpage_empty(self):
-        self.send_raw(b"", "text/html")
-
-    # broken Content-Length (#32/#41): declared size != bytes sent. httrack
-    # warns "bogus state (broken size)" and skips the cache unless -%B.
-    def route_size_index(self):
-        self.send_html('\t<a href="oversize.bin">over</a>\n')
-
-    def route_size_oversize(self):
-        body = b"A" * 100
-        self.send_response(200)
-        self.send_header("Content-Type", "application/octet-stream")
-        self.send_header("Content-Length", str(len(body) - 2))  # lie: too short
-        self.send_header("Connection", "close")
-        self.end_headers()
-        if self.command != "HEAD":
-            self.wfile.write(body)
-
-    # 302 whose Location carries a #fragment (#204): the fragment is a UA anchor
-    # that must be dropped before the target is fetched. A leaked '#' reaches the
-    # strict-server guard below and 400s.
-    def route_redir_index(self):
-        self.send_html('\t<a href="go.php">go</a>')
-
-    def route_redir_go(self):
-        self.send_response(302, "Found")
-        self.send_header("Location", "target.html#section")
-        self.send_header("Content-Length", "0")
-        self.end_headers()
-
-    def route_redir_target(self):
-        self.send_raw(b"<html><body>redirect target</body></html>\n", "text/html")
-
    ROUTES = {
        "/cookies/entrance.php": route_entrance,
        "/cookies/second.php": route_second,
        "/cookies/third.php": route_third,
-        "/gated/index.php": route_gated_index,
-        "/gated/secret.php": route_gated_secret,
        "/robots.txt": route_robots,
        "/types/index.html": route_types_index,
        "/types/control.php": route_types,
@@ -390,58 +233,26 @@ class Handler(SimpleHTTPRequestHandler):
        "/types/style.css": route_types,
        "/types/data.json": route_types,
        "/types/gen.php": route_types,
-        "/intl/index.html": route_intl_index,
-        "/intl/" + INTL_NAME: route_intl_page,
        "/resume/index.html": route_resume_index,
        "/resume/blob.txt": route_resume,
-        "/overlap/index.html": route_overlap_index,
-        "/overlap/flaky.bin": route_overlap,
-        "/overlap/full.bin": route_overlap_full,
-        "/size/index.html": route_size_index,
-        "/size/oversize.bin": route_size_oversize,
-        "/errpage/index.html": route_errpage_index,
-        "/errpage/good.html": route_errpage_good,
-        "/errpage/missing.html": route_errpage_missing,
-        "/errpage/empty.html": route_errpage_empty,
-        "/mimex/index.html": route_mimex_index,
-        "/mimex/blob.pdf": route_mimex_blob,
-        "/mimex/real.html": route_mimex_real,
-        "/redir/index.html": route_redir_index,
-        "/redir/go.php": route_redir_go,
-        "/redir/target.html": route_redir_target,
    }

    # --- dispatch ----------------------------------------------------------

-    def reject_fragment(self):
-        # Strict server: a '#' in the request-target is the client failing to
-        # drop a fragment (#204). RFC 3986 forbids it on the wire; answer 400.
-        if "#" in self.path:
-            self.send_response(400, "Bad Request")
-            self.send_header("Content-Length", "0")
-            self.end_headers()
-            return True
-        return False
-
    def dispatch(self):
        self._set_cookies = []
        path = urlsplit(self.path).path
-        # Match percent-encoded paths (accented #157 route) by their decoded form.
-        handler = self.ROUTES.get(path) or self.ROUTES.get(unquote(path))
+        handler = self.ROUTES.get(path)
        if handler is not None:
            handler(self)
            return True
        return False

    def do_GET(self):
-        if self.reject_fragment():
-            return
        if not self.dispatch():
            super().do_GET()

    def do_HEAD(self):
-        if self.reject_fragment():
-            return
        if not self.dispatch():
            super().do_HEAD()

--- a/tests/server-root/stripquery/a.html
+++ b/tests/server-root/stripquery/a.html
@@ -1 +0,0 @@
-<html><body>resource A</body></html>
--- a/tests/server-root/stripquery/index.html
+++ b/tests/server-root/stripquery/index.html
@@ -1,5 +0,0 @@
-<html><body>
-Two links to one resource, differing only by a tracking parameter.
-<a href="a.html?utm_source=x">x</a>
-<a href="a.html?utm_source=y">y</a>
-</body></html>