Bound the legacy .dat cache readers (cache_rstr / cache_brstr)

cache_rstr() read an attacker-controlled length (clamped only to 32768) from a CACHE-1.x .dat and fread() it straight into fixed htsblk fields (r.msg[80], r.contenttype[64], ...) with no destination bound -- a heap/stack overflow from a crafted/old cache (the audit's S1). cache_brstr() (the in-memory variant) had the same shape and, worse, no length cap at all. Thread a destination size into both: - cache_rstr stores at most s_size-1 bytes and fseek()s past the remainder so the next field stays aligned (the field may be longer than the destination in a tampered cache). - cache_brstr caps the length and bounds the copy. Update every caller (htscache.c and htscoremain.c) to pass sizeof(field) / HTS_URLMAXSIZE*2. cache_rstr_addr already malloc()s to the read size, so it is left as is. Remove the dead cache_quickbrstr (no callers). A dedicated cache self-test (create/read/update) follows separately. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Xavier Roche <roche@httrack.com>
Merge pull request #341 from xroche/test/cache-update
2026-06-14 22:33:54 +03:00 · 2026-06-14 16:41:17 +02:00 · 2026-06-14 16:31:42 +02:00 · 2026-06-14 16:29:45 +02:00 · 2026-06-14 15:58:04 +02:00 · 2026-06-14 15:43:04 +02:00
18 changed files with 641 additions and 377 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -61,6 +61,37 @@ jobs:
        if: failure()
        run: cat tests/test-suite.log 2>/dev/null || true

+  dco:
+    name: DCO sign-off
+    # Only checkable on a PR, where we have the base..head commit range.
+    if: github.event_name == 'pull_request'
+    runs-on: ubuntu-24.04
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Every commit must be signed off
+        env:
+          BASE: ${{ github.event.pull_request.base.sha }}
+          HEAD: ${{ github.event.pull_request.head.sha }}
+        run: |
+          set -euo pipefail
+          fail=0
+          # --no-merges: merge commits are GitHub-generated and carry no sign-off.
+          for sha in $(git rev-list --no-merges "$BASE..$HEAD"); do
+            if [ -z "$(git log -1 --format='%(trailers:key=Signed-off-by)' "$sha")" ]; then
+              echo "Missing Signed-off-by: $(git log -1 --format='%h %s' "$sha")"
+              fail=1
+            fi
+          done
+          if [ "$fail" -ne 0 ]; then
+            echo
+            echo "Sign commits with 'git commit -s'; fix a branch with 'git rebase --signoff $BASE'."
+            echo "See CONTRIBUTING.md (Developer Certificate of Origin)."
+            exit 1
+          fi
+
  lint:
    name: lint (shellcheck, shfmt)
    runs-on: ubuntu-24.04
--- a/CODE_OF_CONDUCT.md
+++ b/CODE_OF_CONDUCT.md
@@ -0,0 +1,83 @@
+# Contributor Covenant Code of Conduct
+
+## Our Pledge
+
+We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
+
+We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
+
+## Our Standards
+
+Examples of behavior that contributes to a positive environment for our community include:
+
+* Demonstrating empathy and kindness toward other people
+* Being respectful of differing opinions, viewpoints, and experiences
+* Giving and gracefully accepting constructive feedback
+* Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
+* Focusing on what is best not just for us as individuals, but for the overall community
+
+Examples of unacceptable behavior include:
+
+* The use of sexualized language or imagery, and sexual attention or advances of any kind
+* Trolling, insulting or derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or email address, without their explicit permission
+* Other conduct which could reasonably be considered inappropriate in a professional setting
+
+## Enforcement Responsibilities
+
+Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
+
+Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
+
+## Scope
+
+This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
+
+## Enforcement
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at <roche@httrack.com>. All complaints will be reviewed and investigated promptly and fairly.
+
+All community leaders are obligated to respect the privacy and security of the reporter of any incident.
+
+## Enforcement Guidelines
+
+Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
+
+### 1. Correction
+
+**Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
+
+**Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
+
+### 2. Warning
+
+**Community Impact**: A violation through a single incident or series of actions.
+
+**Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
+
+### 3. Temporary Ban
+
+**Community Impact**: A serious violation of community standards, including sustained inappropriate behavior.
+
+**Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
+
+### 4. Permanent Ban
+
+**Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
+
+**Consequence**: A permanent ban from any sort of public interaction within the community.
+
+## Attribution
+
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.1, available at [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
+
+Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder][Mozilla CoC].
+
+For answers to common questions about this code of conduct, see the FAQ at [https://www.contributor-covenant.org/faq][FAQ]. Translations are available at [https://www.contributor-covenant.org/translations][translations].
+
+[homepage]: https://www.contributor-covenant.org
+[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
+[Mozilla CoC]: https://github.com/mozilla/diversity
+[FAQ]: https://www.contributor-covenant.org/faq
+[translations]: https://www.contributor-covenant.org/translations
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -0,0 +1,39 @@
+# Contributing to HTTrack
+
+HTTrack is small and old. Keep changes easy to review and safe to merge.
+
+## Pull requests
+
+- One change per PR. Small diffs merge fast.
+- PRs are squash-merged: the title and description become the commit message, so
+  explain *why*.
+- Add or update tests for engine changes (`tests/`), and keep CI green.
+
+## Style
+
+- C, matching nearby code. **Format only the lines you change** (`git
+  clang-format` against the repo `.clang-format`). Never reformat untouched code.
+- Comment the *why*, in English.
+- HTTrack parses hostile input off the network. Check bounds, avoid unchecked
+  copies, and never let an attacker-controlled length drive arithmetic unchecked.
+
+## Sign your work
+
+Every commit needs a `Signed-off-by` line, the
+[DCO](https://developercertificate.org/): `git commit -s`. CI rejects unsigned
+commits; fix a branch with `git rebase --signoff master`.
+
+## AI assistants
+
+Welcome, and nothing to disclose. Two rules:
+
+- **Own every line** as if you wrote it. Can't explain it in review? Not ready.
+- **Don't push your work onto reviewers.** A raw generated patch a maintainer has
+  to vet from scratch will be closed.
+
+The sign-off covers AI-assisted code too.
+
+## Bugs
+
+Open an issue with the version, OS, command used, and expected vs actual result.
+For security issues see [SECURITY.md](SECURITY.md), not a public issue.
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -0,0 +1,23 @@
+# Security Policy
+
+## Reporting
+
+Report privately, not in a public issue or PR: use GitHub
+[private advisories](https://github.com/xroche/httrack/security/advisories/new)
+or email <roche@httrack.com> (alternate: `xroche at gmail dot com`).
+
+Include the HTTrack version and platform, a concrete reproduction (command line,
+a sample page or server response, or a small proof of concept), and what an
+attacker gains. We'll acknowledge it and keep you posted. Please allow time for a
+release before disclosing publicly.
+
+## Supported versions
+
+Fixes land on `master` and ship in the next release; older releases aren't
+maintained. Confirm against current `master` when you can.
+
+## AI-assisted findings
+
+Scanners and LLMs are fine, but only send reports you have verified yourself. A
+confirmed, reproducible issue is worth our time; a plausible one that doesn't
+reproduce is not, and will be closed. If a report is AI-assisted, say so.
--- a/src/htsalias.c
+++ b/src/htsalias.c
@@ -266,7 +266,9 @@ const char *hts_optalias[][4] = {
  return value: number of arguments treated (0 if error)
 */
 int optalias_check(int argc, const char *const *argv, int n_arg,
-                   int *return_argc, char **return_argv, char *return_error) {
+                   int *return_argc, char **return_argv,
+                   size_t return_argv_size, char *return_error,
+                   size_t return_error_size) {
  return_error[0] = '\0';
  *return_argc = 1;
  if (argv[n_arg][0] == '-')
@@ -323,9 +325,10 @@ int optalias_check(int argc, const char *const *argv, int n_arg,
          /* Copy parameters? */
          if (need_param == 2) {
            if ((n_arg + 1 >= argc) || (argv[n_arg + 1][0] == '-')) {   /* no supplemental parameter */
-              sprintf(return_error,
-                      "Syntax error:\n\tOption %s needs to be followed by a parameter: %s <param>\n\t%s\n",
-                      command, command, _NOT_NULL(optalias_help(command)));
+              snprintf(return_error, return_error_size,
+                       "Syntax error:\n\tOption %s needs to be followed by a "
+                       "parameter: %s <param>\n\t%s\n",
+                       command, command, _NOT_NULL(optalias_help(command)));
              return 0;
            }
            strcpybuff(param, argv[n_arg + 1]);
@@ -338,35 +341,36 @@ int optalias_check(int argc, const char *const *argv, int n_arg,

        /* Must be alone (-P /tmp) */
        if (strcmp(hts_optalias[pos][2], "param1") == 0) {
-          strcpybuff(return_argv[0], command);
-          strcpybuff(return_argv[1], param);
+          strlcpybuff(return_argv[0], command, return_argv_size);
+          strlcpybuff(return_argv[1], param, return_argv_size);
          *return_argc = 2;     /* 2 parameters returned */
        }
        /* Alone with parameter (+*.gif) */
        else if (strcmp(hts_optalias[pos][2], "param0") == 0) {
          /* Command */
-          strcpybuff(return_argv[0], command);
-          strcatbuff(return_argv[0], param);
+          strlcpybuff(return_argv[0], command, return_argv_size);
+          strlcatbuff(return_argv[0], param, return_argv_size);
        }
        /* Together (-c8) */
        else {
          /* Command */
-          strcpybuff(return_argv[0], command);
+          strlcpybuff(return_argv[0], command, return_argv_size);
          /* Parameters accepted */
          if (strncmp(hts_optalias[pos][2], "param", 5) == 0) {
            /* --cache=off or --index=on */
            if (strcmp(param, "off") == 0)
-              strcatbuff(return_argv[0], "0");
+              strlcatbuff(return_argv[0], "0", return_argv_size);
            else if (strcmp(param, "on") == 0) {
              // on is the default
              // strcatbuff(return_argv[0],"1");
            } else
-              strcatbuff(return_argv[0], param);
+              strlcatbuff(return_argv[0], param, return_argv_size);
          }
          *return_argc = 1;     /* 1 parameter returned */
        }
      } else {
-        sprintf(return_error, "Unknown option: %s\n", command);
+        snprintf(return_error, return_error_size, "Unknown option: %s\n",
+                 command);
        return 0;
      }
      return need_param;
@@ -380,15 +384,16 @@ int optalias_check(int argc, const char *const *argv, int n_arg,
      if ((strcmp(hts_optalias[pos][2], "param1") == 0)
          || (strcmp(hts_optalias[pos][2], "param0") == 0)) {
        if ((n_arg + 1 >= argc) || (argv[n_arg + 1][0] == '-')) {       /* no supplemental parameter */
-          sprintf(return_error,
-                  "Syntax error:\n\tOption %s needs to be followed by a parameter: %s <param>\n\t%s\n",
-                  argv[n_arg], argv[n_arg],
-                  _NOT_NULL(optalias_help(argv[n_arg])));
+          snprintf(return_error, return_error_size,
+                   "Syntax error:\n\tOption %s needs to be followed by a "
+                   "parameter: %s <param>\n\t%s\n",
+                   argv[n_arg], argv[n_arg],
+                   _NOT_NULL(optalias_help(argv[n_arg])));
          return 0;
        }
        /* Copy parameters */
-        strcpybuff(return_argv[0], argv[n_arg]);
-        strcpybuff(return_argv[1], argv[n_arg + 1]);
+        strlcpybuff(return_argv[0], argv[n_arg], return_argv_size);
+        strlcpybuff(return_argv[1], argv[n_arg + 1], return_argv_size);
        /* And return */
        *return_argc = 2;       /* 2 parameters returned */
        return 2;               /* 2 parameters used */
@@ -397,7 +402,7 @@ int optalias_check(int argc, const char *const *argv, int n_arg,
  }

  /* Copy and return other unknown option */
-  strcpybuff(return_argv[0], argv[n_arg]);
+  strlcpybuff(return_argv[0], argv[n_arg], return_argv_size);
  return 1;
 }

@@ -524,9 +529,10 @@ int optinclude_file(const char *name, int *argc, char **argv, char *x_argvblk,
            strcatbuff(_tmp_argv[0], a);
            strcpybuff(_tmp_argv[1], b);

-            result =
-              optalias_check(2, (const char *const *) tmp_argv, 0, &return_argc,
-                             (tmp_argv + 2), return_error);
+            result = optalias_check(2, (const char *const *) tmp_argv, 0,
+                                    &return_argc, (tmp_argv + 2),
+                                    sizeof(_tmp_argv[0]), return_error,
+                                    sizeof(return_error));
            if (!result) {
              printf("%s\n", return_error);
            } else {
--- a/src/htsalias.h
+++ b/src/htsalias.h
@@ -38,7 +38,9 @@ Please visit our Website: http://www.httrack.com
 #ifdef HTS_INTERNAL_BYTECODE
 extern const char *hts_optalias[][4];
 int optalias_check(int argc, const char *const *argv, int n_arg,
-                   int *return_argc, char **return_argv, char *return_error);
+                   int *return_argc, char **return_argv,
+                   size_t return_argv_size, char *return_error,
+                   size_t return_error_size);
 int optalias_find(const char *token);
 const char *optalias_help(const char *token);
 int optreal_find(const char *token);
--- a/src/htsbauth.c
+++ b/src/htsbauth.c
@@ -102,7 +102,8 @@ int cookie_add(t_cookie * cookie, const char *cook_name, const char *cook_value,
  strcatbuff(cook, "\n");
  if (!((strlen(cookie->data) + strlen(cook)) < cookie->max_len))
    return -1;                  // impossible d'ajouter
-  cookie_insert(insert, cook);
+  cookie_insert(insert, cookie->max_len - (size_t) (insert - cookie->data),
+                cook);
 #if DEBUG_COOK
  printf("add_new cookie: name=\"%s\" value=\"%s\" domain=\"%s\" path=\"%s\"\n",
         cook_name, cook_value, domain, path);
@@ -118,7 +119,7 @@ int cookie_del(t_cookie * cookie, const char *cook_name, const char *domain, con
  b = cookie_find(cookie->data, cook_name, domain, path);
  if (b) {
    a = cookie_nextfield(b);
-    cookie_delete(b, a - b);
+    cookie_delete(b, cookie->max_len - (size_t) (b - cookie->data), a - b);
 #if DEBUG_COOK
    printf("deleted old cookie: %s %s %s\n", cook_name, domain, path);
 #endif
@@ -336,41 +337,44 @@ int cookie_save(t_cookie * cookie, const char *name) {
  return -1;
 }

-// insertion chaine ins avant s
-void cookie_insert(char *s, const char *ins) {
+// Insert string ins before s. s_size is the capacity of the buffer at s.
+void cookie_insert(char *s, size_t s_size, const char *ins) {
  char *buff;

-  if (strnotempty(s) == 0) {    // rien à faire, juste concat
-    strcatbuff(s, ins);
+  if (strnotempty(s) == 0) { // nothing there yet: just concatenate
+    strlcatbuff(s, ins, s_size);
  } else {
    buff = (char *) malloct(strlen(s) + 1);
    if (buff) {
-      strcpybuff(buff, s);      // copie temporaire
-      strcpybuff(s, ins);       // insérer
-      strcatbuff(s, buff);      // copier
+      strlcpybuff(buff, s, strlen(s) + 1); // temporary copy of s
+      strlcpybuff(s, ins, s_size);         // write ins
+      strlcatbuff(s, buff, s_size);        // then the saved content
      freet(buff);
    }
  }
 }

-// destruction chaine dans s position pos
-void cookie_delete(char *s, size_t pos) {
+// Delete the substring of s at position pos. s_size is the capacity at s.
+void cookie_delete(char *s, size_t s_size, size_t pos) {
  char *buff;

-  if (strnotempty(s + pos) == 0) {      // rien à faire, effacer
+  if (strnotempty(s + pos) == 0) { // nothing after pos: truncate
    s[0] = '\0';
  } else {
    buff = (char *) malloct(strlen(s + pos) + 1);
    if (buff) {
-      strcpybuff(buff, s + pos);        // copie temporaire
-      strcpybuff(s, buff);      // copier
+      strlcpybuff(buff, s + pos, strlen(s + pos) + 1); // temporary copy
+      strlcpybuff(s, buff, s_size);                    // overwrite from start
      freet(buff);
    }
  }
 }

-// renvoie champ param de la chaine cookie_base
-// ex: cookie_get("ceci est<tab>un<tab>exemple",1) renvoi "un"
+// Return field <param> (0-based, tab-separated) of the cookie line cookie_base,
+// into buffer. ex: cookie_get("ceci est<tab>un<tab>exemple", 1) returns "un".
+// buffer must hold at least COOKIE_FIELD_BUFFER_SIZE bytes (all callers use
+// char[8192]).
+#define COOKIE_FIELD_BUFFER_SIZE 8192
 const char *cookie_get(char *buffer, const char *cookie_base, int param) {
  const char *limit;

@@ -394,11 +398,11 @@ const char *cookie_get(char *buffer, const char *cookie_base, int param) {
    if (cookie_base) {
      if (cookie_base < limit) {
        const char *a = cookie_base;
+        htsbuff b = htsbuff_ptr(buffer, COOKIE_FIELD_BUFFER_SIZE);

        while((*a) && (*a != '\t') && (*a != '\n'))
          a++;
-        buffer[0] = '\0';
-        strncatbuff(buffer, cookie_base, (int) (a - cookie_base));
+        htsbuff_catn(&b, cookie_base, (size_t) (a - cookie_base));
        return buffer;
      } else
        return "";
@@ -458,11 +462,13 @@ char *bauth_check(t_cookie * cookie, const char *adr, const char *fil) {
  return NULL;
 }

+/* Build the auth prefix (host + path, query stripped) into prefix.
+   Callers pass a buffer of HTS_URLMAXSIZE * 2 bytes. */
 char *bauth_prefix(char *prefix, const char *adr, const char *fil) {
  char *a;

-  strcpybuff(prefix, jump_identification_const(adr));
-  strcatbuff(prefix, fil);
+  strlcpybuff(prefix, jump_identification_const(adr), HTS_URLMAXSIZE * 2);
+  strlcatbuff(prefix, fil, HTS_URLMAXSIZE * 2);
  a = strchr(prefix, '?');
  if (a)
    *a = '\0';
--- a/src/htsbauth.h
+++ b/src/htsbauth.h
@@ -67,8 +67,8 @@ int cookie_add(t_cookie * cookie, const  char *cook_name, const  char *cook_valu
 int cookie_del(t_cookie * cookie, const char *cook_name, const char *domain, const char *path);
 int cookie_load(t_cookie * cookie, const char *path, const char *name);
 int cookie_save(t_cookie * cookie, const char *name);
-void cookie_insert(char *s, const char *ins);
-void cookie_delete(char *s, size_t pos);
+void cookie_insert(char *s, size_t s_size, const char *ins);
+void cookie_delete(char *s, size_t s_size, size_t pos);
 const char *cookie_get(char *buffer, const char *cookie_base, int param);
 char *cookie_find(char *s, const char *cook_name, const char *domain, const char *path);
 char *cookie_nextfield(char *a);
--- a/src/htscache.c
+++ b/src/htscache.c
@@ -196,12 +196,13 @@ struct cache_back_zip_entry {
  int compressionMethod;
 };

-#define ZIP_READFIELD_STRING(line, value, refline, refvalue) do { \
-  if (line[0] != '\0' && strfield2(line, refline)) { \
-    strcpybuff(refvalue, value); \
-    line[0] = '\0'; \
-	} \
-} while(0)
+#define ZIP_READFIELD_STRING(line, value, refline, refvalue, refvalue_size)    \
+  do {                                                                         \
+    if (line[0] != '\0' && strfield2(line, refline)) {                         \
+      strlcpybuff(refvalue, value, refvalue_size);                             \
+      line[0] = '\0';                                                          \
+    }                                                                          \
+  } while (0)
 #define ZIP_READFIELD_INT(line, value, refline, refvalue) do { \
  if (line[0] != '\0' && strfield2(line, refline)) { \
    int intval = 0; \
@@ -643,7 +644,7 @@ static htsblk cache_readex_new(httrackp * opt, cache_back * cache,
  } else {
    r.location = location_default;
  }
-  strcpybuff(r.location, "");
+  r.location[0] = '\0';
  strcpybuff(buff, adr);
  strcatbuff(buff, fil);
  hash_pos_return = coucal_read(cache->hashtable, buff, &hash_pos);
@@ -706,17 +707,25 @@ static htsblk cache_readex_new(httrackp * opt, cache_back * cache,
                value++;
              ZIP_READFIELD_INT(line, value, "X-In-Cache", dataincache);
              ZIP_READFIELD_INT(line, value, "X-Statuscode", r.statuscode);
-              ZIP_READFIELD_STRING(line, value, "X-StatusMessage", r.msg);      // msg
+              ZIP_READFIELD_STRING(line, value, "X-StatusMessage", r.msg,
+                                   sizeof(r.msg));
              ZIP_READFIELD_LLINT(line, value, "X-Size", r.size);       // size
-              ZIP_READFIELD_STRING(line, value, "Content-Type", r.contenttype); // contenttype
-              ZIP_READFIELD_STRING(line, value, "X-Charset", r.charset);        // contenttype
-              ZIP_READFIELD_STRING(line, value, "Last-Modified", r.lastmodified);       // last-modified
-              ZIP_READFIELD_STRING(line, value, "Etag", r.etag);        // Etag
-              ZIP_READFIELD_STRING(line, value, "Location", r.location);        // 'location' pour moved
-              ZIP_READFIELD_STRING(line, value, "Content-Disposition", r.cdispo);       // Content-disposition
+              ZIP_READFIELD_STRING(line, value, "Content-Type", r.contenttype,
+                                   sizeof(r.contenttype));
+              ZIP_READFIELD_STRING(line, value, "X-Charset", r.charset,
+                                   sizeof(r.charset));
+              ZIP_READFIELD_STRING(line, value, "Last-Modified", r.lastmodified,
+                                   sizeof(r.lastmodified));
+              ZIP_READFIELD_STRING(line, value, "Etag", r.etag, sizeof(r.etag));
+              // r.location is a char* pointing into a HTS_URLMAXSIZE*2 buffer
+              ZIP_READFIELD_STRING(line, value, "Location", r.location,
+                                   HTS_URLMAXSIZE * 2);
+              ZIP_READFIELD_STRING(line, value, "Content-Disposition", r.cdispo,
+                                   sizeof(r.cdispo));
              //ZIP_READFIELD_STRING(line, value, "X-Addr", ..);            // Original address
              //ZIP_READFIELD_STRING(line, value, "X-Fil", ..);            // Original URI filename
-              ZIP_READFIELD_STRING(line, value, "X-Save", previous_save_);      // Original save filename
+              ZIP_READFIELD_STRING(line, value, "X-Save", previous_save_,
+                                   sizeof(previous_save_));
            }
          } while(offset < readSizeHeader && !lineEof);
          //totalHeader = offset;
@@ -733,7 +742,7 @@ static htsblk cache_readex_new(httrackp * opt, cache_back * cache,
            }
          }
          if (return_save != NULL) {
-            strcpybuff(return_save, previous_save);
+            strlcpybuff(return_save, previous_save, HTS_URLMAXSIZE * 2);
          }

          /* Complete fields */
@@ -1025,7 +1034,7 @@ static htsblk cache_readex_old(httrackp * opt, cache_back * cache,
  } else {
    r.location = location_default;
  }
-  strcpybuff(r.location, "");
+  r.location[0] = '\0';
 #if HTS_FAST_CACHE
  strcpybuff(buff, adr);
  strcatbuff(buff, fil);
@@ -1096,30 +1105,34 @@ static htsblk cache_readex_old(httrackp * opt, cache_back * cache,
        //
        cache_rint(cache->olddat, &r.statuscode);
        cache_rLLint(cache->olddat, &r.size);
-        cache_rstr(cache->olddat, r.msg);
-        cache_rstr(cache->olddat, r.contenttype);
+        cache_rstr(cache->olddat, r.msg, sizeof(r.msg));
+        cache_rstr(cache->olddat, r.contenttype, sizeof(r.contenttype));
        if (cache->version >= 3)
-          cache_rstr(cache->olddat, r.charset);
-        cache_rstr(cache->olddat, r.lastmodified);
-        cache_rstr(cache->olddat, r.etag);
-        cache_rstr(cache->olddat, r.location);
+          cache_rstr(cache->olddat, r.charset, sizeof(r.charset));
+        cache_rstr(cache->olddat, r.lastmodified, sizeof(r.lastmodified));
+        cache_rstr(cache->olddat, r.etag, sizeof(r.etag));
+        // r.location points into a HTS_URLMAXSIZE*2 buffer
+        cache_rstr(cache->olddat, r.location, HTS_URLMAXSIZE * 2);
        if (cache->version >= 2)
-          cache_rstr(cache->olddat, r.cdispo);
+          cache_rstr(cache->olddat, r.cdispo, sizeof(r.cdispo));
        if (cache->version >= 4) {
-          cache_rstr(cache->olddat, previous_save);     // adr
-          cache_rstr(cache->olddat, previous_save);     // fil
+          cache_rstr(cache->olddat, previous_save,
+                     sizeof(previous_save)); // adr
+          cache_rstr(cache->olddat, previous_save,
+                     sizeof(previous_save)); // fil
          previous_save[0] = '\0';
-          cache_rstr(cache->olddat, previous_save);     // save
+          cache_rstr(cache->olddat, previous_save,
+                     sizeof(previous_save)); // save
          if (return_save != NULL) {
-            strcpybuff(return_save, previous_save);
+            strlcpybuff(return_save, previous_save, HTS_URLMAXSIZE * 2);
          }
        }
        if (cache->version >= 5) {
          r.headers = cache_rstr_addr(cache->olddat);
        }
        //
-        cache_rstr(cache->olddat, check);
-        if (strcmp(check, "HTS") == 0) {        /* intégrité OK */
+        cache_rstr(cache->olddat, check, sizeof(check));
+        if (strcmp(check, "HTS") == 0) { /* integrity OK */
          ok = 1;
        }
        cache_rLLint(cache->olddat, &size_read);        /* lire size pour être sûr de la taille déclarée (réécrire) */
@@ -1758,12 +1771,12 @@ void cache_init(cache_back * cache, httrackp * opt) {
          char firstline[256];
          char *a = cache->use;

-          a += cache_brstr(a, firstline);
-          if (strncmp(firstline, "CACHE-", 6) == 0) {   // Nouvelle version du cache
-            if (strncmp(firstline, "CACHE-1.", 8) == 0) {       // Version 1.1x
+          a += cache_brstr(a, firstline, sizeof(firstline));
+          if (strncmp(firstline, "CACHE-", 6) == 0) {     // new cache format
+            if (strncmp(firstline, "CACHE-1.", 8) == 0) { // version 1.1x
              cache->version = (int) (firstline[8] - '0');      // cache 1.x
              if (cache->version <= 5) {
-                a += cache_brstr(a, firstline);
+                a += cache_brstr(a, firstline, sizeof(firstline));
                strcpybuff(cache->lastmodified, firstline);
              } else {
                hts_log_print(opt, LOG_ERROR,
@@ -1774,7 +1787,7 @@ void cache_init(cache_back * cache, httrackp * opt) {
                freet(cache->use);
                cache->use = NULL;
              }
-            } else {            // non supporté
+            } else { // non supporté
              hts_log_print(opt, LOG_ERROR,
                            "Cache: %s not supported, ignoring current cache",
                            firstline);
@@ -1784,7 +1797,7 @@ void cache_init(cache_back * cache, httrackp * opt) {
              cache->use = NULL;
            }
            /* */
-          } else {              // Vieille version du cache
+          } else { // Vieille version du cache
            /* */
            hts_log_print(opt, LOG_WARNING,
                          "Cache: importing old cache format");
@@ -2088,7 +2101,7 @@ char *readfile_or(const char *fil, const char *defaultdata) {
    char *adr = malloct(strlen(defaultdata) + 1);

    if (adr) {
-      strcpybuff(adr, defaultdata);
+      strlcpybuff(adr, defaultdata, strlen(defaultdata) + 1);
      return adr;
    }
  }
@@ -2109,7 +2122,7 @@ int cache_wstr(FILE * fp, const char *s) {
    return -1;
  return 0;
 }
-void cache_rstr(FILE * fp, char *s) {
+void cache_rstr(FILE *fp, char *s, size_t s_size) {
  INTsys i;
  char buff[256 + 4];

@@ -2118,13 +2131,26 @@ void cache_rstr(FILE * fp, char *s) {
  if (i < 0 || i > 32768)       /* error, something nasty happened */
    i = 0;
  if (i > 0) {
-    if ((int) fread(s, 1, i, fp) != i) {
+    /* Store at most s_size-1 bytes into s, but consume all i bytes from the
+       stream so the next field stays aligned (the field may be longer than the
+       destination in a tampered/old cache). */
+    const size_t want = (size_t) i;
+    const size_t store = want < s_size ? want : s_size - 1;
+
+    if (fread(s, 1, store, fp) != store) {
      int fread_cache_failed = 0;

      assertf(fread_cache_failed);
    }
+    if (want > store && fseek(fp, (long) (want - store), SEEK_CUR) != 0) {
+      int fseek_cache_failed = 0;
+
+      assertf(fseek_cache_failed);
+    }
+    s[store] = '\0';
+  } else {
+    s[0] = '\0';
  }
-  *(s + i) = '\0';
 }
 char *cache_rstr_addr(FILE * fp) {
  INTsys i;
@@ -2148,7 +2174,7 @@ char *cache_rstr_addr(FILE * fp) {
  }
  return addr;
 }
-int cache_brstr(char *adr, char *s) {
+int cache_brstr(char *adr, char *s, size_t s_size) {
  int i;
  int off;
  char buff[256 + 4];
@@ -2156,23 +2182,17 @@ int cache_brstr(char *adr, char *s) {
  off = binput(adr, buff, 256);
  adr += off;
  sscanf(buff, "%d", &i);
-  if (i > 0)
-    strncpy(s, adr, i);
-  *(s + i) = '\0';
-  off += i;
-  return off;
-}
-int cache_quickbrstr(char *adr, char *s) {
-  int i;
-  int off;
-  char buff[256 + 4];
+  if (i < 0 || i > 32768) /* guard a corrupt length */
+    i = 0;
+  if (i > 0) {
+    /* copy at most s_size-1 bytes; advance past the full field regardless */
+    const size_t store = (size_t) i < s_size ? (size_t) i : s_size - 1;

-  off = binput(adr, buff, 256);
-  adr += off;
-  sscanf(buff, "%d", &i);
-  if (i > 0)
-    strncpy(s, adr, i);
-  *(s + i) = '\0';
+    strncpy(s, adr, store);
+    s[store] = '\0';
+  } else {
+    s[0] = '\0';
+  }
  off += i;
  return off;
 }
@@ -2180,7 +2200,7 @@ int cache_quickbrstr(char *adr, char *s) {
 /* idem, mais en int */
 int cache_brint(char *adr, int *i) {
  char s[256];
-  int r = cache_brstr(adr, s);
+  int r = cache_brstr(adr, s, sizeof(s));

  if (r != -1)
    sscanf(s, "%d", i);
@@ -2189,7 +2209,7 @@ int cache_brint(char *adr, int *i) {
 void cache_rint(FILE * fp, int *i) {
  char s[256];

-  cache_rstr(fp, s);
+  cache_rstr(fp, s, sizeof(s));
  sscanf(s, "%d", i);
 }
 int cache_wint(FILE * fp, int i) {
@@ -2201,7 +2221,7 @@ int cache_wint(FILE * fp, int i) {
 void cache_rLLint(FILE * fp, LLint * i) {
  char s[256];

-  cache_rstr(fp, s);
+  cache_rstr(fp, s, sizeof(s));
  sscanf(s, LLintP, i);
 }
 int cache_wLLint(FILE * fp, LLint i) {
--- a/src/htscache.h
+++ b/src/htscache.h
@@ -80,10 +80,9 @@ int cache_writedata(FILE * cache_ndx, FILE * cache_dat, const char *str1,
 int cache_readdata(cache_back * cache, const char *str1, const char *str2,
                   char **inbuff, int *len);

-void cache_rstr(FILE * fp, char *s);
+void cache_rstr(FILE *fp, char *s, size_t s_size);
 char *cache_rstr_addr(FILE * fp);
-int cache_brstr(char *adr, char *s);
-int cache_quickbrstr(char *adr, char *s);
+int cache_brstr(char *adr, char *s, size_t s_size);
 int cache_brint(char *adr, int *i);
 void cache_rint(FILE * fp, int *i);
 void cache_rLLint(FILE * fp, LLint * i);
--- a/src/htscoremain.c
+++ b/src/htscoremain.c
@@ -40,6 +40,7 @@ Please visit our Website: http://www.httrack.com
 #include "htscore.h"
 #include "htsdefines.h"
 #include "htsalias.h"
+#include "htsbauth.h"
 #include "htswrap.h"
 #include "htsmodules.h"
 #include "htszlib.h"
@@ -138,6 +139,19 @@ static void basic_selftests(void) {
  fil_normalized(source, buffer);
  // MD5 selftests
  md5selftest();
+  // cookie_get field extraction (tab-separated, 0-based)
+  {
+    char cbuf[8192];
+
+    assertf(strcmp(cookie_get(cbuf, "a\tb\tc", 0), "a") == 0);
+    assertf(strcmp(cookie_get(cbuf, "a\tb\tc", 1), "b") == 0);
+    assertf(strcmp(cookie_get(cbuf, "a\tb\tc", 2), "c") == 0);
+    // multi-char fields catch length/boundary bugs that 1-char fields hide
+    assertf(strcmp(cookie_get(cbuf, "host\tx\t/path/to", 0), "host") == 0);
+    assertf(strcmp(cookie_get(cbuf, "host\tx\t/path/to", 2), "/path/to") == 0);
+    assertf(strcmp(cookie_get(cbuf, "a\t\tc", 1), "") == 0);  // empty field
+    assertf(strcmp(cookie_get(cbuf, "a\tb\tc", 9), "") == 0); // beyond last
+  }
 }

 /* Self-tests for the htssafe.h bounded string ops (driven by httrack -#8).
@@ -211,6 +225,10 @@ static int string_safety_selftests(void) {
    htsbuff_cpy(&b, "xyz");             /* reset */
    if (strcmp(htsbuff_str(&b), "xyz") != 0 || b.len != 3)
      return 1;
+
+    htsbuff_catc(&b, '!'); /* single character */
+    if (strcmp(htsbuff_str(&b), "xyz!") != 0 || b.len != 4)
+      return 1;
  }

  /* boundary: filling to exactly cap-1 must succeed (one more aborts, which the
@@ -381,10 +399,10 @@ static int hts_main_internal(int argc, char **argv, httrackp * opt) {
      /* Vérifier argv[] non vide */
      if (strnotempty(argv[na])) {

-        /* Vérifier Commande (alias) */
-        result =
-          optalias_check(argc, (const char *const *) argv, na, &tmp_argc,
-                         (char **) tmp_argv, tmp_error);
+        /* Resolve an option alias, if any */
+        result = optalias_check(argc, (const char *const *) argv, na, &tmp_argc,
+                                (char **) tmp_argv, sizeof(_tmp_argv[0]),
+                                tmp_error, sizeof(tmp_error));
        if (!result) {
          HTS_PANIC_PRINTF(tmp_error);
          htsmain_free();
@@ -2137,8 +2155,8 @@ static int hts_main_internal(int argc, char **argv, httrackp * opt) {
                      char firstline[256];
                      char *a = cacheNdx;

-                      a += cache_brstr(a, firstline);
-                      a += cache_brstr(a, firstline);
+                      a += cache_brstr(a, firstline, sizeof(firstline));
+                      a += cache_brstr(a, firstline, sizeof(firstline));
                      while(a != NULL) {
                        a = strchr(a + 1, '\n');        /* start of line */
                        if (a) {
--- a/src/htslib.c
+++ b/src/htslib.c
@@ -1660,138 +1660,107 @@ void treathead(t_cookie * cookie, const char *adr, const char *fil, htsblk * ret
  }
 }

-// transforme le message statuscode en chaîne
-HTSEXT_API void infostatuscode(char *msg, int statuscode) {
+// HTTP status code -> reason phrase (per RFC), or NULL if unknown.
+HTSEXT_API const char *infostatuscode_const(int statuscode) {
+  // O(1) dispatch (the compiler builds a jump table); the phrases are static.
  switch (statuscode) {
-    // Erreurs HTTP, selon RFC
  case 100:
-    strcpybuff(msg, "Continue");
-    break;
+    return "Continue";
  case 101:
-    strcpybuff(msg, "Switching Protocols");
-    break;
+    return "Switching Protocols";
  case 200:
-    strcpybuff(msg, "OK");
-    break;
+    return "OK";
  case 201:
-    strcpybuff(msg, "Created");
-    break;
+    return "Created";
  case 202:
-    strcpybuff(msg, "Accepted");
-    break;
+    return "Accepted";
  case 203:
-    strcpybuff(msg, "Non-Authoritative Information");
-    break;
+    return "Non-Authoritative Information";
  case 204:
-    strcpybuff(msg, "No Content");
-    break;
+    return "No Content";
  case 205:
-    strcpybuff(msg, "Reset Content");
-    break;
+    return "Reset Content";
  case 206:
-    strcpybuff(msg, "Partial Content");
-    break;
+    return "Partial Content";
  case 300:
-    strcpybuff(msg, "Multiple Choices");
-    break;
+    return "Multiple Choices";
  case 301:
-    strcpybuff(msg, "Moved Permanently");
-    break;
+    return "Moved Permanently";
  case 302:
-    strcpybuff(msg, "Moved Temporarily");
-    break;
+    return "Moved Temporarily";
  case 303:
-    strcpybuff(msg, "See Other");
-    break;
+    return "See Other";
  case 304:
-    strcpybuff(msg, "Not Modified");
-    break;
+    return "Not Modified";
  case 305:
-    strcpybuff(msg, "Use Proxy");
-    break;
+    return "Use Proxy";
  case 306:
-    strcpybuff(msg, "Undefined 306 error");
-    break;
+    return "Undefined 306 error";
  case 307:
-    strcpybuff(msg, "Temporary Redirect");
-    break;
+    return "Temporary Redirect";
  case 400:
-    strcpybuff(msg, "Bad Request");
-    break;
+    return "Bad Request";
  case 401:
-    strcpybuff(msg, "Unauthorized");
-    break;
+    return "Unauthorized";
  case 402:
-    strcpybuff(msg, "Payment Required");
-    break;
+    return "Payment Required";
  case 403:
-    strcpybuff(msg, "Forbidden");
-    break;
+    return "Forbidden";
  case 404:
-    strcpybuff(msg, "Not Found");
-    break;
+    return "Not Found";
  case 405:
-    strcpybuff(msg, "Method Not Allowed");
-    break;
+    return "Method Not Allowed";
  case 406:
-    strcpybuff(msg, "Not Acceptable");
-    break;
+    return "Not Acceptable";
  case 407:
-    strcpybuff(msg, "Proxy Authentication Required");
-    break;
+    return "Proxy Authentication Required";
  case 408:
-    strcpybuff(msg, "Request Time-out");
-    break;
+    return "Request Time-out";
  case 409:
-    strcpybuff(msg, "Conflict");
-    break;
+    return "Conflict";
  case 410:
-    strcpybuff(msg, "Gone");
-    break;
+    return "Gone";
  case 411:
-    strcpybuff(msg, "Length Required");
-    break;
+    return "Length Required";
  case 412:
-    strcpybuff(msg, "Precondition Failed");
-    break;
+    return "Precondition Failed";
  case 413:
-    strcpybuff(msg, "Request Entity Too Large");
-    break;
+    return "Request Entity Too Large";
  case 414:
-    strcpybuff(msg, "Request-URI Too Large");
-    break;
+    return "Request-URI Too Large";
  case 415:
-    strcpybuff(msg, "Unsupported Media Type");
-    break;
+    return "Unsupported Media Type";
  case 416:
-    strcpybuff(msg, "Requested Range Not Satisfiable");
-    break;
+    return "Requested Range Not Satisfiable";
  case 417:
-    strcpybuff(msg, "Expectation Failed");
-    break;
+    return "Expectation Failed";
  case 500:
-    strcpybuff(msg, "Internal Server Error");
-    break;
+    return "Internal Server Error";
  case 501:
-    strcpybuff(msg, "Not Implemented");
-    break;
+    return "Not Implemented";
  case 502:
-    strcpybuff(msg, "Bad Gateway");
-    break;
+    return "Bad Gateway";
  case 503:
-    strcpybuff(msg, "Service Unavailable");
-    break;
+    return "Service Unavailable";
  case 504:
-    strcpybuff(msg, "Gateway Time-out");
-    break;
+    return "Gateway Time-out";
  case 505:
-    strcpybuff(msg, "HTTP Version Not Supported");
-    break;
-    //
+    return "HTTP Version Not Supported";
  default:
-    if (strnotempty(msg) == 0)
-      strcpybuff(msg, "Unknown error");
-    break;
+    return NULL;
+  }
+}
+
+// Write the status code's reason phrase into msg. For an unknown code, keep any
+// caller-provided message, otherwise fall back to a default. Callers provide a
+// buffer of at least 64 bytes (the longest reason phrase is 31).
+HTSEXT_API void infostatuscode(char *msg, int statuscode) {
+  const char *const text = infostatuscode_const(statuscode);
+
+  if (text != NULL) {
+    strlcpybuff(msg, text, 64);
+  } else if (strnotempty(msg) == 0) {
+    strlcpybuff(msg, "Unknown error", 64);
  }
 }

--- a/src/htsname.c
+++ b/src/htsname.c
@@ -767,7 +767,7 @@ int url_savename(lien_adrfilsave *const afs,
  // ajouter nom du site éventuellement en premier
  if (opt->savename_type == -1) {       // utiliser savename_userdef! (%h%p/%n%q.%t)
    const char *a = StringBuff(opt->savename_userdef);
-    char *b = afs->save;
+    htsbuff sb = htsbuff_array(afs->save);

    /*char *nom_pos=NULL,*dot_pos=NULL;  // Position nom et point */
    char tok;
@@ -787,17 +787,16 @@ int url_savename(lien_adrfilsave *const afs,
       }
     */

-    // Construire nom
-    while((*a) && (((int) (b - afs->save)) < HTS_URLMAXSIZE)) {      // parser, et pas trop long..
+    // build the name
+    while ((*a) && (sb.len < HTS_URLMAXSIZE)) { // parse, but not too long
      if (*a == '%') {
        int short_ver = 0;

        a++;
-        if (*a == 's') {
+        if (*a == 's') { // '%s...' selects the short (8.3) form
          short_ver = 1;
          a++;
        }
-        *b = '\0';
        switch (tok = *a++) {
        case '[':              // %[param:prefix_if_not_empty:suffix_if_not_empty:empty_replacement:notfound_replacement]
          if (strchr(a, ']')) {
@@ -834,8 +833,7 @@ int url_savename(lien_adrfilsave *const afs,
              }
              if (cp) {
                c = cp + strlen(name[0]);       /* jumps "param=" */
-                strcpybuff(b, name[1]); /* prefix */
-                b += strlen(b);
+                htsbuff_cat(&sb, name[1]);      /* prefix */
                if (*c != '\0' && *c != '&') {
                  char *d = name[0];

@@ -846,110 +844,90 @@ int url_savename(lien_adrfilsave *const afs,
                  *d = '\0';
                  d = unescape_http(catbuff, sizeof(catbuff), name[0]);
                  if (d && *d) {
-                    strcpybuff(b, d);   /* value */
-                    b += strlen(b);
+                    htsbuff_cat(&sb, d); /* value */
                  } else {
-                    strcpybuff(b, name[3]);     /* empty replacement if any */
-                    b += strlen(b);
+                    htsbuff_cat(&sb, name[3]); /* empty replacement if any */
                  }
                } else {
-                  strcpybuff(b, name[3]);       /* empty replacement if any */
-                  b += strlen(b);
+                  htsbuff_cat(&sb, name[3]); /* empty replacement if any */
                }
-                strcpybuff(b, name[2]); /* suffix */
-                b += strlen(b);
+                htsbuff_cat(&sb, name[2]); /* suffix */
              } else {
-                strcpybuff(b, name[4]); /* not found replacement if any */
-                b += strlen(b);
+                htsbuff_cat(&sb, name[4]); /* not found replacement if any */
              }
            } else {
-              strcpybuff(b, name[4]);   /* not found replacement if any */
-              b += strlen(b);
+              htsbuff_cat(&sb, name[4]); /* not found replacement if any */
            }
          }
          break;
        case '%':
-          *b++ = '%';
+          htsbuff_catc(&sb, '%');
          break;
-        case 'n':              // nom sans ext
-          *b = '\0';
+        case 'n': // name without extension
          if (dot_pos) {
-            if (!short_ver)     // Noms longs
-              strncatbuff(b, nom_pos, (int) (dot_pos - nom_pos));
+            if (!short_ver)
+              htsbuff_catn(&sb, nom_pos, (int) (dot_pos - nom_pos));
            else
-              strncatbuff(b, nom_pos, min((int) (dot_pos - nom_pos), 8));
+              htsbuff_catn(&sb, nom_pos, min((int) (dot_pos - nom_pos), 8));
          } else {
-            if (!short_ver)     // Noms longs
-              strcpybuff(b, nom_pos);
+            if (!short_ver)
+              htsbuff_cat(&sb, nom_pos);
            else
-              strncatbuff(b, nom_pos, 8);
+              htsbuff_catn(&sb, nom_pos, 8);
          }
-          b += strlen(b);       // pointer à la fin
          break;
-        case 'N':              // nom avec ext
-          // RECOPIE NOM + EXT
-          *b = '\0';
+        case 'N': // name with extension
          if (dot_pos) {
-            if (!short_ver)     // Noms longs
-              strncatbuff(b, nom_pos, (int) (dot_pos - nom_pos));
+            if (!short_ver)
+              htsbuff_catn(&sb, nom_pos, (int) (dot_pos - nom_pos));
            else
-              strncatbuff(b, nom_pos, min((int) (dot_pos - nom_pos), 8));
+              htsbuff_catn(&sb, nom_pos, min((int) (dot_pos - nom_pos), 8));
          } else {
-            if (!short_ver)     // Noms longs
-              strcpybuff(b, nom_pos);
+            if (!short_ver)
+              htsbuff_cat(&sb, nom_pos);
            else
-              strncatbuff(b, nom_pos, 8);
+              htsbuff_catn(&sb, nom_pos, 8);
          }
-          b += strlen(b);       // pointer à la fin
-          *b = '.';
-          ++b;
-          // RECOPIE NOM + EXT
-          *b = '\0';
+          htsbuff_catc(&sb, '.');
          if (dot_pos) {
-            if (!short_ver)     // Noms longs
-              strcpybuff(b, dot_pos + 1);
+            if (!short_ver)
+              htsbuff_cat(&sb, dot_pos + 1);
            else
-              strncatbuff(b, dot_pos + 1, 3);
+              htsbuff_catn(&sb, dot_pos + 1, 3);
          } else {
-            if (!short_ver)     // Noms longs
-              strcpybuff(b, DEFAULT_EXT + 1);   // pas de..
+            if (!short_ver)
+              htsbuff_cat(&sb, DEFAULT_EXT + 1); // skip the leading dot
            else
-              strcpybuff(b, DEFAULT_EXT_SHORT + 1);     // pas de..
+              htsbuff_cat(&sb, DEFAULT_EXT_SHORT + 1); // skip the leading dot
          }
-          b += strlen(b);       // pointer à la fin
-          //
          break;
-        case 't':              // ext
-          *b = '\0';
+        case 't': // extension
          if (dot_pos) {
-            if (!short_ver)     // Noms longs
-              strcpybuff(b, dot_pos + 1);
+            if (!short_ver)
+              htsbuff_cat(&sb, dot_pos + 1);
            else
-              strncatbuff(b, dot_pos + 1, 3);
+              htsbuff_catn(&sb, dot_pos + 1, 3);
          } else {
-            if (!short_ver)     // Noms longs
-              strcpybuff(b, DEFAULT_EXT + 1);   // pas de..
+            if (!short_ver)
+              htsbuff_cat(&sb, DEFAULT_EXT + 1); // skip the leading dot
            else
-              strcpybuff(b, DEFAULT_EXT_SHORT + 1);     // pas de..
+              htsbuff_cat(&sb, DEFAULT_EXT_SHORT + 1); // skip the leading dot
          }
-          b += strlen(b);       // pointer à la fin
          break;
-        case 'p':              // path sans dernier /
-          *b = '\0';
-          if (nom_pos != fil + 1) {     // pas: /index.html (chemin nul)
-            if (!short_ver) {   // Noms longs
-              strncatbuff(b, fil, (int) (nom_pos - fil) - 1);
+        case 'p': // path without trailing /
+          if (nom_pos !=
+              fil + 1) { // skip when the path is empty (e.g. /index.html)
+            if (!short_ver) {
+              htsbuff_catn(&sb, fil, (int) (nom_pos - fil) - 1);
            } else {
              char BIGSTK pth[HTS_URLMAXSIZE * 2], n83[HTS_URLMAXSIZE * 2];

              pth[0] = n83[0] = '\0';
-              //
              strncatbuff(pth, fil, (int) (nom_pos - fil) - 1);
              long_to_83(opt->savename_83, n83, pth);
-              strcpybuff(b, n83);
+              htsbuff_cat(&sb, n83);
            }
          }
-          b += strlen(b);       // pointer à la fin
          break;
        case 'h':              // host (IDNA decoded if suitable)
          // IDNA / RFC 3492 (Punycode) handling for HTTP(s)
@@ -957,62 +935,50 @@ int url_savename(lien_adrfilsave *const afs,
            DECLARE_ADR(final_adr);

            /* Copy address */
-            *b = '\0';
            if (!short_ver)
-              strcpybuff(b, final_adr);
+              htsbuff_cat(&sb, final_adr);
            else
-              strcpybuff(b, final_adr);
+              htsbuff_cat(&sb, final_adr);

            /* release */
            RELEASE_ADR();
          }
-          b += strlen(b);       // pointer à la fin
          break;
-        case 'H':              // host, raw (old mode)
-          *b = '\0';
+        case 'H': // host, raw (old mode)
          if (protocol == PROTOCOL_FILE) {
-            if (!short_ver)     // Noms longs
-              strcpybuff(b, "localhost");
+            if (!short_ver)
+              htsbuff_cat(&sb, "localhost");
            else
-              strcpybuff(b, "local");
+              htsbuff_cat(&sb, "local");
          } else {
-            if (!short_ver)     // Noms longs
-              strcpybuff(b, print_adr);
+            if (!short_ver)
+              htsbuff_cat(&sb, print_adr);
            else
-              strncatbuff(b, print_adr, 8);
+              htsbuff_catn(&sb, print_adr, 8);
          }
-          b += strlen(b);       // pointer à la fin
          break;
-        case 'M':              /* host/address?query MD5 (128-bits) */
-          *b = '\0';
-          {
-            char digest[32 + 2];
-            char BIGSTK buff[HTS_URLMAXSIZE * 2];
+        case 'M': /* host/address?query MD5 (128-bits) */
+        {
+          char digest[32 + 2];
+          char BIGSTK buff[HTS_URLMAXSIZE * 2];

-            digest[0] = buff[0] = '\0';
-            strcpybuff(buff, adr);
-            strcatbuff(buff, fil_complete);
-            domd5mem(buff, strlen(buff), digest, 1);
-            strcpybuff(b, digest);
-          }
-          b += strlen(b);       // pointer à la fin
-          break;
+          digest[0] = buff[0] = '\0';
+          strcpybuff(buff, adr);
+          strcatbuff(buff, fil_complete);
+          domd5mem(buff, strlen(buff), digest, 1);
+          htsbuff_cat(&sb, digest);
+        } break;
        case 'Q':
-        case 'q':              /* query MD5 (128-bits/16-bits) 
-                                   GENERATED ONLY IF query string exists! */
-          {
-            char md5[32 + 2];
+        case 'q': /* query MD5 (128-bits/16-bits)
+                      GENERATED ONLY IF query string exists! */
+        {
+          char md5[32 + 2];

-            *b = '\0';
-            strncatbuff(b, url_md5(md5, fil_complete), (tok == 'Q') ? 32 : 4);
-            b += strlen(b);     // pointer à la fin
-          }
-          break;
+          htsbuff_catn(&sb, url_md5(md5, fil_complete), (tok == 'Q') ? 32 : 4);
+        } break;
        case 'r':
        case 'R':              // protocol
-          *b = '\0';
-          strcatbuff(b, protocol_str[protocol]);
-          b += strlen(b);       // pointer à la fin
+          htsbuff_cat(&sb, protocol_str[protocol]);
          break;

          /* Patch by Juan Fco Rodriguez to get the full query string */
@@ -1021,19 +987,17 @@ int url_savename(lien_adrfilsave *const afs,
            char *d = strchr(fil_complete, '?');

            if (d != NULL) {
-              strcatbuff(b, d);
-              b += strlen(b);
+              htsbuff_cat(&sb, d);
            }
          }
          break;

        }
      } else
-        *b++ = *a++;
+        htsbuff_catc(&sb, *a++);
    }
-    *b++ = '\0';
    //
-    // Types prédéfinis
+    // predefined types
    //

  }
--- a/src/htssafe.h
+++ b/src/htssafe.h
@@ -351,6 +351,13 @@ static HTS_INLINE HTS_UNUSED void htsbuff_cat(htsbuff *b, const char *s) {
  htsbuff_catn(b, s, (size_t) -1);
 }

+/** Append a single character (including '\0' as data). Aborts on overflow. */
+static HTS_INLINE HTS_UNUSED void htsbuff_catc(htsbuff *b, char c) {
+  assertf__(1 < b->cap - b->len, "htsbuff append overflow", __FILE__, __LINE__);
+  b->buf[b->len++] = c;
+  b->buf[b->len] = '\0';
+}
+
 /** Reset content to s. Aborts on overflow. */
 static HTS_INLINE HTS_UNUSED void htsbuff_cpy(htsbuff *b, const char *s) {
  b->len = 0;
--- a/src/htswizard.c
+++ b/src/htswizard.c
@@ -43,17 +43,23 @@ Please visit our Website: http://www.httrack.com
 /* END specific definitions */

 // libérer filters[0] pour insérer un élément dans filters[0]
-#define HT_INSERT_FILTERS0 do {\
-  int i;\
-  if (*opt->filters.filptr > 0) {\
-    for(i = (*opt->filters.filptr)-1 ; i>=0 ; i--) {\
-      strcpybuff((*opt->filters.filters)[i+1],(*opt->filters.filters)[i]);\
-    }\
-  }\
-  (*opt->filters.filters)[0][0]='\0';\
-  (*opt->filters.filptr)++;\
-  assertf((*opt->filters.filptr) < opt->maxfilter); \
-} while(0)
+/* Per-slot capacity of the filters array, matching the slot stride allocated by
+   filters_init() in htscore.c (HTS_URLMAXSIZE * 2). */
+#define HTS_FILTER_SLOT_SIZE (HTS_URLMAXSIZE * 2)
+
+#define HT_INSERT_FILTERS0                                                     \
+  do {                                                                         \
+    int i;                                                                     \
+    if (*opt->filters.filptr > 0) {                                            \
+      for (i = (*opt->filters.filptr) - 1; i >= 0; i--) {                      \
+        strlcpybuff((*opt->filters.filters)[i + 1],                            \
+                    (*opt->filters.filters)[i], HTS_FILTER_SLOT_SIZE);         \
+      }                                                                        \
+    }                                                                          \
+    (*opt->filters.filters)[0][0] = '\0';                                      \
+    (*opt->filters.filptr)++;                                                  \
+    assertf((*opt->filters.filptr) < opt->maxfilter);                          \
+  } while (0)

 typedef struct htspair_t {
  const char *tag;
@@ -707,17 +713,21 @@ static int hts_acceptlink_(httrackp * opt, int ptr,
        forbidden_url = 1;
        opt->wizard = 2;        // sauter tout le reste
        break;
-      case 0:                  // interdire les mêmes liens: adr/fil
+      case 0: // forbid the same link: adr/fil
        forbidden_url = 1;
-        HT_INSERT_FILTERS0;     // insérer en 0
-        strcpybuff(_FILTERS[0], "-");
-        strcatbuff(_FILTERS[0], jump_identification_const(adr));
-        if (*fil != '/')
-          strcatbuff(_FILTERS[0], "/");
-        strcatbuff(_FILTERS[0], fil);
+        HT_INSERT_FILTERS0; // insert at slot 0
+        {
+          htsbuff f = htsbuff_ptr(_FILTERS[0], HTS_FILTER_SLOT_SIZE);
+
+          htsbuff_cpy(&f, "-");
+          htsbuff_cat(&f, jump_identification_const(adr));
+          if (*fil != '/')
+            htsbuff_cat(&f, "/");
+          htsbuff_cat(&f, fil);
+        }
        break;

-      case 1:                  // éliminer répertoire entier et sous rép: adr/path/ *
+      case 1: // forbid the whole directory and subdirs: adr/path/*
        forbidden_url = 1;
        {
          size_t i = strlen(fil) - 1;
@@ -725,27 +735,34 @@ static int hts_acceptlink_(httrackp * opt, int ptr,
          while((fil[i] != '/') && (i > 0))
            i--;
          if (fil[i] == '/') {
-            HT_INSERT_FILTERS0; // insérer en 0
-            strcpybuff(_FILTERS[0], "-");
-            strcatbuff(_FILTERS[0], jump_identification_const(adr));
+            htsbuff f;
+
+            HT_INSERT_FILTERS0; // insert at slot 0
+            f = htsbuff_ptr(_FILTERS[0], HTS_FILTER_SLOT_SIZE);
+            htsbuff_cpy(&f, "-");
+            htsbuff_cat(&f, jump_identification_const(adr));
            if (*fil != '/')
-              strcatbuff(_FILTERS[0], "/");
-            strncatbuff(_FILTERS[0], fil, i);
-            if (_FILTERS[0][strlen(_FILTERS[0]) - 1] != '/')
-              strcatbuff(_FILTERS[0], "/");
-            strcatbuff(_FILTERS[0], "*");
+              htsbuff_cat(&f, "/");
+            htsbuff_catn(&f, fil, i);
+            if (f.len > 0 && f.buf[f.len - 1] != '/')
+              htsbuff_cat(&f, "/");
+            htsbuff_cat(&f, "*");
          }
        }

        // ** ...
        break;

-      case 2:                  // adresse adr*
+      case 2: // the whole address: adr*
        forbidden_url = 1;
-        HT_INSERT_FILTERS0;     // insérer en 0                                
-        strcpybuff(_FILTERS[0], "-");
-        strcatbuff(_FILTERS[0], jump_identification_const(adr));
-        strcatbuff(_FILTERS[0], "*");
+        HT_INSERT_FILTERS0; // insert at slot 0
+        {
+          htsbuff f = htsbuff_ptr(_FILTERS[0], HTS_FILTER_SLOT_SIZE);
+
+          htsbuff_cpy(&f, "-");
+          htsbuff_cat(&f, jump_identification_const(adr));
+          htsbuff_cat(&f, "*");
+        }
        break;

      case 3:                  // ** A FAIRE
@@ -777,54 +794,70 @@ static int hts_acceptlink_(httrackp * opt, int ptr,

        break;

-      case 5:                  // autoriser répertoire entier et fils
-        if ((opt->seeker & 2) == 0) {   // interdiction de monter
+      case 5: // allow the whole directory and its children
+        if ((opt->seeker & 2) == 0) { // not allowed to go up
          size_t i = strlen(fil) - 1;

          while((fil[i] != '/') && (i > 0))
            i--;
          if (fil[i] == '/') {
-            HT_INSERT_FILTERS0; // insérer en 0                                
-            strcpybuff(_FILTERS[0], "+");
-            strcatbuff(_FILTERS[0], jump_identification_const(adr));
-            if (*fil != '/')
-              strcatbuff(_FILTERS[0], "/");
-            strncatbuff(_FILTERS[0], fil, i + 1);
-            strcatbuff(_FILTERS[0], "*");
+            HT_INSERT_FILTERS0; // insert at slot 0
+            {
+              htsbuff f = htsbuff_ptr(_FILTERS[0], HTS_FILTER_SLOT_SIZE);
+
+              htsbuff_cpy(&f, "+");
+              htsbuff_cat(&f, jump_identification_const(adr));
+              if (*fil != '/')
+                htsbuff_cat(&f, "/");
+              htsbuff_catn(&f, fil, i + 1);
+              htsbuff_cat(&f, "*");
+            }
+          }
+        } else {              // then allow the domain
+          HT_INSERT_FILTERS0; // insert at slot 0
+          {
+            htsbuff f = htsbuff_ptr(_FILTERS[0], HTS_FILTER_SLOT_SIZE);
+
+            htsbuff_cpy(&f, "+");
+            htsbuff_cat(&f, jump_identification_const(adr));
+            htsbuff_cat(&f, "*");
          }
-        } else {                // autoriser domaine alors!!
-          HT_INSERT_FILTERS0;   // insérer en 0                                strcpybuff(filters[filptr],"+");
-          strcpybuff(_FILTERS[0], "+");
-          strcatbuff(_FILTERS[0], jump_identification_const(adr));
-          strcatbuff(_FILTERS[0], "*");
        }
        break;

      case 6:                  // same domain
-        HT_INSERT_FILTERS0;     // insérer en 0                                strcpybuff(filters[filptr],"+");
-        strcpybuff(_FILTERS[0], "+");
-        strcatbuff(_FILTERS[0], jump_identification_const(adr));
-        strcatbuff(_FILTERS[0], "*");
+        HT_INSERT_FILTERS0;    // insert at slot 0
+        {
+          htsbuff f = htsbuff_ptr(_FILTERS[0], HTS_FILTER_SLOT_SIZE);
+
+          htsbuff_cpy(&f, "+");
+          htsbuff_cat(&f, jump_identification_const(adr));
+          htsbuff_cat(&f, "*");
+        }
        break;
        //
-      case 7:                  // autoriser ce répertoire
-        {
-          size_t i = strlen(fil) - 1;
+      case 7: // allow this directory
+      {
+        size_t i = strlen(fil) - 1;

-          while((fil[i] != '/') && (i > 0))
-            i--;
-          if (fil[i] == '/') {
-            HT_INSERT_FILTERS0; // insérer en 0                                
-            strcpybuff(_FILTERS[0], "+");
-            strcatbuff(_FILTERS[0], jump_identification_const(adr));
+        while ((fil[i] != '/') && (i > 0))
+          i--;
+        if (fil[i] == '/') {
+          HT_INSERT_FILTERS0; // insert at slot 0
+          {
+            htsbuff f = htsbuff_ptr(_FILTERS[0], HTS_FILTER_SLOT_SIZE);
+
+            htsbuff_cpy(&f, "+");
+            htsbuff_cat(&f, jump_identification_const(adr));
            if (*fil != '/')
-              strcatbuff(_FILTERS[0], "/");
-            strncatbuff(_FILTERS[0], fil, i + 1);
-            strcatbuff(_FILTERS[0], "*[file]");
+              htsbuff_cat(&f, "/");
+            htsbuff_catn(&f, fil, i + 1);
+            htsbuff_cat(&f, "*[file]");
          }
        }
+      }

-        break;
+      break;

      case 50:                 // on fait rien
        break;
--- a/src/httrack-library.h
+++ b/src/httrack-library.h
@@ -193,6 +193,7 @@ HTSEXT_API int structcheck(const char *path);
 HTSEXT_API int structcheck_utf8(const char *path);
 HTSEXT_API int dir_exists(const char *path);
 HTSEXT_API void infostatuscode(char *msg, int statuscode);
+HTSEXT_API const char *infostatuscode_const(int statuscode);
 HTSEXT_API TStamp mtime_local(void);
 HTSEXT_API void qsec2str(char *st, TStamp t);
 HTSEXT_API char *int2char(strc_int2bytes2 * strc, int n);
--- a/tests/02_update-cache.test
+++ b/tests/02_update-cache.test
@@ -0,0 +1,62 @@
+#!/bin/bash
+#
+
+# Update path: re-mirroring a site reads the cache (cache_readex) to decide what
+# is up to date -- a path the one-shot crawl tests never exercise. Offline
+# (file://), so it always runs.
+#
+#   1. mirror, then re-mirror unchanged -> the cache-read pass must complete clean
+#      (guards against a crash/abort/error in cache_readex).
+#   2. change a source file, re-mirror -> the update must pick up the new content
+#      (guards the update decision that reads the cached metadata).
+
+set -eu
+
+site=$(mktemp -d)
+out=$(mktemp -d)
+trap 'rm -rf "$site" "$out"' EXIT
+
+cat >"$site/index.html" <<EOF
+<a href="a.html">a</a> <a href="sub/b.html">b</a>
+EOF
+echo 'OLDCONTENT' >"$site/a.html"
+mkdir -p "$site/sub"
+echo '<p>bbb</p>' >"$site/sub/b.html"
+
+url="file://$site/index.html"
+
+# count Error: lines in the log (grep -c exits 1 on zero matches: guard it)
+errors() { grep -ciE '^[0-9:]*[[:space:]]Error:' "$out/hts-log.txt" || true; }
+
+# 1. fresh mirror writes the cache
+httrack "$url" -O "$out" -q -%v0 -r3 >/dev/null 2>&1
+test -e "$out/hts-cache/new.zip" || {
+    echo "no cache was written" >&2
+    exit 1
+}
+
+# 2. re-mirror unchanged: the update reads the cache and must complete cleanly
+httrack "$url" -O "$out" -q -%v0 -r3 >/dev/null 2>&1
+test "$(errors)" = 0 || {
+    echo "update (unchanged) reported errors" >&2
+    exit 1
+}
+for suffix in a.html sub/b.html; do
+    find "$out" -path "*/$suffix" | grep -q . || {
+        echo "missing $suffix after update" >&2
+        exit 1
+    }
+done
+
+# 3. change a source file: the update must pick up the new content
+sleep 1
+echo 'NEWCONTENT' >"$site/a.html"
+httrack "$url" -O "$out" -q -%v0 -r3 >/dev/null 2>&1
+test "$(errors)" = 0 || {
+    echo "update (changed) reported errors" >&2
+    exit 1
+}
+grep -q NEWCONTENT "$(find "$out" -path '*/a.html')" || {
+    echo "update did not pick up the changed source" >&2
+    exit 1
+}
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -22,6 +22,7 @@ TESTS = \
 	01_engine-simplify.test \
 	01_engine-strsafe.test \
 	02_manpage-regen.test \
+	02_update-cache.test \
 	10_crawl-simple.test \
 	11_crawl-cookies.test \
 	11_crawl-idna.test \
Author	SHA1	Message	Date
Xavier Roche	b80ee793ac	Bound the legacy .dat cache readers (cache_rstr / cache_brstr) cache_rstr() read an attacker-controlled length (clamped only to 32768) from a CACHE-1.x .dat and fread() it straight into fixed htsblk fields (r.msg[80], r.contenttype[64], ...) with no destination bound -- a heap/stack overflow from a crafted/old cache (the audit's S1). cache_brstr() (the in-memory variant) had the same shape and, worse, no length cap at all. Thread a destination size into both: - cache_rstr stores at most s_size-1 bytes and fseek()s past the remainder so the next field stays aligned (the field may be longer than the destination in a tampered cache). - cache_brstr caps the length and bounds the copy. Update every caller (htscache.c and htscoremain.c) to pass sizeof(field) / HTS_URLMAXSIZE*2. cache_rstr_addr already malloc()s to the read size, so it is left as is. Remove the dead cache_quickbrstr (no callers). A dedicated cache self-test (create/read/update) follows separately. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Xavier Roche <roche@httrack.com>	2026-06-14 16:41:17 +02:00
Xavier Roche	d12456c1e8	Merge pull request #341 from xroche/test/cache-update Add an offline update/cache regression test	2026-06-14 16:31:42 +02:00
Xavier Roche	a52a2b146c	Add an offline update/cache regression test Every crawl test runs httrack exactly once (crawl-test.sh), so the cache read / update path (cache_readex) -- recently touched by the buffer-bounding work -- had zero regression coverage: the cache was written but never read back. Add tests/02_update-cache.test, a self-contained file:// two-pass test (no network, always runs): mirror a local site, re-mirror it unchanged (the cache- read pass must complete with no errors -- guards a crash/abort in cache_readex), then change a source file and re-mirror (the update must pick up the new content -- guards the update decision that reads the cached metadata). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Xavier Roche <roche@httrack.com>	2026-06-14 16:29:45 +02:00
Xavier Roche	226a38d3d0	Merge pull request #340 from xroche/cleanup/htscache-bounds Bound htscache.c cache-field and save-name copies	2026-06-14 15:58:04 +02:00
Xavier Roche	1e463f65a5	Bound htscache.c cache-field and save-name copies ZIP_READFIELD_STRING (the cached ZIP-header field reader) copied attacker-influenced cache-file values into fixed htsblk fields with an unchecked strcpybuff -- benign for the char[] fields, but r.location is a char* (degrades to raw strcpy). Thread the destination size into the macro: sizeof(field) for the array fields, HTS_URLMAXSIZE2 for r.location (it points into a buffer of that size, in both the caller-supplied and the location_default case). Also bound cache_readex's return_save copy (its one non-NULL caller passes a HTS_URLMAXSIZE2 buffer), the exact-sized malloc copy in cache_rstr's default path (strlen(defaultdata)+1), and replace the two strcpybuff(r.location, "") clears with a direct r.location[0] = '\0'. htscache.c pointer-destination warnings 6 -> 0. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Xavier Roche <roche@httrack.com>	2026-06-14 15:43:04 +02:00
Xavier Roche	09ed9968cd	Merge pull request #339 from xroche/cleanup/htsbauth-bounds Bound htsbauth cookie/auth buffer writes	2026-06-14 15:32:37 +02:00
Xavier Roche	ad6915e3cc	Bound htsbauth cookie/auth buffer writes cookie_get(), bauth_prefix(), cookie_insert() and cookie_delete() all wrote into caller-provided char* buffers via unchecked strcpybuff/strcatbuff/strncatbuff (the pointer-destination case). Bound them: - cookie_get: write the extracted field with htsbuff over the buffer's 8192-byte contract (all callers use char[8192]). - bauth_prefix: copy host+path with strlcpybuff/strlcatbuff bounded to the caller's HTS_URLMAXSIZE*2 buffer. - cookie_insert/cookie_delete: thread the destination capacity (the cookie store's max_len minus the cursor offset) and use strlcpybuff/strlcatbuff; update cookie_add/cookie_del to pass it. Add cookie_get field-extraction asserts to basic_selftests (run via -#7) rather than a new -# digit. Translated the touched French comments. htsbauth.c pointer-destination warnings 9 -> 0. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Xavier Roche <roche@httrack.com>	2026-06-14 15:29:33 +02:00
Xavier Roche	4a5580dec0	Merge pull request #338 from xroche/cleanup/htswizard-bounds Build wizard auto-filter rules with htsbuff (bounded)	2026-06-14 14:37:56 +02:00
Xavier Roche	f1d35e7691	Build wizard auto-filter rules with htsbuff (bounded) hts_acceptlink_()'s auto-generated allow/deny rules built _FILTERS[0] -- a filter slot of HTS_URLMAXSIZE2 bytes -- via unchecked strcpybuff/strcatbuff/ strncatbuff on the char slot, and HT_INSERT_FILTERS0 shifted slots with an unchecked strcpybuff. Convert each rule builder to an htsbuff over the slot (new local HTS_FILTER_SLOT_SIZE, matching the stride allocated by filters_init()), and bound the slot-shift copy with strlcpybuff. Behavior preserved: old vs new produce byte-identical mirrors across four crawl configurations on a local multi-directory site (the auto-rules fire for primary links on normal crawls). Touched French comments translated. htswizard.c pointer-destination warnings 30 -> 0. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Xavier Roche <roche@httrack.com>	2026-06-14 14:36:21 +02:00
Xavier Roche	6d7db83726	Merge pull request #336 from xroche/cleanup/htsalias-bounds Bound optalias_check's output buffers (fix S1 overflow)	2026-06-14 13:50:38 +02:00
Xavier Roche	335c2c4b2a	Merge pull request #337 from xroche/docs/governance Add contributor governance: CONTRIBUTING, COC, SECURITY, DCO	2026-06-14 13:47:44 +02:00
Xavier Roche	62be177e35	Add obfuscated personal email as alternate security contact Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Xavier Roche <roche@httrack.com>	2026-06-14 13:47:15 +02:00
Xavier Roche	edd52bf3be	Bound optalias_check's output buffers (thread their sizes) optalias_check() wrote into caller-provided char* buffers with unchecked ops: the param0 case did strcpybuff/strcatbuff of command+param into return_argv[0], which can exceed the buffer, and the syntax-error paths sprintf()'d an option name into return_error -- which is only 256 bytes in the config-file caller, so a long option overflows it. Both are the overflow the audit flagged. Thread return_argv_size and return_error_size through the (internal, non-exported) signature; copy with strlcpybuff/strlcatbuff and format with snprintf, so an over-long value aborts/truncates instead of overrunning. Update both callers to pass their real sizes. Leaves the shared cmdl_ins macro (the cmdl_* family wants its block size threaded too -- a separate cleanup). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-14 13:47:12 +02:00
Xavier Roche	452a9f6c67	Add contributor governance: CONTRIBUTING, COC, SECURITY, DCO httrack had no community-health files. Add a short CONTRIBUTING (PR/style basics, security-sensitivity, an outcome-only AI-assistance policy), the Contributor Covenant 2.1 as CODE_OF_CONDUCT, and a SECURITY policy with a verified-reproduction bar for AI-assisted reports. Require a Signed-off-by (DCO) on every commit and enforce it in CI via a new pull_request-only job. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Xavier Roche <roche@httrack.com>	2026-06-14 13:41:19 +02:00
Xavier Roche	9eb2a344a9	Merge pull request #335 from xroche/cleanup/infostatuscode-const Return HTTP status reason phrases via a const-returning switch	2026-06-14 13:18:16 +02:00
Xavier Roche	348a7d8cb2	Return HTTP status reason phrases via a const-returning switch infostatuscode() was a ~60-case switch, each arm strcpybuff()-ing a literal into the caller's char* msg: 42 unchecked pointer-destination copies of static data. Keep the same O(1) switch dispatch but have it return the phrase instead of copying -- new public infostatuscode_const(int) -> const char* (or NULL) -- and do the copy in a thin wrapper. infostatuscode() preserves exact behavior: a known code overwrites msg; an unknown code keeps any caller-provided message, else writes "Unknown error". The single remaining copy uses strlcpybuff with the documented 64-byte minimum (longest phrase is 31; all callers pass >= 80). Drops 42 pointer-destination warnings (htslib.c 56 -> 14; tree 179 -> 137). No dispatch regression: it stays a switch (jump table), no allocation, no per-call scan. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-14 13:14:23 +02:00
Xavier Roche	5f81741ac5	Merge pull request #332 from xroche/cleanup/url_savename-htsbuff Convert the url_savename template renderer to htsbuff	2026-06-14 13:01:32 +02:00
Xavier Roche	0cf14c4e88	Convert the url_savename template renderer to htsbuff The savename_type == -1 userdef renderer walked afs->save with a raw char* cursor, doing "b += strlen(b)" after each write, and strcpybuff(b, ...) on that char* was unchecked (the pointer-destination case). That manual pointer math is where the function's off-by-one / strlen-based hazards lived. Convert the cursor to an htsbuff over afs->save (capacity sizeof = the full HTS_URLMAXSIZE*2 buffer): every append is now bounds-checked and the pointer math is gone. The loop's truncation guard becomes "sb.len < HTS_URLMAXSIZE", preserving the existing cap-at-1024 behavior; the 2x buffer means a write only aborts where it would previously have overrun. Add htsbuff_catc for the single-character appends ('%', '.', literal copy). Removes 35 pointer-destination warnings (htsname.c 51 -> 9; the renderer is now warning-free). Behavior verified identical: the pre-change and new binaries produce byte-identical output across 14 -N templates (%n %N %t %p %h %H %M %q %r %% %[param], the short %s variants, and literals) crawling a local site. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-14 12:59:29 +02:00
Xavier Roche	29a07ff487	Merge pull request #334 from xroche/cleanup/git-format-hook Add an opt-in pre-commit hook that auto-formats changed C lines	2026-06-14 12:58:42 +02:00