Files
httrack/html
Xavier Roche 799ec88dc7 filters: fix escaped brackets inside *[...] character classes (#440)
* filters: decode escaped chars correctly inside *[...] classes

The escape branch in strjoker probed joker[i+2] instead of the current
char, so a backslash escape only worked as the first class member:
'*[\[\]]' (documented as "the [ or ] character") matched only ']', and
'*[a,\[]' dropped the 'a'. The loop also treated any ']' as the class
terminator, so an escaped ']' could never be a member.

Decode the escape first in the loop body: a backslash takes the next char
as the literal member (only that char, not also the backslash the old code
added), and an escaped ']' is consumed before the terminator check. So
'*[\[\]]' now matches both brackets, and escape precedes the range/size
checks ('\-' '\,' '\<' become literal members). The self-test previously
pinned the buggy output as expected; it now asserts the documented
behavior and fails against the old matcher.

Closes #148

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Xavier Roche <roche@httrack.com>

* filters: fix a 1-byte over-read on a truncated range *[a-

The *[...] class parser's range arm does i += 3 unconditionally, so a
pattern ending in a dangling '-' (e.g. *[a-) read one byte past the NUL:
joker[i+2] is the NUL, i jumps to len+1, and the separator skip and loop
guard then read joker[len+1]. Guard the range arm on joker[i+2] != '\0'
so a truncated range falls through to the literal-member path instead of
overshooting.

The filter self-test now copies the pattern and string into exact-size
heap buffers so a sanitizer traps such over-reads; the pattern previously
came straight from argv (no redzone), which is why this stayed invisible.
A *[a- test case exercises it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Xavier Roche <roche@httrack.com>

---------

Signed-off-by: Xavier Roche <roche@httrack.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 12:56:11 +02:00
..
2015-03-14 14:04:17 +01:00
2012-03-19 12:55:42 +00:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:59:03 +00:00
2015-03-14 14:04:17 +01:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:51:31 +00:00
2013-04-28 16:03:06 +00:00
2015-03-14 14:04:17 +01:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:59:03 +00:00
2023-01-14 17:21:57 +01:00
2015-03-14 14:04:17 +01:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:51:31 +00:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:59:03 +00:00
2023-01-14 17:21:57 +01:00
2012-03-19 12:59:03 +00:00
2023-01-14 17:21:57 +01:00
2012-03-19 12:59:03 +00:00
2023-01-14 17:21:57 +01:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:59:03 +00:00
2023-01-14 17:21:57 +01:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:59:03 +00:00
2012-03-19 12:59:03 +00:00