Compare commits

...

1 Commits

Author SHA1 Message Date
Xavier Roche
794404bba2 test: characterize wildcard class escape behavior
Add -#0 self-test cases for backslash escapes inside a '*[...]' class.
They pin two quirks of the current decoder: '\X' matches both X and the
backslash itself, and a literal ']' cannot be a class member because the
parser stops at the first ']' (escaped or not). The latter is why the
filter guide's '*[\[\]]' = "the [ or ] character" claim is wrong (#148):
it parses as the class {[,\} plus a trailing literal ']'. These tests
lock the behavior down so a later matcher fix is a deliberate change.

refs #148
2026-06-13 10:15:45 +02:00

View File

@@ -47,3 +47,25 @@ match '*foo*bar' 'foozbar'
# '?' is the query-string marker, not a single-char wildcard
nomatch 'a?c' 'abc'
# backslash escapes a metacharacter inside a class so it is matched literally.
# Quirk: the decoder also adds the backslash itself to the set, so '\X' matches
# both X and '\'. These assertions pin that behavior.
match '*[\*]' '*'
match '*[\*]' "\\"
nomatch '*[\*]' 'a'
match '*[\\]' "\\"
nomatch '*[\\]' 'a'
match '*[\[]' '['
match '*[\[]' "\\"
nomatch '*[\[]' 'a'
# A literal ']' cannot be a class member: the class parser stops at the first
# ']', escaped or not. So '*[\[\]]' does NOT mean "the [ or ] character" as the
# filter guide claims (GitHub #148); it parses as the class {'[','\'} followed
# by a trailing literal ']'. These assertions document the current (buggy)
# behavior so any future matcher fix is a deliberate, visible change.
nomatch '*[\[\]]' '[' # not matched, despite the docs
match '*[\[\]]' ']' # only via the empty class-match + trailing ']'
match '*[\[\]]' '[]' # one of {'[','\'} then the trailing ']'
nomatch '*[\[\]]' '[]x'