GNU bug report logs -
#64128
regexp parser zero-width assertion bugs
Previous Next
Full log
View this message in rfc822 format
19 juni 2023 kl. 05.04 skrev Stefan Monnier <monnier <at> iro.umontreal.ca>:
> `^` is only special if it's at the beginning of a group, so `^*` will
> always treat this * as a literal, right?
> "Similarly" `$` is only special if it's at the end of a group, so `$*` will
> always be a repetition of the $ character no?
Yes, ^ and $ have additional rules for when they are plain literals and not subject to these bugs at all.
The literal-splitting powers of ^ have now (075e77ac44) been removed.
> So the remaining problematic elements are \` \' \b and \B
\`* has been observed, so we probably need to keep that working as well.
> I suspect if we don't want to signal errors, the next best thing is to
> treat them like group B.
Yes, maybe; they are less likely to be followed by an operator-literal, but it would also be good to have all zero-width assertions work the same way.
On the other hand, it can't be worse than we have now, as long as we get rid of the "quack,\\b*" semantics.
This bug report was last modified 2 years and 2 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.