GNU bug report logs - #64128
regexp parser zero-width assertion bugs

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattias.engdegard <at> gmail.com>

Date: Sat, 17 Jun 2023 12:21:02 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca>, 64128 <at> debbugs.gnu.org
Subject: bug#64128: regexp parser zero-width assertion bugs
Date: Mon, 19 Jun 2023 13:40:06 -0700
[Message part 1 (text/plain, inline)]
On 2023-06-19 12:52, Mattias Engdegård wrote:

> Sure, we can turn \b and \B into group B assertions, but the patch was more conservative in nature.

OK, but we still need to fix this, as \b and \B should not be a special 
case for following "*".

> I think we have to preserve \`* meaning \`\* for compatibility, historical or not, because it's something we keep sighting in the wild.

That makes some sense, in that \` is like ^, and ^ is already a special 
case (this is true even in POSIX BREs).

In other words, how about if we change the groups from your list:

Group A: ^ $ \` \' \b \B
Group B: \< \> \_< \_> \=

to this:

Group A: ^ \`
Group B: $ \' \b \B \< \> \_< \_> \=

where "*" is ordinary after Group A, and special after Group B and there 
is no other squirrelly behavior. And similarly for the other repetition 
operators.

Attached is a proposed doc change for this, which I have not installed. 
Of course the code and etc/NEWS would need changing too.
[0001-Document-proposed-regex-fix-bug-64128.patch (text/x-patch, attachment)]

This bug report was last modified 2 years and 2 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.