GNU bug report logs - #26409
plus (`+`) not a metacharacter with --posix option, escaped or not

Previous Next

Package: sed;

Reported by: Jordan Torbiak <torbiak <at> gmail.com>

Date: Sun, 9 Apr 2017 00:56:01 UTC

Severity: normal

Tags: fixed

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 26409 in the body.
You can then email your comments to 26409 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-sed <at> gnu.org:
bug#26409; Package sed. (Sun, 09 Apr 2017 00:56:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jordan Torbiak <torbiak <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-sed <at> gnu.org. (Sun, 09 Apr 2017 00:56:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jordan Torbiak <torbiak <at> gmail.com>
To: bug-sed <at> gnu.org
Subject: plus (`+`) not a metacharacter with --posix option, escaped or not
Date: Sat, 8 Apr 2017 18:37:15 -0600
[Message part 1 (text/plain, inline)]
The plus character can't seem to be used as a metacharacter when both the
`-E` and `--posix` options are given.

This works as expected:

$ echo 'hi+' | sed  -E 's/(.+)/{\1}/'
{hi+}

This does not:

$ echo 'hi+' | sed --posix -E 's/(.+)/{\1}/'
h{i+}

And escaping the plus sign doesn't give it special meaning, either:

$ echo 'hi+' | sed --posix -E 's/(.\+)/{\1}/'
h{i+}

I don't believe this falls under the [Regex syntax clash](
https://www.gnu.org/software/sed/manual/sed.html#index-Non_002dbugs_002c-regex-syntax-clashes)
non-bug category, as all the [POSIX specs since 2008](
http://pubs.opengroup.org/onlinepubs/9699919799.2008edition/basedefs/V1_chap09.html#tag_09_04_03)
say "The <asterisk>, <plus-sign>, <question-mark>, and <left-brace> shall
be special except when used in a bracket expression."
[Message part 2 (text/html, inline)]

Information forwarded to bug-sed <at> gnu.org:
bug#26409; Package sed. (Sun, 09 Apr 2017 17:37:01 GMT) Full text and rfc822 format available.

Message #8 received at 26409 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Jordan Torbiak <torbiak <at> gmail.com>
Cc: 26409 <at> debbugs.gnu.org
Subject: Re: bug#26409: plus (`+`) not a metacharacter with --posix option,
 escaped or not
Date: Sun, 9 Apr 2017 13:36:37 -0400
Hello,

> On Apr 8, 2017, at 20:37, Jordan Torbiak <torbiak <at> gmail.com> wrote:
> 
> The plus character can't seem to be used as a metacharacter when both the
> `-E` and `--posix` options are given.
> [...]
> $ echo 'hi+' | sed --posix -E 's/(.+)/{\1}/'
> h{i+}

Thank you for the report.

I can confirm this is reproducible.

I think the cause is that '--posix' sets the sed variable
'posixicity=POSIXLY_BASIC' and then all regex are compiled
with the  RE_LIMITED_OPS option which disables +/?/| .

[1] regex options in sed:
    https://git.savannah.gnu.org/cgit/sed.git/tree/sed/regexp.c#n90
[2] gnulib RE_LIMITED_OPS:
    https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/regex.h#n134

regarding POSIX compliance - this sounds like it needs to be fixed,
but I'm not an export - perhaps others can chime in ?

I think a simple 'if extended_regexp_flags & REG_EXTENDED' inside
the 'switch (posixicity)' can fix this. I can send a patch a bit later.

regards,
 - assaf






Information forwarded to bug-sed <at> gnu.org:
bug#26409; Package sed. (Sun, 09 Apr 2017 21:49:01 GMT) Full text and rfc822 format available.

Message #11 received at 26409 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Jordan Torbiak <torbiak <at> gmail.com>
Cc: 26409 <at> debbugs.gnu.org
Subject: Re: bug#26409: plus (`+`) not a metacharacter with --posix option,
 escaped or not
Date: Sun, 9 Apr 2017 17:47:53 -0400
[Message part 1 (text/plain, inline)]
Attached is a suggested patch to fix it.

comments welcomed,
 - assaf


[0001-sed-enable-special-meaning-of-with-E-posix.patch (application/octet-stream, attachment)]

Information forwarded to bug-sed <at> gnu.org:
bug#26409; Package sed. (Wed, 19 Apr 2017 01:02:01 GMT) Full text and rfc822 format available.

Message #14 received at 26409 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Jordan Torbiak <torbiak <at> gmail.com>
Cc: 26409 <at> debbugs.gnu.org
Subject: Re: bug#26409: plus (`+`) not a metacharacter with --posix option,
 escaped or not
Date: Tue, 18 Apr 2017 21:00:58 -0400
tag 26409 fixed
close 26409
stop

Hello,

> On Apr 9, 2017, at 17:47, Assaf Gordon <assafgordon <at> gmail.com> wrote:
> 
> Attached is a suggested patch to fix it.

With no further comments, I've pushed the fix here:
 https://git.savannah.gnu.org/cgit/sed.git/commit/?id=11a2a701e

regards,
 - assaf





Added tag(s) fixed. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 19 Apr 2017 01:02:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 26409 <at> debbugs.gnu.org and Jordan Torbiak <torbiak <at> gmail.com> Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 19 Apr 2017 01:02:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 17 May 2017 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 92 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.