GNU bug report logs - #33793
sed bug with regular expressions

Previous Next

Package: sed;

Reported by: Uladzimir Panasiuk <v.s.panasyuk <at> gmail.com>

Date: Tue, 18 Dec 2018 17:19:02 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 33793 in the body.
You can then email your comments to 33793 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-sed <at> gnu.org:
bug#33793; Package sed. (Tue, 18 Dec 2018 17:19:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Uladzimir Panasiuk <v.s.panasyuk <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-sed <at> gnu.org. (Tue, 18 Dec 2018 17:19:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Uladzimir Panasiuk <v.s.panasyuk <at> gmail.com>
To: bug-sed <at> gnu.org
Subject: sed bug with regular expressions
Date: Tue, 18 Dec 2018 15:50:49 +0300
[Message part 1 (text/plain, inline)]
Hi. I've found the bug using sed. There is how to reproduce:
1) Run bash
2) Exec command \
echo weather -5.0 | sed
's/[^0-9\-\.]//g'
3) You will get "5.0". Expected output is "-5.0"

BUT
If you exec
echo weather -5.0 | sed 's/[^0-9\.\-]//g'
you''ll get the correct output "-5.0".

I am using GNU sed version 4.5 on Manjaro Linux.
[Message part 2 (text/html, inline)]

Added tag(s) notabug. Request was from Eric Blake <eblake <at> redhat.com> to control <at> debbugs.gnu.org. (Tue, 18 Dec 2018 18:24:02 GMT) Full text and rfc822 format available.

Reply sent to Eric Blake <eblake <at> redhat.com>:
You have taken responsibility. (Tue, 18 Dec 2018 18:24:03 GMT) Full text and rfc822 format available.

Notification sent to Uladzimir Panasiuk <v.s.panasyuk <at> gmail.com>:
bug acknowledged by developer. (Tue, 18 Dec 2018 18:24:03 GMT) Full text and rfc822 format available.

Message #12 received at 33793-done <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Uladzimir Panasiuk <v.s.panasyuk <at> gmail.com>, 33793-done <at> debbugs.gnu.org,
 GNU bug control <control <at> debbugs.gnu.org>
Subject: Re: bug#33793: sed bug with regular expressions
Date: Tue, 18 Dec 2018 12:23:16 -0600
tag 33793 notabug
thanks

On 12/18/18 6:50 AM, Uladzimir Panasiuk wrote:
> Hi. I've found the bug using sed. There is how to reproduce:
> 1) Run bash
> 2) Exec command \
> echo weather -5.0 | sed
> 's/[^0-9\-\.]//g'

You used two range expressions in this regex, but the result is the same 
as if you had used this regex with only one range expression::

's/[^0-9\.]//g'

Either way, you requested all characters except for the 10 digits, a 
literal backslash, or a literal dot.  Remember, a range expression [\-\] 
selects a single character of the backslash.  Since '-' is not excluded 
from the [] expression, sed correctly strips it.

> 3) You will get "5.0". Expected output is "-5.0"

You might be remembering the behavior of perl regex, where \ inside [] 
is an escape character.  But that's not how POSIX regex behaves - inside 
[], \ is literal, and there are no escape characters.

> 
> BUT
> If you exec
> echo weather -5.0 | sed 's/[^0-9\.\-]//g'

Here, your regex only has one range expression, but lists \ twice.  The 
repetition is harmless, but means that your expression is the same as 
this shorter:

's/[^0-9\.-]//g'

It is not obvious from your input whether you intended to be filtering 
out literal backslash or not, but if not, you probably meant to write:

's/[^0-9.-]//g'

with no backslash, and with the - last (as that is one of the few places 
that you can write - to be matched as itself rather than treated as a 
range operator between neighboring characters).

I'm closing this as not a bug, but feel free to reply with further 
questions or comments.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 16 Jan 2019 12:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 213 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.