GNU bug report logs -
#33793
sed bug with regular expressions
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Tue, 18 Dec 2018 12:23:16 -0600
with message-id <1e16a005-8beb-8b86-01b8-5fabb4da6d33 <at> redhat.com>
and subject line Re: bug#33793: sed bug with regular expressions
has caused the debbugs.gnu.org bug report #33793,
regarding sed bug with regular expressions
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
33793: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=33793
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
[Message part 3 (text/plain, inline)]
Hi. I've found the bug using sed. There is how to reproduce:
1) Run bash
2) Exec command \
echo weather -5.0 | sed
's/[^0-9\-\.]//g'
3) You will get "5.0". Expected output is "-5.0"
BUT
If you exec
echo weather -5.0 | sed 's/[^0-9\.\-]//g'
you''ll get the correct output "-5.0".
I am using GNU sed version 4.5 on Manjaro Linux.
[Message part 4 (text/html, inline)]
[Message part 5 (message/rfc822, inline)]
tag 33793 notabug
thanks
On 12/18/18 6:50 AM, Uladzimir Panasiuk wrote:
> Hi. I've found the bug using sed. There is how to reproduce:
> 1) Run bash
> 2) Exec command \
> echo weather -5.0 | sed
> 's/[^0-9\-\.]//g'
You used two range expressions in this regex, but the result is the same
as if you had used this regex with only one range expression::
's/[^0-9\.]//g'
Either way, you requested all characters except for the 10 digits, a
literal backslash, or a literal dot. Remember, a range expression [\-\]
selects a single character of the backslash. Since '-' is not excluded
from the [] expression, sed correctly strips it.
> 3) You will get "5.0". Expected output is "-5.0"
You might be remembering the behavior of perl regex, where \ inside []
is an escape character. But that's not how POSIX regex behaves - inside
[], \ is literal, and there are no escape characters.
>
> BUT
> If you exec
> echo weather -5.0 | sed 's/[^0-9\.\-]//g'
Here, your regex only has one range expression, but lists \ twice. The
repetition is harmless, but means that your expression is the same as
this shorter:
's/[^0-9\.-]//g'
It is not obvious from your input whether you intended to be filtering
out literal backslash or not, but if not, you probably meant to write:
's/[^0-9.-]//g'
with no backslash, and with the - last (as that is one of the few places
that you can write - to be matched as itself rather than treated as a
range operator between neighboring characters).
I'm closing this as not a bug, but feel free to reply with further
questions or comments.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
This bug report was last modified 6 years and 214 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.