GNU bug report logs - #26322
grep '*' VS grep -E '*'

Previous Next

Package: grep;

Reported by: Julien Denis <ju.denis <at> gmail.com>

Date: Fri, 31 Mar 2017 14:48:03 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 26322 in the body.
You can then email your comments to 26322 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#26322; Package grep. (Fri, 31 Mar 2017 14:48:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Julien Denis <ju.denis <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 31 Mar 2017 14:48:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Julien Denis <ju.denis <at> gmail.com>
To: bug-grep <at> gnu.org
Subject: grep '*' VS grep -E '*'
Date: Fri, 31 Mar 2017 15:37:38 +0200
Hello,

Assuming that "textfile" is a regular non empty text file, is it
normal that grep '*' textfile returns nothing but grep -E '*' textfile
returns all the lines ?
I got this using Debian 7.1 stable and so grep is version 2.20.
Would a newer grep version resolve this or is it not a bug (but a
valid behavior of the star character in ERE) ?

Thanks!
Julien




Added tag(s) notabug. Request was from Eric Blake <eblake <at> redhat.com> to control <at> debbugs.gnu.org. (Fri, 31 Mar 2017 15:43:01 GMT) Full text and rfc822 format available.

Reply sent to Eric Blake <eblake <at> redhat.com>:
You have taken responsibility. (Fri, 31 Mar 2017 15:43:02 GMT) Full text and rfc822 format available.

Notification sent to Julien Denis <ju.denis <at> gmail.com>:
bug acknowledged by developer. (Fri, 31 Mar 2017 15:43:02 GMT) Full text and rfc822 format available.

Message #12 received at 26322-done <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Julien Denis <ju.denis <at> gmail.com>, 26322-done <at> debbugs.gnu.org
Subject: Re: bug#26322: grep '*' VS grep -E '*'
Date: Fri, 31 Mar 2017 10:42:40 -0500
[Message part 1 (text/plain, inline)]
tag 26322 notabug
thanks

On 03/31/2017 08:37 AM, Julien Denis wrote:
> Hello,
> 
> Assuming that "textfile" is a regular non empty text file, is it
> normal that grep '*' textfile returns nothing but grep -E '*' textfile
> returns all the lines ?
> I got this using Debian 7.1 stable and so grep is version 2.20.
> Would a newer grep version resolve this or is it not a bug (but a
> valid behavior of the star character in ERE) ?

According to POSIX, the regular expression '*' has a different
interpretation under BRE than under ERE:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html

In BRE (plain 'grep' style), 9.3.3 states that
"    The <asterisk> shall be special except when used:

        In a bracket expression

        As the first character of an entire BRE (after an initial '^',
if any)"

so it means you are searching for the literal character '*'.  In your
case of no output, that means that your textfile contains no literal '*'
on any line.

In ERE ('grep -E' style), 9.4.3 states that

"*+?{
    The <asterisk>, <plus-sign>, <question-mark>, and <left-brace> shall
be special except when used in a bracket expression (see RE Bracket
Expression). Any of the following uses produce undefined results:
    If these characters appear first in an ERE, or immediately following
an unescaped <vertical-line>, <circumflex>, <dollar-sign>, or
<left-parenthesis>"

So your regular expression is undefined, and we can make it mean
whatever we want (whether we error out, or treat it as equivalent to
some other regular expression, it doesn't matter - you are outside the
bounds of POSIX so you can't rely on our behavior to be consistent).

My guess is that your combination of libc and grep version (yes, it
might be different across versions or on different platforms) has an
interpretation where '*' is treated the same as searching for
zero-or-more instances of the regular expression '', and since the empty
regular expression matches everywhere, zero-or-more instances of that
regular expression will also match everywhere, and you thus get the
result of every line of textfile output.  But that doesn't mean you
should expect that behavior to stay the same.

Maybe you are mixing regular expressions with globs.  If you want to
search for zero-or-more characters with a glob, you use '*'; but that
translates to '.*' in both BRE and ERE syntax.

At any rate, I don't see this as a bug, so I'm closing the instance in
the bug-tracker, but feel free to reply with further comments or questions.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 29 Apr 2017 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 109 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.