GNU bug report logs -
#21159
Fails To Match Empty String
Previous Next
Reported by: Squirrely <squirrely <at> gmx.com>
Date: Wed, 29 Jul 2015 22:51:02 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21159 in the body.
You can then email your comments to 21159 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org:
bug#21159; Package
grep.
(Wed, 29 Jul 2015 22:51:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Squirrely <squirrely <at> gmx.com>:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org.
(Wed, 29 Jul 2015 22:51:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi
I'm a bit of a regular expression noob, so I'm not sure if this is a bug
or if I'm just missing something about how grep works.
Here's a demo of the issue I have encountered:
> bash$ rm empty
> bash$ touch empty
> bash$ # I am expecting a match, so grep should return 0.
> bash$ grep '^$' empty
> bash$ echo $?
> 1
> bash$ # Hmmm... weird.
> bash$ # Same example but using STDIN instead...
> bash$ echo -n ""| grep '^$'
> bash$ echo $?
> 1
> bash$ # Same result. How does the python re module treat this?
> bash$ python3
> >>> import re
> >>> m = re.search("^$", "")
> >>> type(m)
> <class '_sre.SRE_Match'>
> >>> # A match was found. Python returns 'None' if it's not a match,
> >>> # like this...
> >>> m = re.search("fo?", "bar")
> >>> type(m)
> <class 'NoneType'>
I know that the Python re module and grep use a different regex syntax,
but I'm pretty sure "^$" has the same meaning for both.
I discounted the idea that grep only checks lines that end with a
newline character because of this:
> bash$ echo -en "foo\nfoo\nfoo"|grep foo
> foo
> foo
> foo
As you can see, the third foo is checked and matched despite not being
terminated with a newline character (observe the echo "-n" switch).
So... why does ^$ match the empty string with python but not with grep?
-Squirrely
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility.
(Thu, 30 Jul 2015 00:54:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
Squirrely <squirrely <at> gmx.com>:
bug acknowledged by developer.
(Thu, 30 Jul 2015 00:54:02 GMT)
Full text and
rfc822 format available.
Message #10 received at 21159-done <at> debbugs.gnu.org (full text, mbox):
Squirrely wrote:
> bash$ rm empty
>>bash$ touch empty
>>bash$ # I am expecting a match, so grep should return 0.
>>bash$ grep '^$' empty
>>bash$ echo $?
Grep looks for lines that contain matches. An empty file has no lines, so it
cannot possibly contain any matches for any regular expression.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org.
(Thu, 27 Aug 2015 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 360 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.