GNU bug report logs - #23185
GNU grep matching discrepancy between -a/--text and not.

Previous Next

Package: grep;

Reported by: Shlomi Fish <shlomif <at> shlomifish.org>

Date: Sat, 2 Apr 2016 12:06:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 23185 in the body.
You can then email your comments to 23185 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#23185; Package grep. (Sat, 02 Apr 2016 12:06:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Shlomi Fish <shlomif <at> shlomifish.org>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Sat, 02 Apr 2016 12:06:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Shlomi Fish <shlomif <at> shlomifish.org>
To: bug-grep <at> gnu.org
Subject: GNU grep matching discrepancy between -a/--text and not.
Date: Sat, 2 Apr 2016 15:00:12 +0300
Hi all,

as can be seen in this repository:

https://github.com/shlomif/gnu-grep-trailing-space-and-CR-on-riddles.he-false-match

GNU grep says a document it suspects to be binary matches without -a/--text and
doesn't match it or return results with that flag applied. perl sides with the
latter.

I'm on Mageia linux x86-64 v6 and have built GNU grep from the latest git
commit ( c767ed70eca9a82d76f07dcdbcaafa21ec7f86d6 ) to test.

Regards,

	Shlomi Fish

P.S: it seems the build system uses gperf but configure does not verify that it
exists in the path.

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
Interview with Ben Collins-Sussman - http://shlom.in/sussman

Can I SCO now? Sue who you wanna sue, it doesn't matter anyhoo, it's time to
litigate.
    — http://www.shlomifish.org/humour/bits/Can-I-SCO-Now/

Please reply to list if it's a mailing list post - http://shlom.in/reply .




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Wed, 06 Apr 2016 06:57:02 GMT) Full text and rfc822 format available.

Notification sent to Shlomi Fish <shlomif <at> shlomifish.org>:
bug acknowledged by developer. (Wed, 06 Apr 2016 06:57:02 GMT) Full text and rfc822 format available.

Message #10 received at 23185-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Shlomi Fish <shlomif <at> shlomifish.org>, 23185-done <at> debbugs.gnu.org
Subject: Re: bug#23185: GNU grep matching discrepancy between -a/--text and
 not.
Date: Tue, 5 Apr 2016 23:56:22 -0700
[Message part 1 (text/plain, inline)]
Thanks for pointing out the seeming inconsistency. The documentation mentions 
the issue but is perhaps not clear enough, so I installed the attached patch.

The input file contains NUL bytes and so is treated as binary data, and the grep 
documentation (secton "File and Directory Selection", option "--binary-files") 
says "When processing binary data, ‘grep’ may treat non-text bytes as line 
terminators". This behavior was added to GNU grep in release 2.21 dated 2014, 
partly for performance reasons.

There are two instances in riddle.he of a space followed by a NUL byte, so

  grep -P '[ \t]\r?$' riddles.he

finds a match when the $ matches just before the NUL byte.

-a is one way to get the behavior you evidently expected. Another (perhaps 
better) way is -z. The command:

  grep -zP '[ \t]\r?\n' riddles.he

outputs nothing and exits with status 1.
[0001-Give-another-example-of-binary-file-processing.patch (text/x-diff, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 04 May 2016 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 132 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.