GNU bug report logs - #22655
grep-2.21 (and git master): --null-data and ranges work in an odd way (-P works fine)

Previous Next

Package: grep;

Reported by: Sergei Trofimovich <slyfox <at> gentoo.org>

Date: Sat, 13 Feb 2016 23:24:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #97 received at 22655 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Stephane Chazelas <stephane.chazelas <at> gmail.com>
Cc: 22655 <at> debbugs.gnu.org
Subject: Re: bug#22655: grep -Pz '^' now fails!
Date: Sat, 19 Nov 2016 03:22:23 -0800
[Message part 1 (text/plain, inline)]
Stephane Chazelas wrote:

> I don't know the details of why it's done that way, but I'm not
> sure I can see how calling pcre_exec that way can be quicker
> than calling it on each individual line/record.

It can be hundreds of times faster in common cases. See:

http://git.savannah.gnu.org/cgit/grep.git/commit/?id=f6603c4e1e04dbb87a7232c4b44acc6afdf65fef

> Note that this is still wrong:
>
> $ printf 'a\nb\0' | ./src/grep -zxP a
> a
> b

Thanks, fixed by installing the attached.

> Removing PCRE_MULTILINE (and get back to calling pcre_exec on
> every record separately) would help except in the cases where the
> user does:
>
> grep -xzP '(?m)a'

I don't think grep can address this problem, as in general that would require 
interpreting the PCRE pattern at run-time and grep should not be delving into 
PCRE internals. Uses of (?m) lead to unspecified behavior in grep, and 
applications should not rely on any particular behavior in this area. This is 
firmly in the Perl tradition, as the Perl documentation for this part of the 
regular expression syntax says "The stability of these extensions varies widely. 
Some ... are experimental and may change without warning or be completely 
removed." Also, the grep manual says that -P "is highly experimental". User 
beware, that's all.
[0001-grep-fix-zxP-bug.patch (text/x-diff, attachment)]

This bug report was last modified 8 years and 190 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.