GNU bug report logs - #22655
grep-2.21 (and git master): --null-data and ranges work in an odd way (-P works fine)

Previous Next

Package: grep;

Reported by: Sergei Trofimovich <slyfox <at> gentoo.org>

Date: Sat, 13 Feb 2016 23:24:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Stephane Chazelas <stephane.chazelas <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 22655 <at> debbugs.gnu.org
Subject: bug#22655: grep -Pz '^' now fails!
Date: Sat, 19 Nov 2016 10:41:27 +0000
2016-11-18 15:37:16 -0800, Paul Eggert:
[...]
> >That might have been the case a long time ago, as I remember
> >some discussion about it as it explained some wrong information
> >in the documentation, but as far as I and gdb can tell, grep
> >2.26 at least call pcre_exec for every line of the input with
> >grep -P.
> >
> 
> Although that was true starting with commit
> a14685c2833f7c28a427fecfaf146e0a861d94ba (2010-03-04), it became
> false starting with commit 9fa500407137f49f6edc3c6b4ee6c7096f0190c5
> (2014-09-16).
[...]

OK, it looks like I don't have the full story, and my multiple
calls to pcre_exec() seems to point to something else:

$ seq 10 | ltrace  -e '*pcre*' ./src/grep -P .
grep->pcre_maketables(0x221e2f0, 0x221e240, 1, 2)                                                                     = 0x221e310
grep->pcre_compile(0x221e2f0, 2050, 0x7ffe943ec6f8, 0x7ffe943ec6f4)                                                   = 0x221e760
grep->pcre_study(0x221e760, 1, 0x7ffe943ec6f8, 0x7ffe943eb490)                                                        = 0x221e7b0
grep->pcre_fullinfo(0x221e760, 0x221e7b0, 16, 0x7ffe943ec6f4)                                                         = 0
grep->pcre_exec(0x221e760, 0x221e7b0, "", 0, 0, 128, 0x7ffe943ec700, 300)                                             = -1
grep->pcre_exec(0x221e760, 0x221e7b0, "", 0, 0, 0, 0x7ffe943ec700, 300)                                               = -1
grep->pcre_exec(0x221e760, 0x221e7b0, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 20, 0, 8192, 0x7ffe943ec4e0, 300)            = 1
1
grep->pcre_exec(0x221e760, 0x221e7b0, "2\n3\n4\n5\n6\n7\n8\n9\n10\n", 18, 0, 8192, 0x7ffe943ec4e0, 300)               = 1
2
grep->pcre_exec(0x221e760, 0x221e7b0, "3\n4\n5\n6\n7\n8\n9\n10\n", 16, 0, 8192, 0x7ffe943ec4e0, 300)                  = 1
3
grep->pcre_exec(0x221e760, 0x221e7b0, "4\n5\n6\n7\n8\n9\n10\n", 14, 0, 8192, 0x7ffe943ec4e0, 300)                     = 1
4
grep->pcre_exec(0x221e760, 0x221e7b0, "5\n6\n7\n8\n9\n10\n", 12, 0, 8192, 0x7ffe943ec4e0, 300)                        = 1
5
grep->pcre_exec(0x221e760, 0x221e7b0, "6\n7\n8\n9\n10\n", 10, 0, 8192, 0x7ffe943ec4e0, 300)                           = 1
6
grep->pcre_exec(0x221e760, 0x221e7b0, "7\n8\n9\n10\n", 8, 0, 8192, 0x7ffe943ec4e0, 300)                               = 1
7
grep->pcre_exec(0x221e760, 0x221e7b0, "8\n9\n10\n", 6, 0, 8192, 0x7ffe943ec4e0, 300)                                  = 1
8
grep->pcre_exec(0x221e760, 0x221e7b0, "9\n10\n", 4, 0, 8192, 0x7ffe943ec4e0, 300)                                     = 1
9
grep->pcre_exec(0x221e760, 0x221e7b0, "10\n", 2, 0, 8192, 0x7ffe943ec4e0, 300)                                        = 1
10
+++ exited (status 0) +++

I don't know the details of why it's done that way, but I'm not
sure I can see how calling pcre_exec that way can be quicker
than calling it on each individual line/record.


Note that this is still wrong:

$ printf 'a\nb\0' | ./src/grep -zxP a
a
b

Removing PCRE_MULTILINE (and get back to calling pcre_exec on
every record separately) would help except in the cases where the
user does:

grep -xzP '(?m)a'

You'd want to change:

  static char const xprefix[] = "^(?:";
  static char const xsuffix[] = ")$";

To:

  static char const xprefix[] = "\A(?:";
  static char const xsuffix[] = ")\z";


-- 
Stephane




This bug report was last modified 8 years and 190 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.