GNU bug report logs -
#22655
grep-2.21 (and git master): --null-data and ranges work in an odd way (-P works fine)
Previous Next
Full log
View this message in rfc822 format
2016-11-18 15:37:16 -0800, Paul Eggert:
[...]
> >That might have been the case a long time ago, as I remember
> >some discussion about it as it explained some wrong information
> >in the documentation, but as far as I and gdb can tell, grep
> >2.26 at least call pcre_exec for every line of the input with
> >grep -P.
> >
>
> Although that was true starting with commit
> a14685c2833f7c28a427fecfaf146e0a861d94ba (2010-03-04), it became
> false starting with commit 9fa500407137f49f6edc3c6b4ee6c7096f0190c5
> (2014-09-16).
[...]
OK, it looks like I don't have the full story, and my multiple
calls to pcre_exec() seems to point to something else:
$ seq 10 | ltrace -e '*pcre*' ./src/grep -P .
grep->pcre_maketables(0x221e2f0, 0x221e240, 1, 2) = 0x221e310
grep->pcre_compile(0x221e2f0, 2050, 0x7ffe943ec6f8, 0x7ffe943ec6f4) = 0x221e760
grep->pcre_study(0x221e760, 1, 0x7ffe943ec6f8, 0x7ffe943eb490) = 0x221e7b0
grep->pcre_fullinfo(0x221e760, 0x221e7b0, 16, 0x7ffe943ec6f4) = 0
grep->pcre_exec(0x221e760, 0x221e7b0, "", 0, 0, 128, 0x7ffe943ec700, 300) = -1
grep->pcre_exec(0x221e760, 0x221e7b0, "", 0, 0, 0, 0x7ffe943ec700, 300) = -1
grep->pcre_exec(0x221e760, 0x221e7b0, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 20, 0, 8192, 0x7ffe943ec4e0, 300) = 1
1
grep->pcre_exec(0x221e760, 0x221e7b0, "2\n3\n4\n5\n6\n7\n8\n9\n10\n", 18, 0, 8192, 0x7ffe943ec4e0, 300) = 1
2
grep->pcre_exec(0x221e760, 0x221e7b0, "3\n4\n5\n6\n7\n8\n9\n10\n", 16, 0, 8192, 0x7ffe943ec4e0, 300) = 1
3
grep->pcre_exec(0x221e760, 0x221e7b0, "4\n5\n6\n7\n8\n9\n10\n", 14, 0, 8192, 0x7ffe943ec4e0, 300) = 1
4
grep->pcre_exec(0x221e760, 0x221e7b0, "5\n6\n7\n8\n9\n10\n", 12, 0, 8192, 0x7ffe943ec4e0, 300) = 1
5
grep->pcre_exec(0x221e760, 0x221e7b0, "6\n7\n8\n9\n10\n", 10, 0, 8192, 0x7ffe943ec4e0, 300) = 1
6
grep->pcre_exec(0x221e760, 0x221e7b0, "7\n8\n9\n10\n", 8, 0, 8192, 0x7ffe943ec4e0, 300) = 1
7
grep->pcre_exec(0x221e760, 0x221e7b0, "8\n9\n10\n", 6, 0, 8192, 0x7ffe943ec4e0, 300) = 1
8
grep->pcre_exec(0x221e760, 0x221e7b0, "9\n10\n", 4, 0, 8192, 0x7ffe943ec4e0, 300) = 1
9
grep->pcre_exec(0x221e760, 0x221e7b0, "10\n", 2, 0, 8192, 0x7ffe943ec4e0, 300) = 1
10
+++ exited (status 0) +++
I don't know the details of why it's done that way, but I'm not
sure I can see how calling pcre_exec that way can be quicker
than calling it on each individual line/record.
Note that this is still wrong:
$ printf 'a\nb\0' | ./src/grep -zxP a
a
b
Removing PCRE_MULTILINE (and get back to calling pcre_exec on
every record separately) would help except in the cases where the
user does:
grep -xzP '(?m)a'
You'd want to change:
static char const xprefix[] = "^(?:";
static char const xsuffix[] = ")$";
To:
static char const xprefix[] = "\A(?:";
static char const xsuffix[] = ")\z";
--
Stephane
This bug report was last modified 8 years and 190 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.