GNU bug report logs -
#22945
Surprising behaviour (bug?) of zgrep in combination with the -f option and process substitutions
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 22945 in the body.
You can then email your comments to 22945 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gzip <at> gnu.org
:
bug#22945
; Package
gzip
.
(Tue, 08 Mar 2016 16:33:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Fulvio Scapin <trantorvega <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gzip <at> gnu.org
.
(Tue, 08 Mar 2016 16:33:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hello.
There is a problem with zgrep whenever the -f option actually reads from
the output of a process substition in bash.
A willingly trivial example below.
$ mkdir /tmp/test
$ cd /tmp/test
$ cat > first
aaa
$ cat > second
bbb
$ cat > third
ccc
$ cat > fourth
ddd
$ tail *
==> first <==
aaa
==> fourth <==
ddd
==> second <==
bbb
==> third <==
ccc
$ gzip -9 *
$ ls
first.gz fourth.gz second.gz third.gz
$ cat > patterns
aaa
bbb
ccc
ddd
$ tail patterns
aaa
bbb
ccc
ddd
$ zfgrep -f <( cat patterns ) first.gz fourth.gz second.gz third.gz
first.gz:aaa
$ zfgrep -f patterns first.gz fourth.gz second.gz third.gz
first.gz:aaa
fourth.gz:ddd
second.gz:bbb
third.gz:ccc
zfgrep -f <( cat patterns ) first.gz fourth.gz second.gz third.gz
translates in
zfgrep -f /dev/fd/XX first.gz fourth.gz second.gz third.gz
where XX is a number, 63 for instance .
The problem, from what I understand, arises since
zgrep -f patternfile a.gz b.gz c.gz
actually is a succession of
gzip -dc a.gz | grep -f patternfile
gzip -dc b.gz | grep -f patternfile
gzip -dc c.gz | grep -f patternfile
Since patternfile in this case is /dev/fd/XX, only the first invocation of
grep in the first pipeline actually reads a pattern list, while the second
and third invocation get nothing, giving no match for b.gz and c.gz as a
result.
From /bin/zgrep (Version 1.6, Ubuntu 15.10) one can read
(-f | --file)
# The pattern is coming from a file rather than the command-line.
# If the file is actually stdin then we need to do a little
# magic, since we use stdin to pass the gzip output to grep.
# Turn the -f option into an -e option by copying the file's
# contents into OPTARG.
case $optarg in
(" '-'" | " '/dev/stdin'" | " '/dev/fd/0'")
option=-e
optarg=" '"$(sed "$escape") || exit 2;;
esac
have_pat=1;;
The workaround concerning stdin should (maybe) also apply to situations
such as the one in my example?
Thanks in advance.
Fulvio Scapin
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gzip <at> gnu.org
:
bug#22945
; Package
gzip
.
(Wed, 16 Mar 2016 02:36:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 22945 <at> debbugs.gnu.org (full text, mbox):
On Tue, Mar 8, 2016 at 3:42 AM, Fulvio Scapin <trantorvega <at> gmail.com> wrote:
> Hello.
>
> There is a problem with zgrep whenever the -f option actually reads from
> the output of a process substition in bash.
> A willingly trivial example below.
>
> $ mkdir /tmp/test
...
>From /bin/zgrep (Version 1.6, Ubuntu 15.10) one can read
Thank you for the report.
To summarize, with zgrep-1.6, this erroneously prints matches only
from the first file:
$ zgrep -f <(echo .) <(echo a) <(echo b)
/dev/fd/12:a
However, with the latest from git (and soon to be gzip-1.7), this now
works as desired:
$ zgrep -f <(echo .) <(echo a) <(echo b)
/dev/fd/12:a
/dev/fd/13:b
I see there is no NEWS entry for this fix and haven't yet identified
the origin of the bug or the commit that fixed it, but will do so.
Information forwarded
to
bug-gzip <at> gnu.org
:
bug#22945
; Package
gzip
.
(Wed, 16 Mar 2016 21:07:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 22945 <at> debbugs.gnu.org (full text, mbox):
On 03/15/2016 07:34 PM, Jim Meyering wrote:
> Thank you for the report.
> To summarize, with zgrep-1.6, this erroneously prints matches only
> from the first file:
>
> $ zgrep -f <(echo .) <(echo a) <(echo b)
> /dev/fd/12:a
>
> However, with the latest from git (and soon to be gzip-1.7), this now
> works as desired:
>
> $ zgrep -f <(echo .) <(echo a) <(echo b)
> /dev/fd/12:a
> /dev/fd/13:b
>
> I see there is no NEWS entry for this fix and haven't yet identified
> the origin of the bug or the commit that fixed it, but will do so.
Draft gzip 1.7 doesn't work for me (Fedora 23 x86-64). I have worked on
a patch but don't have a reliable fix yet, or even a portable test case
to illustrate the bug. Perhaps we should just think of it as a known bug
for now.
Information forwarded
to
bug-gzip <at> gnu.org
:
bug#22945
; Package
gzip
.
(Thu, 17 Mar 2016 16:33:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 22945 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert wrote:
> Draft gzip 1.7 doesn't work for me (Fedora 23 x86-64). I have worked on
> a patch but don't have a reliable fix yet, or even a portable test case
> to illustrate the bug. Perhaps we should just think of it as a known bug
> for now.
What about using command substitution with '-e' instead of process
substitution with '-f'?
zgrep -e "$(cat FILE)" file1.lz file2.gz
Best regards,
Antonio.
Information forwarded
to
bug-gzip <at> gnu.org
:
bug#22945
; Package
gzip
.
(Thu, 17 Mar 2016 20:14:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 22945 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 03/16/2016 02:06 PM, Paul Eggert wrote:
> I have worked on a patch but don't have a reliable fix yet, or even a
> portable test case to illustrate the bug.
On further thought I found a test case and a fix, which I've attached.
Normally I would just install this, but we're so close to a release that
I'll wait for a word from Jim.
[0001-zgrep-with-f-SPECIAL-read-SPECIAL-just-once.patch (application/x-patch, attachment)]
Information forwarded
to
bug-gzip <at> gnu.org
:
bug#22945
; Package
gzip
.
(Fri, 18 Mar 2016 03:59:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 22945 <at> debbugs.gnu.org (full text, mbox):
On Thu, Mar 17, 2016 at 1:13 PM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> On 03/16/2016 02:06 PM, Paul Eggert wrote:
>>
>> I have worked on a patch but don't have a reliable fix yet, or even a
>> portable test case to illustrate the bug.
>
> On further thought I found a test case and a fix, which I've attached.
> Normally I would just install this, but we're so close to a release that
> I'll wait for a word from Jim.
Thank you for working on that.
One nit: perhaps it should continue to work when a search string
contains a NUL byte? E.g., this works before your change on OS X
yet finds no match with the patch applied:
$ zgrep -af <(printf 'b\0\na') <(printf 'b\0') <(echo a)
/dev/fd/12:b
/dev/fd/13:a
Might be tricky to portably transform that NUL byte into something we
can embed in a command-line-specified search string. Is there even a
notation for that? I don't think so.
But NUL problems aside, this also should work, requiring alternation
in the regexp derived from input with two or more lines, but then
we'll have to escape embedded '|' bytes, too:
$ zgrep -f <(printf 'a\nb') <(echo b) <(echo a)
/dev/fd/12:b
/dev/fd/13:a
Information forwarded
to
bug-gzip <at> gnu.org
:
bug#22945
; Package
gzip
.
(Fri, 18 Mar 2016 07:47:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 22945 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Jim Meyering wrote:
> Might be tricky to portably transform that NUL byte into something we
> can embed in a command-line-specified search string. Is there even a
> notation for that? I don't think so.
>
> But NUL problems aside, this also should work, requiring alternation
> in the regexp derived from input with two or more lines, but then
> we'll have to escape embedded '|' bytes, too:
How about the attached patch instead? It uses a bigger hammer, which should
address both issues.
[0001-zgrep-with-f-SPECIAL-read-SPECIAL-just-once.patch (text/x-diff, attachment)]
Information forwarded
to
bug-gzip <at> gnu.org
:
bug#22945
; Package
gzip
.
(Fri, 18 Mar 2016 20:26:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 22945 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Fri, Mar 18, 2016 at 12:46 AM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> Jim Meyering wrote:
>>
>> Might be tricky to portably transform that NUL byte into something we
>> can embed in a command-line-specified search string. Is there even a
>> notation for that? I don't think so.
>>
>> But NUL problems aside, this also should work, requiring alternation
>> in the regexp derived from input with two or more lines, but then
>> we'll have to escape embedded '|' bytes, too:
>
>
> How about the attached patch instead? It uses a bigger hammer, which should
> address both issues.
Very nice. Thank you very much.
You are welcome to push that with changes like the following:
- retain the 2-empty-line section separator in NEWS (there's a
syntax-check hook to test for that in other packages, but not yet here
in gzip)
- adjust the test to cover the case of more than one line in -f's input:
[k.patch (text/x-patch, attachment)]
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Fri, 18 Mar 2016 22:31:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Fulvio Scapin <trantorvega <at> gmail.com>
:
bug acknowledged by developer.
(Fri, 18 Mar 2016 22:31:02 GMT)
Full text and
rfc822 format available.
Message #31 received at 22945-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 03/18/2016 01:24 PM, Jim Meyering wrote:
> You are welcome to push that with changes like the following:
OK, thanks, I pushed the attached patch, which contains those changes,
plus one more change: check for errors when writing to the temporary
pattern file. Marking this as done.
[0001-zgrep-with-f-SPECIAL-read-SPECIAL-just-once.patch (application/x-patch, attachment)]
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 16 Apr 2016 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 126 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.