GNU bug report logs - #54035
Patch for easier use in scripting pipelines

Previous Next

Package: grep;

Reported by: Ulrich Eckhardt <ulrich.eckhardt <at> base-42.de>

Date: Thu, 17 Feb 2022 07:58:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 54035 in the body.
You can then email your comments to 54035 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#54035; Package grep. (Thu, 17 Feb 2022 07:58:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ulrich Eckhardt <ulrich.eckhardt <at> base-42.de>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Thu, 17 Feb 2022 07:58:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ulrich Eckhardt <ulrich.eckhardt <at> base-42.de>
To: bug-grep <at> gnu.org
Subject: Patch for easier use in scripting pipelines
Date: Thu, 17 Feb 2022 08:57:10 +0100
[Message part 1 (text/plain, inline)]
Greetings!

The attached patch add a `--pipe` option to grep. When used, grep
only exits with with nonzero status on error. In particular, it
doesn't signal "match" / "no match" through the exit code.

Here's an example using Bash:

  # enable automatic error handling
  set -eo pipefail
  # grep for issues in a logfile to produce a report
  cat logfile | grep issue | sort --unique

If grep doesn't find "issue" in its input (which is not an error,
obviously), it exits with status 1. Bash interprets this nonzero exit
code as an error and terminates with an error itself.

In order to fix that bug in the above script, you currently have to
replace `grep ...` with `grep ... || [ $? = 1 ]`, which is not really
readable. As alternative, I've implemented a `--pipe` option, which
only returns nonzero on actual errors, but not when there is no match.
This is a bit of a complementary option to `--quiet`.

Open tasks here:
 * FSF paperwork is not finished, so obviously the patch can't be
   applied yet.
 * Should I add a `-p` to complement the long `--pipe`?
 * Should I call it `--pipe` at all? The other alternative I came up
   with was `--filter`. I don't really like either of them very much.


Cheers!

Uli
[gnu-grep-pipe-option.patch (text/x-patch, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#54035; Package grep. (Fri, 18 Feb 2022 03:06:02 GMT) Full text and rfc822 format available.

Message #8 received at 54035 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Ulrich Eckhardt <ulrich.eckhardt <at> base-42.de>
Cc: 54035 <at> debbugs.gnu.org
Subject: Re: bug#54035: Patch for easier use in scripting pipelines
Date: Thu, 17 Feb 2022 19:05:45 -0800
On 2/16/22 23:57, Ulrich Eckhardt wrote:
> In order to fix that bug in the above script, you currently have to
> replace `grep ...` with `grep ... || [ $? = 1 ]`, which is not really
> readable.

Actually, appending something "|| test $? -eq 1" looks readable to me; 
plus, it already works and is portable to non-GNU systems which is a 
plus. Furthermore, it also works with other programs that also return 
0,1,>1 depending on success,failure,error (e.g., 'cmp', 'diff', 'sort'), 
and it doesn't sound like much of a win to add unportable --pipe options 
to every such program.

And it's not just commands like 'cmp' and 'grep'. The following causes 
Bash to exit on GNU/Linux:

set -eo pipefail
cat /usr/share/dict/american-english | grep -l '^'

This is not because of anything 'grep' does, as 'grep' exits with status 
zero. It's because 'cat' exits with nonzero status. Surely we shouldn't 
add a --pipe option to 'cat' too.

Scripts that use "set -eo pipefail" need to be verrrry careful 
regardless of what we do with 'grep'; and if they are careful it won't 
help much to add a --pipe option to 'grep'.




Information forwarded to bug-grep <at> gnu.org:
bug#54035; Package grep. (Fri, 18 Feb 2022 08:28:02 GMT) Full text and rfc822 format available.

Message #11 received at 54035 <at> debbugs.gnu.org (full text, mbox):

From: Ulrich Eckhardt <ulrich.eckhardt <at> base-42.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 54035 <at> debbugs.gnu.org
Subject: Re: bug#54035: Patch for easier use in scripting pipelines
Date: Fri, 18 Feb 2022 09:27:15 +0100
On Thu, 17 Feb 2022 19:05:45 -0800
Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> On 2/16/22 23:57, Ulrich Eckhardt wrote:
> > In order to fix that bug in the above script, you currently have to
> > replace `grep ...` with `grep ... || [ $? = 1 ]`, which is not
> > really readable.
> 
> Actually, appending something "|| test $? -eq 1" looks readable to
> me; plus, it already works and is portable to non-GNU systems which
> is a plus. Furthermore, it also works with other programs that also
> return 0,1,>1 depending on success,failure,error (e.g., 'cmp',
> 'diff', 'sort'), and it doesn't sound like much of a win to add
> unportable --pipe options to every such program.

I wasn't aware of those, but I'd say they are candidates as well! I
receive the result of the grep operation on stdout, so I don't want any
categorization via the exit code.

I'm also still not 100% sure on the idea, which is why I'm putting this
up for discussion. I think my main two arguments are

 - Shell scripting uses nonzero exit codes to signal errors by default.
   grep is an exception here (as are cmp, diff and sort as I understand
   you), and for a reason, but it doesn't provide an option to use
   default behaviour.
 - Also, just arguing in the context of grep, there is an option to
   "just tell me if there was a match, don't give me the results" and I
   want a complementary "just give me the results, don't tell me if
   there was a match".


> And it's not just commands like 'cmp' and 'grep'. The following
> causes Bash to exit on GNU/Linux:
> 
> set -eo pipefail
> cat /usr/share/dict/american-english | grep -l '^'
> 
> This is not because of anything 'grep' does, as 'grep' exits with
> status zero. It's because 'cat' exits with nonzero status. Surely we
> shouldn't add a --pipe option to 'cat' too.

It doesn't do that here, I wonder why you are seeing an error there?
Also, if it did, that would signal an actual error, which is behaviour
I'm completely fine with.


That said, I received feedback from another side (busybox) which is
strongly against a `-p` alias, because BSD grep already uses that one.


Thank you for your input, Paul, I'm enjoying this exchange!


Uli




Information forwarded to bug-grep <at> gnu.org:
bug#54035; Package grep. (Sat, 19 Feb 2022 16:02:02 GMT) Full text and rfc822 format available.

Message #14 received at 54035 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Ulrich Eckhardt <ulrich.eckhardt <at> base-42.de>
Cc: 54035 <at> debbugs.gnu.org
Subject: Re: bug#54035: Patch for easier use in scripting pipelines
Date: Sat, 19 Feb 2022 08:01:20 -0800
On 2/18/22 00:27, Ulrich Eckhardt wrote:

>   - Shell scripting uses nonzero exit codes to signal errors by default.
>     grep is an exception here (as are cmp, diff and sort as I understand
>     you),

I'm sure there are other exceptions. And there's a long tradition for 
these exceptions. It doesn't sound realistic to upend this longstanding 
practice, or to add options to every such program.

>   - Also, just arguing in the context of grep, there is an option to
>     "just tell me if there was a match, don't give me the results"

That's for efficiency; grep can be waaay faster with -q. There is no 
efficiency argument for the changes you're proposing.

>> The following
>> causes Bash to exit on GNU/Linux:
>>
>> set -eo pipefail
>> cat /usr/share/dict/american-english | grep -l '^'
>>
>> This is not because of anything 'grep' does, as 'grep' exits with
>> status zero. It's because 'cat' exits with nonzero status. Surely we
>> shouldn't add a --pipe option to 'cat' too.
> 
> It doesn't do that here, I wonder why you are seeing an error there?

Possibly you're using a GNU/Linux variant where the dictionary is 
located elsewhere? Or you have something in your .profile? I'm running 
Ubuntu 21.10 x86-64, and if I run this shell command:

bash --norc --noprofile -c 'set -eo pipefail; cat 
/usr/share/dict/american-english | grep -l "^"; echo done'

the output is:

(standard input)

which means that Bash exited without doing the 'echo done'.

> Also, if it did, that would signal an actual error, which is behaviour
> I'm completely fine with.

There is no actual error, in that the "cat ... | grep ..." command does 
just what I wanted it to: grep reported that standard input contained a 
match (which it did). This may help to explain my previous remark that 
scripts that use "set -eo pipefail" need to be verrrry careful.




Information forwarded to bug-grep <at> gnu.org:
bug#54035; Package grep. (Thu, 24 Feb 2022 07:12:01 GMT) Full text and rfc822 format available.

Message #17 received at 54035 <at> debbugs.gnu.org (full text, mbox):

From: Ulrich Eckhardt <ulrich.eckhardt <at> base-42.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 54035 <at> debbugs.gnu.org
Subject: Re: bug#54035: Patch for easier use in scripting pipelines
Date: Thu, 24 Feb 2022 08:11:37 +0100
On Sat, 19 Feb 2022 08:01:20 -0800
Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> bash --norc --noprofile -c 'set -eo pipefail; cat 
> /usr/share/dict/american-english | grep -l "^"; echo done'
> 
> the output is:
> 
> (standard input)
> 
> which means that Bash exited without doing the 'echo done'.

I'm running fish as shell, and that doesn't do that. Using bash, it
behaves as you describe. Just for my understanding, grep stops reading
when it finds the first match and then the shell closes the output
stream of cat. That in turn causes cat to fail (exit code 141, meaning
SIGPIPE), because it can't write the rest of the data that it wants,
right?


In any case, you have delivered some convincing arguments. I'll turn my
attention on the manpage instead. The benchmark (if that's what you
want to call it) that brought me to this was that grep behaviour
confused me and that I couldn't find anything even from reading the
docs. I think that short reads (which could cause SIGPIPE) and the
non-error exit code 1 deserve mention there. I'll take a look and
perhaps file another patch.


So long...


Uli




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Thu, 24 Feb 2022 19:10:02 GMT) Full text and rfc822 format available.

Notification sent to Ulrich Eckhardt <ulrich.eckhardt <at> base-42.de>:
bug acknowledged by developer. (Thu, 24 Feb 2022 19:10:02 GMT) Full text and rfc822 format available.

Message #22 received at 54035-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Ulrich Eckhardt <ulrich.eckhardt <at> base-42.de>
Cc: 54035-done <at> debbugs.gnu.org
Subject: Re: bug#54035: Patch for easier use in scripting pipelines
Date: Thu, 24 Feb 2022 11:09:05 -0800
[Message part 1 (text/plain, inline)]
On 2/23/22 23:11, Ulrich Eckhardt wrote:

> Just for my understanding, grep stops reading
> when it finds the first match and then the shell closes the output
> stream of cat. That in turn causes cat to fail (exit code 141, meaning
> SIGPIPE), because it can't write the rest of the data that it wants,
> right?

Right.

> I think that short reads (which could cause SIGPIPE) and the
> non-error exit code 1 deserve mention there. I'll take a look and
> perhaps file another patch.

I installed the attached to try to document this better.
[0001-doc-mention-issues-with-set-e.patch (text/x-patch, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 25 Mar 2022 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 145 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.