GNU bug report logs -
#64540
[PATCH] od: exit out on failure to write to stdout
Previous Next
Reported by: Josef Bacik <josef <at> toxicpanda.com>
Date: Sun, 9 Jul 2023 09:09:02 UTC
Severity: normal
Tags: patch
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 64540 in the body.
You can then email your comments to 64540 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Sun, 09 Jul 2023 09:09:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Josef Bacik <josef <at> toxicpanda.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Sun, 09 Jul 2023 09:09:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
A very weird bug was uncovered when using fstests with github actions.
In fstests we are doing something like this
od /dev/urandom | dd of=somefile bs=1M count=10
The above works fine, except in the case of github actions which runs
this script remotely, capturing the output in a pipe to print on their
console. Somehow od ends up writing forever into the stdout pipe
without exiting in this scenario, normally it doesn't do this.
strace'ing the process we can see it getting SIGPIPE and simply ignoring
the error. This is because when we don't provide a limit we never
actually check if there was an error on the stdout fd. Fix this by
adjusting the logic to check if there was an error on every loop, and
break out if appropriate. With this patch applied this configuration no
longer results in a hang.
---
src/od.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/od.c b/src/od.c
index 6b66ceb4f..a07b4de03 100644
--- a/src/od.c
+++ b/src/od.c
@@ -1380,7 +1380,7 @@ dump (void)
if (limit_bytes_to_format)
{
- while (true)
+ while (ok)
{
size_t n_needed;
if (current_offset >= end_offset)
@@ -1396,13 +1396,14 @@ dump (void)
affirm (n_bytes_read == bytes_per_block);
write_block (current_offset, n_bytes_read,
block[!idx], block[idx]);
+ ok &= check_and_close(0);
current_offset += n_bytes_read;
idx = !idx;
}
}
else
{
- while (true)
+ while (ok)
{
ok &= read_block (bytes_per_block, block[idx], &n_bytes_read);
if (n_bytes_read < bytes_per_block)
@@ -1410,6 +1411,7 @@ dump (void)
affirm (n_bytes_read == bytes_per_block);
write_block (current_offset, n_bytes_read,
block[!idx], block[idx]);
+ ok &= check_and_close(0);
current_offset += n_bytes_read;
idx = !idx;
}
--
2.41.0
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Sun, 09 Jul 2023 09:31:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 64540 <at> debbugs.gnu.org (full text, mbox):
On 2023-07-08 15:43, Josef Bacik wrote:
> A very weird bug was uncovered when using fstests with github actions.
> In fstests we are doing something like this
>
> od /dev/urandom | dd of=somefile bs=1M count=10
>
> The above works fine, except in the case of github actions which runs
> this script remotely, capturing the output in a pipe to print on their
> console. Somehow od ends up writing forever into the stdout pipe
> without exiting in this scenario, normally it doesn't do this.
Do you see the same problem if you use 'cat' rather than 'od'? If so,
the problem isn't with 'od'; it's with the environment, which is somehow
set up to ignore SIGPIPE. It shouldn't do that, as ignoring SIGPIPE
breaks a lot of programs, and 'od' would be just one of them.
Although the patch you sent is probably harmless, it's not clear that we
should go through all the GNU utilities and make them check for I/O
errors on every output, as that'd be a maintenance hassle and in some
cases would hurt performance significantly.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Sun, 09 Jul 2023 10:15:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 64540 <at> debbugs.gnu.org (full text, mbox):
On Jul 09 2023, Paul Eggert wrote:
> Do you see the same problem if you use 'cat' rather than 'od'? If so, the
> problem isn't with 'od'; it's with the environment, which is somehow set
> up to ignore SIGPIPE. It shouldn't do that, as ignoring SIGPIPE breaks a
> lot of programs, and 'od' would be just one of them.
All other utilities (including cat) handle ignored SIGPIPE correctly.
$ (trap '' PIPE; cat /dev/zero | :)
cat: write error: Broken pipe
--
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Sun, 09 Jul 2023 14:12:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 64540 <at> debbugs.gnu.org (full text, mbox):
On 09/07/2023 10:29, Paul Eggert wrote:
> On 2023-07-08 15:43, Josef Bacik wrote:
>> A very weird bug was uncovered when using fstests with github actions.
>> In fstests we are doing something like this
>>
>> od /dev/urandom | dd of=somefile bs=1M count=10
>>
>> The above works fine, except in the case of github actions which runs
>> this script remotely, capturing the output in a pipe to print on their
>> console. Somehow od ends up writing forever into the stdout pipe
>> without exiting in this scenario, normally it doesn't do this.
>
> Do you see the same problem if you use 'cat' rather than 'od'? If so,
> the problem isn't with 'od'; it's with the environment, which is somehow
> set up to ignore SIGPIPE. It shouldn't do that, as ignoring SIGPIPE
> breaks a lot of programs, and 'od' would be just one of them.
>
> Although the patch you sent is probably harmless, it's not clear that we
> should go through all the GNU utilities and make them check for I/O
> errors on every output, as that'd be a maintenance hassle and in some
> cases would hurt performance significantly.
Note the patch looks wrong as it would close the input always.
We can fix that up easily enough anyway.
RE the general point about handling write errors in coreutils,
many other cases are handled implicitly or explicitly,
but some are not. For example these will inf loop:
pr /dev/urandom > /dev/full
od /dev/urandom > /dev/full
It shouldn't be too onerous to handle these cases explicitly.
An example where we explicitly handled it for seq was:
https://github.com/coreutils/coreutils/commit/c92585b10
I'll create patches to handle the above, and any others I notice.
For my reference a short list of utils to check (that might inf loop) are:
cat comm cp cut dd dirname expand factor fmt fold head join ls nl numfmt
od paste pr printf ptx readlink realpath seq shuf sort stat tac tail
tee tr tsort unexpand uniq wc yes
Note we had a similar issue recently on the read side,
where all read errors were not diagnosed, which we checked with:
https://github.com/coreutils/coreutils/commit/7ea7c020e
I'll come up with a similar test to diagnose write errors
for the shortlist of utils above.
cheers,
Pádraig
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Sun, 09 Jul 2023 19:12:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 64540 <at> debbugs.gnu.org (full text, mbox):
On 2023-07-09 07:11, Pádraig Brady wrote:
> Note the patch looks wrong as it would close the input always.
> We can fix that up easily enough anyway.
If it's easy and doesn't hurt performance in the usual case, that's of
course fine.
> For my reference a short list of utils to check (that might inf loop) are:
> cat comm cp cut dd dirname expand factor fmt fold head join ls nl numfmt
> od paste pr printf ptx readlink realpath seq shuf sort stat tac tail
> tee tr tsort unexpand uniq wc yes
There's no point to complicating programs that have bounded input for
other reasons, as they'll report errors soon enough for those other
reasons. The main problem comes from programs like 'cat' and 'od' that
can read from "infinite" input like /dev/urandom and can write indefinitely.
So I think we needn't worry about dirname, ptx, readlink, realpath,
sort, stat, tac, tsort, wc. For example, we needn't worry about 'sort'
because it needs to read all its input before outputting anything, and
it cannot read all of /dev/urandom. As long as these programs eventually
check for write errors, which I expect they all do, that should be good
enough.
Even for programs like 'od' there is no need to check every output
function; just check occasionally, when it's convenient.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Sun, 09 Jul 2023 22:56:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 64540 <at> debbugs.gnu.org (full text, mbox):
On 09/07/2023 20:11, Paul Eggert wrote:
> On 2023-07-09 07:11, Pádraig Brady wrote:
>
>> Note the patch looks wrong as it would close the input always.
>> We can fix that up easily enough anyway.
>
> If it's easy and doesn't hurt performance in the usual case, that's of
> course fine.
>
>
>> For my reference a short list of utils to check (that might inf loop) are:
>> cat comm cp cut dd dirname expand factor fmt fold head join ls nl numfmt
>> od paste pr printf ptx readlink realpath seq shuf sort stat tac tail
>> tee tr tsort unexpand uniq wc yes
>
> There's no point to complicating programs that have bounded input for
> other reasons, as they'll report errors soon enough for those other
> reasons. The main problem comes from programs like 'cat' and 'od' that
> can read from "infinite" input like /dev/urandom and can write indefinitely.
>
> So I think we needn't worry about dirname, ptx, readlink, realpath,
> sort, stat, tac, tsort, wc. For example, we needn't worry about 'sort'
> because it needs to read all its input before outputting anything, and
> it cannot read all of /dev/urandom. As long as these programs eventually
> check for write errors, which I expect they all do, that should be good
> enough.
>
> Even for programs like 'od' there is no need to check every output
> function; just check occasionally, when it's convenient.
Yes completely agreed.
The shortlist above is for my reference for commands to review,
not for commands to adjust.
cheers,
Pádraig
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Sat, 15 Jul 2023 21:09:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 64540 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 09/07/2023 23:54, Pádraig Brady wrote:
> On 09/07/2023 20:11, Paul Eggert wrote:
>> On 2023-07-09 07:11, Pádraig Brady wrote:
>>
>>> Note the patch looks wrong as it would close the input always.
>>> We can fix that up easily enough anyway.
>>
>> If it's easy and doesn't hurt performance in the usual case, that's of
>> course fine.
>>
>>
>>> For my reference a short list of utils to check (that might inf loop) are:
>>> cat comm cp cut dd dirname expand factor fmt fold head join ls nl numfmt
>>> od paste pr printf ptx readlink realpath seq shuf sort stat tac tail
>>> tee tr tsort unexpand uniq wc yes
>>
>> There's no point to complicating programs that have bounded input for
>> other reasons, as they'll report errors soon enough for those other
>> reasons. The main problem comes from programs like 'cat' and 'od' that
>> can read from "infinite" input like /dev/urandom and can write indefinitely.
>>
>> So I think we needn't worry about dirname, ptx, readlink, realpath,
>> sort, stat, tac, tsort, wc. For example, we needn't worry about 'sort'
>> because it needs to read all its input before outputting anything, and
>> it cannot read all of /dev/urandom. As long as these programs eventually
>> check for write errors, which I expect they all do, that should be good
>> enough.
>>
>> Even for programs like 'od' there is no need to check every output
>> function; just check occasionally, when it's convenient.
>
> Yes completely agreed.
> The shortlist above is for my reference for commands to review,
> not for commands to adjust.
The attached patch-set addresses two classes of issue:
1. Doubled error messages upon write errors
2. Continued processing upon write errors (the orig problem reported).
For class 1, we include fixes for:
expand, factor, paste, shuf, tr, unexpand
For class 2, we include fixes for:
od, uniq, cut, comm, join
For class 2, we do _not_ include fixes for these utils
which are more awkward / less important in this context:
fmt, fold, nl, numfmt, pr
I'll update NEWS with this info before pushing.
cheers,
Pádraig
[0001-tests-ensure-utilties-exit-promptly-upon-write-error.patch (text/x-patch, attachment)]
[0002-all-avoid-repeated-diagnostic-upon-write-error.patch (text/x-patch, attachment)]
[0003-od-promptly-diagnose-write-errors.patch (text/x-patch, attachment)]
[0004-uniq-promptly-diagnose-write-errors.patch (text/x-patch, attachment)]
[0005-cut-promptly-diagnose-write-errors.patch (text/x-patch, attachment)]
[0006-comm-promptly-diagnose-write-errors.patch (text/x-patch, attachment)]
[0007-join-promptly-diagnose-write-errors.patch (text/x-patch, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Mon, 17 Jul 2023 06:37:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 64540 <at> debbugs.gnu.org (full text, mbox):
On 7/15/23 23:08, Pádraig Brady wrote:
> The attached patch-set addresses two classes of issue:
> 1. Doubled error messages upon write errors
> 2. Continued processing upon write errors (the orig problem reported).
Nice work!
Patch 1:
> --- /dev/null
> +++ b/tests/misc/write-errors.sh
> @@ -0,0 +1,70 @@
> +if ! test -w /dev/full || ! test -c /dev/full; then
> + skip_ 'Need /dev/full'
Other tests use a slightly different skip message:
skip_ '/dev/full is required'
I'd favor the latter, so that we don't miss it once we'd like
to factor it out to a require_dev_full in init.cfg.
Re. patch 002:
> all: avoid repeated diagnostic upon write error
>
> * cfg.mk (sc_some_programs_must_avoid_exit_failure): Adjust to
> avoid false positive.
> (sc_prohibit_exit_write_error): A new syntax check to prohibit
> open coding error(..., "write error"); instead directiing to use...
s/ii/i/
> --- a/src/system.h
> +++ b/src/system.h
> @@ -762,6 +762,14 @@ The following directory is part of the cycle:\n %s\n"), \
> } \
> while (0)
>
> +static inline void
> +write_error (void)
> +{
> + fflush (stdout); /* Ensure nothing buffered that might induce an error. */
> + clearerr (stdout); /* To avoid extraneous diagnostic from close_stdout. */
> + error (EXIT_FAILURE, errno, _("write error"));
> +}
Hmm, fflush could theoretically change errno, couldn't it?
In that case, error() would give a wrong diagnostic.
FWIW: `man clearerr` states that this function doesn't touch errno.
Thanks & have a nice day,
Berny
Reply sent
to
Pádraig Brady <P <at> draigBrady.com>
:
You have taken responsibility.
(Mon, 17 Jul 2023 10:32:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Josef Bacik <josef <at> toxicpanda.com>
:
bug acknowledged by developer.
(Mon, 17 Jul 2023 10:32:02 GMT)
Full text and
rfc822 format available.
Message #31 received at 64540-done <at> debbugs.gnu.org (full text, mbox):
On 17/07/2023 07:35, Bernhard Voelker wrote:
> On 7/15/23 23:08, Pádraig Brady wrote:
>> The attached patch-set addresses two classes of issue:
>> 1. Doubled error messages upon write errors
>> 2. Continued processing upon write errors (the orig problem reported).
>
> Nice work!
>
> Patch 1:
>
> > --- /dev/null
> > +++ b/tests/misc/write-errors.sh
> > @@ -0,0 +1,70 @@
>
> > +if ! test -w /dev/full || ! test -c /dev/full; then
> > + skip_ 'Need /dev/full'
>
> Other tests use a slightly different skip message:
> skip_ '/dev/full is required'
>
> I'd favor the latter, so that we don't miss it once we'd like
> to factor it out to a require_dev_full in init.cfg.
>
> Re. patch 002:
>
> > all: avoid repeated diagnostic upon write error
> >
> > * cfg.mk (sc_some_programs_must_avoid_exit_failure): Adjust to
> > avoid false positive.
> > (sc_prohibit_exit_write_error): A new syntax check to prohibit
> > open coding error(..., "write error"); instead directiing to use...
>
> s/ii/i/
>
> > --- a/src/system.h
> > +++ b/src/system.h
> > @@ -762,6 +762,14 @@ The following directory is part of the cycle:\n %s\n"), \
> > } \
> > while (0)
> >
> > +static inline void
> > +write_error (void)
> > +{
> > + fflush (stdout); /* Ensure nothing buffered that might induce an error. */
> > + clearerr (stdout); /* To avoid extraneous diagnostic from close_stdout. */
> > + error (EXIT_FAILURE, errno, _("write error"));
> > +}
>
> Hmm, fflush could theoretically change errno, couldn't it?
> In that case, error() would give a wrong diagnostic.
> FWIW: `man clearerr` states that this function doesn't touch errno.
Yes good call.
Current version is:
/* exit with a _single_ "write error" diagnostic. */
static inline void
write_error (void)
{
int saved_errno = errno;
fflush (stdout); /* Ensure nothing buffered that might induce an error. */
clearerr (stdout); /* To avoid extraneous diagnostic from close_stdout. */
error (EXIT_FAILURE, saved_errno, _("write error"));
}
I'll push later.
Marking this as done.
Thanks for the review.
Pádraig
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Mon, 17 Jul 2023 17:05:02 GMT)
Full text and
rfc822 format available.
Message #34 received at 64540-done <at> debbugs.gnu.org (full text, mbox):
On 2023-07-17 03:31, Pádraig Brady wrote:
> static inline void
As a general rule, there's no need for 'static inline' in C, as nowadays
compilers figure out inlining just fine for static functions and plain
'static' should be good enough. There are exceptions but 'write_error'
doesn't look like it's one of them.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Mon, 17 Jul 2023 17:13:02 GMT)
Full text and
rfc822 format available.
Message #37 received at 64540-done <at> debbugs.gnu.org (full text, mbox):
On 17/07/2023 18:04, Paul Eggert wrote:
> On 2023-07-17 03:31, Pádraig Brady wrote:
>> static inline void
>
> As a general rule, there's no need for 'static inline' in C, as nowadays
> compilers figure out inlining just fine for static functions and plain
> 'static' should be good enough. There are exceptions but 'write_error'
> doesn't look like it's one of them.
Right. In headers though, the traditional "static inline" idiom
indicates to the compiler that this function is a small
utility function that may not be used in all translation units
that the header is included in. I.e. without the inline,
in some translation units you'd get:
error: 'write_error' defined but not used [-Werror=unused-function]
cheers,
Pádraig.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#64540
; Package
coreutils
.
(Mon, 17 Jul 2023 17:37:01 GMT)
Full text and
rfc822 format available.
Message #40 received at 64540-done <at> debbugs.gnu.org (full text, mbox):
On 2023-07-17 10:12, Pádraig Brady wrote:
> Right. In headers though, the traditional "static inline" idiom
Ah, I forgot that it was in system.h. At some point we should have
system.h use _GL_INLINE but that can wait.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 15 Aug 2023 11:24:07 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 303 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.