GNU bug report logs - #77800
grep-3.12: write-error-msg test failure on fedora rawhide (f43)

Previous Next

Package: grep;

Reported by: Jaroslav Škarvada <jskarvad <at> redhat.com>

Date: Mon, 14 Apr 2025 12:56:01 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 77800 in the body.
You can then email your comments to 77800 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Mon, 14 Apr 2025 12:56:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jaroslav Škarvada <jskarvad <at> redhat.com>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Mon, 14 Apr 2025 12:56:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jaroslav Škarvada <jskarvad <at> redhat.com>
To: bug-grep <bug-grep <at> gnu.org>
Subject: grep-3.12: write-error-msg test failure on fedora rawhide (f43)
Date: Mon, 14 Apr 2025 14:53:47 +0200
[Message part 1 (text/plain, inline)]
Hi,

test log is attached

thanks & regards

Jaroslav
[test-suite.log (text/x-log, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Mon, 14 Apr 2025 19:33:02 GMT) Full text and rfc822 format available.

Message #8 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Jaroslav Škarvada <jskarvad <at> redhat.com>
Cc: 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Mon, 14 Apr 2025 12:32:36 -0700
On Mon, Apr 14, 2025 at 5:56 AM Jaroslav Škarvada via Bug reports for
GNU grep <bug-grep <at> gnu.org> wrote:
> test log is attached

Hi Jaroslav,
Thanks for testing!

The test that's failing is ensuring that a disk-full write failure
elicits a diagnostic like this:
  grep: write error: No space left on device

On your system, it prints only this:
  grep: write error

Can you run the following and share the strace output file, "out"?
  yes 12345 | head -n 50000 > in
  strace -o out -- grep -q --help >/dev/full

We expect the first write to fail like this:
  write(1, "Usage: grep [OPTION]... PATTERNS"..., 4096) = -1 ENOSPC
(No space left on device)




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Tue, 15 Apr 2025 10:53:01 GMT) Full text and rfc822 format available.

Message #11 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Jaroslav Škarvada <jskarvad <at> redhat.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Tue, 15 Apr 2025 12:52:09 +0200
On the system where the test failed (fedora 43/rawhide):
# yes 12345 | head -n 50000 > in
# strace -o out -- grep -q --help >/dev/full
grep: write error: No space left on device

write(1, "Usage: grep [OPTION]... PATTERNS"..., 4068) = -1 ENOSPC (No
space left on device)

It's the system grep 3.11.

With the 3.12 build from the sources:
# yes 12345 | head -n 50000 > in
# strace -o out -- ./grep -q --help >/dev/full
./grep: write error

write(1, "Usage: grep [OPTION]... PATTERNS"..., 4096) = -1 ENOSPC (No
space left on device)

It's compiled the same way as the 3.11 was except the 3.11 build used
the 'autoreconf -fi' which was dropped, because autoreconf fails now
probably due to version incompatibility:
configure:9031: error: possibly undefined macro: gl_ANYTHREADLIB_EARLY
      If this token and others are legitimate, please use m4_pattern_allow.
      See the Autoconf documentation.

autoconf-2.72

Jaroslav

On Mon, Apr 14, 2025 at 9:32 PM Jim Meyering <jim <at> meyering.net> wrote:
>
> On Mon, Apr 14, 2025 at 5:56 AM Jaroslav Škarvada via Bug reports for
> GNU grep <bug-grep <at> gnu.org> wrote:
> > test log is attached
>
> Hi Jaroslav,
> Thanks for testing!
>
> The test that's failing is ensuring that a disk-full write failure
> elicits a diagnostic like this:
>   grep: write error: No space left on device
>
> On your system, it prints only this:
>   grep: write error
>
> Can you run the following and share the strace output file, "out"?
>   yes 12345 | head -n 50000 > in
>   strace -o out -- grep -q --help >/dev/full
>
> We expect the first write to fail like this:
>   write(1, "Usage: grep [OPTION]... PATTERNS"..., 4096) = -1 ENOSPC
> (No space left on device)
>





Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Tue, 15 Apr 2025 18:23:06 GMT) Full text and rfc822 format available.

Message #14 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jaroslav Škarvada <jskarvad <at> redhat.com>
Cc: 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Tue, 15 Apr 2025 11:22:07 -0700
Thanks for the bug report. What is the output of the following command 
on Fedora rawhide?

LC_ALL=C strace src/grep -q --help >/dev/full

On my Fedora 41 x86-64 host, the output of the above command ends as 
follows; what's different on your platform?

sigaltstack({ss_sp=0x429f40, ss_flags=0, ss_size=65536}, NULL) = 0
rt_sigaction(SIGSEGV, {sa_handler=0x41a370, sa_mask=[HUP INT QUIT USR1 
USR2 PIPE ALRM TERM CHLD URG XCPU XFSZ VTALRM PROF WINCH IO PWR], 
sa_flags=SA_RESTORER|SA_ONSTACK|SA_SIGINFO, sa_restorer=0x7fd8b3e1e050}, 
NULL, 8) = 0
rt_sigaction(SIGSEGV, {sa_handler=0x41a370, sa_mask=[HUP INT QUIT USR1 
USR2 PIPE ALRM TERM CHLD URG XCPU XFSZ VTALRM PROF WINCH IO PWR], 
sa_flags=SA_RESTORER|SA_ONSTACK|SA_SIGINFO, sa_restorer=0x7fd8b3e1e050}, 
NULL, 8) = 0
fstat(1, {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x7), ...}) = 0
ioctl(1, TCGETS, 0x7ffc70791e50)        = -1 ENOTTY (Inappropriate ioctl 
for device)
write(1, "Usage: grep [OPTION]... PATTERNS"..., 4096) = -1 ENOSPC (No 
space left on device)
close(1)                                = 0
write(2, "src/grep: ", 10src/grep: )              = 10
write(2, "write error", 11write error)             = 11
write(2, ": No space left on device", 25: No space left on device) = 25
write(2, "\n", 1
)                       = 1
exit_group(2)                           = ?
+++ exited with 2 +++





Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Tue, 15 Apr 2025 18:53:07 GMT) Full text and rfc822 format available.

Message #17 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jaroslav Škarvada <jskarvad <at> redhat.com>
Cc: 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Tue, 15 Apr 2025 11:52:14 -0700
Oh, sorry, I hadn't seen the email between you and Jim.

I am suspicious that rawhide gettext doesn't preserve errno. Could you 
please run the built grep under gdb as follows and say what happens to 
errno? The following transcript is on Fedora 41 where things work as 
expected (ENOSPC == 28):

$ gdb src/grep
GNU gdb (Fedora Linux) 16.2-2.fc41
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from src/grep...
(gdb) b close_stream
Breakpoint 1 at 0x41c300: file close-stream.c, line 57.
(gdb) r -q --help >/dev/full
Starting program: /home/eggert/src/gnu/grep/src/grep -q --help >/dev/full

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, close_stream (stream=0x7ffff7ef15c0 <_IO_2_1_stdout_>)
    at close-stream.c:57
57	  const bool some_pending = (__fpending (stream) != 0);
(gdb) n
58	  const bool prev_fail = (ferror (stream) != 0);
(gdb) p some_pending
$1 = <optimized out>
(gdb) n
59	  const bool fclose_fail = (fclose (stream) != 0);
(gdb) n
69	  if (prev_fail || (fclose_fail && (some_pending || errno != EBADF)))
(gdb) p prev_fail
$2 = false
(gdb) p fclose_fail
$3 = true
(gdb) p some_pending
$4 = true
(gdb) p errno
$5 = 28
(gdb) n
close_stdout () at closeout.c:121
121	      char const *write_error = _("write error");
(gdb) n
122	      if (file_name)
(gdb) n
126	        error (0, errno, "%s", write_error);
(gdb) p errno
$6 = 28
(gdb) c
Continuing.
/home/eggert/src/gnu/grep/src/grep: write error: No space left on device
[Inferior 1 (process 96127) exited with code 02]
(gdb)




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Tue, 15 Apr 2025 22:15:08 GMT) Full text and rfc822 format available.

Message #20 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Grisha Levit <grishalevit <at> gmail.com>
To: Jaroslav Škarvada <jskarvad <at> redhat.com>
Cc: 77800 <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>,
 Jim Meyering <jim <at> meyering.net>
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Tue, 15 Apr 2025 18:14:16 -0400
On Tue, Apr 15, 2025 at 6:53 AM Jaroslav Škarvada via Bug reports for
GNU grep <bug-grep <at> gnu.org> wrote:
> With the 3.12 build from the sources:
> # yes 12345 | head -n 50000 > in
> # strace -o out -- ./grep -q --help >/dev/full
> ./grep: write error
>

This seems to be fixed if the patch here [1] is _not_ applied when
building the RPM

[1]: https://src.fedoraproject.org/rpms/grep/blob/rawhide/f/grep-3.5-help-align.patch




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Tue, 15 Apr 2025 22:39:03 GMT) Full text and rfc822 format available.

Message #23 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: jackson <at> fastmail.fm
To: "Grisha Levit" <grishalevit <at> gmail.com>,
 Jaroslav Škarvada <jskarvad <at> redhat.com>
Cc: 77800 <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>,
 Jim Meyering <jim <at> meyering.net>
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Tue, 15 Apr 2025 17:37:23 -0500
>> This seems to be fixed if the patch here [1] is _not_ applied when
>> building the RPM

Can you check, perhaps using strace, whether that problematic patch pushes
the total amount of bytes written to stdout by grep, writing to /dev/full as you
invoke it, goes from a little below 4096 bytes, in a single write(2) system call,
to trying to write more than 4096 bytes, which likely forces two write calls,
the first 4096 bytes, then the remainder.

Then I'd speculate further that having to go to a second write(2)  system call
triggers additional logic, perhaps keyed off the 'stdout_error' flag being set,
that leads to the additional error output.

-- 
Paul Jackson
jackson <at> fastmail.fm




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Tue, 15 Apr 2025 23:13:03 GMT) Full text and rfc822 format available.

Message #26 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Grisha Levit <grishalevit <at> gmail.com>
To: jackson <at> fastmail.fm
Cc: 77800 <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>,
 Jaroslav Škarvada <jskarvad <at> redhat.com>,
 Jim Meyering <jim <at> meyering.net>
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Tue, 15 Apr 2025 19:12:18 -0400
On Tue, Apr 15, 2025 at 6:38 PM <jackson <at> fastmail.fm> wrote:
>
> >> This seems to be fixed if the patch here [1] is _not_ applied when
> >> building the RPM
>
> Can you check, perhaps using strace, whether that problematic patch pushes
> the total amount of bytes written to stdout by grep, writing to /dev/full as you
> invoke it, goes from a little below 4096 bytes, in a single write(2) system call,
> to trying to write more than 4096 bytes, which likely forces two write calls,
> the first 4096 bytes, then the remainder.
>

Indeed, the `--help' output is 4096 bytes normally and longer after the patch:

$ ./src/grep --help | wc -c
4096
$ patch ... && make -C src grep
$ ./src/grep --help | wc -c
4122

ltrace shows:

printf("General help using GNU software:"...,
"https://www.gnu.org/gethelp/") = 64
exit(0 <unfinished ...>
__fpending(0xffff9afe1510, 0, 0xffff9afe2280, 1) = 4096
fclose(0xffff9afe1510)                           = -1
dcgettext(0x41d480, 0x41c418, 5, 0x450000)       = 0x41c418
__errno_location()                               = 0xffff9b0da760
error(0, 28, 0x41c318, 0x41c418)                 = 0xffff9b0ebc90

vs

printf("General help using GNU software:"...,
"https://www.gnu.org/gethelp/") = -1
exit(0 <unfinished ...>
__fpending(0xffff813f1510, 0, 0xffff813f2280, 1) = 0
fclose(0xffff813f1510)                           = 0
__errno_location()                               = 0xffff814e9760
dcgettext(0x41d4a0, 0x41c418, 5, 0x450000)       = 0x41c418
__errno_location()                               = 0xffff814e9760
error(0, 0, 0x41c318, 0x41c418)                  = 0xffff814fac90

and gdb (with the patch applied):

(gdb) n
58   const bool prev_fail = (ferror (stream) != 0);
(gdb) n
close_stream (stream=0xfffff7ef1510 <_IO_2_1_stdout_>) at
/usr/include/bits/stdio.h:137
137   return __ferror_unlocked_body (__stream);
(gdb) n
close_stream (stream=0xfffff7ef1510 <_IO_2_1_stdout_>) at close-stream.c:59
59   const bool fclose_fail = (fclose (stream) != 0);
(gdb) n
69   if (prev_fail || (fclose_fail && (some_pending || errno != EBADF)))
(gdb) p prev_fail
$2 = true
(gdb) p fclose_fail
$3 = false
(gdb) p some_pending
$4 = false
(gdb) p errno
$5 = 28
(gdb) n
71       if (! fclose_fail)
(gdb) n
72         errno = 0;




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Wed, 16 Apr 2025 01:55:02 GMT) Full text and rfc822 format available.

Message #29 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Grisha Levit <grishalevit <at> gmail.com>
Cc: Jaroslav Škarvada <jskarvad <at> redhat.com>,
 Paul Eggert <eggert <at> cs.ucla.edu>, jackson <at> fastmail.fm, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Tue, 15 Apr 2025 18:54:23 -0700
[Message part 1 (text/plain, inline)]
On Tue, Apr 15, 2025 at 4:12 PM Grisha Levit <grishalevit <at> gmail.com> wrote:
>
> On Tue, Apr 15, 2025 at 6:38 PM <jackson <at> fastmail.fm> wrote:
> >
> > >> This seems to be fixed if the patch here [1] is _not_ applied when
> > >> building the RPM
> >
> > Can you check, perhaps using strace, whether that problematic patch pushes
> > the total amount of bytes written to stdout by grep, writing to /dev/full as you
> > invoke it, goes from a little below 4096 bytes, in a single write(2) system call,
> > to trying to write more than 4096 bytes, which likely forces two write calls,
> > the first 4096 bytes, then the remainder.
> >
>
> Indeed, the `--help' output is 4096 bytes normally and longer after the patch:
>
> $ ./src/grep --help | wc -c
> 4096
> $ patch ... && make -C src grep
> $ ./src/grep --help | wc -c
> 4122
>
> ltrace shows:
>
> printf("General help using GNU software:"...,
> "https://www.gnu.org/gethelp/") = 64
> exit(0 <unfinished ...>
> __fpending(0xffff9afe1510, 0, 0xffff9afe2280, 1) = 4096
> fclose(0xffff9afe1510)                           = -1
> dcgettext(0x41d480, 0x41c418, 5, 0x450000)       = 0x41c418
> __errno_location()                               = 0xffff9b0da760
> error(0, 28, 0x41c318, 0x41c418)                 = 0xffff9b0ebc90
>
> vs
>
> printf("General help using GNU software:"...,
> "https://www.gnu.org/gethelp/") = -1
> exit(0 <unfinished ...>
> __fpending(0xffff813f1510, 0, 0xffff813f2280, 1) = 0
> fclose(0xffff813f1510)                           = 0
> __errno_location()                               = 0xffff814e9760
> dcgettext(0x41d4a0, 0x41c418, 5, 0x450000)       = 0x41c418
> __errno_location()                               = 0xffff814e9760
> error(0, 0, 0x41c318, 0x41c418)                  = 0xffff814fac90
>
> and gdb (with the patch applied):
>
> (gdb) n
> 58   const bool prev_fail = (ferror (stream) != 0);
> (gdb) n
> close_stream (stream=0xfffff7ef1510 <_IO_2_1_stdout_>) at
> /usr/include/bits/stdio.h:137
> 137   return __ferror_unlocked_body (__stream);
> (gdb) n
> close_stream (stream=0xfffff7ef1510 <_IO_2_1_stdout_>) at close-stream.c:59
> 59   const bool fclose_fail = (fclose (stream) != 0);
> (gdb) n
> 69   if (prev_fail || (fclose_fail && (some_pending || errno != EBADF)))
> (gdb) p prev_fail
> $2 = true
> (gdb) p fclose_fail
> $3 = false
> (gdb) p some_pending
> $4 = false
> (gdb) p errno
> $5 = 28
> (gdb) n
> 71       if (! fclose_fail)
> (gdb) n
> 72         errno = 0;

Whoa. Thanks. It looks like this is a >19-year-old bug in stream-close.c
Here's a tentative fix -- the ChangeLog entry still lacks details of
when it was introduced:
[close-stream.diff (application/octet-stream, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Wed, 16 Apr 2025 02:27:02 GMT) Full text and rfc822 format available.

Message #32 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Grisha Levit <grishalevit <at> gmail.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: Jaroslav Škarvada <jskarvad <at> redhat.com>,
 Paul Eggert <eggert <at> cs.ucla.edu>, jackson <at> fastmail.fm, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Tue, 15 Apr 2025 22:26:05 -0400
On Tue, Apr 15, 2025 at 9:54 PM Jim Meyering <jim <at> meyering.net> wrote:
> Whoa. Thanks. It looks like this is a >19-year-old bug in stream-close.c
> Here's a tentative fix -- the ChangeLog entry still lacks details of
> when it was introduced:

-      if (! fclose_fail)
+      if (!fclose_fail && !prev_fail)
         errno = 0;

I wonder if this was intentional? Since close_stdout seems to be mostly used
as an atexit function, errno may well have been set for some unrelated
reason by the time this runs -- but errno will be relevant to this stream
if fclose has just failed.

(That's how I made sense of it anyway, on the assumption that it was not a bug)




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Wed, 16 Apr 2025 03:33:02 GMT) Full text and rfc822 format available.

Message #35 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Grisha Levit <grishalevit <at> gmail.com>
Cc: Jaroslav Škarvada <jskarvad <at> redhat.com>,
 Paul Eggert <eggert <at> cs.ucla.edu>, jackson <at> fastmail.fm, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Tue, 15 Apr 2025 20:32:21 -0700
On Tue, Apr 15, 2025 at 7:26 PM Grisha Levit <grishalevit <at> gmail.com> wrote:
> On Tue, Apr 15, 2025 at 9:54 PM Jim Meyering <jim <at> meyering.net> wrote:
> > Whoa. Thanks. It looks like this is a >19-year-old bug in stream-close.c
> > Here's a tentative fix -- the ChangeLog entry still lacks details of
> > when it was introduced:
>
> -      if (! fclose_fail)
> +      if (!fclose_fail && !prev_fail)
>          errno = 0;
>
> I wonder if this was intentional? Since close_stdout seems to be mostly used
> as an atexit function, errno may well have been set for some unrelated
> reason by the time this runs -- but errno will be relevant to this stream
> if fclose has just failed.

I retract that patch. It was obviously wrong: it would render the
errno=0 statement unreachable.




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Wed, 16 Apr 2025 04:18:01 GMT) Full text and rfc822 format available.

Message #38 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Grisha Levit <grishalevit <at> gmail.com>
Cc: Jaroslav Škarvada <jskarvad <at> redhat.com>,
 Paul Eggert <eggert <at> cs.ucla.edu>, jackson <at> fastmail.fm, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Tue, 15 Apr 2025 21:17:30 -0700
[Message part 1 (text/plain, inline)]
On Tue, Apr 15, 2025 at 8:32 PM Jim Meyering <jim <at> meyering.net> wrote:
> On Tue, Apr 15, 2025 at 7:26 PM Grisha Levit <grishalevit <at> gmail.com> wrote:
> > On Tue, Apr 15, 2025 at 9:54 PM Jim Meyering <jim <at> meyering.net> wrote:
> > > Whoa. Thanks. It looks like this is a >19-year-old bug in stream-close.c
> > > Here's a tentative fix -- the ChangeLog entry still lacks details of
> > > when it was introduced:
> >
> > -      if (! fclose_fail)
> > +      if (!fclose_fail && !prev_fail)
> >          errno = 0;
> >
> > I wonder if this was intentional? Since close_stdout seems to be mostly used
> > as an atexit function, errno may well have been set for some unrelated
> > reason by the time this runs -- but errno will be relevant to this stream
> > if fclose has just failed.
>
> I retract that patch. It was obviously wrong: it would render the
> errno=0 statement unreachable.

We're going to have to revise that code.
The difference I see is that before rawhide, that fclose would fail.
It's perfectly fine for fclose to succeed in this case, as now happens
on rawhide (because with 4k BUFSIZ, the fclose wrote nothing -- the
preceding 4096-byte write is what failed).

Here's a better patch: (technically, we could factor it somewhat, but
readability would suffer disproportionately)
[close-stream2.diff (application/octet-stream, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Fri, 18 Apr 2025 05:20:01 GMT) Full text and rfc822 format available.

Message #41 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Grisha Levit <grishalevit <at> gmail.com>,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>
Cc: Jaroslav Škarvada <jskarvad <at> redhat.com>,
 Paul Eggert <eggert <at> cs.ucla.edu>, jackson <at> fastmail.fm, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Thu, 17 Apr 2025 22:18:39 -0700
[Message part 1 (text/plain, inline)]
On Tue, Apr 15, 2025 at 9:17 PM Jim Meyering <jim <at> meyering.net> wrote:
...
> We're going to have to revise that code.
> The difference I see is that before rawhide, that fclose would fail.
> It's perfectly fine for fclose to succeed in this case, as now happens
> on rawhide (because with 4k BUFSIZ, the fclose wrote nothing -- the
> preceding 4096-byte write is what failed).
>
> Here's a better patch: (technically, we could factor it somewhat, but
> readability would suffer disproportionately)

I didn't take the time to find a precise commit, but this bug predates
the move from closeout.c to gnulib's close-stdout.c in 2006. As I
write this, I'm installing Fedora 42.
I'll probably push the attached to gnulib tomorrow:
[close-stream-vs-F42.diff (application/octet-stream, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Fri, 18 Apr 2025 05:37:03 GMT) Full text and rfc822 format available.

Message #44 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Grisha Levit <grishalevit <at> gmail.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: jackson <at> fastmail.fm,
 Jaroslav Škarvada <jskarvad <at> redhat.com>,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>,
 Paul Eggert <eggert <at> cs.ucla.edu>, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Fri, 18 Apr 2025 01:35:40 -0400
On Fri, Apr 18, 2025, 01:18 Jim Meyering <jim <at> meyering.net> wrote:
>
> On Tue, Apr 15, 2025 at 9:17 PM Jim Meyering <jim <at> meyering.net> wrote:
> ...
> > We're going to have to revise that code.
> > The difference I see is that before rawhide, that fclose would fail.
> > It's perfectly fine for fclose to succeed in this case, as now happens
> > on rawhide (because with 4k BUFSIZ, the fclose wrote nothing -- the
> > preceding 4096-byte write is what failed).
> >
> > Here's a better patch: (technically, we could factor it somewhat, but
> > readability would suffer disproportionately)
>
> I didn't take the time to find a precise commit, but this bug predates
> the move from closeout.c to gnulib's close-stdout.c in 2006. As I
> write this, I'm installing Fedora 42.
> I'll probably push the attached to gnulib tomorrow:


> Exposed via Fedora 42's new glibc vs grep's --help being precisely
> 4096 bytes.

AFAICT this is not related to F42 or new glibc, it's just longer help
text in grep-3.12 + Fedora patch making it even longer.  But you should
see the same behavior on existing systems with e.g.:

    $ { env printf %4095s; env printf %4096s; } > /dev/full
    printf: write error: Broken pipe
    printf: write error




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Fri, 18 Apr 2025 05:52:01 GMT) Full text and rfc822 format available.

Message #47 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Grisha Levit <grishalevit <at> gmail.com>
Cc: jackson <at> fastmail.fm,
 Jaroslav Škarvada <jskarvad <at> redhat.com>,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>,
 Paul Eggert <eggert <at> cs.ucla.edu>, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Thu, 17 Apr 2025 22:51:12 -0700
On Thu, Apr 17, 2025 at 10:35 PM Grisha Levit <grishalevit <at> gmail.com> wrote:
> On Fri, Apr 18, 2025, 01:18 Jim Meyering <jim <at> meyering.net> wrote:
> >
> > On Tue, Apr 15, 2025 at 9:17 PM Jim Meyering <jim <at> meyering.net> wrote:
> > ...
> > > We're going to have to revise that code.
> > > The difference I see is that before rawhide, that fclose would fail.
> > > It's perfectly fine for fclose to succeed in this case, as now happens
> > > on rawhide (because with 4k BUFSIZ, the fclose wrote nothing -- the
> > > preceding 4096-byte write is what failed).
> > >
> > > Here's a better patch: (technically, we could factor it somewhat, but
> > > readability would suffer disproportionately)
> >
> > I didn't take the time to find a precise commit, but this bug predates
> > the move from closeout.c to gnulib's close-stdout.c in 2006. As I
> > write this, I'm installing Fedora 42.
> > I'll probably push the attached to gnulib tomorrow:
>
> > Exposed via Fedora 42's new glibc vs grep's --help being precisely
> > 4096 bytes.
>
> AFAICT this is not related to F42 or new glibc, it's just longer help
> text in grep-3.12 + Fedora patch making it even longer.  But you should
> see the same behavior on existing systems with e.g.:
>
>     $ { env printf %4095s; env printf %4096s; } > /dev/full
>     printf: write error: Broken pipe
>     printf: write error

Nice. Thanks. Will adjust the commit log and ChangeLog.
I confirmed that coreutils-9.7's printf with this fix is no longer
susceptible to that failure.

Surprised to find that coreutils-9.5 (fedora 41 stock) works fine:

  $ { /bin/printf %4095s; /bin/printf %4096s; } > /dev/full
  /bin/printf: write error: No space left on device
  /bin/printf: write error: No space left on device




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Fri, 18 Apr 2025 06:16:04 GMT) Full text and rfc822 format available.

Message #50 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Grisha Levit <grishalevit <at> gmail.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: jackson <at> fastmail.fm,
 Jaroslav Škarvada <jskarvad <at> redhat.com>,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>,
 Paul Eggert <eggert <at> cs.ucla.edu>, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Fri, 18 Apr 2025 02:15:37 -0400
On Fri, Apr 18, 2025 at 1:51 AM Jim Meyering <jim <at> meyering.net> wrote:
>
> Surprised to find that coreutils-9.5 (fedora 41 stock) works fine:
>
>   $ { /bin/printf %4095s; /bin/printf %4096s; } > /dev/full
>   /bin/printf: write error: No space left on device
>   /bin/printf: write error: No space left on device

Though OTOH (not sure why):

    $ { /bin/printf %4096s; /bin/printf %4097s; } > /dev/full
    /bin/printf: write error: No space left on device
    /bin/printf: write error

----

But I'm concerned that with this change because programs where:

    1. errno is set by a failed (unchecked) write
    2. errno is then set by some unrelated function
    3. close_stdout() is arranged to run atexit
    4. when close_stream() calls fclose() it does not fail

will now erroneously report the error from the unrelated function
when running close_stdout.

Previously:

    $ stdbuf -oL realpath -e . xxx >/dev/full
    realpath: xxx: No such file or directory
    realpath: write error

After the patch:

    $ stdbuf -oL src/realpath -e . xxx >/dev/full
    realpath: xxx: No such file or directory
    realpath: write error: No such file or directory




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Fri, 18 Apr 2025 07:00:06 GMT) Full text and rfc822 format available.

Message #53 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Grisha Levit <grishalevit <at> gmail.com>
Cc: jackson <at> fastmail.fm,
 Jaroslav Škarvada <jskarvad <at> redhat.com>,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>,
 Paul Eggert <eggert <at> cs.ucla.edu>, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Thu, 17 Apr 2025 23:58:40 -0700
On Thu, Apr 17, 2025 at 11:15 PM Grisha Levit <grishalevit <at> gmail.com> wrote:
>
> On Fri, Apr 18, 2025 at 1:51 AM Jim Meyering <jim <at> meyering.net> wrote:
> >
> > Surprised to find that coreutils-9.5 (fedora 41 stock) works fine:
> >
> >   $ { /bin/printf %4095s; /bin/printf %4096s; } > /dev/full
> >   /bin/printf: write error: No space left on device
> >   /bin/printf: write error: No space left on device
>
> Though OTOH (not sure why):
>
>     $ { /bin/printf %4096s; /bin/printf %4097s; } > /dev/full
>     /bin/printf: write error: No space left on device
>     /bin/printf: write error
>
> ----
>
> But I'm concerned that with this change because programs where:
>
>     1. errno is set by a failed (unchecked) write
>     2. errno is then set by some unrelated function
>     3. close_stdout() is arranged to run atexit
>     4. when close_stream() calls fclose() it does not fail
>
> will now erroneously report the error from the unrelated function
> when running close_stdout.
>
> Previously:
>
>     $ stdbuf -oL realpath -e . xxx >/dev/full
>     realpath: xxx: No such file or directory
>     realpath: write error
>
> After the patch:
>
>     $ stdbuf -oL src/realpath -e . xxx >/dev/full
>     realpath: xxx: No such file or directory
>     realpath: write error: No such file or directory

I think I've finally paged back in all of that from 20 years ago.
And I agree: there is no need for my most recent proposed change.
When fclose succeeds, yet there was a "prev_failure" but no fclose
failure, we cannot guarantee errno is relevant, so clearing it **is**
appropriate.
Now, as for what changed in F42 to make us go from printing the ENOSPC
expansion to not printing it, so far I haven't reproduced the failure.
Just built there and this works just as it does on F41:

$ src/grep --help > /dev/full

                                      :
src/grep: write error: No space left on device




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Fri, 18 Apr 2025 07:03:05 GMT) Full text and rfc822 format available.

Message #56 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Grisha Levit <grishalevit <at> gmail.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: jackson <at> fastmail.fm,
 Jaroslav Škarvada <jskarvad <at> redhat.com>,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>,
 Paul Eggert <eggert <at> cs.ucla.edu>, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Fri, 18 Apr 2025 03:02:21 -0400
On Fri, Apr 18, 2025 at 2:58 AM Jim Meyering <jim <at> meyering.net> wrote:
>
> Now, as for what changed in F42 to make us go from printing the ENOSPC
> expansion to not printing it, so far I haven't reproduced the failure.
> Just built there and this works just as it does on F41:
>
> $ src/grep --help > /dev/full
>
>                                       :
> src/grep: write error: No space left on device

To reproduce the test failure originally reported, apply the patch I
mentioned in https://lists.gnu.org/r/bug-grep/2025-04/msg00015.html,
i.e.: https://src.fedoraproject.org/rpms/grep/blob/rawhide/f/grep-3.5-help-align.patch




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Fri, 18 Apr 2025 10:41:04 GMT) Full text and rfc822 format available.

Message #59 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>, Grisha Levit <grishalevit <at> gmail.com>,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>
Cc: Jaroslav Škarvada <jskarvad <at> redhat.com>,
 Paul Eggert <eggert <at> cs.ucla.edu>, jackson <at> fastmail.fm, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Fri, 18 Apr 2025 11:40:17 +0100
On 18/04/2025 06:18, Jim Meyering wrote:
> On Tue, Apr 15, 2025 at 9:17 PM Jim Meyering <jim <at> meyering.net> wrote:
> ...
>> We're going to have to revise that code.
>> The difference I see is that before rawhide, that fclose would fail.
>> It's perfectly fine for fclose to succeed in this case, as now happens
>> on rawhide (because with 4k BUFSIZ, the fclose wrote nothing -- the
>> preceding 4096-byte write is what failed).
>>
>> Here's a better patch: (technically, we could factor it somewhat, but
>> readability would suffer disproportionately)
> 
> I didn't take the time to find a precise commit, but this bug predates
> the move from closeout.c to gnulib's close-stdout.c in 2006. As I
> write this, I'm installing Fedora 42.
> I'll probably push the attached to gnulib tomorrow:

The variance here seems to be due to stdio buffer size:

 $ for b in 0 4096 8192; do
     echo bs=$b
     for ws in 4095 4096 4097; do
       printf "ws=$ws: "
       stdbuf -o${b} printf %${ws}s >/dev/full
     done
   done

 bs=0
 ws=4095: printf: write error
 ws=4096: printf: write error
 ws=4097: printf: write error
 bs=4096
 ws=4095: printf: write error: No space left on device
 ws=4096: printf: write error
 ws=4097: printf: write error
 bs=8192
 ws=4095: printf: write error: No space left on device
 ws=4096: printf: write error: No space left on device
 ws=4097: printf: write error: No space left on device

I.e. write() always gives ENOSPC, it just whether it's called or not.
I.e. we get the more exact error if it's latent until we fflush() at exit.

$ for i in 4095 4096 4097; do ltrace -e fwrite -e fclose -e fflush -e ferror printf %${i}s >/dev/full; done
printf->fwrite("                                "..., 1, 4095, 0x7f15ca8225c0)                = 4095
printf->fflush(0x7f15ca8225c0)                                                                = -1
printf->fclose(0x7f15ca8225c0)                                                                = 0
printf: write error: No space left on device
+++ exited (status 1) +++
printf->fwrite("                                "..., 1, 4096, 0x7fb38377e5c0)                = 0
printf->ferror(0x7fb38377e5c0)                                                                = 1
printf->fflush(0x7fb38377e5c0)                                                                = 0
printf->fclose(0x7fb38377e5c0)                                                                = 0
printf: write error
+++ exited (status 1) +++
printf->fwrite("                                "..., 1, 4097, 0x7fd0372695c0)                = 0
printf->ferror(0x7fd0372695c0)                                                                = 1
printf->fflush(0x7fd0372695c0)                                                                = 0
printf->fclose(0x7fd0372695c0)                                                                = 0
printf: write error
+++ exited (status 1) +++

If a utility wants to give a more exact error it could operate like certain coreutils commands
and issue a specific error at fwrite (or other stdio function) call time.
Note one has to be careful to not output multiple errors in that case,
so it's not always appropriate to follow that course, but if appropriate
(to exit immediately) then ref the coreutils write_error() function:
https://github.com/coreutils/coreutils/blob/0d04b985/src/system.h#L750-L757

Now printf (vasnprintf) doesn't diagnose such write errors immediately,
so --help output may not be the route to ensure an ENOSPC error.
A hacky solution might be to change the test to use --version rather than --help,
to output a smaller amount of data, but that assumes the output is buffered.
It might be appropriate instead to just not look for the specific ENOSPC error.

cheers,
Pádraig




Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Fri, 18 Apr 2025 11:07:02 GMT) Full text and rfc822 format available.

Message #62 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Jim Meyering <jim <at> meyering.net>, Grisha Levit <grishalevit <at> gmail.com>,
 bug-gnulib <at> gnu.org
Cc: Jaroslav Škarvada <jskarvad <at> redhat.com>,
 Paul Eggert <eggert <at> cs.ucla.edu>, jackson <at> fastmail.fm,
 Pádraig Brady <P <at> draigbrady.com>, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Fri, 18 Apr 2025 13:05:53 +0200
Pádraig Brady wrote:
>   bs=0
>   ws=4095: printf: write error
>   ws=4096: printf: write error
>   ws=4097: printf: write error
>   bs=4096
>   ws=4095: printf: write error: No space left on device
>   ws=4096: printf: write error
>   ws=4097: printf: write error
>   bs=8192
>   ws=4095: printf: write error: No space left on device
>   ws=4096: printf: write error: No space left on device
>   ws=4097: printf: write error: No space left on device

Yep. There are cases where the errno value gets lost (due to the ISO C
standard, and even in glibc the errno value gets lost in such cases).

For the user, both messages are nearly equivalent, since a full disk
is the most likely reason for a write error.

> It might be appropriate instead to just not look for the specific ENOSPC error.

+1

Bruno







Information forwarded to bug-grep <at> gnu.org:
bug#77800; Package grep. (Mon, 21 Apr 2025 07:21:05 GMT) Full text and rfc822 format available.

Message #65 received at 77800 <at> debbugs.gnu.org (full text, mbox):

From: Jaroslav Škarvada <jskarvad <at> redhat.com>
To: Grisha Levit <grishalevit <at> gmail.com>
Cc: jackson <at> fastmail.fm, Paul Eggert <eggert <at> cs.ucla.edu>,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>,
 Jim Meyering <jim <at> meyering.net>, 77800 <at> debbugs.gnu.org
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Mon, 21 Apr 2025 09:19:50 +0200
On Fri, Apr 18, 2025 at 7:35 AM Grisha Levit <grishalevit <at> gmail.com> wrote:
>
> On Fri, Apr 18, 2025, 01:18 Jim Meyering <jim <at> meyering.net> wrote:
> >
> > On Tue, Apr 15, 2025 at 9:17 PM Jim Meyering <jim <at> meyering.net> wrote:
> > ...
> > > We're going to have to revise that code.
> > > The difference I see is that before rawhide, that fclose would fail.
> > > It's perfectly fine for fclose to succeed in this case, as now happens
> > > on rawhide (because with 4k BUFSIZ, the fclose wrote nothing -- the
> > > preceding 4096-byte write is what failed).
> > >
> > > Here's a better patch: (technically, we could factor it somewhat, but
> > > readability would suffer disproportionately)
> >
> > I didn't take the time to find a precise commit, but this bug predates
> > the move from closeout.c to gnulib's close-stdout.c in 2006. As I
> > write this, I'm installing Fedora 42.
> > I'll probably push the attached to gnulib tomorrow:
>
>
> > Exposed via Fedora 42's new glibc vs grep's --help being precisely
> > 4096 bytes.
>
> AFAICT this is not related to F42 or new glibc, it's just longer help
> text in grep-3.12 + Fedora patch making it even longer.  But you should
> see the same behavior on existing systems with e.g.:
>
>     $ { env printf %4095s; env printf %4096s; } > /dev/full
>     printf: write error: Broken pipe
>     printf: write error
>
Guys, thanks for the analysis and quick actions. BTW is the Fedora
help patch which uncovered this problem acceptable for upstream?

thanks & regards

Jaroslav





Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Wed, 14 May 2025 18:21:02 GMT) Full text and rfc822 format available.

Notification sent to Jaroslav Škarvada <jskarvad <at> redhat.com>:
bug acknowledged by developer. (Wed, 14 May 2025 18:21:02 GMT) Full text and rfc822 format available.

Message #70 received at 77800-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Jaroslav Škarvada <jskarvad <at> redhat.com>
Cc: 77800-done <at> debbugs.gnu.org, jackson <at> fastmail.fm,
 Paul Eggert <eggert <at> cs.ucla.edu>,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>,
 Grisha Levit <grishalevit <at> gmail.com>
Subject: Re: bug#77800: grep-3.12: write-error-msg test failure on fedora
 rawhide (f43)
Date: Wed, 14 May 2025 11:19:59 -0700
[Message part 1 (text/plain, inline)]
On Mon, Apr 21, 2025 at 12:20 AM Jaroslav Škarvada <jskarvad <at> redhat.com> wrote:
> >     $ { env printf %4095s; env printf %4096s; } > /dev/full
> >     printf: write error: Broken pipe
> >     printf: write error
> >
> Guys, thanks for the analysis and quick actions. BTW is the Fedora
> help patch which uncovered this problem acceptable for upstream?

The two spaces between each long option name and its description are
required by help2man. I.e., this change would mess up the generated
documentation:

-  -L, --files-without-match  print only names of FILEs with no
selected lines\n\
+  -L, --files-without-match print only names of FILEs with no selected lines\n\

I've just pushed the attached to avoid the false test failure.
[grep-tests-write-error-msg.diff (application/octet-stream, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 12 Jun 2025 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 7 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.