GNU bug report logs -
#77613
grep-3.11.69-a4628 on GNU/Hurd
Previous Next
To reply to this bug, email your comments to 77613 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org
:
bug#77613
; Package
grep
.
(Mon, 07 Apr 2025 17:34:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Bruno Haible <bruno <at> clisp.org>
:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org
.
(Mon, 07 Apr 2025 17:34:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On
- GNU/Hurd x86_64 from 2024,
- GNU/Hurd i386 from 2023,
I see a test hang: hash-collision-perf.
On GNU/Hurd x86_64:
When I interrupted the build, the file 'in' has 5120000 lines, and
find attached the log file of this test. As you can see, the value of
small_ms stays 0 even for larger files.
By running
$ date; LC_ALL=C ../../src/grep --file=in empty; date
I can see that the execution times grow like this:
640000 0.3 sec
1280000 0.9 sec
2560000 1.5 sec
5120000 > 60 sec
On GNU/Hurd i386, it's similar. Here it's when the file 'in' has
40960000 lines, that the grep execution hangs. Find attached the
last stack trace I was able to obtain before it hung.
Regardless how much RAM I give to the machine, there will always
be a point where "grep --file=in empty" will take more RAM than
available, and (since Hurd does not have an OOM killer) the machine
then hangs.
IMO, the correct behaviour would be that 'grep' exits via xalloc_die(),
not that it hangs.
Whereas on GNU/Linux (in a machine that has the same amount of RAM as
the GNU/Hurd machine):
$ : > empty
$ seq 640000 > in; LC_ALL=C time ./src/grep --file=in empty
real 0.44s
$ seq 1280000 > in; LC_ALL=C time ./src/grep --file=in empty
real 0.99s
$ seq 2560000 > in; LC_ALL=C time ./src/grep --file=in empty
real 2.22s
$ seq 5120000 > in; LC_ALL=C time ./src/grep --file=in empty
real 4.84s
$ seq 10240000 > in; LC_ALL=C time ./src/grep --file=in empty
real 24.19s
$ seq 20480000 > in; LC_ALL=C time ./src/grep --file=in empty
Killed
real 24.40s
Here it was the OOM killer that saved the machine from hanging.
So, IMO, there are two bugs:
1) When the allocation of the kwset takes more memory than available,
'grep' should exit via xalloc_die(), instead of waiting to be killed
by the OOM killer.
2) In the 'hash-collision-perf' unit test: The use of a perl primitive
for measuring the execution time of a child process, that is not
properly ported to GNU/Hurd.
Bruno
[hash-collision-perf.log (text/x-log, attachment)]
[last-stacktrace.png (image/png, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#77613
; Package
grep
.
(Tue, 08 Apr 2025 05:49:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 77613 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, Apr 7, 2025 at 10:34 AM Bruno Haible via Bug reports for GNU
grep <bug-grep <at> gnu.org> wrote:
> On
> - GNU/Hurd x86_64 from 2024,
> - GNU/Hurd i386 from 2023,
> I see a test hang: hash-collision-perf.
>
> On GNU/Hurd x86_64:
>
> When I interrupted the build, the file 'in' has 5120000 lines, and
> find attached the log file of this test. As you can see, the value of
> small_ms stays 0 even for larger files.
>
> By running
> $ date; LC_ALL=C ../../src/grep --file=in empty; date
> I can see that the execution times grow like this:
> 640000 0.3 sec
> 1280000 0.9 sec
> 2560000 1.5 sec
> 5120000 > 60 sec
>
> On GNU/Hurd i386, it's similar. Here it's when the file 'in' has
> 40960000 lines, that the grep execution hangs. Find attached the
> last stack trace I was able to obtain before it hung.
>
> Regardless how much RAM I give to the machine, there will always
> be a point where "grep --file=in empty" will take more RAM than
> available, and (since Hurd does not have an OOM killer) the machine
> then hangs.
>
> IMO, the correct behaviour would be that 'grep' exits via xalloc_die(),
> not that it hangs.
>
> Whereas on GNU/Linux (in a machine that has the same amount of RAM as
> the GNU/Hurd machine):
>
> $ : > empty
> $ seq 640000 > in; LC_ALL=C time ./src/grep --file=in empty
> real 0.44s
> $ seq 1280000 > in; LC_ALL=C time ./src/grep --file=in empty
> real 0.99s
> $ seq 2560000 > in; LC_ALL=C time ./src/grep --file=in empty
> real 2.22s
> $ seq 5120000 > in; LC_ALL=C time ./src/grep --file=in empty
> real 4.84s
> $ seq 10240000 > in; LC_ALL=C time ./src/grep --file=in empty
> real 24.19s
> $ seq 20480000 > in; LC_ALL=C time ./src/grep --file=in empty
> Killed
> real 24.40s
>
> Here it was the OOM killer that saved the machine from hanging.
>
> So, IMO, there are two bugs:
>
> 1) When the allocation of the kwset takes more memory than available,
> 'grep' should exit via xalloc_die(), instead of waiting to be killed
> by the OOM killer.
>
> 2) In the 'hash-collision-perf' unit test: The use of a perl primitive
> for measuring the execution time of a child process, that is not
> properly ported to GNU/Hurd.
Thanks for reporting that!
Adding a timeout should resolve this. Expect to push tomorrow:
[gr-Hurd-hang.diff (application/octet-stream, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#77613
; Package
grep
.
(Tue, 08 Apr 2025 07:55:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 77613 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi Jim,
> > So, IMO, there are two bugs:
> >
> > 1) When the allocation of the kwset takes more memory than available,
> > 'grep' should exit via xalloc_die(), instead of waiting to be killed
> > by the OOM killer.
> >
> > 2) In the 'hash-collision-perf' unit test: The use of a perl primitive
> > for measuring the execution time of a child process, that is not
> > properly ported to GNU/Hurd.
>
> Thanks for reporting that!
> Adding a timeout should resolve this. Expect to push tomorrow:
No, it does not resolve the problem.
In both of my Hurd machines, with the patch, the 'hash-collision-perf'
unit test is still running after 20 minutes.
In the Hurd (32-bit) machine, a 'grep --file=in empty' command crashed from
signal 6 (SIGABRT); see attached screenshot.
Both machines are unresponsive and need to be rebooted.
[hurd-hang.png (image/png, inline)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#77613
; Package
grep
.
(Tue, 08 Apr 2025 14:28:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 77613 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Tue, Apr 8, 2025 at 12:54 AM Bruno Haible <bruno <at> clisp.org> wrote:
> Hi Jim,
>
> > > So, IMO, there are two bugs:
> > >
> > > 1) When the allocation of the kwset takes more memory than available,
> > > 'grep' should exit via xalloc_die(), instead of waiting to be killed
> > > by the OOM killer.
> > >
> > > 2) In the 'hash-collision-perf' unit test: The use of a perl primitive
> > > for measuring the execution time of a child process, that is not
> > > properly ported to GNU/Hurd.
> >
> > Thanks for reporting that!
> > Adding a timeout should resolve this. Expect to push tomorrow:
>
> No, it does not resolve the problem.
>
> In both of my Hurd machines, with the patch, the 'hash-collision-perf'
> unit test is still running after 20 minutes.
> In the Hurd (32-bit) machine, a 'grep --file=in empty' command crashed from
> signal 6 (SIGABRT); see attached screenshot.
> Both machines are unresponsive and need to be rebooted.
Oh! Sorry. I made only the final invocation use the timeout. Must use
it in the loop, too.
Here's a better patch:
[gr-Hurd-hang.diff (application/octet-stream, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#77613
; Package
grep
.
(Tue, 08 Apr 2025 15:03:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 77613 <at> debbugs.gnu.org (full text, mbox):
Jim Meyering wrote:
> Oh! Sorry. I made only the final invocation use the timeout. Must use
> it in the loop, too.
> Here's a better patch:
The 'hash-collision-perf' test still hangs both of my GNU/Hurd machines.
One of the machines now says:
vm_page warning: unable to recycle any page
The reason is that you terminate the loop if small_ms >= 200, but
small_ms is always 0, each time.
Bruno
Information forwarded
to
bug-grep <at> gnu.org
:
bug#77613
; Package
grep
.
(Tue, 08 Apr 2025 16:20:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 77613 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Tue, Apr 8, 2025 at 8:02 AM Bruno Haible <bruno <at> clisp.org> wrote:
> Jim Meyering wrote:
> > Oh! Sorry. I made only the final invocation use the timeout. Must use
> > it in the loop, too.
> > Here's a better patch:
>
> The 'hash-collision-perf' test still hangs both of my GNU/Hurd machines.
> One of the machines now says:
> vm_page warning: unable to recycle any page
>
> The reason is that you terminate the loop if small_ms >= 200, but
> small_ms is always 0, each time.
Thanks again. This one should do it: skipping the test in that case.
[gr-Hurd-hang-skip.diff (application/octet-stream, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#77613
; Package
grep
.
(Tue, 08 Apr 2025 16:39:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 77613 <at> debbugs.gnu.org (full text, mbox):
Jim Meyering wrote:
> Thanks again. This one should do it: skipping the test in that case.
Yes, this one does it. Now "make check" proceeds through all tests.
No failure.
Bruno
This bug report was last modified 69 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.