GNU bug report logs - #77634
[shepherd] Test failures on GNU/Hurd

Previous Next

Package: guix;

Reported by: yelninei <at> tutamail.com

Date: Tue, 8 Apr 2025 08:44:02 UTC

Severity: normal

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: yelninei <at> tutamail.com
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 77634 <77634 <at> debbugs.gnu.org>
Subject: bug#77634: [shepherd] Test failures on GNU/Hurd
Date: Mon, 14 Apr 2025 16:53:35 +0200 (CEST)
Hello Ludo,
Apr 14, 2025, 13:28 by ludo <at> gnu.org:

> Hi yelninei,
>
> yelninei <at> tutamail.com writes:
>
>>>> FAIL tests/logging-failure.sh (exit status: 124)
>>>>
>
> [...]
>
>> I was able to reproduce this by extracting the shepherd conf from the
>> test and and attempting to run the test manually on the childhurd with
>> the cross compiled shepherd (so you should not need to have a native
>> toolchain available)
>>
>>
>> Might be the attempt to write to /proc.
>>
>> Starting the log-directory-not-writable service causes the entire shepherd to become completely unresponsive causing the timeout failure.
>>
>> A similar behavior occurs with
>>
>> (call-with-output-file "/proc/1/something"
>>   (lambda (port)
>>     (display "Hello" port)))
>>
>> which hangs my guile completely.
>>
>> When I change the log file to a more normal unwritable path the test passes.
>>
>
> Ah, interesting.  So /something instead of /proc/1/something would work,
> right?  (I picked /proc/1/something because ‘guix shell -C’ currently
> gives a writable root file system… Going to be fixed with
> <https://issues.guix.gnu.org/77638>.)
>
> You could also report the procfs issue to bug-hurd I guess.  :-)
>

Yes.
I think reproducing this on the latest mach/hurd first would be better but I have been a bit busy trying to understand what went wrong with glibc2.41


>>>> FAIL tests/services/system-log.sh (exit status: 124)
>>>>
>>>
>>> This one also hanged more than 3m it seems.
>>>
>> The shepherd syslog seems to be extremely slow:
>>
>> I extracted the logger.scm and conf.scm from the test, removed the $
>> from the shell variables variables, started shepherd, echoed the test
>> msg into  the kmsg file and then tried to start the shepherd syslog.
>>
>> It took multiple minutes to see the "Service system-log running with
>> value" in the log and then another couple of minutes for "herd start
>> syslogd" to return. Afterwards trying to query the syslog status (or
>> trying to interact with shepherd in any other way) takes forever to
>> complete while syslogd is running.
>>
>> I did this with the 1.0.4rc1 shepherd to work around the blocking
>> socket problem with guix-daemon (#77610, it still fails on the first
>> connection now but idk if this is a problem with shepherd or the
>> guix-daemon)
>>
>> With the 1.0.3 shepherd syslogd seems to be a lot quicker to start
>> initally ( see the log from the first message) but still extremely
>> slow to interact with afterwards.
>>
>
> Weird.  Could you bisect between 1.0.3 and HEAD to try and find the
> source of slowness?
>
> I wonder if f1a82845eaf8851af9a811e5a1d185b68b1c363f might explain it,
> but that’s pre-1.0.3.
>
> Alternatively, you can try running ‘shepherd’ under ‘rpctrace’ for the
> syslog slowness issue, so we get an idea of what it’s doing.
>
> Thanks a lot for helping out!
>
> Ludo’.
>
It is still slow on 1.0.3 to do anything once its started but with 1.0.4rc1 even starting is extremely painful.
I tried to manually add features to the system-log-service and the thing that causes everything to slow down drastically is the addition of the #:kernel-log-file (even if it is empty).

Setting that to #f instead makes the test now fail considerable faster (1.8s rather than 900).

Running the logger manually only the syslog_remote_file seems to be created. This is consistent with the log output from the test.
I also added a pk for the message-sender to investigate.
Difference I can see is that on Linux most of them are #f but here are #(1 \x00") instead
FAIL: tests/services/system-log
===============================

+ shepherd --version
System lacks support for 'signalfd'; using fallback mechanism.
shepherd (GNU Shepherd) 1.0.4rc1
Copyright (C) 2025 the Shepherd authors
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
+ herd --version
herd (GNU Shepherd) 1.0.4rc1
Copyright (C) 2025 the Shepherd authors
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
+ socket=t-socket-25332
+ conf=t-conf-25332
+ log=t-log-25332
+ pid=t-pid-25332
+ logger=/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-logger-25332.scm
+ kmsg=/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-kmsg-25332
+ syslog_file=/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-25332
+ syslog_auth_file=/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-auth-25332
+ syslog_debug_file=/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-debug-25332
+ syslog_remote_file=/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332
+ syslog_socket=/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-socket-25332
+ herd='herd -s t-socket-25332'
+ trap 'zcat t-log-25332.* || true; cat t-log-25332 || true;
      rm -f t-socket-25332 t-conf-25332 t-log-25332* /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-logger-25332.scm /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-kmsg-25332 /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-socket-25332;
      rm -f /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-25332* /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-auth-25332* /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-debug-25332* /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332*;
      test -f t-pid-25332 && kill `cat t-pid-25332` || true; rm -f t-pid-25332' EXIT
+ cat
+ cat
+ chmod +x /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-logger-25332.scm
+ cat
+ rm -f t-pid-25332
+ test -f t-pid-25332
+ shepherd -I -s t-socket-25332 -c t-conf-25332 -l t-log-25332 --pid=t-pid-25332
+ sleep 0.3
System lacks support for 'signalfd'; using fallback mechanism.
GNU Shepherd 1.0.4rc1 (Guile 3.0.9, i586-unknown-gnu)
Starting service root...
Service root started.
Service root running with value #<<process> id: 25340 command: #f>.
Service root has been started.
Configuration successfully loaded from 't-conf-25332'.
+ test -f t-pid-25332
+ herd -s t-socket-25332 eval root '(gc)'
Evaluating user expression (gc).
"#<unspecified>"
++ file_descriptor_count
++ wc -l
+++ cat t-pid-25332
++ ls -l '/proc/25340/fd/[0-9]*'
ls: cannot access '/proc/25340/fd/[0-9]*': No such file or directory
+ initial_fd_count=0
+ herd -s t-socket-25332 start logger
Starting service system-log...
Service system-log has been started.
Service system-log started.
Service system-log running with value #<system-log 39c5290>.
Starting service logger...
Service logger has been started.
+ herd -s t-socket-25332 status logger
+ grep stopped
  It is stopped since 14:39:57 (0 seconds ago).
+ grep 'starting logger' t-log-25332
2025-04-14 14:39:57 t-syslog-logger-25332.scm[25348] starting logger
+ grep 'sudo:.* session opened' /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-auth-25332
grep: /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-auth-25332: No such file or directory
+ zcat 't-log-25332.*'
gzip: t-log-25332.*.gz: No such file or directory
+ true
+ cat t-log-25332
2025-04-14 14:39:57 GNU Shepherd 1.0.4rc1 (Guile 3.0.9, i586-unknown-gnu)
2025-04-14 14:39:57 Starting service root...
2025-04-14 14:39:57 Service root started.
2025-04-14 14:39:57 Service root running with value #<<process> id: 25340 command: #f>.
2025-04-14 14:39:57 Service root has been started.
2025-04-14 14:39:57 Configuration successfully loaded from 't-conf-25332'.
2025-04-14 14:39:57 Evaluating user expression (gc).
2025-04-14 14:39:57 Starting service system-log...
2025-04-14 14:39:57 Service system-log has been started.
2025-04-14 14:39:57 Service system-log started.
2025-04-14 14:39:57 Service system-log running with value #<system-log 39c5290>.
2025-04-14 14:39:57 Starting service logger...
2025-04-14 14:39:57 Service logger has been started.
2025-04-14 14:39:57 Service logger started.
2025-04-14 14:39:57 Service logger running with value #<<process> id: 25348 command: ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-logger-25332.scm")>.
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (system-log-message-sender-> #(1 "\x00"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (message-destination-> ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (system-log-message-sender-> #(2 2130706433 52204))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (message-destination-> ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332"))
2025-04-14 14:39:57 t-syslog-logger-25332.scm[25348] starting logger
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (system-log-message-sender-> #(1 "\x00"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (message-destination-> ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (system-log-message-sender-> #(1 "\x00"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (message-destination-> ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (system-log-message-sender-> #(1 "\x00"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (message-destination-> ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (system-log-message-sender-> #(1 "\x00"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (message-destination-> ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (system-log-message-sender-> #(1 "\x00"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (message-destination-> ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (system-log-message-sender-> #(1 "\x00"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (message-destination-> ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (system-log-message-sender-> #(1 "\x00"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (message-destination-> ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (system-log-message-sender-> #(1 "\x00"))
2025-04-14 14:39:57 
2025-04-14 14:39:57 ;;; (message-destination-> ("/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332"))
2025-04-14 14:39:57 Service logger has been disabled.
+ rm -f t-socket-25332 t-conf-25332 t-log-25332 /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-logger-25332.scm /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-kmsg-25332 /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-socket-25332
+ rm -f '/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-25332*' '/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-auth-25332*' '/tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-debug-25332*' /tmp/guix-build-shepherd-libevent-1.0.99-git.drv-0/source/t-syslog-remote-25332
+ test -f t-pid-25332
++ cat t-pid-25332
+ kill 25340
+ rm -f t-pid-25332
FAIL tests/services/system-log.sh (exit status: 2)
Stopping service root...
Exiting shepherd...
Service log-rotation is not running.
Service logger is not running.
Exiting.






This bug report was last modified 23 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.