GNU bug report logs - #30299
shepherd fails tests on all systems except x86_64

Previous Next

Package: guix;

Reported by: Mark H Weaver <mhw <at> netris.org>

Date: Wed, 31 Jan 2018 03:09:02 UTC

Severity: serious

Done: Marius Bakke <mbakke <at> fastmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: ludo <at> gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw <at> netris.org>
Cc: 30299 <at> debbugs.gnu.org
Subject: bug#30299: [core-updates] shepherd fails tests on all systems except x86_64
Date: Sat, 17 Feb 2018 01:04:00 +0100
Hello,

Mark H Weaver <mhw <at> netris.org> skribis:

> However, on armhf-linux, three tests failed: respawn.sh,
> respawn-throttling.sh, and pid-file.sh.
>
>   https://hydra.gnu.org/build/2499835

(Similar issue on aarch64:
<https://berlin.guixsd.org/build/2419870/log/raw>.  Though of course it
passed on the 2nd and 3rd attempts…)

I was able to reproduce a tests/respawn.sh failure on hardware (ARMv7).
The issue is that a service is not respawned, and the log shows:

--8<---------------cut here---------------start------------->8---
+ assert_killed_service_is_respawned t-service2-pid-695
++ cat t-service2-pid-695
+ old_pid=789
+ rm t-service2-pid-695
+ kill 789
+ wait_for_file t-service2-pid-695
+ i=0
+ test -f t-service2-pid-695
+ test 0 -lt 20
+ sleep 0.3
++ expr 0 + 1

[...]

2018-02-16 11:13:31 Service root has been started.
2018-02-16 11:13:32 Service test1 has been started.
2018-02-16 11:13:34 Service test2 has been started.
2018-02-16 11:13:35 Respawning test1.
2018-02-16 11:13:35 Service test1 has been started.
2018-02-16 11:13:36 Respawning test2.
2018-02-16 11:13:37 Service test2 has been started.
2018-02-16 11:13:37 Respawning test1.
2018-02-16 11:13:37 Service test1 has been started.
2018-02-16 11:13:38 Respawning test2.
2018-02-16 11:13:43 Service test2 could not be started.
--8<---------------cut here---------------end--------------->8---

So SIGCHLD was correctly delivered, but somehow restarting that service
didn’t work (its PID file didn’t show up again; the 5 seconds between
“Respawning” and “could not be started” correspond to the delay in
‘read-pid-file’ in (shepherd service)).  

These test failures seem to be more frequent when the machine is loaded.

Ludo’.




This bug report was last modified 5 years and 143 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.