GNU bug report logs - #40981
shepherd 0.8.0 race condition can lead to stopping itself

Previous Next

Package: guix;

Reported by: Mathieu Othacehe <m.othacehe <at> gmail.com>

Date: Thu, 30 Apr 2020 11:52:02 UTC

Severity: important

Merged with 41429

Done: Mathieu Othacehe <mathieu <at> meru.i-did-not-set--mail-host-address--so-tickle-me>

Bug is archived. No further changes may be made.

Full log


Message #14 received at 40981 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Mathieu Othacehe <m.othacehe <at> gmail.com>
Cc: 40981 <at> debbugs.gnu.org
Subject: Re: bug#40981: Graphical installer tests sometimes hang.
Date: Tue, 05 May 2020 12:00:25 +0200
Hi!

Mathieu Othacehe <m.othacehe <at> gmail.com> skribis:

>> I'll keep looking!
>
> Ok, getting closer. Here's a suspect part of Shepherd strace log:
>
> [pid     1] stat("/etc/localtime", {st_mode=S_IFREG|0444, st_size=2298, ...}) = 0
> [pid     1] write(9, "shepherd[1]: changing HTTP/HTTPS"..., 86) = 86
> [pid     1] getpgid(194)                = 194
> [pid     1] kill(-194, SIGTERM)         = 0
>
>
> I think the problem is introduced by commit
> 1e7a91d21f1cc5d02697680e19e3878ff8565710 in Shepherd.

OK, but the trace above is “as expected”, isn’t it?

> "(getpgid <guix-daemon-pid>") returns 0, and calling "(kill 0 SIGTERM)"
> kills all processes.

What made you think of this scenario?

I don’t think getpgid(2) can return 0.  Or am I missing something?
Since guix-dameon doesn’t actually daemonize, getpgid(pid) = pid.

Running this (in a VM) works fine:

  while herd set-http-proxy guix-daemon foo ; do : ; done

Thanks for debugging!

Ludo’.




This bug report was last modified 4 years and 341 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.