GNU bug report logs -
#40981
shepherd 0.8.0 race condition can lead to stopping itself
Previous Next
Reported by: Mathieu Othacehe <m.othacehe <at> gmail.com>
Date: Thu, 30 Apr 2020 11:52:02 UTC
Severity: important
Merged with 41429
Done: Mathieu Othacehe <mathieu <at> meru.i-did-not-set--mail-host-address--so-tickle-me>
Bug is archived. No further changes may be made.
Full log
Message #14 received at 40981 <at> debbugs.gnu.org (full text, mbox):
Hi!
Mathieu Othacehe <m.othacehe <at> gmail.com> skribis:
>> I'll keep looking!
>
> Ok, getting closer. Here's a suspect part of Shepherd strace log:
>
> [pid 1] stat("/etc/localtime", {st_mode=S_IFREG|0444, st_size=2298, ...}) = 0
> [pid 1] write(9, "shepherd[1]: changing HTTP/HTTPS"..., 86) = 86
> [pid 1] getpgid(194) = 194
> [pid 1] kill(-194, SIGTERM) = 0
>
>
> I think the problem is introduced by commit
> 1e7a91d21f1cc5d02697680e19e3878ff8565710 in Shepherd.
OK, but the trace above is “as expected”, isn’t it?
> "(getpgid <guix-daemon-pid>") returns 0, and calling "(kill 0 SIGTERM)"
> kills all processes.
What made you think of this scenario?
I don’t think getpgid(2) can return 0. Or am I missing something?
Since guix-dameon doesn’t actually daemonize, getpgid(pid) = pid.
Running this (in a VM) works fine:
while herd set-http-proxy guix-daemon foo ; do : ; done
Thanks for debugging!
Ludo’.
This bug report was last modified 4 years and 341 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.