GNU bug report logs - #57922
Shepherd doesn't seem to correctly handle waitpid itself

Previous Next

Package: guix;

Reported by: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Date: Mon, 19 Sep 2022 04:30:03 UTC

Severity: normal

Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Ludovic Courtès <ludo <at> gnu.org>
To: Josselin Poiret <dev <at> jpoiret.xyz>
Cc: 57922 <at> debbugs.gnu.org, Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Subject: bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
Date: Fri, 23 Sep 2022 08:33:28 +0200
Hi,

Josselin Poiret <dev <at> jpoiret.xyz> skribis:

> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> writes:

[...]

>> 1. It requires to be installed in the signal handlers for each
>> processes, with something like:
>>
>> --8<---------------cut here---------------start------------->8---
>>   (unless %sigchld-handler-installed?
>>     (sigaction SIGCHLD handle-SIGCHLD SA_NOCLDSTOP)
>>     (set! %sigchld-handler-installed? #t))
>> --8<---------------cut here---------------end--------------->8---
>>
>> Done for fork+exec-command and make-inetd-forkexec-constructor, but not
>> for make-forkexec-constructor/container, AFAICT;
>
> The signal handler is only installed once in PID 1 (in fact, you haven't
> forked yet here), since it's the one that receives the SIGCHLD.

Right.

> What I don't understand that well is that this signal handler could be
> installed only once when shepherd starts, right?  That way, it wouldn't
> need to depend on specific start actions being chosen.

The SIGCHLD handler is installed lazily since
f776de04e6702e18d95152072e78c43441d3ccc3.  The rationale was discussed
here:

  https://issues.guix.gnu.org/27553

That said, on GNU/Linux, SIGCHLD is actually blocked and instead we rely
on signalfd(2).  It’s from the main even loop in shepherd.scm that the
signal handler is called.

>> Here's a small reproducer to apply on our code base:
>>
>> --8<---------------cut here---------------start------------->8---
>> modified   gnu/services/telephony.scm
>> @@ -685,13 +685,7 @@ (define (archive-name->username archive)
>>  
>>                      ;; Finally, return the PID of the daemon process.
>>                      daemon-pid))
>> -               (stop
>> -                #~(lambda (pid . args)
>> -                    (kill pid SIGKILL)
>> -                    ;; Wait for the process to exit; this prevents overlapping
>> -                    ;; processes when issuing 'herd restart'.
>> -                    (waitpid pid)
>> -                    #f))))))))
>> +               (stop #~(make-kill-destructor))))))))

I think the main difference between these two is that the first one uses
SIGKILL while the second one uses SIGTERM.

You could try #~(make-kill-destructor SIGKILL) to get the same effect.

(Another difference is that ‘make-kill-destructor’ kills the process
group, not just the process itself.)

Anyway, the key point is that shepherd takes care of calling ‘waitpid’
for its child processes (services).  If you call it yourself as in the
snippet above, you’re racing with shepherd; in the case above it
probably doesn’t make any difference though because it will consider
that the service is stopped in any case.

HTH!

Ludo’.




This bug report was last modified 2 years and 237 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.