GNU bug report logs - #63736
[Shepherd] Loss of SIGCHLD notifications

Previous Next

Package: guix;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Fri, 26 May 2023 12:15:02 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Ludovic Courtès <ludo <at> gnu.org>
To: 63736 <at> debbugs.gnu.org
Subject: bug#63736: [Shepherd] Loss of SIGCHLD notifications
Date: Sat, 27 May 2023 19:00:55 +0200
Ludovic Courtès <ludo <at> gnu.org> skribis:

> Long story short: there seems to be a problem with signal delivery.
> Most likely, the initial grace period expiration above when stopping
> nscd is a symptom of shepherd no longer receiving/processing SIGCHLD
> rather than the cause.

Another possibility is lockup: one of the relevant fibers is either gone
or stuck in ‘put-message’ or ‘get-message’.

I did two things:

  b9a37f3 shepherd: Make signal handling fiber an essential task.
  8ae2780 service: Do not attempt to restart transient services.

Commit 8ae2780 fixes a bug whereby ‘herd restart’ could end up
attempting to restart a transient service, which would lock up the
calling fiber because the service’s controlling fiber would first
receive the 'terminate message, so it would return and nobody would be
reading further messages send on its channel.

Commit b9a37f3 will allows us to ensure that the signal-handling fiber
never exits (and we’ll get a trace in the log if it tries to).

Ludo’.




This bug report was last modified 1 year and 239 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.