GNU bug report logs -
#40981
shepherd 0.8.0 race condition can lead to stopping itself
Previous Next
Reported by: Mathieu Othacehe <m.othacehe <at> gmail.com>
Date: Thu, 30 Apr 2020 11:52:02 UTC
Severity: important
Merged with 41429
Done: Mathieu Othacehe <mathieu <at> meru.i-did-not-set--mail-host-address--so-tickle-me>
Bug is archived. No further changes may be made.
Full log
Message #20 received at 40981 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hello,
I found what happened here. Turns out, when a process is forked and
before "setsid" or "setpgid" is called, it shares the parent PGID. In
that case the parent is Shepherd, with the PGID 0.
When doing the following actions:
* stop guix-daemon
* start guix-daemon
* stop guix-daemon
* start guix-daemon
If the second stop occurs after "fork" has been done, but before
"setsid", then "(getpgid)" returns 0. The naive patch attached could fix
the situation.
WDYT?
Mathieu
[0001-service-Fix-make-kill-destructor-when-PGID-is-zero.patch (text/x-diff, inline)]
From 0e4167251a56d6baa4f51fe72250a6e3bffae8c3 Mon Sep 17 00:00:00 2001
From: Mathieu Othacehe <m.othacehe <at> gmail.com>
Date: Wed, 6 May 2020 11:48:26 +0200
Subject: [PATCH] service: Fix 'make-kill-destructor' when PGID is zero.
When a process is forked, and before its GID is changed in "exec-command",
it will share the parent GID, which is 0 for Shepherd. In that case, use
the PID instead of the PGID.
* modules/shepherd/service.scm (make-kill-destructor): Handle the case when
PGID is zero, between the process fork and exec.
---
modules/shepherd/service.scm | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 8604d2f..258992c 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -5,6 +5,7 @@
;; Copyright (C) 2016 Alex Kost <alezost <at> gmail.com>
;; Copyright (C) 2018 Carlo Zancanaro <carlo <at> zancanaro.id.au>
;; Copyright (C) 2019 Ricardo Wurmus <rekado <at> elephly.net>
+;; Copyright (C) 2020 Mathieu Othacehe <m.othacehe <at> gmail.com>
;;
;; This file is part of the GNU Shepherd.
;;
@@ -957,11 +958,15 @@ start."
"Return a procedure that sends SIGNAL to the process group of the PID given
as argument, where SIGNAL defaults to `SIGTERM'."
(lambda (pid . args)
- ;; Kill the whole process group PID belongs to. Don't assume that PID
- ;; is a process group ID: that's not the case when using #:pid-file,
- ;; where the process group ID is the PID of the process that
- ;; "daemonized".
- (kill (- (getpgid pid)) signal)
+ ;; Kill the whole process group PID belongs to. Don't assume that PID is
+ ;; a process group ID: that's not the case when using #:pid-file, where
+ ;; the process group ID is the PID of the process that "daemonized". If
+ ;; this procedure is called, between the process fork and exec, the PGID
+ ;; will still be zero (the Shepherd PGID). In that case, use the PID.
+ (let ((pgid (getpgid pid)))
+ (if pgid
+ (kill (- pgid) signal)
+ (kill pid signal)))
#f))
;; Produce a constructor that executes a command.
--
2.26.0
This bug report was last modified 4 years and 341 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.