GNU bug report logs - #77274
[Shepherd] Competing one-shot service starter gets erroneous failure

Previous Next

Package: guix;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Wed, 26 Mar 2025 10:15:02 UTC

Severity: normal

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#77274: closed ([Shepherd] Competing one-shot service starter
 gets erroneous failure)
Date: Wed, 26 Mar 2025 11:37:01 +0000
[Message part 1 (text/plain, inline)]
Your message dated Wed, 26 Mar 2025 12:36:17 +0100
with message-id <874izgkor2.fsf <at> gnu.org>
and subject line Re: bug#77274: [Shepherd] Competing one-shot service starter gets erroneous failure
has caused the debbugs.gnu.org bug report #77274,
regarding [Shepherd] Competing one-shot service starter gets erroneous failure
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
77274: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=77274
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Ludovic Courtès <ludo <at> gnu.org>
To: bug-guix <at> gnu.org
Subject: [Shepherd] Competing one-shot service starter gets erroneous failure
Date: Wed, 26 Mar 2025 11:14:24 +0100
[Message part 3 (text/plain, inline)]
As of 1.0.3, when two clients start the same one-shot service, the one
that loses the race never sees the value that was produced by the
‘start’ method.

  herd start one-shot & herd start one-shot

Here one of the ‘herd start’ processes will wrongfully fail with “failed
to start service one-shot”.

Instead, it calls ‘service-running-value’ but that always returns #f
because the one-shot service was stopped in the meantime.  I’m referring
to this bit of ‘start-service’:

      (match (get-message reply)
        (#f
         ;; We lost the race: SERVICE is already running.
         (service-running-value service))   ;<- here
        …)

Attached is a reproducer.

Ludo’.

[Message part 4 (text/x-patch, inline)]
diff --git a/tests/one-shot.sh b/tests/one-shot.sh
index eeecea7..491eeae 100644
--- a/tests/one-shot.sh
+++ b/tests/one-shot.sh
@@ -1,5 +1,5 @@
 # GNU Shepherd --- Test one-shot services.
-# Copyright © 2019, 2023-2024 Ludovic Courtès <ludo <at> gnu.org>
+# Copyright © 2019, 2023-2025 Ludovic Courtès <ludo <at> gnu.org>
 #
 # This file is part of the GNU Shepherd.
 #
@@ -197,4 +197,35 @@ test "$(cat "$stamp")" = "third"
 $herd start fourth && false
 $herd start fourth && false
 
+# Check the behavior of two clients competing to start the same one-shot
+# service.  Both should succeed.
+
+cat > "$conf" <<EOF
+(register-services
+  (list (service
+          '(fifth)
+          #:one-shot? #t
+          #:start (lambda ()
+                    (let loop ()
+                      (unless (file-exists? "$stamp")
+                        (sleep 0.5)
+                        (loop)))
+                    #t))))
+EOF
+
+$herd load root "$conf"
+
+rm -f "$stamp"
+
+$herd start fifth &
+herd_start_pid1=$!
+$herd start fifth &
+herd_start_pid2=$!
+until $herd status fifth | grep starting; do sleep 0.5; done
+touch "$stamp"			# trigger starting->running transition
+
+# Both 'herd start' processes should have succeeded.
+wait $herd_start_pid1
+wait $herd_start_pid2
+
 $herd stop root
[Message part 5 (message/rfc822, inline)]
From: Ludovic Courtès <ludo <at> gnu.org>
To: 77274-done <at> debbugs.gnu.org
Subject: Re: bug#77274: [Shepherd] Competing one-shot service starter gets
 erroneous failure
Date: Wed, 26 Mar 2025 12:36:17 +0100
Ludovic Courtès <ludo <at> gnu.org> skribis:

> As of 1.0.3, when two clients start the same one-shot service, the one
> that loses the race never sees the value that was produced by the
> ‘start’ method.
>
>   herd start one-shot & herd start one-shot
>
> Here one of the ‘herd start’ processes will wrongfully fail with “failed
> to start service one-shot”.
>
> Instead, it calls ‘service-running-value’ but that always returns #f
> because the one-shot service was stopped in the meantime.  I’m referring
> to this bit of ‘start-service’:
>
>       (match (get-message reply)
>         (#f
>          ;; We lost the race: SERVICE is already running.
>          (service-running-value service))   ;<- here
>         …)

Fixed in f730106fe1cf9a3efc2f327cc5716335585ac92b.

Ludo'.


This bug report was last modified 54 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.