GNU bug report logs -
#74284
Shepherd does not respect ordering for one-shot? services
Previous Next
Reported by: Tomas Volf <~@wolfsden.cz>
Date: Sat, 9 Nov 2024 16:54:02 UTC
Severity: normal
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 74284 in the body.
You can then email your comments to 74284 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org
:
bug#74284
; Package
guix
.
(Sat, 09 Nov 2024 16:54:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Tomas Volf <~@wolfsden.cz>
:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org
.
(Sat, 09 Nov 2024 16:54:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hello,
I think I found a bug in the GNU Shepherd. Dependencies between
one-shot? #t services do not seem to be respected.
Documentation for #:requirement says the following (emphasis mine):
--8<---------------cut here---------------start------------->8---
#:requirement is, like provision, a list of symbols that specify
services. In this case, they name what this service depends on: before
the service can be started, services that provide those symbols *must be
started*.
Note that every name listed in #:requirement must be registered so it
can be resolved (see Service Registry).
--8<---------------cut here---------------end--------------->8---
Documentation for #:one-shot? says the following:
--8<---------------cut here---------------start------------->8---
Whether the service is a one-shot service. A one-shot service is a
service that, as soon as it has been successfully started, is marked as
“stopped.” Other services can nonetheless require one-shot
services. One-shot services are useful to trigger an action before other
services are started, such as a cleanup or an initialization action.
As for other services, the start method of a one-shot service must
return a truth value to indicate success, and false to indicate failure.
--8<---------------cut here---------------end--------------->8---
Nothing in there seems to mention that one-shot? services do not
actually wait on each other. To reproduce I wrote a simple
configuration file:
--8<---------------cut here---------------start------------->8---
(define %one-shot #f)
(use-modules (srfi srfi-1))
(define (make-waiting-service name wait requirement)
(service (list name)
#:requirement requirement
#:start (λ _
(sleep wait)
(format #t "~a\n" name)
#t)
#:one-shot? %one-shot))
(let ((svcs (pair-fold (λ (names waits svcs)
(cons (make-waiting-service (car names)
(car waits)
(cdr names))
svcs))
'()
'(a b c d)
'(1 2 3 4))))
(register-services svcs)
(start-in-the-background (map service-canonical-name svcs)))
--8<---------------cut here---------------end--------------->8---
Each service sleeps for `wait' seconds to simulate some slow work being
done. In effect that means that each of the services takes different
time to start up.
Now, when we run it as it is, we get the following (correct) output:
--8<---------------cut here---------------start------------->8---
$ shepherd -c conf.scm
Starting service root...
Service root started.
Service root running with value #t.
Service root has been started.
Configuration successfully loaded from 'conf.scm'.
Starting service d...
d
Service d has been started.
Service d started.
Service d running with value #t.
Starting service c...
c
Service c has been started.
Service c started.
Service c running with value #t.
Starting service b...
b
Service b has been started.
Service b started.
Service b running with value #t.
Starting service a...
a
Service a has been started.
Service a started.
Successfully started 4 services in the background.
Service a running with value #t.
--8<---------------cut here---------------end--------------->8---
Notice the start-up order (d c b a). If you run it, you will also
notice that `d' takes 4 seconds to start up, `c' 3 seconds etc.
However if we change the define at the top of the configuration file to
#t, hence:
--8<---------------cut here---------------start------------->8---
(define %one-shot #t)
--8<---------------cut here---------------end--------------->8---
The behavior changes:
--8<---------------cut here---------------start------------->8---
$ shepherd -c conf.scm
Starting service root...
Service root started.
Service root running with value #t.
Service root has been started.
Configuration successfully loaded from 'conf.scm'.
Starting service d...
Starting service c...
Starting service b...
Starting service a...
a
Service a has been started.
Service a started.
Service a running with value #t.
b
Service b has been started.
Service b started.
Service b running with value #t.
c
Service c has been started.
Service c started.
Service c running with value #t.
d
Service d has been started.
Service d started.
Successfully started 4 services in the background.
Service d running with value #t.
--8<---------------cut here---------------end--------------->8---
Notice that the order changed to (a b c d, this matches the increasing
wait time), the initial messages are all together:
--8<---------------cut here---------------start------------->8---
Starting service d...
Starting service c...
Starting service b...
Starting service a...
--8<---------------cut here---------------end--------------->8---
and the whole start-up takes 4 seconds (the wait time of `d'). That
seems to indicate that all 4 services are actually starting at the same
time without waiting as they should per the #:requirement argument.
Have a nice day,
Tomas
--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#74284
; Package
guix
.
(Fri, 22 Nov 2024 14:39:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 74284 <at> debbugs.gnu.org (full text, mbox):
Hi Tomas,
(+ Dariqq since we briefly discussed it on IRC yesterday.)
Tomas Volf <~@wolfsden.cz> skribis:
> Notice that the order changed to (a b c d, this matches the increasing
> wait time), the initial messages are all together:
>
> Starting service d...
> Starting service c...
> Starting service b...
> Starting service a...
>
> and the whole start-up takes 4 seconds (the wait time of `d'). That
> seems to indicate that all 4 services are actually starting at the same
> time without waiting as they should per the #:requirement argument.
Indeed. As Dariqq found out, the problem was that we’d mark one-short
services in ‘%one-shot-services-started’ as soon as we’ve started them,
effectively acting as if “started” were synonymous with “running”.
This is fixed with 550c0370985022c5c90a7b477a5e0b84f6faf5d7.
Let me know if you find anything fishy!
Thanks,
Ludo’.
bug closed, send any further explanations to
74284 <at> debbugs.gnu.org and Tomas Volf <~@wolfsden.cz>
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Fri, 22 Nov 2024 14:39:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#74284
; Package
guix
.
(Fri, 22 Nov 2024 19:42:01 GMT)
Full text and
rfc822 format available.
Message #13 received at 74284 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi Ludo',
Ludovic Courtès <ludo <at> gnu.org> writes:
> Indeed. As Dariqq found out, the problem was that we’d mark one-short
> services in ‘%one-shot-services-started’ as soon as we’ve started them,
> effectively acting as if “started” were synonymous with “running”.
>
> This is fixed with 550c0370985022c5c90a7b477a5e0b84f6faf5d7.
I have checked out the commit and verified it with my original
reproducer. Everything seems to work as it should, thank you for fixing
it :)
> Let me know if you find anything fishy!
Did not notice anything, so once 1.0.0 lands in Guix we can just close
this bug.
Have a nice day,
Tomas
--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#74284
; Package
guix
.
(Tue, 26 Nov 2024 15:58:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 74284 <at> debbugs.gnu.org (full text, mbox):
Tomas Volf <~@wolfsden.cz> skribis:
> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Indeed. As Dariqq found out, the problem was that we’d mark one-short
>> services in ‘%one-shot-services-started’ as soon as we’ve started them,
>> effectively acting as if “started” were synonymous with “running”.
>>
>> This is fixed with 550c0370985022c5c90a7b477a5e0b84f6faf5d7.
>
> I have checked out the commit and verified it with my original
> reproducer. Everything seems to work as it should, thank you for fixing
> it :)
>
>> Let me know if you find anything fishy!
>
> Did not notice anything, so once 1.0.0 lands in Guix we can just close
> this bug.
Awesome, thanks for checking!
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Wed, 25 Dec 2024 12:24:10 GMT)
Full text and
rfc822 format available.
This bug report was last modified 179 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.