GNU bug report logs -
#23170
Shepherd doesn't restart previously running dependent services
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 23170 in the body.
You can then email your comments to 23170 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org
:
bug#23170
; Package
guix
.
(Thu, 31 Mar 2016 13:24:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
"Thompson, David" <dthompson2 <at> worcester.edu>
:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org
.
(Thu, 31 Mar 2016 13:24:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
If service 'foo' is depended on by service 'bar', and both are
running, and one runs 'herd restart foo', both 'foo' and 'bar' are
stopped, but then only 'foo' is restarted. I think that the dependent
services that were stopped should also be restarted, otherwise the
user has to manually start them back up one at a time.
- Dave
Information forwarded
to
bug-guix <at> gnu.org
:
bug#23170
; Package
guix
.
(Sat, 25 Aug 2018 11:35:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 23170 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I've written a patch to fix this. It's not super smart, but it
should do the job.
It currently targets the branch after my patch in #32408[1], but
it's technically an independent change.
[1]: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32408
[0001-service-Restart-dependent-services-on-service-restar.patch (text/x-patch, inline)]
From 50dd3ef4888b04ea3b869da893b23ad69fad8971 Mon Sep 17 00:00:00 2001
From: Carlo Zancanaro <carlo <at> zancanaro.id.au>
Date: Sat, 25 Aug 2018 20:32:11 +1000
Subject: [PATCH] service: Restart dependent services on service restart
* modules/shepherd/service.scm (required-by?): New procedure.
(stop): Return a list of canonical-names for stopped dependent services,
including transitive dependencies.
(action)[restart]: Start services based on the return value of stop.
(fold-services): New procedure.
* tests/restart.sh: New file.
* Makefile.am (TESTS): Add tests/restart.sh.
---
Makefile.am | 1 +
modules/shepherd/service.scm | 90 ++++++++++++++++++++++--------------
tests/restart.sh | 77 ++++++++++++++++++++++++++++++
3 files changed, 133 insertions(+), 35 deletions(-)
create mode 100644 tests/restart.sh
diff --git a/Makefile.am b/Makefile.am
index 4322d7f..d9e21e9 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -187,6 +187,7 @@ TESTS = \
tests/replacement.sh \
tests/respawn.sh \
tests/respawn-throttling.sh \
+ tests/restart.sh \
tests/misbehaved-client.sh \
tests/no-home.sh \
tests/pid-file.sh \
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 006309c..510a5ea 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -358,61 +358,72 @@ NEW-SERVICE."
(for-each remove-service (provided-by old-service))
(register-services new-service)))
+(define (required-by? service dependent)
+ "Returns #t if DEPENDENT directly requires SERVICE in order to run. Returns
+#f otherwise."
+ (and (find (lambda (dependency)
+ (memq dependency (provided-by service)))
+ (required-by dependent))
+ #t))
+
;; Stop the service, including services that depend on it. If the
;; latter fails, continue anyway. Return `#f' if it could be stopped.
-(define-method (stop (obj <service>) . args)
+(define-method (stop (service <service>) . args)
+ "Stop SERVICE, and any services which depend on it. Returns a list of
+canonical names for all of the services which have been stopped (including
+transitive dependent services). This method will print a warning if SERVICE
+is not already running, and will return SERVICE's canonical name in a list."
;; Block asyncs so the SIGCHLD handler doesn't execute concurrently.
- ;; Notably, that makes sure the handler processes the SIGCHLD for OBJ's
- ;; process once we're done; otherwise, it could end up respawning OBJ.
+ ;; Notably, that makes sure the handler processes the SIGCHLD for SERVICE's
+ ;; process once we're done; otherwise, it could end up respawning SERVICE.
(call-with-blocked-asyncs
(lambda ()
- (if (not (running? obj))
- (local-output "Service ~a is not running." (canonical-name obj))
- (if (slot-ref obj 'stop-delay?)
+ (if (not (running? service))
+ (begin
+ (local-output "Service ~a is not running." (canonical-name service))
+ (list (canonical-name service)))
+ (if (slot-ref service 'stop-delay?)
(begin
- (slot-set! obj 'waiting-for-termination? #t)
+ (slot-set! service 'waiting-for-termination? #t)
(local-output "Service ~a pending to be stopped."
- (canonical-name obj)))
- (begin
- ;; Stop services that depend on it.
- (for-each-service
- (lambda (serv)
- (and (running? serv)
- (for-each (lambda (sym)
- (and (memq sym (provided-by obj))
- (stop serv)))
- (required-by serv)))))
-
+ (canonical-name service))
+ (list (canonical-name service)))
+ (let ((name (canonical-name service))
+ (stopped-dependents (fold-services (lambda (other acc)
+ (if (and (running? other)
+ (required-by? service other))
+ (append (stop other) acc)
+ acc))
+ '())))
;; Stop the service itself.
(catch #t
(lambda ()
- (apply (slot-ref obj 'stop)
- (slot-ref obj 'running)
+ (apply (slot-ref service 'stop)
+ (slot-ref service 'running)
args))
(lambda (key . args)
;; Special case: 'root' may quit.
- (and (eq? root-service obj)
+ (and (eq? root-service service)
(eq? key 'quit)
(apply quit args))
(caught-error key args)))
- ;; OBJ is no longer running.
- (slot-set! obj 'running #f)
+ ;; SERVICE is no longer running.
+ (slot-set! service 'running #f)
;; Reset the list of respawns.
- (slot-set! obj 'last-respawns '())
+ (slot-set! service 'last-respawns '())
;; Replace the service with its replacement, if it has one
- (let ((replacement (slot-ref obj 'replacement)))
+ (let ((replacement (slot-ref service 'replacement)))
(when replacement
- (replace-service obj replacement)))
+ (replace-service service replacement)))
;; Status message.
- (let ((name (canonical-name obj)))
- (if (running? obj)
- (local-output "Service ~a could not be stopped." name)
- (local-output "Service ~a has been stopped." name))))))
- (slot-ref obj 'running))))
+ (if (running? service)
+ (local-output "Service ~a could not be stopped." name)
+ (local-output "Service ~a has been stopped." name))
+ (cons name stopped-dependents)))))))
;; Call action THE-ACTION with ARGS.
(define-method (action (obj <service>) the-action . args)
@@ -423,10 +434,9 @@ NEW-SERVICE."
;; Restarting is done in the obvious way.
((restart)
(lambda (running . args)
- (if running
- (stop obj)
- (local-output "~a was not running." (canonical-name obj)))
- (apply start obj args)))
+ (let ((stopped-services (stop obj)))
+ (for-each start stopped-services)
+ #t)))
((status)
;; Return the service itself. It is automatically converted to an sexp
;; via 'result->sexp' and sent to the client.
@@ -953,6 +963,16 @@ Return #f if service is not found."
(eq? name (canonical-name service)))
services))
+(define (fold-services proc init)
+ "Apply PROC to the registered services to build a result, and return that
+result. Works in a manner akin to `fold' from SRFI-1."
+ (hash-fold (lambda (name services acc)
+ (let ((service (lookup-canonical-service name services)))
+ (if service
+ (proc service acc)
+ acc)))
+ init %services))
+
(define (for-each-service proc)
"Call PROC for each registered service."
(hash-for-each (lambda (name services)
diff --git a/tests/restart.sh b/tests/restart.sh
new file mode 100644
index 0000000..92a1f79
--- /dev/null
+++ b/tests/restart.sh
@@ -0,0 +1,77 @@
+# GNU Shepherd --- Test restarting services.
+# Copyright © 2013, 2014, 2016 Ludovic Courtès <ludo <at> gnu.org>
+# Copyright © 2018 Carlo Zancanaro <carlo <at> zancanaro.id.au>
+#
+# This file is part of the GNU Shepherd.
+#
+# The GNU Shepherd is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or (at
+# your option) any later version.
+#
+# The GNU Shepherd is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with the GNU Shepherd. If not, see <http://www.gnu.org/licenses/>.
+
+shepherd --version
+herd --version
+
+socket="t-socket-$$"
+conf="t-conf-$$"
+log="t-log-$$"
+pid="t-pid-$$"
+
+herd="herd -s $socket"
+
+trap "cat $log || true ;
+ rm -f $socket $conf $log;
+ test -f $pid && kill \`cat $pid\` || true ; rm -f $pid" EXIT
+
+cat > "$conf"<<EOF
+(register-services
+ (make <service>
+ #:provides '(test1)
+ #:start (const #t)
+ #:stop (const #t))
+ (make <service>
+ #:provides '(test2)
+ #:requires '(test1)
+ #:start (const #t)
+ #:stop (const #t))
+ (make <service>
+ #:provides '(test3)
+ #:requires '(test2)
+ #:start (const #t)
+ #:stop (const #t)))
+EOF
+
+rm -f "$pid"
+shepherd -I -s "$socket" -c "$conf" -l "$log" --pid="$pid" &
+
+while ! test -f "$pid" ; do sleep 0.3 ; done
+
+# Start some test services, and make sure they behave how we expect
+$herd start test1
+$herd start test2
+$herd status test1 | grep started
+$herd status test2 | grep started
+
+# Restart test1 and make sure that both services are still running (ie. that
+# test2 hasn't been stopped)
+$herd restart test1
+$herd status test1 | grep started
+$herd status test2 | grep started
+
+# Now let's test with a transitive dependency
+$herd start test3
+$herd status test3 | grep started
+
+# After restarting test1 we want test3 to still be running
+$herd restart test1
+$herd status test1 | grep started
+$herd status test2 | grep started
+$herd status test3 | grep started
--
2.18.0
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#23170
; Package
guix
.
(Sat, 25 Aug 2018 14:42:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 23170 <at> debbugs.gnu.org (full text, mbox):
Hi again! :-)
Carlo Zancanaro <carlo <at> zancanaro.id.au> skribis:
> I've written a patch to fix this. It's not super smart, but it should
> do the job.
I wonder if there are cases where one might want to restart a service
without restarting its dependent services. We can probably ignore it
for now, but perhaps we’ll need to add a flag or a separate action later.
Thoughts?
> From 50dd3ef4888b04ea3b869da893b23ad69fad8971 Mon Sep 17 00:00:00 2001
> From: Carlo Zancanaro <carlo <at> zancanaro.id.au>
> Date: Sat, 25 Aug 2018 20:32:11 +1000
> Subject: [PATCH] service: Restart dependent services on service restart
>
> * modules/shepherd/service.scm (required-by?): New procedure.
> (stop): Return a list of canonical-names for stopped dependent services,
> including transitive dependencies.
> (action)[restart]: Start services based on the return value of stop.
> (fold-services): New procedure.
> * tests/restart.sh: New file.
> * Makefile.am (TESTS): Add tests/restart.sh.
[...]
> +# Restart test1 and make sure that both services are still running (ie. that
> +# test2 hasn't been stopped)
> +$herd restart test1
> +$herd status test1 | grep started
> +$herd status test2 | grep started
> +
> +# Now let's test with a transitive dependency
> +$herd start test3
> +$herd status test3 | grep started
> +
> +# After restarting test1 we want test3 to still be running
> +$herd restart test1
> +$herd status test1 | grep started
> +$herd status test2 | grep started
> +$herd status test3 | grep started
For clarity, should we do an explicit “herd stop test1” followed by
“herd start test1”? I know it’s currently equivalent under the hood,
but it might be slightly clearer. WDYT?
Otherwise it LGTM. Thanks for addressing these longstanding issues!
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#23170
; Package
guix
.
(Sat, 25 Aug 2018 22:50:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 23170 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sun, Aug 26 2018, Ludovic Courtès wrote:
> I wonder if there are cases where one might want to restart a
> service without restarting its dependent services. We can
> probably ignore it for now, but perhaps we’ll need to add a flag
> or a separate action later.
>
> Thoughts?
I think this is best served by 'herd stop', followed by 'herd
start'. This patch just special-cases the 'restart' action, so
manually stopping then starting a service will behave as the old
restart used to.
> For clarity, should we do an explicit “herd stop test1” followed
> by “herd start test1”? I know it’s currently equivalent under
> the hood, but it might be slightly clearer. WDYT?
Hopefully the above also answers this, too. I did consider whether
it was worth adding a test for 'herd stop' to make sure it still
stops dependent services, and 'herd start' to make sure it doesn't
start dependent services, but in the end I decided not to. I'm
happy to send through another patch to test these cases, though,
if you think it would be worthwhile.
Carlo
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#23170
; Package
guix
.
(Sun, 26 Aug 2018 21:09:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 23170 <at> debbugs.gnu.org (full text, mbox):
Carlo Zancanaro <carlo <at> zancanaro.id.au> skribis:
> On Sun, Aug 26 2018, Ludovic Courtès wrote:
>> I wonder if there are cases where one might want to restart a
>> service without restarting its dependent services. We can probably
>> ignore it for now, but perhaps we’ll need to add a flag or a
>> separate action later.
>>
>> Thoughts?
>
> I think this is best served by 'herd stop', followed by 'herd
> start'. This patch just special-cases the 'restart' action, so
> manually stopping then starting a service will behave as the old
> restart used to.
Great, I had overlooked this.
>> For clarity, should we do an explicit “herd stop test1” followed by
>> “herd start test1”? I know it’s currently equivalent under the
>> hood, but it might be slightly clearer. WDYT?
>
> Hopefully the above also answers this, too.
It does, thanks!
> I did consider whether it was worth adding a test for 'herd stop' to
> make sure it still stops dependent services, and 'herd start' to make
> sure it doesn't start dependent services, but in the end I decided not
> to. I'm happy to send through another patch to test these cases,
> though, if you think it would be worthwhile.
No, that’s fine.
I forgot if this was already done, but perhaps you can add a bit in the
manual to insist that ‘restart’ is not quite the same as ‘stop’ +
‘start’.
Anyway, it all LGTM, thanks!
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#23170
; Package
guix
.
(Sun, 26 Aug 2018 22:06:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 23170 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hey Ludo’!
On 27 August 2018 7:08:34 am AEST, ludo <at> gnu.org wrote:
>I forgot if this was already done, but perhaps you can add a bit in the manual to insist that ‘restart’ is not quite the same as ‘stop’ + ‘start’.
I hadn't done that, but I have now. There aren't many mentions of restart in the manual, but I changed the one that seemed most relevant.
>Anyway, it all LGTM, thanks!
Pushed! Thanks for the review.
Carlo
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#23170
; Package
guix
.
(Sun, 26 Aug 2018 22:06:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 23170 <at> debbugs.gnu.org (full text, mbox):
Hey Ludo’!
On 27 August 2018 7:08:34 am AEST, ludo <at> gnu.org wrote:
>I forgot if this was already done, but perhaps you can add a bit in the manual to insist that ‘restart’ is not quite the same as ‘stop’ + ‘start’.
I hadn't done that, but I have now. There aren't many mentions of restart in the manual, but I changed the one that seemed most relevant.
>Anyway, it all LGTM, thanks!
Pushed! Thanks for the review.
Carlo
Information forwarded
to
bug-guix <at> gnu.org
:
bug#23170
; Package
guix
.
(Mon, 27 Aug 2018 11:06:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 23170 <at> debbugs.gnu.org (full text, mbox):
Hi,
Carlo Zancanaro <carlo <at> zancanaro.id.au> skribis:
> Pushed! Thanks for the review.
Great!
I see that you also reverted the patch that removed the ‘EINTR-safe’
workaround. Could you explain why that was necessary? (It should not
be necessary with current Guile versions.)
Thank you,
Ludo’.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#23170
; Package
guix
.
(Mon, 27 Aug 2018 12:43:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 23170 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, Aug 27 2018, Ludovic Courtès wrote:
> I see that you also reverted the patch that removed the
> ‘EINTR-safe’ workaround. Could you explain why that was
> necessary? (It should not be necessary with current Guile
> versions.)
I'm not really sure of the details, but as I mentioned on IRC,
that commit stopped me from being able to boot. I grafted a new
version of the Shepherd and reconfigured my system, and when it
came up my screen was filled with messages complaining that it
couldn't create things in /dev because they already existed. I
tested the commits since the previous release and that was the one
that caused my problem.
I'd like to investigate further, but I have no idea what could be
causing it. All I have to go on so far is that I think the udev
service is getting respawned repeatedly, but I don't know why that
commit would cause that problem. After reverting that commit I
could successfully boot back into my system and everything is
working properly again.
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#23170
; Package
guix
.
(Mon, 27 Aug 2018 17:54:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 23170 <at> debbugs.gnu.org (full text, mbox):
Carlo Zancanaro <carlo <at> zancanaro.id.au> writes:
>>Anyway, it all LGTM, thanks!
>
> Pushed! Thanks for the review.
>
> Carlo
Awesome! Thanks a lot Carlo for working on this :-)
Added tag(s) fixed.
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Sat, 05 Jan 2019 13:54:01 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
23170 <at> debbugs.gnu.org and "Thompson, David" <dthompson2 <at> worcester.edu>
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Sat, 05 Jan 2019 13:54:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sun, 03 Feb 2019 12:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 6 years and 138 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.