GNU bug report logs -
#74279
Shepherd service is not getting respawned.
Previous Next
Reported by: Tomas Volf <~@wolfsden.cz>
Date: Sat, 9 Nov 2024 15:01:01 UTC
Severity: normal
Tags: notabug
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 74279 in the body.
You can then email your comments to 74279 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org
:
bug#74279
; Package
guix
.
(Sat, 09 Nov 2024 15:01:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Tomas Volf <~@wolfsden.cz>
:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org
.
(Sat, 09 Nov 2024 15:01:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi,
I wrote a shepherd service to function as a check for networking being
actually up, but it does not get respawned when it fails and I do not
understand why.
This is the service in my operating-system:
--8<---------------cut here---------------start------------->8---
(simple-service
'network-online
shepherd-root-service-type
(list (shepherd-service
(requirement '(networking))
(provision '(network-online))
(documentation "Wait for the network to come up.")
(start #~(lambda _
(let* ((cmd "/run/privileged/bin/ping -qc1 -W1 1.1.1.1")
(status (system cmd)))
(= 0 (status:exit-val status)))))
(one-shot? #t)
;; Try every second.
(respawn-delay 1)
;; Retry forever. Double-quoting is intentional.
(respawn-limit ''(5 . 5)))))
--8<---------------cut here---------------end--------------->8---
Now, when I reboot the machine, I see in the log that the service did
start:
--8<---------------cut here---------------start------------->8---
Nov 7 00:18:20 localhost shepherd[1]: Starting service network-online...
[..]
Nov 7 00:18:20 localhost shepherd[1]: [sh] PING 192.168.0.110 (192.168.0.110): 56 data bytes
Nov 7 00:18:20 localhost shepherd[1]: [sh] /run/privileged/bin/ping: sending packet: Network is unreachable
Nov 7 00:18:20 localhost shepherd[1]: Service network-online could not be started.
Nov 7 00:18:20 localhost shepherd[1]: Service network-online failed to start.
--8<---------------cut here---------------end--------------->8---
The fail on first run is expected, however the problem is it starts
exactly once. I do not see any attempts to respawn it in the
/var/log/messages, but based on the documentation the service *should*
get respawned, since it failed. What am I doing wrong? Would anyone
have any suggestions, either what is wrong with the code above or how to
approach it in another way?
Have a nice day,
Tomas
--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#74279
; Package
guix
.
(Sun, 10 Nov 2024 11:33:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 74279 <at> debbugs.gnu.org (full text, mbox):
Hi Tomas,
Tomas Volf <~@wolfsden.cz> skribis:
> (start #~(lambda _
> (let* ((cmd "/run/privileged/bin/ping -qc1 -W1 1.1.1.1")
> (status (system cmd)))
> (= 0 (status:exit-val status)))))
> (one-shot? #t)
> ;; Try every second.
> (respawn-delay 1)
> ;; Retry forever. Double-quoting is intentional.
> (respawn-limit ''(5 . 5)))))
[...]
> Nov 7 00:18:20 localhost shepherd[1]: Starting service network-online...
> [..]
> Nov 7 00:18:20 localhost shepherd[1]: [sh] PING 192.168.0.110 (192.168.0.110): 56 data bytes
> Nov 7 00:18:20 localhost shepherd[1]: [sh] /run/privileged/bin/ping: sending packet: Network is unreachable
> Nov 7 00:18:20 localhost shepherd[1]: Service network-online could not be started.
> Nov 7 00:18:20 localhost shepherd[1]: Service network-online failed to start.
I think there’s a misunderstanding here: ‘respawn?’ is about respawning
a service that, once it is running, terminates prematurely.
In your case, the service does not start (its ‘start’ method returns
#f).
Now, it would probably make sense to have a mechanism to retry starting
services.
In the specific case of ‘network-online’ though, you could use a
different approach: the ‘start’ method could itself try retry pinging
the network several times and fail only if it failed to reach the
network after, say, 10s. (Remember that ‘start’ and ‘stop’ must
complete in a timely fashion.)
HTH,
Ludo’.
Added tag(s) notabug.
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Wed, 20 Nov 2024 21:49:02 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
74279 <at> debbugs.gnu.org and Tomas Volf <~@wolfsden.cz>
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Wed, 20 Nov 2024 21:49:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 19 Dec 2024 12:24:08 GMT)
Full text and
rfc822 format available.
This bug report was last modified 184 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.