GNU bug report logs - #33968
errors in shepherd service constructors are not logged and lead to misleading status

Previous Next

Package: guix;

Reported by: Florian Dold <florian.dold <at> gmail.com>

Date: Thu, 3 Jan 2019 21:37:02 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Florian Dold <florian.dold <at> gmail.com>
Subject: bug#33968: closed (Re: bug#33968: errors in shepherd service
 constructors are not logged and lead to misleading status)
Date: Thu, 15 Jun 2023 21:16:01 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#33968: errors in shepherd service constructors are not logged and lead to misleading status

which was filed against the guix package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 33968 <at> debbugs.gnu.org.

-- 
33968: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=33968
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Ludovic Courtès <ludo <at> gnu.org>
To: 33968-done <at> debbugs.gnu.org
Subject: Re: bug#33968: errors in shepherd service constructors are not
 logged and lead to misleading status
Date: Thu, 15 Jun 2023 23:15:29 +0200
Florian Dold <florian.dold <at> gmail.com> skribis:

> when defining a service type that extends shepherd-root-service-type and
> the 'start' function of the shepherd-service definition contains an
> error, the error is silently ignored.  No log output is generated at all.

[...]

> I generally feel like the state machine for services needs some work.
> In particular, it would be useful to distinguish between "failed" and
> "completed" services instead of conflating both states into "stopped".
> Or maybe have some more general mechanism for storing state about the
> service, instead of just the slot that usually contains the PID?

It’s been 4 years (!) but the good news is that all this is fixed as of
Shepherd 0.10.  Closing!

Ludo’.

[Message part 3 (message/rfc822, inline)]
From: Florian Dold <florian.dold <at> gmail.com>
To: bug-guix <at> gnu.org
Subject: errors in shepherd service constructors are not logged and lead to
 misleading status
Date: Thu, 3 Jan 2019 22:36:20 +0100
[Message part 4 (text/plain, inline)]
Hi Guix,

when defining a service type that extends shepherd-root-service-type and
the 'start' function of the shepherd-service definition contains an
error, the error is silently ignored.  No log output is generated at all.

For example (full system definition is attached):

(define (errtest-shepherd-service c)
  (list
    (shepherd-service
      (provision '(errtest))
      (documentation "Errtest")
      (requirement '(file-systems))
      (modules `((shepherd support) (ice-9 match) ,@%default-modules))
      (start #~(lambda args
                 (local-output "errtest start")
                 this-is-an-unbound-variable
                 (local-output "errtest end")
                 #t)))))


The log message "errtest start" appears in /var/log/messages, as
expected.  The next line contains an error, and aborts execution of the
start function.

The error only becomes apparent when manually doing a "herd restart
errtest", which shows an error message (but without any error location
or stack trace).  But the error (regarding the unbound variable) is not
logged, and there is no indication in the log that the service couldn't
be started in any log.

Furthermore the "herd status" of a service that encountered an error in
the start function is very misleading:

root <at> errtest ~# herd status errtest
Status of errtest:
  It is stopped.
  It is enabled.
  Provides (errtest).
  Requires (file-systems).
  Conflicts with ().
  Will be respawned.


It shows "Will be respawned", which is wrong.

I'd be happy to work on a patch, but it seems like there is some design
discussion necessary, in particular how the "Will be respawned" should
be handled.  Services have a "respawn?" flag, but of course respawning
can only work if the start function executed successfully (and only the
service process itself failed) in the first place.

I generally feel like the state machine for services needs some work.
In particular, it would be useful to distinguish between "failed" and
"completed" services instead of conflating both states into "stopped".
Or maybe have some more general mechanism for storing state about the
service, instead of just the slot that usually contains the PID?

- Florian
[config-error-reporting.scm (text/x-scheme, attachment)]

This bug report was last modified 1 year and 336 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.