GNU bug report logs -
#52654
shepherd hangs whole system upon encountering error in start slot
Previous Next
To reply to this bug, email your comments to 52654 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guix <at> gnu.org
:
bug#52654
; Package
guix
.
(Sun, 19 Dec 2021 05:14:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
raingloom <raingloom <at> riseup.net>
:
New bug report received and forwarded. Copy sent to
bug-guix <at> gnu.org
.
(Sun, 19 Dec 2021 05:14:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I'm writing a single-shot shepherd-service that expands the (ext4) root
file system on first boot, using the hostname service as a template,
just passing the script as a G-expression, instead of using the
forkexec constructor.
Of course there is a bug in it. Trouble is, I have no idea what it is,
because Shepherd won't tell me. :)
The VM boots and completes the ssh initialization phase and then
apparently just gets stuck. Doesn't even show a login prompt.
It's... not a great debugging experience.
I'm going to attempt to at the very least add some error reporting.
It would also be really nice if the failure modes for Shepherd services
were better documented, like what happens when the procedure passed in
the `start` field fails, or is not even a procedure, etc.
Since I never touched Shepherd internals, help would be greatly
appreciated.
ps.: I'm attaching the system definition for completeness's sake and so
that someone might point out where the error is, but honestly the exact
bug in my code does not matter for the feature. All that matters is
there is an error and it should be logged but isn't.
[cloud-deploy-bootstrap.scm (text/x-scheme, attachment)]
Information forwarded
to
bug-guix <at> gnu.org
:
bug#52654
; Package
guix
.
(Sun, 19 Dec 2021 06:03:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 52654 <at> debbugs.gnu.org (full text, mbox):
On Sun, 19 Dec 2021 06:13:20 +0100
raingloom <raingloom <at> riseup.net> wrote:
> I'm writing a single-shot shepherd-service that expands the (ext4)
> root file system on first boot, using the hostname service as a
> template, just passing the script as a G-expression, instead of using
> the forkexec constructor.
> Of course there is a bug in it. Trouble is, I have no idea what it is,
> because Shepherd won't tell me. :)
> The VM boots and completes the ssh initialization phase and then
> apparently just gets stuck. Doesn't even show a login prompt.
> It's... not a great debugging experience.
> I'm going to attempt to at the very least add some error reporting.
> It would also be really nice if the failure modes for Shepherd
> services were better documented, like what happens when the procedure
> passed in the `start` field fails, or is not even a procedure, etc.
> Since I never touched Shepherd internals, help would be greatly
> appreciated.
>
> ps.: I'm attaching the system definition for completeness's sake and
> so that someone might point out where the error is, but honestly the
> exact bug in my code does not matter for the feature. All that
> matters is there is an error and it should be logged but isn't.
So the error in my config turned out to be the G-expression not
evaluating to a lambda, but the issue with Shepherd still stands.
Changed bug title to 'shepherd hangs whole system upon encountering error in start slot' from 'shepherd lacks error reporting'
Request was from
Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Sat, 29 Apr 2023 15:20:02 GMT)
Full text and
rfc822 format available.
Severity set to 'important' from 'normal'
Request was from
Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Sat, 29 Apr 2023 15:20:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guix <at> gnu.org
:
bug#52654
; Package
guix
.
(Sat, 29 Apr 2023 15:23:01 GMT)
Full text and
rfc822 format available.
Message #15 received at 52654 <at> debbugs.gnu.org (full text, mbox):
Hi,
I also encountered that issue, it's really puzzling.
Here's the problematic start slot that got my mpd test to hang the boot,
with the last message being "Please wait while gathering entropy to
generate the key pair;":
--8<---------------cut here---------------start------------->8---
(start
(with-imported-modules (source-module-closure
'((gnu build activation)))
#~(begin
(use-modules (gnu build activation))
(let ((user (getpw #$username)))
(define (init-directory directory)
(unless (file-exists? directory)
(mkdir-p/perms directory user #o755)))
(for-each
init-directory
(cons '#$(map dirname
;; XXX: Delete the potential "syslog"
;; log-file value, which is not a directory.
(delete "syslog"
(filter-map maybe-value
(list db-file
log-file
state-file
sticker-file)))))))
(make-forkexec-constructor
(list #$(file-append package "/bin/mpd") "--no-daemon"
#$config-file)
#:environment-variables '#$environment-variables))))
--8<---------------cut here---------------end--------------->8---
The error was the lonely cons. Taking it out, the test then passed:
--8<---------------cut here---------------start------------->8---
(start
(with-imported-modules (source-module-closure
'((gnu build activation)))
#~(begin
(use-modules (gnu build activation))
(let ((user (getpw #$username)))
(define (init-directory directory)
(unless (file-exists? directory)
(mkdir-p/perms directory user #o755)))
(for-each
init-directory
'#$(map dirname
;; XXX: Delete the potential "syslog"
;; log-file value, which is not a directory.
(delete "syslog"
(filter-map maybe-value
(list db-file
log-file
state-file
sticker-file))))))
(make-forkexec-constructor
(list #$(file-append package "/bin/mpd") "--no-daemon"
#$config-file)
#:environment-variables '#$environment-variables))))
--8<---------------cut here---------------end--------------->8---
Shepherd should report the error, fail that one service and attempt to
keep booting (if the service is not required by other critical ones).
--
Thanks,
Maxim
This bug report was last modified 2 years and 46 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.