GNU bug report logs - #76315
System does not boot after switching to system-log service

Previous Next

Package: guix;

Reported by: Tomas Volf <~@wolfsden.cz>

Date: Sun, 16 Feb 2025 00:43:01 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Ludovic Courtès <ludo <at> gnu.org>
To: Tomas Volf <~@wolfsden.cz>
Cc: 76315 <at> debbugs.gnu.org
Subject: bug#76315: System does not boot after switching to system-log service
Date: Fri, 21 Feb 2025 12:17:16 +0100
Hi,

Tomas Volf <~@wolfsden.cz> skribis:

>> After spending hours on this and fixing improbable issues in the
>> Shepherd (will push shortly), I found that the root of the problem is
>> exactly what I feared and which led to the patches at
>> <https://issues.guix.gnu.org/76262>.
>>
>> Namely, ‘dhcp-client-service-type’ calls ‘waitpid’; that call competes
>> with the one done by shepherd’s SIGCHLD handler and, if you’re unlucky,
>> it loses the race and waits forever.
>
> Observation here.  While yes, based on the description I agree that it
> is (bad) luck based, in practice it seems to be extremely reliable to
> reproduce.

Yes, I could reproduce it 100% with just ‘bare-bones.tmpl’.  Thing is,
as soon as you would change something non-trivial, for instance the
‘message-destination’ procedure of shepherd so that it writes everything
to /dev/console, the problem would go away.  Even just commenting out
some of the parameters passed to ‘system-log’ could make the problem
disappear (!), which is why it took me a lot of time to figure it out.

>> Could you try your config with the patch at
>> <https://issues.guix.gnu.org/76262#2>, at least in a VM and ideally on
>> the metal?

[...]

> I can confirm the patch 2 fixes the issue for me, both in the VM and on
> physical machine.

Yay!

> Only thing I have noticed that even when deploying the "good" commit, I
> see the following error in the log:
>
> guix deploy: warning: an error occurred while upgrading services on '127.0.0.1':
> %exception #<inferior-object #<&service-not-found-error service: system-log>>

I think I understood this one now.

The old service has only one name: syslogd.  The new one, which upgrades
it, has two names: system-log and syslogd (system-log is its “canonical
name”).

The service upgrade machinery gets confused because it uses the
canonical name in one place.

I’ll investigate.

Ludo’.




This bug report was last modified 40 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.