GNU bug report logs - #37309
'ssh-daemon' fails to start

Previous Next

Package: guix;

Reported by: Giovanni Biscuolo <g <at> xelera.eu>

Date: Thu, 5 Sep 2019 13:19:03 UTC

Severity: important

Tags: fixed, unreproducible

Merged with 30993, 33299, 34580

Done: maxim.cournoyer <at> gmail.com

Bug is archived. No further changes may be made.

Full log


Message #8 received at 37309 <at> debbugs.gnu.org (full text, mbox):

From: iyzsong <at> member.fsf.org (宋文武)
To: Giovanni Biscuolo <g <at> xelera.eu>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 37309 <at> debbugs.gnu.org
Subject: Re: bug#37309: ‘ssh-daemon’ service fails
 to start at boot
Date: Sun, 08 Sep 2019 12:19:14 +0800
Giovanni Biscuolo <g <at> xelera.eu> writes:

> Hi,
>
> following a recent discussion on guix-sysadmin I have to confirm the
> ssh-daemon issue since it is still happening on some of the machines I
> administer
>
> Previous possibly related bug reports are
> https://issues.guix.gnu.org/issue/30993 and
> https://issues.guix.gnu.org/issue/32197
>
> Unfortunately this issue is *not* well reproducible, it depends on some
> mysterious (to me) timing factor; AFAIU it does *not* depend on the
> shepherd version, probably it depends on "something" related to IPv6
> (read below the details)

Hello, thank you for this report, it's reproducible with my box that has
an old hard disk, and disable IPv6 for sshd does fix the issue for me...

>
> Andreas Enge <andreas <at> enge.fr> writes:
>
> [...]
>
>> My impression is that the problem is still there. I am quite certain it
>> happened when I rebooted dover, since I had to connect on the serial console
>> to manually restart the ssh service.
>
> I'm sure it happened when milano-guix-1 was rebooted due to data centre
> maintenance and happened yesterday to one of my personal Guix machines at
> office
>
> [...]
>
> My situation is similar to the one observed by Andreas
>
>> Well, it is in /var/log/messages:
>> Aug  3 21:11:38 localhost sshd[360]: Server listening on 0.0.0.0 port 22.
>> Aug  3 21:11:55 localhost shepherd[1]: Service ssh-daemon could not be started.
>
> [...]
> Sep  4 21:46:02 localhost shepherd[1]: Service syslogd has been started.
> [...]
> Sep  4 21:46:03 localhost shepherd[1]: Service loopback has been started.
> [...]
> Sep  4 21:46:22 localhost vmunix: [    0.226337] PCI: Using configuration type 1 for base access
> Sep  4 21:46:09 localhost dhclient: DHCPREQUEST for 10.38.2.16 on eno1 to 255.255.255.255 port 67
> [...]
> Sep  4 21:46:24 localhost shepherd[1]: Service networking has been started.
> [...]
> Sep  4 21:46:12 localhost sshd[577]: Server listening on 0.0.0.0 port 22.
> [...]
> Sep  4 21:46:30 localhost vmunix: [    0.250107] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 10 *11 12 14 15)
> Sep  4 21:46:13 localhost dhclient: DHCPREQUEST for 10.38.2.16 on eno1 to 255.255.255.255 port 67
> [...]
> Sep  4 21:46:16 localhost dhclient: DHCPACK of 10.38.2.16 from 10.38.2.1
> [...]
> Sep  4 21:46:33 localhost shepherd[1]: Service ssh-daemon could not be started.
> [...]
> Sep  4 21:46:47 localhost vmunix: [    0.731142] Segment Routing with IPv6
>
>
> Please note the timing of the dhclient and the sshd processes: I
> inserted them as printed in /var/log/messages but they are not
> time-sequential: does it means something or is irrelevant?
>
> So the sshd process started (as far as I cen see there is no trace it
> was stopped) and pretty soon shepherd noticed ssh-daemon was not
> started.
>
> Logging in from the console I see the ssh-daemon is stopped but enabled:
>
> Status of ssh-daemon:
>   It is stopped.
>   It is enabled.
>   Provides (ssh-daemon).
>   Requires (syslogd loopback).
>   Conflicts with ().
>   Will be respawned.
>
>
> [...]

Yes, I think when 'ssh-daemon' failed to start, shepherd should respawn
it until success or disable it, but by look at the code of
'make-forkexec-constructor', when using 'pid-file' (as 'ssh-ademon'
does), and a timeout (default to 5s %pid-file-timeout) is reached, the
processes got a 'SIGTERM' and return '#f' as its running state, which
won't be respawn (it's not a pid number) I guess...

To ludo: Is my analysis correct?  It's not clear to me how to fix it so
'ssh-daemon' can be respawn though...

>
> If I start it via `sudo herd start ssh-daemon` it immediatly starts,
> like in Andreas experience:
>
>> Aug  3 21:13:10 localhost sshd[385]: Server listening on 0.0.0.0 port 22.
>> Aug  3 21:13:10 localhost sshd[385]: Server listening on :: port 22.
>> Aug  3 21:13:11 localhost shepherd[1]: Service ssh-daemon has been started.
>
> Sep  5 13:38:55 localhost sshd[745]: Server listening on 0.0.0.0 port 22.
> Sep  5 13:38:55 localhost sshd[745]: Server listening on :: port 22.
> Sep  5 13:38:55 localhost shepherd[1]: Service ssh-daemon has been started.
>
>
> Please notice the difference from above: this time the sshd server is
> also listening on the IPv6 address :: while in the above log it was only
> listening on the 0.0.0.0 IPv4 address
>
> Does the failure have something to do with IPv6 not available when sshd
> starts for the first time after a reboot?

I agree, as adding '(extra-content "ListenAddress 0.0.0.0")' to my
'openssh-configuration' to skip the ipv6 listen fix this issue for me.

A proper fix should be respawn 'ssh-daemon' and start it after 'ipv6
available' (i don't know what this mean yet..).




This bug report was last modified 4 years and 171 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.