GNU bug report logs -
#58485
[shepherd] EADDRINUSE while restarting ssh-daemon
Previous Next
Reported by: Lars-Dominik Braun <lars <at> 6xq.net>
Date: Thu, 13 Oct 2022 07:53:01 UTC
Severity: important
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
Hi,
Sorry for the late reply. I’m going through Shepherd bug reports and I
remembered this discussion…
Lars-Dominik Braun <ldb <at> leibniz-psychology.org> skribis:
>> Can you confirm shepherd (PID 1) is 0.9.3?
> it is:
>
> root 1 0.2 0.2 308148 76816 ? Sl Feb07 52:08 /gnu/store/kphp5d85rrb3q1rdc2lfqc1mdklwh3qp-guile-3.0.9/bin/guile --no-auto-compile /gnu/store/4nw0zb4swga0cb8i35nvng3rg6z5qm8p-shepherd-0.9.3/bin/shepherd --config /gnu/store/cvrai6z8777jf7860rnvppfznl1lcxi1-shepherd.conf
>
>> ‘sudo herd restart ssh-daemon’ works fine on my laptop FWIW.
> This works fine too. Only unattended-upgrades seems to have this issue :/
>
> The strace looks unsuspicious right now:
>
> ---snip---
> 1 14:12:15.117035 read(21, "(shepherd-command (version 0) (action restart) (service ssh-daemon) (arguments ()) (directory \"/root\"))", 1024) = 103
> 1 14:12:15.117254 close(27) = 0
> 1 14:12:15.117283 close(30) = 0
> 1 14:12:15.117416 newfstatat(AT_FDCWD, "/etc/localtime", {st_dev=makedev(0x8, 0x2), st_ino=110100491, st_mode=S_IFREG|0444, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, s
> t_size=2298, st_atime=1676898665 /* 2023-02-20T14:11:05.338746772+0100 */, st_atime_nsec=338746772, st_mtime=1676898664 /* 2023-02-20T14:11:04.874743456+0100 */, st_mtime_nsec=874743456, st_c
> time=1676898664 /* 2023-02-20T14:11:04.874743456+0100 */, st_ctime_nsec=874743456}, 0) = 0
> 1 14:12:15.117475 write(17, "shepherd[1]: Service ssh-daemon has been stopped.\n", 50) = 50
> 1 14:12:15.117524 socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 26
> 1 14:12:15.117561 setsockopt(26, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> 1 14:12:15.117598 bind(26, {sa_family=AF_INET, sin_port=htons(2222), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)
> 1 14:12:15.117724 write(21, "(reply (version 0) (result #f) (error (error (version 0) action-exception start ssh-daemon system-error (\"bind\" \"~A\" (\"Address already in use\") (98)))) (messages (\"Service ssh-daemon has been stopped.\")))", 204) = 204
> 1 14:12:15.117754 close(21) = 0
This suggests ‘bind’ can return EADDRINUSE even though the sockets have
been closed before (presumably file descriptors 27 and 30 above).
Can you confirm nothing else is competing to bind port 2222 on that
machine?
I tried to reproduce it with something as brutal as:
while sudo herd restart sshd ; do : ; done
… to no avail (I’m on current Shepherd ‘master’ though).
Maybe we should just have shepherd retry upon EADDRINUSE (like nginx
does, as you wrote), though I’d like to understand under what conditions
we can get EADDRINUSE in the first place.
Ludo’.
This bug report was last modified 2 years and 18 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.