GNU bug report logs -
#72166
Shepherd periodically goes unresponsive on one of my machines
Previous Next
Full log
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
I've been running into an issue with Shepherd on one of my machines. Every so often (and I haven't figured out what conditions trigger it), my Shepherd instances (both home and PID 1) will go unresponsive. I thought I had tracked it down to a misbehaving home service that I had configured, but it's just happened again without that service running.
'herd status' hangs indefinitely:
jfred <at> terracard ~$ sudo herd status
Password:
<never returns>
...on both instances:
jfred <at> terracard ~$ herd status
<never returns>
The PID 1 shepherd instance isn't reaping defunct processes:
jfred <at> terracard ~$ ps aux | grep -i lock
jfred 541 0.0 0.0 3700 2304 ? S 18:30 0:00 swayidle -w timeout 300 swaylock -f -i ~/.wallpapers/user-manual.jpg timeout 10 if pgrep swaylock; then swaymsg "output * dpms off"; fi resume swaymsg "output * dpms on" before-sleep swaylock -f -i ~/.wallpapers/user-manual.jpg
jfred 3111 0.0 0.0 0 0 ? Z 18:53 0:00 [swaylock] <defunct>
jfred 3112 0.0 0.0 0 0 ? Zs 18:53 0:00 [swaylock] <defunct>
Some further troubleshooting... strace indicates that it's waiting on a read() on its fd 9:
jfred <at> terracard ~ [env]$ sudo strace -fp 1
Password:
strace: Process 1 attached with 5 threads
[pid 144] read(9, <unfinished ...>
[pid 142] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 141] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 140] futex(0x7fa43892abe8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY^
...which seems to be:
jfred <at> terracard ~ [env]$ sudo ls -l /proc/1/fd/9
lr-x------ 1 root root 64 Jul 17 20:39 /proc/1/fd/9 -> 'pipe:[4015]'
jfred <at> terracard ~ [env]$ sudo lsof -n | grep 4015
lsof: WARNING: can't stat() fuse.portal file system /run/user/1000/doc
Output information may be incomplete.
shepherd 1 root 9r FIFO 0,15 0t0 4015 pipe
shepherd 1 root 11w FIFO 0,15 0t0 4015 pipe
shepherd 1 140 GC-marker root 9r FIFO 0,15 0t0 4015 pipe
shepherd 1 140 GC-marker root 11w FIFO 0,15 0t0 4015 pipe
shepherd 1 141 GC-marker root 9r FIFO 0,15 0t0 4015 pipe
shepherd 1 141 GC-marker root 11w FIFO 0,15 0t0 4015 pipe
shepherd 1 142 GC-marker root 9r FIFO 0,15 0t0 4015 pipe
shepherd 1 142 GC-marker root 11w FIFO 0,15 0t0 4015 pipe
shepherd 1 144 shepherd root 9r FIFO 0,15 0t0 4015 pipe
shepherd 1 144 shepherd root 11w FIFO 0,15 0t0 4015 pipe
My system configuration for this machine can be found here, and I last ran a 'guix pull' on June 21: https://github.com/jfrederickson/dotfiles/blob/master/guix/guix/system/machines/terracard/config.scm
Has anyone else run into this?
This bug report was last modified 282 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.