GNU bug report logs - #33239
'guix offload' regularly hangs in 'channel-get-exit-status' call

Previous Next

Package: guix;

Reported by: ludo <at> gnu.org (Ludovic Courtès)

Date: Fri, 2 Nov 2018 10:58:02 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #19 received at 33239 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: 33239 <at> debbugs.gnu.org
Subject: Re: bug#33239: 'guix offload' regularly hangs in
 'channel-get-exit-status' call
Date: Fri, 23 Nov 2018 18:25:21 +0100
ludo <at> gnu.org (Ludovic Courtès) skribis:

> (gdb) bt
> #0  0x00007f299fb330f1 in __GI___poll (fds=0x1dd58c0, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
> #1  0x00007f2994287577 in ssh_poll_ctx_dopoll () from target:/gnu/store/wmpg67bn7i7pqc0p4xjp1npnqixk9znd-libssh-0.7.6/lib/libssh.so.4
> #2  0x00007f29942884d9 in ssh_handle_packets () from target:/gnu/store/wmpg67bn7i7pqc0p4xjp1npnqixk9znd-libssh-0.7.6/lib/libssh.so.4
> #3  0x00007f29942885ad in ssh_handle_packets_termination () from target:/gnu/store/wmpg67bn7i7pqc0p4xjp1npnqixk9znd-libssh-0.7.6/lib/libssh.so.4
> #4  0x00007f2994275080 in ssh_channel_get_exit_status () from target:/gnu/store/wmpg67bn7i7pqc0p4xjp1npnqixk9znd-libssh-0.7.6/lib/libssh.so.4
> #5  0x00007f29946dd11a in guile_ssh_channel_get_exit_status () from target:/gnu/store/i3nfl17wfx7sryq6w15r9wxl7ilmq4rb-guile-ssh-0.11.3/lib/libguile-ssh.so.11
> #6  0x00007f29a1765965 in vm_regular_engine (thread=0x1dd58c0, vp=0x1d4df30, registers=0xffffffff, resume=-1615646479) at vm-engine.c:786
> #7  0x00007f29a1768fba in scm_call_n (proc=#<program 7f29a1be0030>, argv=argv <at> entry=0x7ffc76b1ece8, nargs=nargs <at> entry=1) at vm.c:1257
> #8  0x00007f29a16ecff7 in scm_primitive_eval (
>     exp=exp <at> entry=((@ (ice-9 control) %) (begin ((@@ (ice-9 command-line) load/lang) "/gnu/store/zz3b7j4iv6v143v7cqyr77k83zc5n3zw-guix-0.15.0-6.f9a8fce/bin/.guix-real") (main (command-line)) (quit)))) at eval.c:662
> #9  0x00007f29a16ed053 in scm_eval (
>     exp=((@ (ice-9 control) %) (begin ((@@ (ice-9 command-line) load/lang) "/gnu/store/zz3b7j4iv6v143v7cqyr77k83zc5n3zw-guix-0.15.0-6.f9a8fce/bin/.guix-real") (main (command-line)) (quit))), module_or_state=module_or_state <at> entry="#<struct module>" = {...}) at eval.c:696
> #10 0x00007f29a1738220 in scm_shell (argc=11, argv=0x1dd5280) at script.c:454
>
> (gdb) frame 0
> #0  0x00007f299fb330f1 in __GI___poll (fds=0x1dd58c0, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
> 29      in ../sysdeps/unix/sysv/linux/poll.c
> (gdb) p *fds
> $1 = {fd = 14, events = 1, revents = 0}
> (gdb) shell ls -l /proc/12605/fd
> total 0
> lr-x------ 1 root root 64 Nov  2 11:20 0 -> 'pipe:[44413497]'
> l-wx------ 1 root root 64 Nov  2 11:33 1 -> 'pipe:[44413496]'
> lr-x------ 1 root root 64 Nov  2 11:33 10 -> 'pipe:[44459532]'
> l-wx------ 1 root root 64 Nov  2 11:33 11 -> 'pipe:[44459532]'
> lr-x------ 1 root root 64 Nov  2 11:33 12 -> 'pipe:[44429590]'
> l-wx------ 1 root root 64 Nov  2 11:33 13 -> 'pipe:[44429590]'
> lrwx------ 1 root root 64 Nov  2 11:33 14 -> 'socket:[44444783]'
> lrwx------ 1 root root 64 Nov  2 11:33 15 -> 'socket:[44444784]'
> l-wx------ 1 root root 64 Nov  2 11:33 16 -> /var/guix/offload/141.80.167.140/0

When that happens, the guile process on the remote node that runs the
‘redirect’ code of ‘remote-daemon-channel’ is stuck in select(2) with
infinite timeout.

Note on berlin the build nodes are still running Guile 2.2.2, vulnerable
to the ‘select’ bug <https://bugs.gnu.org/30365>, which we ‘redirect’
supposedly works around.

Ludo’.




This bug report was last modified 6 years and 128 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.