Package: guix;
Reported by: Ludovic Courtès <ludovic.courtes <at> inria.fr>
Date: Tue, 22 Nov 2022 22:15:02 UTC
Severity: normal
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Ludovic Courtès <ludovic.courtes <at> inria.fr> To: 59493 <at> debbugs.gnu.org Cc: Mathieu Othacehe <othacehe <at> gnu.org> Subject: bug#59493: cuirass-remote-worker crash Date: Tue, 22 Nov 2022 23:14:05 +0100
Hi, In /var/log/cuirass-remote-worker.log on overdrive1.guix, I found this: --8<---------------cut here---------------start------------->8--- 2022-11-21 14:27:24 Backtrace: 2022-11-21 14:27:24 Backtrace: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 1752:10 10 (with-exception-handler _ _ #:unwind? _ # _) 2022-11-21 14:27:24 In unknown file: 2022-11-21 14:27:24 9 (apply-smob/0 #<thunk 3903a300>) 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 724:2 8 (call-with-prompt _ _ #<procedure default-prompt-handle?>) 2022-11-21 14:27:24 In ice-9/eval.scm: 2022-11-21 14:27:24 1752:10 10 (with-exception-handler _ _ #:unwind? _ # _) 2022-11-21 14:27:24 619:8 7 (_ #(#(#<directory (guile-user) 3903dc80>))) 2022-11-21 14:27:24 In cuirass/ui.scm: 2022-11-21 14:27:24 In unknown file: 2022-11-21 14:27:24 9 (apply-smob/0 #<thunk 3903a300>) 2022-11-21 14:27:24 104:10 6 (run-cuirass-command _ . _) 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 724:2 8 (call-with-prompt _ _ #<procedure default-prompt-handle?>) 2022-11-21 14:27:24 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) 2022-11-21 14:27:24 In ice-9/eval.scm: 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 619:8 7 (_ #(#(#<directory (guile-user) 3903dc80>))) 2022-11-21 14:27:24 In cuirass/ui.scm: 2022-11-21 14:27:24 104:10 6 (run-cuirass-command _ . _) 2022-11-21 14:27:24 435:12 4 (_) 2022-11-21 14:27:24 In srfi/srfi-1.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) 2022-11-21 14:27:24 634:9 3 (for-each #<procedure 398a3510 at cuirass/scripts/remo?> ?) 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 448:18 2 (_ _) 2022-11-21 14:27:24 435:12 4 (_) 2022-11-21 14:27:24 In srfi/srfi-1.scm: 2022-11-21 14:27:24 634:9 3 (for-each #<procedure 398a3510 at cuirass/scripts/remo?> ?) 2022-11-21 14:27:24 356:11 1 (start-worker _ _) 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 448:18 2 (_ _) 2022-11-21 14:27:24 1685:16 0 (raise-exception _ #:continuable? _) 2022-11-21 14:27:24 2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no matching pattern" (#vu8()))'. 2022-11-21 14:27:24 356:11 1 (start-worker _ _) 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 1685:16 0 (raise-exception _ #:continuable? _) 2022-11-21 14:27:24 2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no matching pattern" (#vu8()))'. --8<---------------cut here---------------end--------------->8--- (Stuttering is due to the unprotected use of ‘primitive-fork’: a non-local exit in the child leads it to execute the same code as its parent. We should fix that, but should we really fork in the first place? :-)) This comes from here: --8<---------------cut here---------------start------------->8--- (define (read-server-info socket) (request-info socket) (match (zmq-get-msg-parts-bytevector socket '()) ;<-- here ((empty info) (match (zmq-read-message (bv->string info)) (('server-info ('worker-address worker-address) ('log-port log-port) ('publish-port publish-port)) (list worker-address log-port publish-port)))))) --8<---------------cut here---------------end--------------->8--- This is the version being used: --8<---------------cut here---------------start------------->8--- ludo <at> overdrive1 ~$ cat /proc/24019/cmdline |xargs -0 /gnu/store/zpir9n73amaxrwz2k7x46l73v21vxk6s-guile-3.0.8/bin/guile --no-auto-compile -e main -s /gnu/store/rlqdzmfyamjpn6lz07yqk2hsabv3l7g5-cuirass-1.1.0-11.9f08035/bin/.cuirass-real remote-worker --workers=2 --server=10.0.0.1:5555 --systems=armhf-linux,aarch64-linux --publish-port=5558 --substitute-urls=http://10.0.0.1 ludo <at> overdrive1 ~$ guix system describe Generation 36 Sep 27 2022 09:06:48 (current) file name: /var/guix/profiles/system-36-link canonical file name: /gnu/store/m04qw6f0lfd0wpn1skiys4b56wqfc3b8-system label: GNU with Linux-Libre 5.19.11 bootloader: grub-efi root device: /dev/sda3 kernel: /gnu/store/09r4wbbabskmbrnwmshpdk7vh6g87gam-linux-libre-5.19.11/Image channels: guix: repository URL: https://git.savannah.gnu.org/git/guix.git commit: f15a141cf35bd4188767f0e91c0654991d4c49e0 configuration file: /gnu/store/myvzd1kpw2pfzfj3krl4lzpcbqsdn48x-configuration.scm --8<---------------cut here---------------end--------------->8--- The sequence leading to this seems to be: --8<---------------cut here---------------start------------->8--- 22340 eventfd2(0, EFD_CLOEXEC <unfinished ...> […] 22340 <... eventfd2 resumed>) = 15 […] 22340 ppoll([{fd=15, events=POLLIN}], 1, NULL, NULL, 0 <unfinished ...> […] 22340 <... ppoll resumed>) = 1 ([{fd=15, revents=POLLIN}]) 22343 epoll_pwait(8, <unfinished ...> 22340 read(15, "\1\0\0\0\0\0\0\0", 8) = 8 22340 ppoll([{fd=15, events=POLLIN}], 1, {tv_sec=0, tv_nsec=0}, NULL, 0) = 0 (Timeout) 22340 write(2, "Backtrace:\n", 11) = 11 --8<---------------cut here---------------end--------------->8--- Does that ring a bell? Perhaps that was fixed in the meantime? Right now it cannot be restarted: it always fails at start up with the error above. 10.0.0.1 is reachable though so I’m not sure what’s up. Ludo’.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.