From unknown Wed Jun 25 09:13:06 2025 X-Loop: help-debbugs@gnu.org Subject: bug#59493: cuirass-remote-worker crash Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: othacehe@gnu.org, bug-guix@gnu.org Resent-Date: Tue, 22 Nov 2022 22:15:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 59493 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 59493@debbugs.gnu.org Cc: Mathieu Othacehe X-Debbugs-Original-To: bug-guix@gnu.org X-Debbugs-Original-Xcc: Mathieu Othacehe Received: via spool by submit@debbugs.gnu.org id=B.166915525824642 (code B ref -1); Tue, 22 Nov 2022 22:15:02 +0000 Received: (at submit) by debbugs.gnu.org; 22 Nov 2022 22:14:18 +0000 Received: from localhost ([127.0.0.1]:52758 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxbWz-0006PN-S7 for submit@debbugs.gnu.org; Tue, 22 Nov 2022 17:14:18 -0500 Received: from lists.gnu.org ([209.51.188.17]:60136) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxbWw-0006PB-Em for submit@debbugs.gnu.org; Tue, 22 Nov 2022 17:14:16 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxbWv-0003y5-KK for bug-guix@gnu.org; Tue, 22 Nov 2022 17:14:14 -0500 Received: from mail3-relais-sop.national.inria.fr ([192.134.164.104]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxbWt-00009H-7L for bug-guix@gnu.org; Tue, 22 Nov 2022 17:14:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=inria.fr; s=dc; h=from:to:subject:date:message-id:mime-version: content-transfer-encoding; bh=yXriI6oDw9I0vfPfKOe9pNY8YzKIHj0arJFhoEHPg+A=; b=cYHPl5qiCFXvgKd9CqW3GP+FfGxOcohffLWq4pRUxcVJoU8OFXpVusvA 6asf5xWmW4jUZlKNlu01c9O1ErCNT1f3g+nh5C7pEoUXZr8Fgvh54bZcl TcNXw9nH4qcfRSRMAtbo+2Xi88IdELxP1NS5UD8Qr28IkkA78BqXOqH+3 o=; Authentication-Results: mail3-relais-sop.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=ludovic.courtes@inria.fr; dmarc=fail (p=none dis=none) d=inria.fr X-IronPort-AV: E=Sophos;i="5.96,185,1665439200"; d="scan'208";a="40514360" Received: from 91-160-117-201.subs.proxad.net (HELO ribbon) ([91.160.117.201]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Nov 2022 23:14:06 +0100 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Duodi 2 Frimaire an 231 de la =?UTF-8?Q?R=C3=A9volution,?= jour du Turneps X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 22 Nov 2022 23:14:05 +0100 Message-ID: <87ilj6hc2a.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=192.134.164.104; envelope-from=ludovic.courtes@inria.fr; helo=mail3-relais-sop.national.inria.fr X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NUMERIC_HTTP_ADDR=1.242, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Hi, In /var/log/cuirass-remote-worker.log on overdrive1.guix, I found this: --8<---------------cut here---------------start------------->8--- 2022-11-21 14:27:24 Backtrace: 2022-11-21 14:27:24 Backtrace: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 1752:10 10 (with-exception-handler _ _ #:unwind? _ # = _) 2022-11-21 14:27:24 In unknown file: 2022-11-21 14:27:24 9 (apply-smob/0 #) 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 724:2 8 (call-with-prompt _ _ #) 2022-11-21 14:27:24 In ice-9/eval.scm: 2022-11-21 14:27:24 1752:10 10 (with-exception-handler _ _ #:unwind? _ # = _) 2022-11-21 14:27:24 619:8 7 (_ #(#(#)= )) 2022-11-21 14:27:24 In cuirass/ui.scm: 2022-11-21 14:27:24 In unknown file: 2022-11-21 14:27:24 9 (apply-smob/0 #) 2022-11-21 14:27:24 104:10 6 (run-cuirass-command _ . _) 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 724:2 8 (call-with-prompt _ _ #) 2022-11-21 14:27:24 1752:10 5 (with-exception-handler _ _ #:unwind? _ # = _) 2022-11-21 14:27:24 In ice-9/eval.scm: 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 619:8 7 (_ #(#(#)= )) 2022-11-21 14:27:24 In cuirass/ui.scm: 2022-11-21 14:27:24 104:10 6 (run-cuirass-command _ . _) 2022-11-21 14:27:24 435:12 4 (_) 2022-11-21 14:27:24 In srfi/srfi-1.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 1752:10 5 (with-exception-handler _ _ #:unwind? _ # = _) 2022-11-21 14:27:24 634:9 3 (for-each # ?) 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 448:18 2 (_ _) 2022-11-21 14:27:24 435:12 4 (_) 2022-11-21 14:27:24 In srfi/srfi-1.scm: 2022-11-21 14:27:24 634:9 3 (for-each # ?) 2022-11-21 14:27:24 356:11 1 (start-worker _ _) 2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm: 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 448:18 2 (_ _) 2022-11-21 14:27:24 1685:16 0 (raise-exception _ #:continuable? _) 2022-11-21 14:27:24 2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no matc= hing pattern" (#vu8()))'. 2022-11-21 14:27:24 356:11 1 (start-worker _ _) 2022-11-21 14:27:24 In ice-9/boot-9.scm: 2022-11-21 14:27:24 1685:16 0 (raise-exception _ #:continuable? _) 2022-11-21 14:27:24 2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no matc= hing pattern" (#vu8()))'. --8<---------------cut here---------------end--------------->8--- (Stuttering is due to the unprotected use of =E2=80=98primitive-fork=E2=80= =99: a non-local exit in the child leads it to execute the same code as its parent. We should fix that, but should we really fork in the first place? :-)) This comes from here: --8<---------------cut here---------------start------------->8--- (define (read-server-info socket) (request-info socket) (match (zmq-get-msg-parts-bytevector socket '()) ;<-- here ((empty info) (match (zmq-read-message (bv->string info)) (('server-info ('worker-address worker-address) ('log-port log-port) ('publish-port publish-port)) (list worker-address log-port publish-port)))))) --8<---------------cut here---------------end--------------->8--- This is the version being used: --8<---------------cut here---------------start------------->8--- ludo@overdrive1 ~$ cat /proc/24019/cmdline |xargs -0 /gnu/store/zpir9n73amaxrwz2k7x46l73v21vxk6s-guile-3.0.8/bin/guile --no-auto= -compile -e main -s /gnu/store/rlqdzmfyamjpn6lz07yqk2hsabv3l7g5-cuirass-1.1= .0-11.9f08035/bin/.cuirass-real remote-worker --workers=3D2 --server=3D10.0= .0.1:5555 --systems=3Darmhf-linux,aarch64-linux --publish-port=3D5558 --sub= stitute-urls=3Dhttp://10.0.0.1 ludo@overdrive1 ~$ guix system describe Generation 36 Sep 27 2022 09:06:48 (current) file name: /var/guix/profiles/system-36-link canonical file name: /gnu/store/m04qw6f0lfd0wpn1skiys4b56wqfc3b8-system label: GNU with Linux-Libre 5.19.11 bootloader: grub-efi root device: /dev/sda3 kernel: /gnu/store/09r4wbbabskmbrnwmshpdk7vh6g87gam-linux-libre-5.19.11/I= mage channels: guix: repository URL: https://git.savannah.gnu.org/git/guix.git commit: f15a141cf35bd4188767f0e91c0654991d4c49e0 configuration file: /gnu/store/myvzd1kpw2pfzfj3krl4lzpcbqsdn48x-configura= tion.scm --8<---------------cut here---------------end--------------->8--- The sequence leading to this seems to be: --8<---------------cut here---------------start------------->8--- 22340 eventfd2(0, EFD_CLOEXEC [=E2=80=A6] 22340 <... eventfd2 resumed>) =3D 15 [=E2=80=A6] 22340 ppoll([{fd=3D15, events=3DPOLLIN}], 1, NULL, NULL, 0 [=E2=80=A6] 22340 <... ppoll resumed>) =3D 1 ([{fd=3D15, revents=3DPOLLIN}= ]) 22343 epoll_pwait(8, 22340 read(15, "\1\0\0\0\0\0\0\0", 8) =3D 8 22340 ppoll([{fd=3D15, events=3DPOLLIN}], 1, {tv_sec=3D0, tv_nsec=3D0}, NUL= L, 0) =3D 0 (Timeout) 22340 write(2, "Backtrace:\n", 11) =3D 11 --8<---------------cut here---------------end--------------->8--- Does that ring a bell? Perhaps that was fixed in the meantime? Right now it cannot be restarted: it always fails at start up with the error above. 10.0.0.1 is reachable though so I=E2=80=99m not sure what=E2= =80=99s up. Ludo=E2=80=99. From unknown Wed Jun 25 09:13:06 2025 X-Loop: help-debbugs@gnu.org Subject: bug#59493: cuirass-remote-worker crash Resent-From: Mathieu Othacehe Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Wed, 23 Nov 2022 08:09:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 59493 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: 59493@debbugs.gnu.org Received: via spool by 59493-submit@debbugs.gnu.org id=B59493.166919092626544 (code B ref 59493); Wed, 23 Nov 2022 08:09:02 +0000 Received: (at 59493) by debbugs.gnu.org; 23 Nov 2022 08:08:46 +0000 Received: from localhost ([127.0.0.1]:53346 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxkoI-0006u3-2D for submit@debbugs.gnu.org; Wed, 23 Nov 2022 03:08:46 -0500 Received: from eggs.gnu.org ([209.51.188.92]:58118) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxkoF-0006tj-SW for 59493@debbugs.gnu.org; Wed, 23 Nov 2022 03:08:45 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxko9-0002py-H3; Wed, 23 Nov 2022 03:08:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=4oI2Eu0l+d6MfProg1UFbxdKzgPnNM4snB2qE00Nrvg=; b=hHRJ9Xpf1gAyn+Jth1Iw dF6vJmoPbQCeobm3ZT17pzU8v8+6Qd04MmVoymkYrSjrrVXM8MeFhihJhMfg4d6MdQBLdCRC/uaE+ cGlDdF5Q6KUuXL4vkyVHRaa3gD6vByxFx49QwClyZfSqlQ131K5GKqztXwhBp9YGBB3yM/+yNtdyP HjU5o1klrzXVnvtL/iV3wyfQnuQ/F2RMbo7PVz6gl9r2i8GejgeySF+JnNexwMVHG9kHhgGQonptQ PZFwutc52KNttmjl+P/4S2NhMktWy+P1TY/IXzBSe9dp7nRIxHuTRTxae9t95b1gcdzj+tRGxsyJX Hf2y0qmjcuUybQ==; Received: from 2a02-8429-81d2-3d01-94c9-8097-ea5c-2775.rev.sfr.net ([2a02:8429:81d2:3d01:94c9:8097:ea5c:2775] helo=meije) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxko9-0002z0-3v; Wed, 23 Nov 2022 03:08:37 -0500 From: Mathieu Othacehe References: <87ilj6hc2a.fsf@inria.fr> Date: Wed, 23 Nov 2022 09:08:32 +0100 In-Reply-To: <87ilj6hc2a.fsf@inria.fr> ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Tue, 22 Nov 2022 23:14:05 +0100") Message-ID: <87h6yqw0sf.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hello Ludo, Thanks for gathering those information. > 2022-11-21 14:27:24 1685:16 0 (raise-exception _ #:continuable? _) > 2022-11-21 14:27:24 > 2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exceptio= n: > 2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no ma= tching pattern" (#vu8()))'. Yes this is because a new remote-server is running on Berlin and it sends an empty sequence at every connection: https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/commit/?id=3Dfc1641= 381d2a8a0472a71ef5ad2b64361faaaab4 All remote-workers must update, and I have deployed Cuirass 1.1.0-13.1341725 on all hydra workers + guix9p. I have been trying to deploy that to overdrive1 for two days but Berlin offloads the builds to kreuzberg which has some issues because a lot of builds are timeouting: --8<---------------cut here---------------start------------->8--- \building of `/gnu/store/9jg75a8rvdz3qxcbbm95312rlc4hyi98-mrustc-0.10-2.597= 593a-checkout.drv' timed out after 3600 seconds of silence build of /gnu/store/9jg75a8rvdz3qxcbbm95312rlc4hyi98-mrustc-0.10-2.597593a-= checkout.drv failed View build log at '/var/log/guix/drvs/9j/g75a8rvdz3qxcbbm95312rlc4hyi98-mru= stc-0.10-2.597593a-checkout.drv.gz'. cannot build derivation `/gnu/store/wavx7rl6h93fpmc46nggnhkyxm75lqa4-mrustc= -0.10-2.597593a-checkout.drv': 1 dependencies couldn't be built --8<---------------cut here---------------end--------------->8--- > (Stuttering is due to the unprotected use of =E2=80=98primitive-fork=E2= =80=99: a > non-local exit in the child leads it to execute the same code as its > parent. We should fix that, but should we really fork in the first > place? :-)) Right, this is problematic. I can't remember why I chose to fork. In the meantime, this should be fixed by updating to 1.1.0-13.1341725 so we can close this one I guess. Mathieu From unknown Wed Jun 25 09:13:06 2025 X-Loop: help-debbugs@gnu.org Subject: bug#59493: cuirass-remote-worker crash Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Wed, 23 Nov 2022 15:48:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 59493 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Mathieu Othacehe Cc: 59493@debbugs.gnu.org Received: via spool by 59493-submit@debbugs.gnu.org id=B59493.166921846529463 (code B ref 59493); Wed, 23 Nov 2022 15:48:02 +0000 Received: (at 59493) by debbugs.gnu.org; 23 Nov 2022 15:47:45 +0000 Received: from localhost ([127.0.0.1]:55952 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxryT-0007f8-5U for submit@debbugs.gnu.org; Wed, 23 Nov 2022 10:47:45 -0500 Received: from eggs.gnu.org ([209.51.188.92]:37224) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxryO-0007eo-Ud for 59493@debbugs.gnu.org; Wed, 23 Nov 2022 10:47:44 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxryJ-00054R-HI for 59493@debbugs.gnu.org; Wed, 23 Nov 2022 10:47:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=N3dIj9WZwinlNrXiEoEd8l0WTZQZEiJXa9e+umeyHaE=; b=iEGTwTCncGLWrbqjR6aC HvfoPE9/lSjRA+WdUOy02T/zYBb4Phr9fgwSXRfEb6M5nVTvpWbsanZ1Z7QicYmf/vNW0dvcVQ2tF 2pJWiqBmhE7RsrrJLScKBo8yhoJfHnBDktE04AUAahPYqj4ef3q8QkbkieL49jzVvdnoDKYWIkWQD 7wr9DwgRDkKi1dt0i5zM3xE3SpmvHkW9NrEnbJEEbpifJVnOwf405a0/vrIcM1nfkijUfqlTXRTBB hVMETdXz6WDWYh4S16tqcEPJGRHAaVz/klsAZQYVJA56bSDIN7eTJc/5hH35gguRC4OIqTPOPJGlz 8Jv1vnX90jZZNQ==; Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxryI-0001WL-Kl; Wed, 23 Nov 2022 10:47:35 -0500 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87ilj6hc2a.fsf@inria.fr> <87h6yqw0sf.fsf@gnu.org> Date: Wed, 23 Nov 2022 16:47:32 +0100 In-Reply-To: <87h6yqw0sf.fsf@gnu.org> (Mathieu Othacehe's message of "Wed, 23 Nov 2022 09:08:32 +0100") Message-ID: <87tu2pfzaj.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi, Mathieu Othacehe skribis: >> 2022-11-21 14:27:24 1685:16 0 (raise-exception _ #:continuable? _) >> 2022-11-21 14:27:24 >> 2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-excepti= on: >> 2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no m= atching pattern" (#vu8()))'. > > Yes this is because a new remote-server is running on Berlin and it > sends an empty sequence at every connection: > https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/commit/?id=3Dfc16= 41381d2a8a0472a71ef5ad2b64361faaaab4 Oh I see. It would be nice to avoid non-backward-compatible changes in the protocol so we can upgrade more smoothly. > All remote-workers must update, and I have deployed Cuirass > 1.1.0-13.1341725 on all hydra workers + guix9p. > > I have been trying to deploy that to overdrive1 for two days but Berlin > offloads the builds to kreuzberg which has some issues because a lot of > builds are timeouting: Done now! --8<---------------cut here---------------start------------->8--- ludo@overdrive1 ~$ guix system describe Generation 37 Nov 23 2022 15:58:08 (current) file name: /var/guix/profiles/system-37-link canonical file name: /gnu/store/62dr875n7i30l375j87flbqfym78kddg-system label: GNU with Linux-Libre 6.0.9 bootloader: grub-efi root device: /dev/sda3 kernel: /gnu/store/p4impcxw8lba8600acrxs21lgzc06xzq-linux-libre-6.0.9/Ima= ge channels: guix: repository URL: https://git.savannah.gnu.org/git/guix.git commit: 78f03567f44f704dfbc03cb64368aa42a01e78ad configuration file: /gnu/store/myvzd1kpw2pfzfj3krl4lzpcbqsdn48x-configura= tion.scm --8<---------------cut here---------------end--------------->8--- Running the Shepherd 0.9.3 and all, wonderful. >> (Stuttering is due to the unprotected use of =E2=80=98primitive-fork=E2= =80=99: a >> non-local exit in the child leads it to execute the same code as its >> parent. We should fix that, but should we really fork in the first >> place? :-)) Fixed in Cuirass commit 9fb6f21d29c5398b35f4c1a77cf6c20f207c9ebb. > Right, this is problematic. I can't remember why I chose to fork. One concern is that, in the Avahi case, we create at least one thread before forking, and as we know that doesn=E2=80=99t work (as in: it might w= ork sometimes). ZMQ may also create threads behind our back. The parent doesn=E2=80=99t call =E2=80=98waitpid=E2=80=99 on its children, = which isn=E2=80=99t great. To me, ideally this would be either multi-threaded or Fiberized. The latter would be more fruitful but what might be difficult is guile-simple-zmq integration with Fibers (but maybe not: zmq_getsockopt + ZMQ_FD lets us get the file descriptor of a socket). Something to consider=E2=80=A6 Thanks, Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 23 10:47:51 2022 Received: (at control) by debbugs.gnu.org; 23 Nov 2022 15:47:51 +0000 Received: from localhost ([127.0.0.1]:55956 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxryZ-0007fk-IW for submit@debbugs.gnu.org; Wed, 23 Nov 2022 10:47:51 -0500 Received: from eggs.gnu.org ([209.51.188.92]:49008) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxryW-0007ez-4j for control@debbugs.gnu.org; Wed, 23 Nov 2022 10:47:50 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxryQ-00055J-Rn for control@debbugs.gnu.org; Wed, 23 Nov 2022 10:47:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:Subject:From:To:Date:in-reply-to: references; bh=OzcACDJktGd3DZ0a1uVuBVuSfYhGaoh1RbMiFoDaiRs=; b=PybM4j8YAaAdvH MTp51vwdrtHbTVN0yvFwNumSZ/bxSjYDIxd+ke43TTqz/wcadUac2+dkojYzvmFNATDkv3MwBFiNk 3JbWcwoI+1xMzzDjvRB5Sh6rmLuyYqDSqNwxKrFBAnyMyVneFWZBEB1A+u+zXrmI7CMlyxpzH1tEY kv2Te1b07/VCNbarcrytcSd38vOJYijF1rTpZN94yQ7RHX511Qw2PmJJZ9EFtpqAjiBiyEcIDPstQ 4rXeCTTAV9+eiIk1wDXmDaaXn5K8amvB3KPt+Whu4W6lY94Vwo4hceQIuzYG8cUYuUFbrAORPIqLD DvlSCqpZ5uM7hyTaPbhg==; Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxryQ-0001WV-AE for control@debbugs.gnu.org; Wed, 23 Nov 2022 10:47:42 -0500 Date: Wed, 23 Nov 2022 16:47:41 +0100 Message-Id: <87sfi9fzaa.fsf@gnu.org> To: control@debbugs.gnu.org From: =?utf-8?Q?Ludovic_Court=C3=A8s?= Subject: control message for bug #59493 MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) close 59493 quit From unknown Wed Jun 25 09:13:06 2025 X-Loop: help-debbugs@gnu.org Subject: bug#59493: cuirass-remote-worker crash Resent-From: Mathieu Othacehe Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Wed, 23 Nov 2022 16:04:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 59493 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: 59493-done@debbugs.gnu.org Received: via spool by 59493-done@debbugs.gnu.org id=D59493.16692194277425 (code D ref 59493); Wed, 23 Nov 2022 16:04:01 +0000 Received: (at 59493-done) by debbugs.gnu.org; 23 Nov 2022 16:03:47 +0000 Received: from localhost ([127.0.0.1]:55972 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxsDz-0001vh-2z for submit@debbugs.gnu.org; Wed, 23 Nov 2022 11:03:47 -0500 Received: from eggs.gnu.org ([209.51.188.92]:52838) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oxsDv-0001vR-9W for 59493-done@debbugs.gnu.org; Wed, 23 Nov 2022 11:03:45 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxsDp-00019r-CB; Wed, 23 Nov 2022 11:03:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=q6abf2tZQH8ifb7B37E+vecah0gyegxgv4hmTfxokvs=; b=RyDiYqyQ+4JKmxokVLDx MHtcR0cBAnNUKP91RmIEuRFbJXmJI0RUDH4U/y0SttPNnZlZUzinDfzVvgl4OZxnh3ITUgArvtSFl 0shvFDIXyqzC1ToDyu0vjy4Fkc/pqGP0Bug6xrRCViEBulZqZAedEaq2mdMr1EklrzQupQA+xet5M lgMA8JajTFK/HMskNKa/w9TIGIUNUL/u7/vCn7aPElhJ39YviKBAGn/Gff4Ik6dCzoj8YzKsmKEWJ YevEJxCM/bVfbufVh1warwtw4Q1EvruU07PhR0ZDemphqQigz8avHirgQ48Zo03QdRoCd92HnmMZY jVzaVf1jibsDiw==; Received: from 2a02-8429-81d2-3d01-94c9-8097-ea5c-2775.rev.sfr.net ([2a02:8429:81d2:3d01:94c9:8097:ea5c:2775] helo=meije) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oxsDo-0003P7-G4; Wed, 23 Nov 2022 11:03:37 -0500 From: Mathieu Othacehe References: <87ilj6hc2a.fsf@inria.fr> <87h6yqw0sf.fsf@gnu.org> <87tu2pfzaj.fsf@gnu.org> Date: Wed, 23 Nov 2022 17:03:32 +0100 In-Reply-To: <87tu2pfzaj.fsf@gnu.org> ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Wed, 23 Nov 2022 16:47:32 +0100") Message-ID: <87k03lwtd7.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hey, > Oh I see. It would be nice to avoid non-backward-compatible changes in > the protocol so we can upgrade more smoothly. Right, sorry. We should introduce a protocol version to avoid that in the future. > Fixed in Cuirass commit 9fb6f21d29c5398b35f4c1a77cf6c20f207c9ebb. Awesome, thanks :) > To me, ideally this would be either multi-threaded or Fiberized. The > latter would be more fruitful but what might be difficult is > guile-simple-zmq integration with Fibers (but maybe not: zmq_getsockopt > + ZMQ_FD lets us get the file descriptor of a socket). I would prefer the multi-threaded approach if possible. While the concept of Fiber is nice it adds another layer of complexity and instability to those programs which are already hard to debug. Mathieu From unknown Wed Jun 25 09:13:06 2025 X-Loop: help-debbugs@gnu.org Subject: bug#59493: cuirass-remote-worker crash Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Sat, 26 Nov 2022 15:05:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 59493 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Mathieu Othacehe Cc: 59493-done@debbugs.gnu.org Received: via spool by 59493-done@debbugs.gnu.org id=D59493.166947507626298 (code D ref 59493); Sat, 26 Nov 2022 15:05:01 +0000 Received: (at 59493-done) by debbugs.gnu.org; 26 Nov 2022 15:04:36 +0000 Received: from localhost ([127.0.0.1]:41123 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oywjL-0006q6-TB for submit@debbugs.gnu.org; Sat, 26 Nov 2022 10:04:36 -0500 Received: from eggs.gnu.org ([209.51.188.92]:45762) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oywjH-0006pn-JK for 59493-done@debbugs.gnu.org; Sat, 26 Nov 2022 10:04:34 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oywjA-0002yR-Bi for 59493-done@debbugs.gnu.org; Sat, 26 Nov 2022 10:04:24 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=KhqAAEkctIOfXVs4zW1Ev0eSdLj4imVe4C7EFUJXas8=; b=fC7K7w56IWBt3vQRR8Lo pJOWcYp0Vi/cyrOVgK7tbROny0hTuk8BZ7GwAtCnxEcXSQWif342S+DBmNu//TLe8LpdEg15yS2UP 4Nvq41SquM2I8AiOqRGOD05uggc04GlvMMP1v3d/zRvVxDfCLUrpj61VYxXssADXZZRH4x2QJa8KU W0ImxkA5htmopczYJkvpjfgFaJMVbD7oz783erZ7p+0zoQLkPxq6kYYMJp9AS2+/Xe/AoyQ35loOk JyPyp9dVDcP6HOhhxUQ1+41y6DTYVGDxu/XvFa68c/BacYeWzuS4XjCv3PS6jVVTbkwHCuW5XJKYd n04reEsP7qW8qw==; Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oywj8-00033O-UO; Sat, 26 Nov 2022 10:04:23 -0500 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87ilj6hc2a.fsf@inria.fr> <87h6yqw0sf.fsf@gnu.org> <87tu2pfzaj.fsf@gnu.org> <87k03lwtd7.fsf@gnu.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Sextidi 6 Frimaire an 231 de la =?UTF-8?Q?R=C3=A9volution,?= jour de la =?UTF-8?Q?M=C3=A2che?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Sat, 26 Nov 2022 16:04:20 +0100 In-Reply-To: <87k03lwtd7.fsf@gnu.org> (Mathieu Othacehe's message of "Wed, 23 Nov 2022 17:03:32 +0100") Message-ID: <87edtp92q3.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi, Mathieu Othacehe skribis: >> To me, ideally this would be either multi-threaded or Fiberized. The >> latter would be more fruitful but what might be difficult is >> guile-simple-zmq integration with Fibers (but maybe not: zmq_getsockopt >> + ZMQ_FD lets us get the file descriptor of a socket). > > I would prefer the multi-threaded approach if possible. While the > concept of Fiber is nice it adds another layer of complexity and > instability to those programs which are already hard to debug. I guess it=E2=80=99s not black and white. Shared-state multithreading is an endless source of bugs, regardless of the language being used; message-passing (what Fibers is about) is more tractable. Sure Fibers can have bugs of its own (I=E2=80=99m well aware of that :-)) b= ut at Fiber-using code can be simpler and less error-ridden than the equivalent shared-state code. Anyway, we=E2=80=99re not there yet. Can you remember the rationale for forking in remote-worker.scm, or do you think we might as well do it all in a single process? Thanks, Ludo=E2=80=99.