From unknown Sat Jun 21 10:43:15 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#67988 <67988@debbugs.gnu.org> To: bug#67988 <67988@debbugs.gnu.org> Subject: Status: [Cuirass] =?UTF-8?Q?=E2=80=98request-work=E2=80=99?= responses received by several workers Reply-To: bug#67988 <67988@debbugs.gnu.org> Date: Sat, 21 Jun 2025 17:43:15 +0000 retitle 67988 [Cuirass] =E2=80=98request-work=E2=80=99 responses received b= y several workers reassign 67988 guix submitter 67988 Ludovic Court=C3=A8s severity 67988 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Sat Dec 23 04:13:26 2023 Received: (at submit) by debbugs.gnu.org; 23 Dec 2023 09:13:26 +0000 Received: from localhost ([127.0.0.1]:48548 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rGy4U-0005UL-0O for submit@debbugs.gnu.org; Sat, 23 Dec 2023 04:13:26 -0500 Received: from lists.gnu.org ([2001:470:142::17]:55940) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rGy4P-0005U4-M2 for submit@debbugs.gnu.org; Sat, 23 Dec 2023 04:13:24 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rGy4E-0004Mr-Pz for bug-guix@gnu.org; Sat, 23 Dec 2023 04:13:10 -0500 Received: from mail3-relais-sop.national.inria.fr ([192.134.164.104]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rGy4B-00064G-R6 for bug-guix@gnu.org; Sat, 23 Dec 2023 04:13:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=inria.fr; s=dc; h=from:to:subject:date:message-id:mime-version: content-transfer-encoding; bh=IE30EXk32asiKu/hxSQ1g1TbQrtH6N+5KGgX8V9fGkE=; b=d4URkRydLji4S0f1CqaI+AASQy6dVVD1Y91V8+jzVyBF6WbtrEG9eRCi SfyXw6NSVO/sn4sCpTxssLeDCLTZwwVAtO62vfBWkn6pW3bXxmu69zrLh bUXsvra7G2yWQ8X8eC5j910veEg7Bx+XRfI5oo1QA/ZBWzh4VwYlD1RPv A=; Authentication-Results: mail3-relais-sop.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=ludovic.courtes@inria.fr; dmarc=fail (p=none dis=none) d=inria.fr X-IronPort-AV: E=Sophos;i="6.04,298,1695679200"; d="scan'208";a="75245025" Received: from 91-160-117-201.subs.proxad.net (HELO ribbon) ([91.160.117.201]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Dec 2023 10:13:02 +0100 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: bug-guix@gnu.org Subject: [Cuirass] =?utf-8?Q?=E2=80=98request-work=E2=80=99?= responses received by several workers X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Tridi 3 =?utf-8?Q?Niv=C3=B4se?= an 232 de la =?utf-8?Q?R=C3=A9volution=2C?= jour du Bitume X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Sat, 23 Dec 2023 10:13:01 +0100 Message-ID: <87wmt5704i.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=192.134.164.104; envelope-from=ludovic.courtes@inria.fr; helo=mail3-relais-sop.national.inria.fr X-Spam_score_int: -8 X-Spam_score: -0.9 X-Spam_bar: / X-Spam_report: (-0.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NORMAL_HTTP_TO_IP=0.001, NUMERIC_HTTP_ADDR=1.242, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, WEIRD_PORT=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hello, I=E2=80=99m under the impression that sometimes, when the server replies to =E2=80=98worker-request-work=E2=80=99 messages, its reply is received by mo= re than just the target worker, leading to builds being performed twice: --8<---------------cut here---------------start------------->8--- ludo@berlin ~$ sudo grep lyhz5d1jb396m32dy0fs9h8vqzw95ddp /var/log/cuirass-= remote-server.log 2023-12-23 00:15:29 141.80.167.184 (0LFowqzr): build started: '/gnu/store/l= yhz5d1jb396m32dy0fs9h8vqzw95ddp-cdrdao-1.2.5.drv'. 2023-12-23 00:18:41 fetching 1 outputs of '/gnu/store/lyhz5d1jb396m32dy0fs9= h8vqzw95ddp-cdrdao-1.2.5.drv' from http://141.80.167.184:5558 2023-12-23 00:18:45 build succeeded: '/gnu/store/lyhz5d1jb396m32dy0fs9h8vqz= w95ddp-cdrdao-1.2.5.drv' 2023-12-23 00:21:20 141.80.167.159 (oNzYXCv5): build started: '/gnu/store/l= yhz5d1jb396m32dy0fs9h8vqzw95ddp-cdrdao-1.2.5.drv'. 2023-12-23 00:24:31 fetching 1 outputs of '/gnu/store/lyhz5d1jb396m32dy0fs9= h8vqzw95ddp-cdrdao-1.2.5.drv' from http://141.80.167.159:5558 2023-12-23 00:24:32 build succeeded: '/gnu/store/lyhz5d1jb396m32dy0fs9h8vqz= w95ddp-cdrdao-1.2.5.drv' ludo@berlin ~$ sudo ssh root@141.80.167.184 grep lyhz5d1jb396m32dy0fs9h8vqz= w95ddp /var/log/cuirass-remote-worker.log 2023-12-23 00:12:32 0LFowqzr: building derivation `/gnu/store/lyhz5d1jb396m= 32dy0fs9h8vqzw95ddp-cdrdao-1.2.5.drv' (system: x86_64-linux) 2023-12-23 00:12:54 0LFowqzr: derivation /gnu/store/lyhz5d1jb396m32dy0fs9h8= vqzw95ddp-cdrdao-1.2.5.drv build succeeded. ludo@berlin ~$ sudo ssh root@141.80.167.159 grep lyhz5d1jb396m32dy0fs9h8vqz= w95ddp /var/log/cuirass-remote-worker.log 2023-12-23 00:17:51 oNzYXCv5: building derivation `/gnu/store/lyhz5d1jb396m= 32dy0fs9h8vqzw95ddp-cdrdao-1.2.5.drv' (system: x86_64-linux) 2023-12-23 00:18:17 oNzYXCv5: derivation /gnu/store/lyhz5d1jb396m32dy0fs9h8= vqzw95ddp-cdrdao-1.2.5.drv build succeeded. --8<---------------cut here---------------end--------------->8--- This is with Cuirass 1.2.0-1.bdc1f9f. To be continued=E2=80=A6 Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Tue May 28 17:51:01 2024 Received: (at 67988) by debbugs.gnu.org; 28 May 2024 21:51:01 +0000 Received: from localhost ([127.0.0.1]:59288 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sC4ij-0006Pt-0e for submit@debbugs.gnu.org; Tue, 28 May 2024 17:51:01 -0400 Received: from eggs.gnu.org ([209.51.188.92]:42296) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sC4if-0006PX-Vr for 67988@debbugs.gnu.org; Tue, 28 May 2024 17:50:59 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sC4iR-0005zi-5A for 67988@debbugs.gnu.org; Tue, 28 May 2024 17:50:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To: From; bh=1jih5Fn7Go0CNxj+7H9AylW4eeEVxSAGwCMLQj8vCLM=; b=Jk9raafj2q+oH6aeILAm fAJ/+DxEGd6UeLfKkFiU0F9Fclya9aYxd3ZcT6rwGqLbx9IaOxZh6OU/4BmPWkCpb9+9OHiK5wGO+ 5oHjwYx6KxVXM2EYyln1m1FluUw1K8jlEFdcWHqu6kNCefSm7T2oum2bRBcK1OfVIusTJZS6UJF6t w03PW0YRP5ZZw9xbejyi2m02rFO8QHdCJNvMdo6Dl9OA9oP53FUV9Vbnc6TeIcVd586AvYurufLGN a8NC7Kp5THvGqYVPi8XAnGHfO7FZpAFhivOoD1E49VdVn3S9+HwpShtk8eopzJx2AQIK7WPCnIYd2 bIDx4n3sidiNDg==; From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: 67988@debbugs.gnu.org Subject: Re: bug#67988: [Cuirass] =?utf-8?Q?=E2=80=98request-work=E2=80=99?= responses received by several workers In-Reply-To: <87wmt5704i.fsf@inria.fr> ("Ludovic =?utf-8?Q?Court=C3=A8s=22?= =?utf-8?Q?'s?= message of "Sat, 23 Dec 2023 10:13:01 +0100") References: <87wmt5704i.fsf@inria.fr> Date: Tue, 28 May 2024 23:50:39 +0200 Message-ID: <8734q1wqq8.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 67988 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Ludovic Court=C3=A8s skribis: > I=E2=80=99m under the impression that sometimes, when the server replies = to > =E2=80=98worker-request-work=E2=80=99 messages, its reply is received by = more than just > the target worker, leading to builds being performed twice: Seen again: --8<---------------cut here---------------start------------->8--- ludo@guix-hpc4 ~/src/cuirass$ sudo grep nmhvrka9i4qng54w3d478j1lsp9dn7r7 /= var/log/cuirass-remote-server.log 2024-05-28 21:31:43 194.199.1.26 (PajrOfGX): build started: '/gnu/store/nmh= vrka9i4qng54w3d478j1lsp9dn7r7-firefox-126.0.1.drv'. 2024-05-28 21:34:22 194.199.1.27 (exataaY9): build started: '/gnu/store/nmh= vrka9i4qng54w3d478j1lsp9dn7r7-firefox-126.0.1.drv'. 2024-05-28 21:38:32 194.199.1.17 (DIwFaVSn): build started: '/gnu/store/nmh= vrka9i4qng54w3d478j1lsp9dn7r7-firefox-126.0.1.drv'. 2024-05-28 22:16:13 fetching 1 outputs of '/gnu/store/nmhvrka9i4qng54w3d478= j1lsp9dn7r7-firefox-126.0.1.drv' from http://194.199.1.26:5558 2024-05-28 22:16:18 build succeeded: '/gnu/store/nmhvrka9i4qng54w3d478j1lsp= 9dn7r7-firefox-126.0.1.drv' 2024-05-28 22:53:49 fetching 1 outputs of '/gnu/store/nmhvrka9i4qng54w3d478= j1lsp9dn7r7-firefox-126.0.1.drv' from http://194.199.1.27:5558 2024-05-28 22:53:49 build succeeded: '/gnu/store/nmhvrka9i4qng54w3d478j1lsp= 9dn7r7-firefox-126.0.1.drv' 2024-05-28 23:03:50 fetching 1 outputs of '/gnu/store/nmhvrka9i4qng54w3d478= j1lsp9dn7r7-firefox-126.0.1.drv' from http://194.199.1.17:5558 2024-05-28 23:03:50 build succeeded: '/gnu/store/nmhvrka9i4qng54w3d478j1lsp= 9dn7r7-firefox-126.0.1.drv' --8<---------------cut here---------------end--------------->8--- And on workers: --8<---------------cut here---------------start------------->8--- $ ssh root@guix-hpc3 grep nmhvrka9i4qng54w3d478j1lsp9dn7r7 /var/log/cuirass= -remote-worker.log 2024-05-28 21:57:43 DIwFaVSn: building derivation `/gnu/store/nmhvrka9i4qng= 54w3d478j1lsp9dn7r7-firefox-126.0.1.drv' (system: x86_64-linux) 2024-05-28 23:22:58 DIwFaVSn: derivation /gnu/store/nmhvrka9i4qng54w3d478j1= lsp9dn7r7-firefox-126.0.1.drv build succeeded. $ ssh root@guix-hpc5 grep nmhvrka9i4qng54w3d478j1lsp9dn7r7 /var/log/cuirass= -remote-worker.log 2024-05-28 21:34:13 PajrOfGX: building derivation `/gnu/store/nmhvrka9i4qng= 54w3d478j1lsp9dn7r7-firefox-126.0.1.drv' (system: x86_64-linux) 2024-05-28 22:18:40 PajrOfGX: derivation /gnu/store/nmhvrka9i4qng54w3d478j1= lsp9dn7r7-firefox-126.0.1.drv build succeeded. $ ssh root@guix-hpc7 grep nmhvrka9i4qng54w3d478j1lsp9dn7r7 /var/log/cuirass= -remote-worker.log 2024-05-28 21:34:11 exataaY9: building derivation `/gnu/store/nmhvrka9i4qng= 54w3d478j1lsp9dn7r7-firefox-126.0.1.drv' (system: x86_64-linux) 2024-05-28 22:53:35 exataaY9: derivation /gnu/store/nmhvrka9i4qng54w3d478j1= lsp9dn7r7-firefox-126.0.1.drv build succeeded. --8<---------------cut here---------------end--------------->8--- Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Fri May 31 15:55:41 2024 Received: (at 67988) by debbugs.gnu.org; 31 May 2024 19:55:41 +0000 Received: from localhost ([127.0.0.1]:55393 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sD8Lk-0007h5-Ph for submit@debbugs.gnu.org; Fri, 31 May 2024 15:55:41 -0400 Received: from eggs.gnu.org ([209.51.188.92]:55742) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sD8Lg-0007gl-J0 for 67988@debbugs.gnu.org; Fri, 31 May 2024 15:55:40 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sD8LP-0003jY-Sn for 67988@debbugs.gnu.org; Fri, 31 May 2024 15:55:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To: From; bh=GfOgRKdb4dMFNOudKJWjOaDRyyKgk8DN07oaUqGwsuA=; b=r7hbdCjgP1wq6qVTRNwE 88hC92Wjw5IEfr9bC5mcgd6DeBXARPG3QMmMi8jZV8uxMxJimqZbpALgK1zotQeRB3hXJTT6qc4Zk e078EuxKJnCjBnB/X3TaWI9C9BklsLFSmTMiKpl5eYZAZagv656/+JedGXC5GgqgXKrRAK/Ih/5QW 6gC6IGduDomWiBVQDQM9FtWOxMJPVHDS+WYuuGdQyqPmZXm63IheAKtvYDRRCA96jrsKPBX2sAiXa 50pz8Iq3Yv1TCRmuaCi2/9U2VuVmcpx1JBo3RwIKkfhE889/1p1bVMHQJF7rzhC1RrIMUGjGrhNdU /ZZsacZUVgSSKA==; From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: 67988@debbugs.gnu.org Subject: Re: bug#67988: [Cuirass] =?utf-8?Q?=E2=80=98request-work=E2=80=99?= responses received by several workers In-Reply-To: <87wmt5704i.fsf@inria.fr> ("Ludovic =?utf-8?Q?Court=C3=A8s=22?= =?utf-8?Q?'s?= message of "Sat, 23 Dec 2023 10:13:01 +0100") References: <87wmt5704i.fsf@inria.fr> Date: Fri, 31 May 2024 21:55:16 +0200 Message-ID: <87ttidrc2j.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 67988 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Ludovic Court=C3=A8s skribis: > I=E2=80=99m under the impression that sometimes, when the server replies = to > =E2=80=98worker-request-work=E2=80=99 messages, its reply is received by = more than just > the target worker, leading to builds being performed twice: On closer inspection, the theory of the message being received by two different peers doesn=E2=80=99t hold. Instead, I believe =E2=80=98db-get-pending-build=E2=80=99 would return the = same build at two different points in time, typically while the first one is still running. That=E2=80=99s normally not possible because the build=E2=80=99s status is = changed to =E2=80=98submitted=E2=80=99 once it=E2=80=99s been picked up. Turns out th= at, due to slowness of the query in =E2=80=98db-get-pending-build=E2=80=99 (fixed in 17338588d4862b04e9e405c1244a2ea703b50d98), =E2=80=98remote-server=E2=80=99 = would sometimes fail to see worker pings in a timely fashion. Thus, it would call =E2=80=98db-remove-unresponsive-workers=E2=80=99, which would reschedu= le builds that were being carried out by said worker(s). And that=E2=80=99s how we w= ould end up with multiple concurrent builds of the same derivation. I added logging in c2061ca845d05694ebeb88935a6ff2254711beb2, which should give a hint, should that happen again. Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Tue Jun 04 09:57:03 2024 Received: (at control) by debbugs.gnu.org; 4 Jun 2024 13:57:03 +0000 Received: from localhost ([127.0.0.1]:53491 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sEUet-0003ji-4r for submit@debbugs.gnu.org; Tue, 04 Jun 2024 09:57:03 -0400 Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]:9727) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sEUer-0003j3-4K for control@debbugs.gnu.org; Tue, 04 Jun 2024 09:57:01 -0400 Authentication-Results: mail2-relais-roc.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=ludo@gnu.org; dmarc=fail (p=none dis=none) d=gnu.org X-IronPort-AV: E=Sophos;i="6.08,213,1712613600"; d="scan'208";a="168973342" Received: from 91-160-117-201.subs.proxad.net (HELO ribbon) ([91.160.117.201]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2024 15:56:41 +0200 Date: Tue, 04 Jun 2024 15:56:40 +0200 Message-Id: <87jzj47qw7.fsf@gnu.org> To: control@debbugs.gnu.org From: =?utf-8?Q?Ludovic_Court=C3=A8s?= Subject: control message for bug #67988 MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) close 67988 quit From unknown Sat Jun 21 10:43:15 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Wed, 03 Jul 2024 11:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator