From unknown Fri Aug 15 20:56:51 2025 X-Loop: help-debbugs@gnu.org Subject: bug#64297: [Cuirass] Remote server not picking up job, losing workers Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Mon, 26 Jun 2023 08:55:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 64297 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 64297@debbugs.gnu.org X-Debbugs-Original-To: bug-guix@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.16877696815630 (code B ref -1); Mon, 26 Jun 2023 08:55:02 +0000 Received: (at submit) by debbugs.gnu.org; 26 Jun 2023 08:54:41 +0000 Received: from localhost ([127.0.0.1]:44437 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qDhzd-0001Sk-DF for submit@debbugs.gnu.org; Mon, 26 Jun 2023 04:54:41 -0400 Received: from lists.gnu.org ([209.51.188.17]:42986) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qDhza-0001Sa-78 for submit@debbugs.gnu.org; Mon, 26 Jun 2023 04:54:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qDhzZ-0005NN-Sp for bug-guix@gnu.org; Mon, 26 Jun 2023 04:54:37 -0400 Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qDhzX-00020J-Ix for bug-guix@gnu.org; Mon, 26 Jun 2023 04:54:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=inria.fr; s=dc; h=from:to:subject:date:message-id:mime-version: content-transfer-encoding; bh=4dOHBOu0R1ZSg1Z8Y5fXPZErjxj43EHEyIn5qmkO6hw=; b=e83Eq5gw+PHJgBInftp+fAbKz/KZ8xUwGQtduYZ5tBV/0PRQHMEZD3Kd 62HK/pe3c8m02K9RBqg/fIo2CE5C7LH8yZVDgndTvfhqcxrBFbMZOsnjj k2opWS4ZXrDnyro98+BhOn8Qu4sZjQxOg0fCGjuTubxSTsCkVRbtEFceV 0=; Authentication-Results: mail2-relais-roc.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=ludovic.courtes@inria.fr; dmarc=fail (p=none dis=none) d=inria.fr X-IronPort-AV: E=Sophos;i="6.01,159,1684792800"; d="scan'208";a="114625635" Received: from unknown (HELO ribbon) ([193.50.110.146]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jun 2023 10:54:12 +0200 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Octidi 8 Messidor an 231 de la =?UTF-8?Q?R=C3=A9volution,?= jour de =?UTF-8?Q?l'=C3=89chalotte?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Mon, 26 Jun 2023 10:54:12 +0200 Message-ID: <87fs6ey56z.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=192.134.164.83; envelope-from=ludovic.courtes@inria.fr; helo=mail2-relais-roc.national.inria.fr X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) As of cuirass@1.1.0-16.b825967, =E2=80=98cuirass remote-server=E2=80=99 app= ears to not pick jobs as quickly as it should and to lose sight of workers (you can see them come and go on ). /var/log/cuirass-remote-worker.log shows that it does build things, but only sporadically. Then there are things like: 2023-06-26 10:07:58 warning: Poll loop busy during 3404 seconds. This is presumably related to Cuirass commit c4743b54720e86b0e0b0295fb6d33977e4293644 (previously =E2=80=98remote-worker= =E2=80=99 would have a database worker thread; now it opens a new connection every time=E2=80=94a stopgap before it=E2=80=99s fiberized, but apparently not a = good one). Ludo=E2=80=99. From unknown Fri Aug 15 20:56:51 2025 X-Loop: help-debbugs@gnu.org Subject: bug#64297: [Cuirass] Remote server not picking up job, losing workers Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Fri, 30 Jun 2023 15:46:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64297 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 64297@debbugs.gnu.org Received: via spool by 64297-submit@debbugs.gnu.org id=B64297.16881399594577 (code B ref 64297); Fri, 30 Jun 2023 15:46:02 +0000 Received: (at 64297) by debbugs.gnu.org; 30 Jun 2023 15:45:59 +0000 Received: from localhost ([127.0.0.1]:55642 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qFGJq-0001Bi-Iv for submit@debbugs.gnu.org; Fri, 30 Jun 2023 11:45:59 -0400 Received: from eggs.gnu.org ([209.51.188.92]:49590) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qFGJo-0001BQ-AC for 64297@debbugs.gnu.org; Fri, 30 Jun 2023 11:45:57 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qFGJi-0000PA-VS for 64297@debbugs.gnu.org; Fri, 30 Jun 2023 11:45:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=8rIKMXatkXoNMmANmTQ4/BnFhzdOWZ2SbYzTlmzDKpY=; b=T153Gm0XXJsZWnDXNXTx 0PAgNrr/nSaFNzc5qQm1drxazcg/41x+Q4pYBlSVg2sliOh+vbxZrbeMCARpmKupZYA426fzJuyFi jD6sLtFUBtRioYuDiAzu9IOs/Zd/HlDgPy41Z/t3TqwaX5/BVRCGTOYrMBdaqk8sOV/duL60RtzwK as1uVhognNtmtsTLhk2qhReU3yuS5M/Thok+32kmYgLemjdrDLBiO03aI6zcrQcr/0pZHKBq7JY/7 R1TVLZSrXGnZV1PV1y45QLok+nyESmH+xnZsWL3Sz+hoTy6dX+wp3xVjm828/YGg96/N4iNltY3Po 8+f6VAfC5Fdweg==; Received: from [193.50.110.213] (helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qFGJi-0002JO-Iu for 64297@debbugs.gnu.org; Fri, 30 Jun 2023 11:45:50 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87fs6ey56z.fsf@inria.fr> Date: Fri, 30 Jun 2023 17:45:48 +0200 In-Reply-To: <87fs6ey56z.fsf@inria.fr> ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Mon, 26 Jun 2023 10:54:12 +0200") Message-ID: <87wmzlymvn.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Ludovic Court=C3=A8s skribis: > As of cuirass@1.1.0-16.b825967, =E2=80=98cuirass remote-server=E2=80=99 a= ppears to not > pick jobs as quickly as it should and to lose sight of workers (you can > see them come and go on ). > > /var/log/cuirass-remote-worker.log shows that it does build things, but > only sporadically. Then there are things like: > > 2023-06-26 10:07:58 warning: Poll loop busy during 3404 seconds. > > This is presumably related to Cuirass commit > c4743b54720e86b0e0b0295fb6d33977e4293644 (previously =E2=80=98remote-work= er=E2=80=99 > would have a database worker thread; now it opens a new connection every > time=E2=80=94a stopgap before it=E2=80=99s fiberized, but apparently not = a good one). Apparently this has to do with squee calling =E2=80=98current-read-waiter= =E2=80=99 (i.e., poll(2)) while waiting for its response and passing checking the wrong FD for some reason, as in this case: --8<---------------cut here---------------start------------->8--- 18484 15:01:00 connect(55, {sa_family=3DAF_UNIX, sun_path=3D"/tmp/ephemeral= pg.58xnKh/.s.PGSQL.5432"}, 110) =3D 0 <0.000019> 18484 15:01:00 getsockopt(55, SOL_SOCKET, SO_ERROR, [0], [4]) =3D 0 <0.0000= 14> 18484 15:01:00 getsockname(55, {sa_family=3DAF_UNIX}, [128 =3D> 2]) =3D 0 <= 0.000010> 18484 15:01:00 poll([{fd=3D55, events=3DPOLLOUT|POLLERR}], 1, -1) =3D 1 ([{= fd=3D55, revents=3DPOLLOUT}]) <0.000011> 18484 15:01:00 sendto(55, "\0\0\0!\0\3\0\0user\0ludo\0database\0test\0\0", = 33, MSG_NOSIGNAL, NULL, 0) =3D 33 <0.000014> 18484 15:01:00 poll([{fd=3D55, events=3DPOLLIN|POLLERR}], 1, -1 18484 15:01:00 <... poll resumed>) =3D 1 ([{fd=3D55, revents=3DPOLLIN}= ]) <0.001786> 18484 15:01:00 recvfrom(55, "R\0\0\0\10\0\0\0\0S\0\0\0\26application_name\0= \0S\0\0\0\31client_encoding\0UTF8\0S\0\0\0\27DateStyle\0ISO, MD"..., 16384,= 0, NULL, NULL) =3D 376 <0.000016> 18484 15:01:00 sendto(55, "P\0\0\0\366\0\nUPDATE Builds SET status =3D -2, = worker =3D null FROM\n(SELECT id FROM Workers"..., 291, MSG_NOSIGNAL, NULL,= 0 18484 15:01:00 <... sendto resumed>) =3D 291 <0.000589> 18484 15:01:00 poll([{fd=3D54, events=3DPOLLIN}], 1, -1 18484 15:03:14 <... poll resumed>) =3D 1 ([{fd=3D54, revents=3DPOLLIN}= ]) <134.319198> 18484 15:03:14 recvfrom(55, 18484 15:03:14 <... recvfrom resumed>"1\0\0\0\0042\0\0\0\4n\0\0\0\4C\0\0\0\= rUPDATE 0\0Z\0\0\0\5I", 16384, 0, NULL, NULL) =3D 35 <0.000026> 18484 15:03:14 sendto(55, "P\0\0\0V\0DELETE FROM Workers WHERE\n(extract(ep= och from now())::int - last_seen) > "..., 131, MSG_NOSIGNAL, NULL, 0 18484 15:03:14 <... sendto resumed>) =3D 131 <0.000084> 18484 15:03:14 poll([{fd=3D54, events=3DPOLLIN}], 1, -1) =3D 1 ([{fd=3D54, = revents=3DPOLLNVAL}]) <0.000013> 18484 15:03:14 recvfrom(55, "1\0\0\0\0042\0\0\0\4n\0\0\0\4C\0\0\0\rDELETE 4= \0Z\0\0\0\5I", 16384, 0, NULL, NULL) =3D 35 <0.000016> 18484 15:03:14 sendto(55, "X\0\0\0\4", 5, MSG_NOSIGNAL, NULL, 0) =3D 5 <0.0= 00025> 18484 15:03:14 close(55) =3D 0 <0.000018> 18484 15:03:14 openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) =3D -= 1 ENOENT (No such file or directory) <0.000019> 18484 15:03:14 write(2, "2023-06-30T15:03:14 warning: Poll loop busy during= 134 seconds.\n", 64) =3D 64 <0.000024> --8<---------------cut here---------------end--------------->8--- In this case FD 54 is a connection with a worker process; terminating that process led poll(2) to return, thus unblocking the =E2=80=9Cpoll loop= =E2=80=9D. The problem is most likely with the connection-to-port caching in squee=E2=80=99s =E2=80=98connection-socket-port=E2=80=99, as can be seen in= this other trace where I added =E2=80=98pk=E2=80=99 calls in =E2=80=98connection-socket-port= =E2=80=99: --8<---------------cut here---------------start------------->8--- 19468 15:37:43 connect(55, {sa_family=3DAF_UNIX, sun_path=3D"/tmp/ephemeral= pg.58xnKh/.s.PGSQL.5432"}, 110) =3D 0 <0.000018> 19468 15:37:43 getsockopt(55, SOL_SOCKET, SO_ERROR, [0], [4]) =3D 0 <0.0000= 10> 19468 15:37:43 getsockname(55, {sa_family=3DAF_UNIX}, [128 =3D> 2]) =3D 0 <= 0.000009> 19468 15:37:43 poll([{fd=3D55, events=3DPOLLOUT|POLLERR}], 1, -1) =3D 1 ([{= fd=3D55, revents=3DPOLLOUT}]) <0.000009> 19468 15:37:43 sendto(55, "\0\0\0!\0\3\0\0user\0ludo\0database\0test\0\0", = 33, MSG_NOSIGNAL, NULL, 0) =3D 33 <0.000012> 19468 15:37:43 poll([{fd=3D55, events=3DPOLLIN|POLLERR}], 1, -1) =3D 1 ([{f= d=3D55, revents=3DPOLLIN}]) <0.002109> 19468 15:37:43 recvfrom(55, "R\0\0\0\10\0\0\0\0S\0\0\0\26application_name\0= \0S\0\0\0\31client_encoding\0UTF8\0S\0\0\0\27DateStyle\0ISO, MD"..., 16384,= 0, NULL, NULL) =3D 376 <0.000009> 19468 15:37:43 sendto(55, "P\0\0\0\366\0\nUPDATE Builds SET status =3D -2, = worker =3D null FROM\n(SELECT id FROM Workers"..., 291, MSG_NOSIGNAL, NULL,= 0) =3D 291 <0.000012> 19468 15:37:43 write(1, "\n", 1) =3D 1 <0.000015> 19468 15:37:43 ioctl(54, TCGETS, 0x7ffd8fde6660) =3D -1 ENOTTY (Inappropria= te ioctl for device) <0.000012> 19468 15:37:43 write(1, ";;; (cached # #)\n", 68) =3D 68 <0.000012> --8<---------------cut here---------------end--------------->8--- Above we open a fresh connection on FD 55, but =E2=80=98connection-socket-p= ort=E2=80=99 determines that the connection object is cached and associated with a port for FD 54. To be continued=E2=80=A6 Ludo=E2=80=99. From unknown Fri Aug 15 20:56:51 2025 X-Loop: help-debbugs@gnu.org Subject: bug#64297: [Cuirass] Remote server not picking up job, losing workers Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Fri, 30 Jun 2023 22:43:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64297 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 64297@debbugs.gnu.org Received: via spool by 64297-submit@debbugs.gnu.org id=B64297.168816497715079 (code B ref 64297); Fri, 30 Jun 2023 22:43:02 +0000 Received: (at 64297) by debbugs.gnu.org; 30 Jun 2023 22:42:57 +0000 Received: from localhost ([127.0.0.1]:55969 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qFMpM-0003v9-Rb for submit@debbugs.gnu.org; Fri, 30 Jun 2023 18:42:57 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44616) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qFMpK-0003ur-3O for 64297@debbugs.gnu.org; Fri, 30 Jun 2023 18:42:54 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qFMpD-0001dW-LZ for 64297@debbugs.gnu.org; Fri, 30 Jun 2023 18:42:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=URIoEpHpRLI2tJgeO+myryb11g5R0/o5Jk3UL0HuBn0=; b=Wz4Rzbf16RAJ1N5KPSXz nht/X5yTwczdQW3mZhz7I+CpGhuaWyiA0PvVau08aC9IbaSQNNdhSpaLJYheMhD90YqanPTnKVo00 5/XQHnITV7bCbM79j0AvTB8foFOL+hQzTqeEQ3V05A9LtuQc4V5dYqPkNmGniS4/C4bZ/n+hEEndH O4UJaoXTzH/7Q3lmgtsEqgs7iuhAIvhyQuxSwzkw3v6hHDR4hd3ccCkLXx8ATSrjh6RKXhNP/FkRI Sxm35n6tCQyBmDNKEzCp9wPLfXPl9MzV1CsWaP2MaDkMZaBjZf1sfY+wGyt1PH6SWoqRpTYkLjPqb LFJHX1Pviv+GNw==; Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201] helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qFMpD-0006iE-98 for 64297@debbugs.gnu.org; Fri, 30 Jun 2023 18:42:47 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87fs6ey56z.fsf@inria.fr> <87wmzlymvn.fsf@gnu.org> Date: Sat, 01 Jul 2023 00:42:44 +0200 In-Reply-To: <87wmzlymvn.fsf@gnu.org> ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Fri, 30 Jun 2023 17:45:48 +0200") Message-ID: <87sfa8zi57.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Ludovic Court=C3=A8s skribis: > The problem is most likely with the connection-to-port caching in > squee=E2=80=99s =E2=80=98connection-socket-port=E2=80=99, as can be seen = in this other trace > where I added =E2=80=98pk=E2=80=99 calls in =E2=80=98connection-socket-po= rt=E2=80=99: Confirmed, with a fix! https://notabug.org/cwebber/guile-squee/pulls/8 Ludo=E2=80=99. From unknown Fri Aug 15 20:56:51 2025 X-Loop: help-debbugs@gnu.org Subject: bug#64297: [Cuirass] Remote server not picking up job, losing workers Resent-From: Christopher Baines Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Sat, 01 Jul 2023 10:31:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64297 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: 64297@debbugs.gnu.org Received: via spool by 64297-submit@debbugs.gnu.org id=B64297.16882074081821 (code B ref 64297); Sat, 01 Jul 2023 10:31:02 +0000 Received: (at 64297) by debbugs.gnu.org; 1 Jul 2023 10:30:08 +0000 Received: from localhost ([127.0.0.1]:56688 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qFXrk-0000TJ-7z for submit@debbugs.gnu.org; Sat, 01 Jul 2023 06:30:08 -0400 Received: from mira.cbaines.net ([212.71.252.8]:42802) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qFXrh-0000T5-0b for 64297@debbugs.gnu.org; Sat, 01 Jul 2023 06:30:05 -0400 Received: from localhost (unknown [IPv6:2a02:8010:68c1:0:54d1:d5d4:280e:f699]) by mira.cbaines.net (Postfix) with ESMTPSA id E778F27BBE2; Sat, 1 Jul 2023 11:30:03 +0100 (BST) Received: from felis (localhost [127.0.0.1]) by localhost (OpenSMTPD) with ESMTP id 25a0aa53; Sat, 1 Jul 2023 10:30:02 +0000 (UTC) References: <87fs6ey56z.fsf@inria.fr> <87wmzlymvn.fsf@gnu.org> <87sfa8zi57.fsf@gnu.org> User-agent: mu4e 1.10.2; emacs 28.2 From: Christopher Baines Date: Sat, 01 Jul 2023 11:28:54 +0100 In-reply-to: <87sfa8zi57.fsf@gnu.org> Message-ID: <87zg4fq5zs.fsf@cbaines.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Ludovic Court=C3=A8s writes: > Ludovic Court=C3=A8s skribis: > >> The problem is most likely with the connection-to-port caching in >> squee=E2=80=99s =E2=80=98connection-socket-port=E2=80=99, as can be seen= in this other trace >> where I added =E2=80=98pk=E2=80=99 calls in =E2=80=98connection-socket-p= ort=E2=80=99: > > Confirmed, with a fix! > > https://notabug.org/cwebber/guile-squee/pulls/8 I've merged that change, updated guile-squee in Guix, pulled on berlin, reconfigured and restarted Cuirass now. It seems to be building some now stuff at least. --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmSgACdfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh aW5lcy5uZXQACgkQXiijOwuE9Xfspw//cJ4hauzPD7e/PokmBaoz8ul61GsF670v XWvv4OFywZe3IBlVXkzGiIX731YBb02cdIFIeWK4YrAOTh5s53JaTLABRV7mByov ys3eFaZyMrZJkB0F8ROzjWDgDubXTY2coQqoN8Q/2Q7u5Jk7hJ2VVkAiIz8tqxJM QUvyg99tCUbSmgwl1A33GB/1n0z93MU9WNT8W62qoN3hFiN/+uJrxXltpjJn/Q8m N6koTLe1VAa90aLGcp54kJselDKVBZcH+AeOAxFL8JufmpfPq8eFAG3eMcFi0Z5o fp3tj3DGE8hEzNSgo+N6iHnk65cTGIvse9vhzNwmYzFlATyv4qC4Yeo/gSWrnOHs ZJAjxqEG/vPDyTicB3q4bp9cubcJZ154YF4I5XUXL0gzb4sVsyRZTRyQecApmaS/ rfTXGfoOFGbyLi9m2b8ymG2jSKk70hitHTAQLKQEO5U5TTcXq7in7Q4cDh0ZK7ho iPO8XOfyNa8WEkL+KfNy7FNzhlONPGDDtSLFU/dsf65N4MHbqPp9Qx/4sW1vGvJs 3glebYlKrqWlXYwJZkAXvBsI+bPdj/WMtVz07EviDjYNt1/KYdguNV7GekT+DK9j K/zaSI+9mylxhXjCK/BcrpHebeD9eppr4v/FDhGWAmFp8GMdPyuw9z0j1tZWTsz3 GAY9UCiZ3pg= =n00e -----END PGP SIGNATURE----- --=-=-=-- From unknown Fri Aug 15 20:56:51 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Subject: bug#64297: closed (Re: bug#64297: [Cuirass] Remote server not picking up job, losing workers) Message-ID: References: <87mt0fwovi.fsf@gnu.org> <87fs6ey56z.fsf@inria.fr> X-Gnu-PR-Message: they-closed 64297 X-Gnu-PR-Package: guix Reply-To: 64297@debbugs.gnu.org Date: Sat, 01 Jul 2023 16:59:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1688230742-29556-1" This is a multi-part message in MIME format... ------------=_1688230742-29556-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #64297: [Cuirass] Remote server not picking up job, losing workers which was filed against the guix package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 64297@debbugs.gnu.org. --=20 64297: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D64297 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1688230742-29556-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 64297-done) by debbugs.gnu.org; 1 Jul 2023 16:58:03 +0000 Received: from localhost ([127.0.0.1]:58580 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qFdv8-0007fO-PJ for submit@debbugs.gnu.org; Sat, 01 Jul 2023 12:58:03 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53824) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qFdv6-0007eq-O9 for 64297-done@debbugs.gnu.org; Sat, 01 Jul 2023 12:58:01 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qFdv1-0004dR-BH; Sat, 01 Jul 2023 12:57:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=G+YGC68MVfcdoWwSgzGyw+238P6tch7K6IFBpaF0fdU=; b=FzyEf4P5nNI8Jbfv3ZsB F4NGnZVg+sSQdlvwAh0/1D6SNuJhhvdqii2QToj+1zThERkOI+2pwaSuuX88lTfvdhw6xQH8Z/wTx TtJSLYMbdtEkHpKqDFVQGs8BAmFYqWF+miK1r5AW8eKJAusOhwy3yrcuj5E4pCi8cJi+8aj0FoxWf 1lU42YOxW05eJgohyyqZAJgehYLs4kQb96LBRcJI9NNjCdWyFxl0mfRpwcK/i7R5hCJjsIopkYHRd RzUMURBhg5yzZ3R9QkxI6fpcDFAeqGxDs5HVDL4Z6PA4gRXphSTkr1UQl0gAV/g6/FUmMO9U+V/j8 CdJuU7VHHHAT8g==; Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201] helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qFdv0-0004zc-Uj; Sat, 01 Jul 2023 12:57:55 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Christopher Baines Subject: Re: bug#64297: [Cuirass] Remote server not picking up job, losing workers References: <87fs6ey56z.fsf@inria.fr> <87wmzlymvn.fsf@gnu.org> <87sfa8zi57.fsf@gnu.org> <87zg4fq5zs.fsf@cbaines.net> Date: Sat, 01 Jul 2023 18:57:53 +0200 In-Reply-To: <87zg4fq5zs.fsf@cbaines.net> (Christopher Baines's message of "Sat, 01 Jul 2023 11:28:54 +0100") Message-ID: <87mt0fwovi.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 64297-done Cc: 64297-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hello, Christopher Baines skribis: > Ludovic Court=C3=A8s writes: > >> Ludovic Court=C3=A8s skribis: >> >>> The problem is most likely with the connection-to-port caching in >>> squee=E2=80=99s =E2=80=98connection-socket-port=E2=80=99, as can be see= n in this other trace >>> where I added =E2=80=98pk=E2=80=99 calls in =E2=80=98connection-socket-= port=E2=80=99: >> >> Confirmed, with a fix! >> >> https://notabug.org/cwebber/guile-squee/pulls/8 > > I've merged that change, updated guile-squee in Guix, pulled on berlin, > reconfigured and restarted Cuirass now. Awesome, thanks a lot! It does seem to be working hard now. (I=E2=80=99ve just reconfigured as well, as I hadn=E2=80=99t seen your mess= age=E2=80=A6) Ludo=E2=80=99. ------------=_1688230742-29556-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 26 Jun 2023 08:54:41 +0000 Received: from localhost ([127.0.0.1]:44437 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qDhzd-0001Sk-DF for submit@debbugs.gnu.org; Mon, 26 Jun 2023 04:54:41 -0400 Received: from lists.gnu.org ([209.51.188.17]:42986) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qDhza-0001Sa-78 for submit@debbugs.gnu.org; Mon, 26 Jun 2023 04:54:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qDhzZ-0005NN-Sp for bug-guix@gnu.org; Mon, 26 Jun 2023 04:54:37 -0400 Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qDhzX-00020J-Ix for bug-guix@gnu.org; Mon, 26 Jun 2023 04:54:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=inria.fr; s=dc; h=from:to:subject:date:message-id:mime-version: content-transfer-encoding; bh=4dOHBOu0R1ZSg1Z8Y5fXPZErjxj43EHEyIn5qmkO6hw=; b=e83Eq5gw+PHJgBInftp+fAbKz/KZ8xUwGQtduYZ5tBV/0PRQHMEZD3Kd 62HK/pe3c8m02K9RBqg/fIo2CE5C7LH8yZVDgndTvfhqcxrBFbMZOsnjj k2opWS4ZXrDnyro98+BhOn8Qu4sZjQxOg0fCGjuTubxSTsCkVRbtEFceV 0=; Authentication-Results: mail2-relais-roc.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=ludovic.courtes@inria.fr; dmarc=fail (p=none dis=none) d=inria.fr X-IronPort-AV: E=Sophos;i="6.01,159,1684792800"; d="scan'208";a="114625635" Received: from unknown (HELO ribbon) ([193.50.110.146]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jun 2023 10:54:12 +0200 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: bug-guix@gnu.org Subject: [Cuirass] Remote server not picking up job, losing workers X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Octidi 8 Messidor an 231 de la =?utf-8?Q?R=C3=A9volu?= =?utf-8?Q?tion=2C?= jour de =?utf-8?Q?l'=C3=89chalotte?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Mon, 26 Jun 2023 10:54:12 +0200 Message-ID: <87fs6ey56z.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=192.134.164.83; envelope-from=ludovic.courtes@inria.fr; helo=mail2-relais-roc.national.inria.fr X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) As of cuirass@1.1.0-16.b825967, =E2=80=98cuirass remote-server=E2=80=99 app= ears to not pick jobs as quickly as it should and to lose sight of workers (you can see them come and go on ). /var/log/cuirass-remote-worker.log shows that it does build things, but only sporadically. Then there are things like: 2023-06-26 10:07:58 warning: Poll loop busy during 3404 seconds. This is presumably related to Cuirass commit c4743b54720e86b0e0b0295fb6d33977e4293644 (previously =E2=80=98remote-worker= =E2=80=99 would have a database worker thread; now it opens a new connection every time=E2=80=94a stopgap before it=E2=80=99s fiberized, but apparently not a = good one). Ludo=E2=80=99. ------------=_1688230742-29556-1--