From unknown Wed Jun 18 00:16:48 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#41119 <41119@debbugs.gnu.org> To: bug#41119 <41119@debbugs.gnu.org> Subject: Status: [PATCH] fix some issues with (guix nar) Reply-To: bug#41119 <41119@debbugs.gnu.org> Date: Wed, 18 Jun 2025 07:16:48 +0000 retitle 41119 [PATCH] fix some issues with (guix nar) reassign 41119 guix-patches submitter 41119 Caleb Ristvedt severity 41119 normal tag 41119 fixed thanks From debbugs-submit-bounces@debbugs.gnu.org Wed May 06 23:52:21 2020 Received: (at submit) by debbugs.gnu.org; 7 May 2020 03:52:21 +0000 Received: from localhost ([127.0.0.1]:41711 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jWXa9-0008Ad-5Y for submit@debbugs.gnu.org; Wed, 06 May 2020 23:52:21 -0400 Received: from lists.gnu.org ([209.51.188.17]:55696) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jWXa7-0008AV-7L for submit@debbugs.gnu.org; Wed, 06 May 2020 23:52:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47658) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jWXa6-0004cH-VA for guix-patches@gnu.org; Wed, 06 May 2020 23:52:18 -0400 Received: from mail-qt1-x832.google.com ([2607:f8b0:4864:20::832]:42914) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jWXa4-0008Op-2l for guix-patches@gnu.org; Wed, 06 May 2020 23:52:18 -0400 Received: by mail-qt1-x832.google.com with SMTP id x12so3595237qts.9 for ; Wed, 06 May 2020 20:52:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cune-org.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:user-agent:mime-version; bh=3AdYdekXyXTLaGOuE3HlEv7/cYF5p0Z3FNAaPWpyx+I=; b=g2S725CbT6UQmum8dUX47341NwCUfN4SblB2dUaakvYUnbfZICtOuGeExBsiuwlu4X PZ6IbtoqL4MNnFptq7PUKA/aIINsjs8dTodP+37HVLgH719M+jyVSsoeWMGvFFxXYh+p /CXrRV43u4lq8fZ9sdieSEAveefhprCHA6z/1N6mLUoDippSxhvN/VmVnKPEecqUvkkL OsIrrdMPiMccNJ4e8/Er+NJr0Wk9gRGxMoM4uBvzF/Lp2Mf6LwyIT7dIdYZjCcGcUzyv dHME0jMSdjNVffRJ53dnV7J4bvzXYyMSVqcu6GLmOESObTK3A0Xg4aVhDaq6fBOXZtyK vhIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:user-agent :mime-version; bh=3AdYdekXyXTLaGOuE3HlEv7/cYF5p0Z3FNAaPWpyx+I=; b=EUynK0ra4Nh5bXzvqSKeljjNVaQq86qYmfSnNoGrpZ0ZVsG6+LRiwS8UoNyyL2R+yH EzD+9BLIdnjg0tDCeXLNGf1HH9Bsflwu/IvD0b+pZo2vz/PJ0xiEUHgcwhP4ZmY5srxf CQNCyjZ0DbtJ9juRWRIB53xbl7HEpo3ZbCoBj3CXarlTQsKEmsnei6P5HvDonvSdwjLP mJzEMzKKhFrpHmHz2P3fy/V6FFbOQKaP2zE9at24KMQlFDDwNKrmPy+XvbDqyQakReQB F9s9QvH2ZQMG59jWGmP7pkcd4SxlXFwmdNgzRg5vxbrfiy+AhRlPkLIX6YogpgOTOiAE d0XA== X-Gm-Message-State: AGi0Pub9lJA8cy5UBW126Fe/dTj7vZ2Ia2LkMQBn3bBTEQYJey+2LxPY ydyPk2yFSUwk2bSo4gI6qVkSPfIICwc= X-Google-Smtp-Source: APiQypJAkpo1EuAbtxVjc4RgMj4ODqn47XHz5fyqQCBTeaw5mZbvcxS6HARgmkEjF+76+l+S87vHWg== X-Received: by 2002:ac8:7309:: with SMTP id x9mr12207068qto.68.1588823534218; Wed, 06 May 2020 20:52:14 -0700 (PDT) Received: from GuixPotato ([208.89.170.24]) by smtp.gmail.com with ESMTPSA id g7sm1627117qth.9.2020.05.06.20.52.12 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 May 2020 20:52:13 -0700 (PDT) From: Caleb Ristvedt To: guix-patches@gnu.org Subject: [PATCH] fix some issues with (guix nar) Date: Wed, 06 May 2020 22:52:11 -0500 Message-ID: <87h7wsqu50.fsf@cune.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Received-SPF: pass client-ip=2607:f8b0:4864:20::832; envelope-from=caleb.ristvedt@cune.org; helo=mail-qt1-x832.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) --==-=-= Content-Type: multipart/mixed; boundary="=-=-=" --=-=-= Content-Type: text/plain I noticed two issues while looking at (guix nar): 1. The proper store-lock-handling protocol isn't used in FINALIZE-STORE-FILE. Lock acquisition needs to check for a deletion token, retrying if it exists, and lock release needs to delete the lock file and write the deletion token. 2. WITH-TEMPORARY-STORE-FILE opens a new daemon connection every time it retries with a new filename, and only closes any of them after the body has completed. So if we retry 20 times, we get 20 concurrent daemon connections. This also prevents the call to LOOP from being a tail call. The attached patches resolve these issues. There are of course going to be more places we need to (properly) acquire and release store locks as guile-daemon code gets merged, but for now this should work as a bandaid fix. - reepca --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-nar-finalize-store-file-follows-proper-store-lock-pr.patch Content-Transfer-Encoding: quoted-printable Content-Description: FINALIZE-STORE-FILE fix From=20b2c66b443bd42e05820cfb3920c96f1894820587 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Wed, 6 May 2020 11:48:21 -0500 Subject: [PATCH 1/2] nar: 'finalize-store-file' follows proper store lock protocol. * guix/nar.scm (finalize-store-file): check for deletion token when acquiri= ng lock, write deletion token and delete lock file before releasing lock. =2D-- guix/nar.scm | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/guix/nar.scm b/guix/nar.scm index 29636aa0f8..f91af72879 100644 =2D-- a/guix/nar.scm +++ b/guix/nar.scm @@ -82,10 +82,19 @@ REFERENCES and DERIVER. When LOCK? is true, acquire exclusive locks on TA= RGET before attempting to register it; otherwise, assume TARGET's locks are alr= eady held." + ;; TODO: make this reusable + (define (acquire-lock filename) + (let ((port (lock-file filename))) + (if (zero? (stat:size (stat port))) + port + (begin + (close port) + (acquire-lock filename))))) + (with-database %default-database-file db (unless (path-id db target) (let ((lock (and lock? =2D (lock-file (string-append target ".lock"))))) + (acquire-lock (string-append target ".lock"))))) =20 (unless (path-id db target) ;; If FILE already exists, delete it (it's invalid anyway.) @@ -102,6 +111,9 @@ held." #:deriver deriver)) =20 (when lock? + (delete-file (string-append target ".lock")) + (display "d" lock) + (force-output lock) (unlock-file lock)))))) =20 (define (temporary-store-file) =2D-=20 2.26.2 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0002-nar-with-temporary-store-file-uses-a-single-connecti.patch Content-Transfer-Encoding: quoted-printable Content-Description: WITH-TEMPORARY-STORE-FILE fix From=2043ee61b405b01038b3e7c84aba64521ab8a62236 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Wed, 6 May 2020 11:52:16 -0500 Subject: [PATCH 2/2] nar: 'with-temporary-store-file' uses a single connect= ion Previously the 'with-store' form was entered every time a different tempora= ry file was tried. This caused there to be as many simultaneous open connecti= ons as there were attempts, and prevented the (loop ...) call from being a tail call. This change fixes that. * guix/nar.scm (with-temporary-store-file): open connection once prior to entering the loop. =2D-- guix/nar.scm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/guix/nar.scm b/guix/nar.scm index f91af72879..404cef8b97 100644 =2D-- a/guix/nar.scm +++ b/guix/nar.scm @@ -126,8 +126,8 @@ held." (define-syntax-rule (with-temporary-store-file name body ...) "Evaluate BODY with NAME bound to the file name of a temporary store item protected from GC." =2D (let loop ((name (temporary-store-file))) =2D (with-store store + (with-store store + (let loop ((name (temporary-store-file))) ;; Add NAME to the current process' roots. (Opening this connection= to ;; the daemon allows us to reuse its code that deals with the ;; per-process roots file.) =2D-=20 2.26.2 --=-=-=-- --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEdNapMPRLm4SepVYGwWaqSV9/GJwFAl6zhesACgkQwWaqSV9/ GJx4+Af6AyCZByhnQmIT9akppywz1Mut+YKV7IWkKBLifnbuqbaQl3faBbo4io6l 1++rq1FypEKUlepgQLwUGWfEdR21WPPxwL+LopllcqTklZO45WB7PsOr1wIQjvW1 /mxbbLIU9de37gzNl4caKO1Ijlra2fKmWzFqbSpy5h17dg2Q+1LFf0epLOwyOq7E m1fpvcaPp8IOj2X/Bb25XDLCopkJB5NYdJYoT8yAsXNQd3ORmTw4GnS9NoYTYVR8 7w6fdlAhDP6xjVKyJPwPos/u7T7loskn8wxBi85TFZG1RcAha1ib2TuAi83xEzam Y/pM8cn76UL7SWUuJ9i3IwJ58Rj75Q== =FFN3 -----END PGP SIGNATURE----- --==-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Thu May 07 04:05:20 2020 Received: (at 41119) by debbugs.gnu.org; 7 May 2020 08:05:20 +0000 Received: from localhost ([127.0.0.1]:41983 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jWbWy-0007rg-9J for submit@debbugs.gnu.org; Thu, 07 May 2020 04:05:20 -0400 Received: from eggs.gnu.org ([209.51.188.92]:39864) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jWbWv-0007r8-A4 for 41119@debbugs.gnu.org; Thu, 07 May 2020 04:05:17 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46241) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jWbWq-0004ry-02; Thu, 07 May 2020 04:05:12 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=35696 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jWbWo-0000QT-SB; Thu, 07 May 2020 04:05:11 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Caleb Ristvedt Subject: Re: [bug#41119] [PATCH] fix some issues with (guix nar) References: <87h7wsqu50.fsf@cune.org> Date: Thu, 07 May 2020 10:05:08 +0200 In-Reply-To: <87h7wsqu50.fsf@cune.org> (Caleb Ristvedt's message of "Wed, 06 May 2020 22:52:11 -0500") Message-ID: <87o8r0dvbf.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 41119 Cc: 41119@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi! Caleb Ristvedt skribis: > From b2c66b443bd42e05820cfb3920c96f1894820587 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Wed, 6 May 2020 11:48:21 -0500 > Subject: [PATCH 1/2] nar: 'finalize-store-file' follows proper store lock > protocol. > > * guix/nar.scm (finalize-store-file): check for deletion token when acqui= ring > lock, write deletion token and delete lock file before releasing lock. [...] > + ;; TODO: make this reusable > + (define (acquire-lock filename) For consistency, s/filename/file/ please. :-) > + (let ((port (lock-file filename))) > + (if (zero? (stat:size (stat port))) > + port > + (begin Could you add a comment, like: =E2=80=9CIf FILE is non-empty, that=E2=80=99= s because it contains the deletion token, so try again.=E2=80=9D > (when lock? > + (delete-file (string-append target ".lock")) > + (display "d" lock) > + (force-output lock) Also a comment explaining why we=E2=80=99re writing a deletion token. It=E2=80=99s a fine point of the daemon that I had totally overlooked. I w= onder what the implications might have been. > From 43ee61b405b01038b3e7c84aba64521ab8a62236 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Wed, 6 May 2020 11:52:16 -0500 > Subject: [PATCH 2/2] nar: 'with-temporary-store-file' uses a single conne= ction > > Previously the 'with-store' form was entered every time a different tempo= rary > file was tried. This caused there to be as many simultaneous open connec= tions > as there were attempts, and prevented the (loop ...) call from being a ta= il > call. This change fixes that. > > * guix/nar.scm (with-temporary-store-file): open connection once prior to > entering the loop. LGTM! You can push both patches to =E2=80=98master=E2=80=99 (make sure =E2=80=9Cm= ake authenticate=E2=80=9D passes before you do). Thanks a lot for the quick fixes! Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Mon May 11 17:39:14 2020 Received: (at 41119-done) by debbugs.gnu.org; 11 May 2020 21:39:14 +0000 Received: from localhost ([127.0.0.1]:53524 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jYG8n-00049H-W6 for submit@debbugs.gnu.org; Mon, 11 May 2020 17:39:14 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40474) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jYG8m-000494-Br for 41119-done@debbugs.gnu.org; Mon, 11 May 2020 17:39:12 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:54352) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYG8g-0004AR-U6; Mon, 11 May 2020 17:39:06 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=39358 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jYG8g-0000Wr-2x; Mon, 11 May 2020 17:39:06 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Caleb Ristvedt Subject: Re: [bug#41119] [PATCH] fix some issues with (guix nar) References: <87h7wsqu50.fsf@cune.org> <87o8r0dvbf.fsf@gnu.org> Date: Mon, 11 May 2020 23:39:03 +0200 In-Reply-To: <87o8r0dvbf.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Thu, 07 May 2020 10:05:08 +0200") Message-ID: <877dxidudk.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 41119-done Cc: 41119-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) This was pushed a couple of days ago: b338c41c82 nar: 'with-temporary-store-file' uses a single connection 37edbc91e3 nar: 'finalize-store-file' follows proper store lock protocol. Closing, and thanks again! Ludo=E2=80=99. From unknown Wed Jun 18 00:16:48 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: Did not alter fixed versions and reopened. Date: Wed, 27 May 2020 21:39:02 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # Did not alter fixed versions and reopened. thanks # This fakemail brought to you by your local debbugs # administrator From debbugs-submit-bounces@debbugs.gnu.org Wed May 27 17:38:51 2020 Received: (at control) by debbugs.gnu.org; 27 May 2020 21:38:51 +0000 Received: from localhost ([127.0.0.1]:50196 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1je3lD-0008Ma-9K for submit@debbugs.gnu.org; Wed, 27 May 2020 17:38:51 -0400 Received: from eggs.gnu.org ([209.51.188.92]:38906) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1je3lA-0008M8-Gf for control@debbugs.gnu.org; Wed, 27 May 2020 17:38:48 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46343) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1je3l5-0003WW-8q for control@debbugs.gnu.org; Wed, 27 May 2020 17:38:43 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=37210 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1je3l4-0002KM-Nt for control@debbugs.gnu.org; Wed, 27 May 2020 17:38:43 -0400 Date: Wed, 27 May 2020 23:38:41 +0200 Message-Id: <87wo4xgise.fsf@gnu.org> To: control@debbugs.gnu.org From: =?utf-8?Q?Ludovic_Court=C3=A8s?= Subject: control message for bug #41119 MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) reopen 41119 tags 41119 - fixed patch quit From debbugs-submit-bounces@debbugs.gnu.org Wed May 27 17:54:51 2020 Received: (at 41119) by debbugs.gnu.org; 27 May 2020 21:54:51 +0000 Received: from localhost ([127.0.0.1]:50223 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1je40g-0002S3-Jm for submit@debbugs.gnu.org; Wed, 27 May 2020 17:54:51 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40870) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1je40e-0002Rp-6I for 41119@debbugs.gnu.org; Wed, 27 May 2020 17:54:48 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46627) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1je40Y-0001uT-LL; Wed, 27 May 2020 17:54:42 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=37222 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1je40W-0003n9-9Q; Wed, 27 May 2020 17:54:42 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Caleb Ristvedt Subject: Re: [bug#41119] [PATCH] fix some issues with (guix nar) References: <87h7wsqu50.fsf@cune.org> Date: Wed, 27 May 2020 23:54:38 +0200 In-Reply-To: <87h7wsqu50.fsf@cune.org> (Caleb Ristvedt's message of "Wed, 06 May 2020 22:52:11 -0500") Message-ID: <87k10xgi1t.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 41119 Cc: guix-sysadmin@gnu.org, 41119@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi, Caleb Ristvedt skribis: > From 43ee61b405b01038b3e7c84aba64521ab8a62236 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Wed, 6 May 2020 11:52:16 -0500 > Subject: [PATCH 2/2] nar: 'with-temporary-store-file' uses a single conne= ction > > Previously the 'with-store' form was entered every time a different tempo= rary > file was tried. This caused there to be as many simultaneous open connec= tions > as there were attempts, and prevented the (loop ...) call from being a ta= il > call. This change fixes that. > > * guix/nar.scm (with-temporary-store-file): open connection once prior to > entering the loop. > --- > guix/nar.scm | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/guix/nar.scm b/guix/nar.scm > index f91af72879..404cef8b97 100644 > --- a/guix/nar.scm > +++ b/guix/nar.scm > @@ -126,8 +126,8 @@ held." > (define-syntax-rule (with-temporary-store-file name body ...) > "Evaluate BODY with NAME bound to the file name of a temporary store i= tem > protected from GC." > - (let loop ((name (temporary-store-file))) > - (with-store store > + (with-store store > + (let loop ((name (temporary-store-file))) > ;; Add NAME to the current process' roots. (Opening this connecti= on to > ;; the daemon allows us to reuse its code that deals with the > ;; per-process roots file.) This change had an undesirable effect: the connection would be kept for the body of =E2=80=98with-temporary-store-file=E2=80=99, during which we=E2= =80=99d call: finalize-store-file -> register-path which accesses the database. At this point, for each =E2=80=98guix offload= =E2=80=99 process, we=E2=80=99d thus have the database open twice: once for the sessi= on=E2=80=99s guix-daemon, and once for that =E2=80=98register-path=E2=80=99 call. On berlin, the effect is that we see many =E2=80=98guix offload=E2=80=99 pr= ocesses stalled because the SQLite database is busy: --8<---------------cut here---------------start------------->8--- ludo@berlin ~$ guix processes |grep ^SessionPID|wc -l 104 ludo@berlin ~$ guix processes |recsel -e 'ClientCommand ~ "offload"'|grep ^= SessionPID |wc -l 69 ludo@berlin ~$ guix processes |recsel -e 'ClientCommand ~ "offload"'|head=20 SessionPID: 10916 ClientPID: 7408 ClientCommand: /gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/bin/= guile \ /gnu/store/abiva5ivq99x30r2s9pa3jj0pv9g16sv-guix-1.1.0-4.bdc801e/bi= n/.guix-real offload x86_64-linux 3600 1 21600 SessionPID: 11333 ClientPID: 9505 ClientCommand: /gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/bin/= guile \ /gnu/store/abiva5ivq99x30r2s9pa3jj0pv9g16sv-guix-1.1.0-4.bdc801e/bi= n/.guix-real offload x86_64-linux 3600 1 21600 SessionPID: 16277 ClientPID: 9179 ludo@berlin ~$ sudo strace -p 7408 strace: Process 7408 attached restart_syscall(<... resuming interrupted read ...>) =3D 0 fcntl(19, F_SETLK, {l_type=3DF_RDLCK, l_whence=3DSEEK_SET, l_start=3D125, l= _len=3D1}) =3D 0 fcntl(19, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=3D120, l= _len=3D1}) =3D -1 EAGAIN (Resource temporarily unavailable) fcntl(19, F_SETLK, {l_type=3DF_UNLCK, l_whence=3DSEEK_SET, l_start=3D125, l= _len=3D1}) =3D 0 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3D0, tv_nsec=3D100000000}, NULL)= =3D 0 fcntl(19, F_SETLK, {l_type=3DF_RDLCK, l_whence=3DSEEK_SET, l_start=3D125, l= _len=3D1}) =3D 0 fcntl(19, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=3D120, l= _len=3D1}) =3D -1 EAGAIN (Resource temporarily unavailable) fcntl(19, F_SETLK, {l_type=3DF_UNLCK, l_whence=3DSEEK_SET, l_start=3D125, l= _len=3D1}) =3D 0 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3D0, tv_nsec=3D100000000}, NULL)= =3D 0 fcntl(19, F_SETLK, {l_type=3DF_RDLCK, l_whence=3DSEEK_SET, l_start=3D125, l= _len=3D1}) =3D 0 fcntl(19, F_SETLK, {l_type=3DF_WRLCK, l_whence=3DSEEK_SET, l_start=3D120, l= _len=3D1}) =3D -1 EAGAIN (Resource temporarily unavailable) fcntl(19, F_SETLK, {l_type=3DF_UNLCK, l_whence=3DSEEK_SET, l_start=3D125, l= _len=3D1}) =3D 0 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=3D0, tv_nsec=3D100000000}, ^Cstr= ace: Process 7408 detached ludo@berlin ~$ sudo gdb -p 7408 [=E2=80=A6] (gdb) bt #0 0x00007f2e2aa327a1 in clock_nanosleep@GLIBC_2.2.5 () from target:/gnu/s= tore/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6 #1 0x00007f2e2aa37c03 in nanosleep () from target:/gnu/store/fa6wj5bxkj5ll= 1d7292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6 #2 0x00007f2e2aa611a4 in usleep () from target:/gnu/store/fa6wj5bxkj5ll1d7= 292a70knmyl7a0cr-glibc-2.31/lib/libc.so.6 #3 0x00007f2e1e8245ea in unixSleep () from target:/gnu/store/807c6g9xqrxdj= yhm8wm1r6jjjmc8q4vs-sqlite-3.31.1/lib/libsqlite3.so.0 #4 0x00007f2e1e81f56e in sqliteDefaultBusyCallback () from target:/gnu/sto= re/807c6g9xqrxdjyhm8wm1r6jjjmc8q4vs-sqlite-3.31.1/lib/libsqlite3.so.0 #5 0x00007f2e1e81f5d9 in sqlite3InvokeBusyHandler () from target:/gnu/stor= e/807c6g9xqrxdjyhm8wm1r6jjjmc8q4vs-sqlite-3.31.1/lib/libsqlite3.so.0 #6 0x00007f2e1e877ec1 in sqlite3BtreeBeginTrans () from target:/gnu/store/= 807c6g9xqrxdjyhm8wm1r6jjjmc8q4vs-sqlite-3.31.1/lib/libsqlite3.so.0 #7 0x00007f2e1e89fc64 in sqlite3VdbeExec () from target:/gnu/store/807c6g9= xqrxdjyhm8wm1r6jjjmc8q4vs-sqlite-3.31.1/lib/libsqlite3.so.0 #8 0x00007f2e1e8a6d09 in sqlite3_step () from target:/gnu/store/807c6g9xqr= xdjyhm8wm1r6jjjmc8q4vs-sqlite-3.31.1/lib/libsqlite3.so.0 #9 0x00007f2e1e8a7add in sqlite3_exec () from target:/gnu/store/807c6g9xqr= xdjyhm8wm1r6jjjmc8q4vs-sqlite-3.31.1/lib/libsqlite3.so.0 #10 0x00007f2e2af0466d in ffi_call_unix64 () from target:/gnu/store/bw15z9k= h9c65ycc2vbhl2izwfwfva7p1-libffi-3.3/lib/libffi.so.7 #11 0x00007f2e2af02ac0 in ffi_call_int () from target:/gnu/store/bw15z9kh9c= 65ycc2vbhl2izwfwfva7p1-libffi-3.3/lib/libffi.so.7 #12 0x00007f2e2aff148e in scm_i_foreign_call () from target:/gnu/store/18hp= 7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/lib/libguile-3.0.so.1 --8<---------------cut here---------------end--------------->8--- They loop pretty much indefinitely on this and nothing (or very little) happens on the system. I=E2=80=99ll revert this patch but I=E2=80=99m happy to hear what you think= , Caleb. Another reason to implement temp roots in Scheme, as it would allow us to not open a connection to the daemon from =E2=80=98guix offload=E2=80=99! Thanks, Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Thu May 28 04:50:52 2020 Received: (at 41119) by debbugs.gnu.org; 28 May 2020 08:50:52 +0000 Received: from localhost ([127.0.0.1]:50944 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jeEFS-0006Fk-CG for submit@debbugs.gnu.org; Thu, 28 May 2020 04:50:52 -0400 Received: from mail-il1-f194.google.com ([209.85.166.194]:32907) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jeEFQ-0006FT-Ck for 41119@debbugs.gnu.org; Thu, 28 May 2020 04:50:44 -0400 Received: by mail-il1-f194.google.com with SMTP id y17so24618997ilg.0 for <41119@debbugs.gnu.org>; Thu, 28 May 2020 01:50:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cune-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=YN/2ALYPOHeIQ0JBw91PgZgk2usughOcWDS6dCss+tI=; b=qnuv2G5B1+2H3AsHk/3IuKOQ5h4el/dYxyoDP/A8D3u8Nmk8t+z8D2wlszyTfqwb5N WHEgYBu3lWdcT5vXhanIY3wlwafn5+qUvForuvuWwsd+Nfde/vcE362aPFJzykyRdj0g V3/+IKOY29xnMqPZEl71DQvsNLuOyXqFp4516lSQVhmspeQDz6YkgCovSV1/N0augi+i QeXXZnSC4z/fVI4F/aH9kSCRxm7MCcZW7lVKTEpxEVBcY01PTMIcVoEFXS2X+WsVkrx6 EXyellOzY94CfrFIA6B52XwOZn/+5DqCM2WLIH+2hwYgiCHTgYLuJNxdKWaDgc4htf0m Ea0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-transfer-encoding; bh=YN/2ALYPOHeIQ0JBw91PgZgk2usughOcWDS6dCss+tI=; b=UAE02IKduPu/xzV6oWNcuiP3s/9JmX5Iu2aDOfuhYE83s6wxIPD6xYW0sV//jcq8i2 Kq6Ko2k6mHWC033WVXvY0EGSYT3PSJbFF3UurxAuGcuLUOXK628netsb9zkwDAOsNOqp +QZHMKf4vpBmxG9+M3YfjPXPnuIzDl9SRAncHtPwzb7rbARMb2DwwDkG7ulGzjkjhwTn NDdwUFHizoB3584pP0FFJhNb0KI1Kl6El3lSBwVIYmLp4doRV2SWmBqXqKxI8aKHBC35 8X9MJ1+9530Yb0MuxcuVuVz7tCN4RiYVCkoPkjqBR+dEuq73kcK14pqDYGbjSi3uSACf 8mrA== X-Gm-Message-State: AOAM5322TgEB+gJB5ogX9I9YsXeDegM+IMsSF/L17OSDJuuOsvKVuPLn pMw8rCT/e3cBI67MOaOEEDU4zw== X-Google-Smtp-Source: ABdhPJxcJByLBwU7op0xaPgFqPeVmRSGIEkXNj1/fN+rI6oItLWTcx+SDCICZJDAteHzOkfU1sIK4A== X-Received: by 2002:a92:d989:: with SMTP id r9mr1890281iln.30.1590655838488; Thu, 28 May 2020 01:50:38 -0700 (PDT) Received: from GuixPotato ([208.89.170.24]) by smtp.gmail.com with ESMTPSA id s26sm2365905iol.10.2020.05.28.01.50.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 May 2020 01:50:37 -0700 (PDT) From: Caleb Ristvedt To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: [bug#41119] [PATCH] fix some issues with (guix nar) References: <87h7wsqu50.fsf@cune.org> <87k10xgi1t.fsf@gnu.org> Date: Thu, 28 May 2020 03:50:36 -0500 In-Reply-To: <87k10xgi1t.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Wed, 27 May 2020 23:54:38 +0200") Message-ID: <87mu5se943.fsf@cune.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 41119 Cc: guix-sysadmin@gnu.org, 41119@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Ludovic Court=C3=A8s writes: > Hi, > > This change had an undesirable effect: the connection would be kept for > the body of =E2=80=98with-temporary-store-file=E2=80=99, during which we= =E2=80=99d call: > > finalize-store-file -> register-path > > which accesses the database. At this point, for each =E2=80=98guix offlo= ad=E2=80=99 > process, we=E2=80=99d thus have the database open twice: once for the ses= sion=E2=80=99s > guix-daemon, and once for that =E2=80=98register-path=E2=80=99 call. If the connection wasn't kept for the body of with-temporary-store-file, the temporary store file wouldn't be protected from GC during the body (the daemon treats unlocked temproots files as "stale"), thus rather defeating the purpose. It makes sense, then, that the connection was also kept for the body prior to this patch - indeed, unless emacs's parenthesis-matching capabilities are failing me, it appears that the body is solidly within the 'with-store' form in 37edbc91e34fb5658261e637e6ffccdb381e5271. > On berlin, the effect is that we see many =E2=80=98guix offload=E2=80=99 = processes > stalled because the SQLite database is busy: ... which makes this quite the mystery indeed. I assume you've tested with the patch reverted and found that this issue goes away? If so, I am very puzzled. One would expect that "database open twice" would tend to have *less* contention issues than "database open at least twice". AFAIK just having the database open doesn't by itself impose any locks. The daemon process we're connected to should have it open, but should just be blocked waiting for our next RPC. Database locks happen when transactions are started (either explicitly or implicitly), and implicitly-started transactions are automatically committed by sqlite (specifically when the statement that started the transaction is either reset or finalized). The only loose end I can think of right now is that call-with-transaction only catches exceptions of type 'sqlite-error, so in theory if a different type of exception were to be thrown, it could exit in a broken state where neither a commit nor a rollback has been performed. Really it should catch all exception types, and use match in the handler to pick out the sqlite-errors. If that were causing the problems, though, we'd expect to see some errors appearing in the offload output. Actually, come to think of it, there could be another issue with call-with-transaction: if somehow it's possible for SQLITE_BUSY errors to occur despite the connection having succeeded with a 'begin immediate;' (which immediately starts a write transaction), then the rollback wouldn't occur, and what should be a failed transaction followed by a successful transaction becomes one long, restarted-in-the-middle transaction. I'm not sure if that's a problem in practice, though. And now that I look at it again, it turns out that most of our database query procedures in (guix store database) aren't finalizing their statements in case of a nonlocal exit... which would tend to happen if, for example, an SQLITE_BUSY error occurred. Which would cause the statement to not be finalized until the garbage collector got ahold of it. But due to statement caching the garbage collector likely won't get ahold of it until the database connection itself is destroyed. The wording at https://www.sqlite.org/lang_transaction.html makes me think that this shouldn't be an issue because the errors we'd expect all seem to roll back automatically, but if we somehow got one that didn't roll back automatically, there would potentially be an extended amount of time before the statement was finalized and the implicit transaction committed. Also, I've noticed that with the way that finalize-store-file is written, we actually already have a database open when we call register-path. This is because it's needed in order to call path-id, but the scope of that with-database form is rather larger than it needs to be. We may have a situation here where things go fine until a single SQLITE_BUSY error is produced by chance, and that causes more SQLITE_BUSY errors, and so on. In summary, there are many things I could imagine going wrong to cause / contribute to the observed behavior, but the patch, barring some absurd guile compilation bug, is not one of them. I do, however, think that (guix store database) needs some love. > They loop pretty much indefinitely on this and nothing (or very little) > happens on the system. To be clear, the nothing-happening status is common to all processes that use the database, including daemon processes? That's quite severe. > I=E2=80=99ll revert this patch but I=E2=80=99m happy to hear what you thi= nk, Caleb. If the data says it's causing those problems, I'd tend to agree with that. I would really like to understand how, though, because even after a few hours of brainstorming bizarre edge cases I still can't come up with a satisfying explanation. > Another reason to implement temp roots in Scheme, as it would allow us > to not open a connection to the daemon from =E2=80=98guix offload=E2=80= =99! Soon=E2=84=A2. Conceptually the code is there, I'm working towards a rebase= that tries to first make the rest of daemon-side guix compatible with fibers - thread pools=E2=9C=93, eval-with-container=E2=9C=93, fibers-friendly wait= pid=E2=9C=93, etc. - reepca From debbugs-submit-bounces@debbugs.gnu.org Thu May 28 06:33:08 2020 Received: (at 41119) by debbugs.gnu.org; 28 May 2020 10:33:08 +0000 Received: from localhost ([127.0.0.1]:51073 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jeFqP-0000XB-TF for submit@debbugs.gnu.org; Thu, 28 May 2020 06:33:08 -0400 Received: from eggs.gnu.org ([209.51.188.92]:38806) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jeFqO-0000Wg-4a for 41119@debbugs.gnu.org; Thu, 28 May 2020 06:33:00 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:58433) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jeFqI-00082s-Ki; Thu, 28 May 2020 06:32:54 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=38122 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jeFqH-0004ae-MX; Thu, 28 May 2020 06:32:54 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Caleb Ristvedt Subject: Re: [bug#41119] [PATCH] fix some issues with (guix nar) References: <87h7wsqu50.fsf@cune.org> <87k10xgi1t.fsf@gnu.org> <87mu5se943.fsf@cune.org> Date: Thu, 28 May 2020 12:32:51 +0200 In-Reply-To: <87mu5se943.fsf@cune.org> (Caleb Ristvedt's message of "Thu, 28 May 2020 03:50:36 -0500") Message-ID: <87imgge4do.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 41119 Cc: guix-sysadmin@gnu.org, 41119@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi! Caleb Ristvedt skribis: >> This change had an undesirable effect: the connection would be kept for >> the body of =E2=80=98with-temporary-store-file=E2=80=99, during which we= =E2=80=99d call: >> >> finalize-store-file -> register-path >> >> which accesses the database. At this point, for each =E2=80=98guix offl= oad=E2=80=99 >> process, we=E2=80=99d thus have the database open twice: once for the se= ssion=E2=80=99s >> guix-daemon, and once for that =E2=80=98register-path=E2=80=99 call. > > If the connection wasn't kept for the body of with-temporary-store-file, > the temporary store file wouldn't be protected from GC during the body > (the daemon treats unlocked temproots files as "stale"), thus rather > defeating the purpose. It makes sense, then, that the connection was > also kept for the body prior to this patch - indeed, unless emacs's > parenthesis-matching capabilities are failing me, it appears that the > body is solidly within the 'with-store' form in > 37edbc91e34fb5658261e637e6ffccdb381e5271. Oh you=E2=80=99re right, sorry for the confusion. >> On berlin, the effect is that we see many =E2=80=98guix offload=E2=80=99= processes >> stalled because the SQLite database is busy: > > ... which makes this quite the mystery indeed. I assume you've tested > with the patch reverted and found that this issue goes away? No. I observed the behavior and looked for recent changes that could cause the problem. But I guess I was tired and jumped to silly conclusions. > > > AFAIK just having the database open doesn't by itself impose any > locks. The daemon process we're connected to should have it open, but > should just be blocked waiting for our next RPC. Database locks happen > when transactions are started (either explicitly or implicitly), and > implicitly-started transactions are automatically committed by sqlite > (specifically when the statement that started the transaction is either > reset or finalized). The only loose end I can think of right now is that > call-with-transaction only catches exceptions of type 'sqlite-error, so > in theory if a different type of exception were to be thrown, it could > exit in a broken state where neither a commit nor a rollback has been > performed. Really it should catch all exception types, and use match in > the handler to pick out the sqlite-errors. If that were causing the > problems, though, we'd expect to see some errors appearing in the > offload output. Good point but yes, we=E2=80=99d see an error, and =E2=80=98guix offload=E2= =80=99 would probably exit right away. > Actually, come to think of it, there could be another issue with > call-with-transaction: if somehow it's possible for SQLITE_BUSY errors > to occur despite the connection having succeeded with a 'begin > immediate;' (which immediately starts a write transaction), then the > rollback wouldn't occur, and what should be a failed transaction > followed by a successful transaction becomes one long, > restarted-in-the-middle transaction. I'm not sure if that's a problem in > practice, though. > > And now that I look at it again, it turns out that most of our database > query procedures in (guix store database) aren't finalizing their > statements in case of a nonlocal exit... which would tend to happen if, > for example, an SQLITE_BUSY error occurred. Which would cause the > statement to not be finalized until the garbage collector got ahold of > it. But due to statement caching the garbage collector likely won't get > ahold of it until the database connection itself is destroyed. The > wording at https://www.sqlite.org/lang_transaction.html makes me think > that this shouldn't be an issue because the errors we'd expect all seem > to roll back automatically, but if we somehow got one that didn't roll > back automatically, there would potentially be an extended amount of > time before the statement was finalized and the implicit transaction > committed. > > Also, I've noticed that with the way that finalize-store-file is > written, we actually already have a database open when we call > register-path. This is because it's needed in order to call path-id, but > the scope of that with-database form is rather larger than it needs to > be. > > We may have a situation here where things go fine until a single > SQLITE_BUSY error is produced by chance, and that causes more > SQLITE_BUSY errors, and so on. Hmm, sounds plausible. > In summary, there are many things I could imagine going wrong to cause / > contribute to the observed behavior, but the patch, barring some absurd > guile compilation bug, is not one of them. I do, however, think that > (guix store database) needs some love. Yeah. >> They loop pretty much indefinitely on this and nothing (or very little) >> happens on the system. > > To be clear, the nothing-happening status is common to all processes > that use the database, including daemon processes? That's quite severe. I just did a random sample, but several offload processes were stuck like the one I showed, and clients would usually get =E2=80=9Cdatabase is locked=E2=80=9D messages from the daemon. >> I=E2=80=99ll revert this patch but I=E2=80=99m happy to hear what you th= ink, Caleb. > > If the data says it's causing those problems, I'd tend to agree with > that. I would really like to understand how, though, because even after > a few hours of brainstorming bizarre edge cases I still can't come up > with a satisfying explanation. No you=E2=80=99re right, my analysis was wrong. Further investigation need= ed=E2=80=A6 >> Another reason to implement temp roots in Scheme, as it would allow us >> to not open a connection to the daemon from =E2=80=98guix offload=E2=80= =99! > > Soon=E2=84=A2. Conceptually the code is there, I'm working towards a reba= se that > tries to first make the rest of daemon-side guix compatible with fibers > - thread pools=E2=9C=93, eval-with-container=E2=9C=93, fibers-friendly wa= itpid=E2=9C=93, etc. Neat! For master we could do with a simpler implementation, but we=E2=80= =99ll see. Thanks, Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 05 14:44:41 2020 Received: (at control) by debbugs.gnu.org; 5 Jun 2020 18:44:42 +0000 Received: from localhost ([127.0.0.1]:49911 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jhHKb-0007Xc-NY for submit@debbugs.gnu.org; Fri, 05 Jun 2020 14:44:41 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43264) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jhHKZ-0007XH-LP for control@debbugs.gnu.org; Fri, 05 Jun 2020 14:44:39 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:44116) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jhHKT-0001f2-Pk for control@debbugs.gnu.org; Fri, 05 Jun 2020 14:44:33 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=50368 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jhHKT-0005gq-B5 for control@debbugs.gnu.org; Fri, 05 Jun 2020 14:44:33 -0400 Date: Fri, 05 Jun 2020 20:44:31 +0200 Message-Id: <87bllx1hf4.fsf@gnu.org> To: control@debbugs.gnu.org From: =?utf-8?Q?Ludovic_Court=C3=A8s?= Subject: control message for bug #41119 MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) tags 41119 fixed close 41119 quit From unknown Wed Jun 18 00:16:48 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 04 Jul 2020 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator