From unknown Fri Aug 15 12:50:33 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#42023 <42023@debbugs.gnu.org> To: bug#42023 <42023@debbugs.gnu.org> Subject: Status: [PATCH] database: register-items: reduce transaction scope. Reply-To: bug#42023 <42023@debbugs.gnu.org> Date: Fri, 15 Aug 2025 19:50:33 +0000 retitle 42023 [PATCH] database: register-items: reduce transaction scope. reassign 42023 guix-patches submitter 42023 Christopher Baines severity 42023 normal tag 42023 fixed patch thanks From debbugs-submit-bounces@debbugs.gnu.org Tue Jun 23 12:37:03 2020 Received: (at submit) by debbugs.gnu.org; 23 Jun 2020 16:37:03 +0000 Received: from localhost ([127.0.0.1]:37026 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jnlux-00021U-9H for submit@debbugs.gnu.org; Tue, 23 Jun 2020 12:37:03 -0400 Received: from lists.gnu.org ([209.51.188.17]:45702) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jnluv-000215-Ef for submit@debbugs.gnu.org; Tue, 23 Jun 2020 12:37:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43310) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jnluv-00029C-6Y for guix-patches@gnu.org; Tue, 23 Jun 2020 12:37:01 -0400 Received: from mira.cbaines.net ([2a01:7e00:e000:2f8:fd4d:b5c7:13fb:3d27]:44423) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jnlup-0008GS-0E for guix-patches@gnu.org; Tue, 23 Jun 2020 12:37:00 -0400 Received: from localhost (unknown [46.237.174.39]) by mira.cbaines.net (Postfix) with ESMTPSA id D82EA27BBE1 for ; Tue, 23 Jun 2020 17:36:51 +0100 (BST) Received: from localhost (localhost [local]) by localhost (OpenSMTPD) with ESMTPA id 5d911ac2 for ; Tue, 23 Jun 2020 16:36:49 +0000 (UTC) From: Christopher Baines To: guix-patches@gnu.org Subject: [PATCH] database: register-items: reduce transaction scope. Date: Tue, 23 Jun 2020 17:36:49 +0100 Message-Id: <20200623163649.32444-1-mail@cbaines.net> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2a01:7e00:e000:2f8:fd4d:b5c7:13fb:3d27; envelope-from=mail@cbaines.net; helo=mira.cbaines.net X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/23 12:36:52 X-ACL-Warn: Detected OS = ??? X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) It was made transactional in a4678c6ba18d8dbd79d931f80426eebf61be7ebe, with the reasoning to prevent broken intermediate states from being visible. I think this means something like an entry being in ValidPaths, but the Refs not being inserted. Using a transaction for this makes sense, but I think using one single transaction for the whole register-items call is unnecessary to avoid broken states from being visible, and could block other writes to the store database while register-items is running. Because the deduplication and resetting timestamps happens within the transaction as well, even though these things don't involve the database, writes to the database will still be blocked while this is happening. To reduce the potential for register-items to block other writers to the database for extended periods, this commit moves the transaction to just wrap the call to sqlite-register. This is the one place where writes occur, so that should prevent the broken intermediate states issue above. The one difference this will make is some of the registered items will be visible to other connections while others may be still being added. I think this is OK, as it's equivalent to just registering different items. * guix/store/database.scm (register-items): Reduce transaction scope. --- guix/store/database.scm | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/guix/store/database.scm b/guix/store/database.scm index a38e4d7e52..767335bc0f 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -441,24 +441,25 @@ in the database; #f means \"now\". Write a progress report to LOG-PORT." (when reset-timestamps? (reset-timestamps real-file-name)) (let-values (((hash nar-size) (nar-sha256 real-file-name))) - (sqlite-register db #:path to-register - #:references (store-info-references item) - #:deriver (store-info-deriver item) - #:hash (string-append "sha256:" - (bytevector->base16-string hash)) - #:nar-size nar-size - #:time registration-time) + (call-with-retrying-transaction db + (lambda () + (sqlite-register db #:path to-register + #:references (store-info-references item) + #:deriver (store-info-deriver item) + #:hash (string-append + "sha256:" + (bytevector->base16-string hash)) + #:nar-size nar-size + #:time registration-time))) (when deduplicate? (deduplicate real-file-name hash #:store store-dir))))) - (call-with-retrying-transaction db - (lambda () - (let* ((prefix (format #f "registering ~a items" (length items))) - (progress (progress-reporter/bar (length items) - prefix log-port))) - (call-with-progress-reporter progress - (lambda (report) - (for-each (lambda (item) - (register db item) - (report)) - items))))))) + (let* ((prefix (format #f "registering ~a items" (length items))) + (progress (progress-reporter/bar (length items) + prefix log-port))) + (call-with-progress-reporter progress + (lambda (report) + (for-each (lambda (item) + (register db item) + (report)) + items))))) -- 2.26.2 From debbugs-submit-bounces@debbugs.gnu.org Tue Jun 23 18:15:42 2020 Received: (at 42023) by debbugs.gnu.org; 23 Jun 2020 22:15:42 +0000 Received: from localhost ([127.0.0.1]:37322 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jnrCg-00084Y-4p for submit@debbugs.gnu.org; Tue, 23 Jun 2020 18:15:42 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50624) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jnrCd-00084H-Cd for 42023@debbugs.gnu.org; Tue, 23 Jun 2020 18:15:41 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:41073) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jnrCX-0006nb-Hq; Tue, 23 Jun 2020 18:15:33 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=58638 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jnrCW-0005Qv-F3; Tue, 23 Jun 2020 18:15:33 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Christopher Baines Subject: Re: [bug#42023] [PATCH] database: register-items: reduce transaction scope. References: <20200623163649.32444-1-mail@cbaines.net> Date: Wed, 24 Jun 2020 00:15:30 +0200 In-Reply-To: <20200623163649.32444-1-mail@cbaines.net> (Christopher Baines's message of "Tue, 23 Jun 2020 17:36:49 +0100") Message-ID: <87366lxwcd.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42023 Cc: 42023@debbugs.gnu.org, Caleb Ristvedt X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi, (+Cc: reepca) Christopher Baines skribis: > It was made transactional in a4678c6ba18d8dbd79d931f80426eebf61be7ebe, wi= th > the reasoning to prevent broken intermediate states from being visible. I > think this means something like an entry being in ValidPaths, but the Ref= s not > being inserted. > > Using a transaction for this makes sense, but I think using one single > transaction for the whole register-items call is unnecessary to avoid bro= ken > states from being visible, and could block other writes to the store data= base > while register-items is running. Because the deduplication and resetting > timestamps happens within the transaction as well, even though these thin= gs > don't involve the database, writes to the database will still be blocked = while > this is happening. > > To reduce the potential for register-items to block other writers to the > database for extended periods, this commit moves the transaction to just = wrap > the call to sqlite-register. This is the one place where writes occur, so= that > should prevent the broken intermediate states issue above. The one differ= ence > this will make is some of the registered items will be visible to other > connections while others may be still being added. I think this is OK, as= it's > equivalent to just registering different items. > > * guix/store/database.scm (register-items): Reduce transaction scope. [...] > + (call-with-retrying-transaction db > + (lambda () ^^ Too much indentation (maybe we miss a rule in .dir-locals.el?). > + (sqlite-register db #:path to-register > + #:references (store-info-references item) > + #:deriver (store-info-deriver item) > + #:hash (string-append > + "sha256:" > + (bytevector->base16-string hash)) > + #:nar-size nar-size > + #:time registration-time))) I think it would be good to have a 2-line summary of the rationale right above =E2=80=98call-with-retrying-transaction=E2=80=99. Two questions: 1. Can another process come and fiddle with TO-REGISTER while we=E2=80=99= re still in =E2=80=98reset-timestamps=E2=80=99? Or can GC happen while w= e=E2=80=99re in =E2=80=98reset-timestamps=E2=80=99 and delete TO-REGISTER and remove i= t from the database? I think none of these scenarios can happen, as long as we=E2=80=99ve taken = the .lock file for TO-REGISTER before, like =E2=80=98finalize-store-file=E2=80= =99 does. 2. After the transaction, TO-REGISTER is considered valid. But are the effects of the on-going deduplication observable, due to non-atomicity of some operation? I think the =E2=80=98replace-with-link=E2=80=99 dance is atomic, so we shou= ld be fine. Thoughts? Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 24 13:52:03 2020 Received: (at 42023) by debbugs.gnu.org; 24 Jun 2020 17:52:04 +0000 Received: from localhost ([127.0.0.1]:38856 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jo9Z5-0000w7-EF for submit@debbugs.gnu.org; Wed, 24 Jun 2020 13:52:03 -0400 Received: from mail-io1-f65.google.com ([209.85.166.65]:41167) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jo9Z3-0000vd-Ta for 42023@debbugs.gnu.org; Wed, 24 Jun 2020 13:52:02 -0400 Received: by mail-io1-f65.google.com with SMTP id o5so3121902iow.8 for <42023@debbugs.gnu.org>; Wed, 24 Jun 2020 10:52:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cune-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:in-reply-to:references:user-agent:date :message-id:mime-version; bh=qIuaswtnteYHfnXy1La8UAtAcxVT5kzAh7RuCRvzJSA=; b=ash13EVOu3gnfKUCCjXoTNzu7zmxmzmRzD9eDOno9KtaDs0syZZ9NAsbxRmGP7Y+Yy bsgs/hFcf3iBz1OWnpO9VUGJzRWZbzzZY/dy1kKRejXkWSKD/W0yjrRaar5n8INRkvdu 3hX8vX5w31ryDsjLMgt7gI6UFT13FggNU1vvS9nJ0eURFa4uofSKnyUeoEDv/YgH/6XL i8lphhBHbC3bOK+/7H/ez3hTh35CuQZd37mAAUvRWpVm+PaOEyUpw9E1kc7qWGlDFdDP QYZGD4UCXmYX5FDeuv5m5x1uaXZKvZ3/gdM1UmGm5LywxHL1OTuWaUqHZPeiOh0A3PuS ZBbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references :user-agent:date:message-id:mime-version; bh=qIuaswtnteYHfnXy1La8UAtAcxVT5kzAh7RuCRvzJSA=; b=OQaC9bFyxrcvFxvPRBFr6dtswozJhVWOk/MADr2sPQ6JumXMmyAZVizM+s0efK65pU 4tVduZjOxQ3VEEYAOa3voiN3A5oXem6A1X4qy/VUEuTQy3cdySw73w3oQAuodm+HvPw+ YvtqBGUAU1uT5uNDqHi87fJdlO9ZVgQB+ui5zm6Sc0sXscogXQTkUtAhlJFMWsSyG3yA JzFGrxkbyrEKE2zoWx9rgnM1msKxQktFEvfT+BZkmohiiDzpOqf51phZox8yuII1Aa0X 9ZD+vCWO/idJ/e+8QrFhVBWUhlzEgLp6ZmPRQHcOcNnbewvaWS7tn8BeH8W7DVtynn/4 NFWw== X-Gm-Message-State: AOAM531e1ZNsnl42x4bvuMPKO7OfDAp8CO2B34cTryHNQCHqRA7Djn6x 0xW5LrwW7ChWNwcyGpTfQ+uMRhBNilXWEg== X-Google-Smtp-Source: ABdhPJzCePdhAq1B6ZXIJwS43hpXcypBgKMHHrzakU9kd7yxE38rvzgBku2wMcr9AUarGqBbhU720A== X-Received: by 2002:a05:6638:a83:: with SMTP id 3mr27946443jas.37.1593021114054; Wed, 24 Jun 2020 10:51:54 -0700 (PDT) Received: from GuixPotato ([208.89.170.24]) by smtp.gmail.com with ESMTPSA id f5sm11784799iog.49.2020.06.24.10.51.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Jun 2020 10:51:53 -0700 (PDT) From: Caleb Ristvedt To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: [bug#42023] [PATCH] database: register-items: reduce transaction scope. In-Reply-To: <87366lxwcd.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Wed, 24 Jun 2020 00:15:30 +0200") References: <20200623163649.32444-1-mail@cbaines.net> <87366lxwcd.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) Date: Wed, 24 Jun 2020 12:51:48 -0500 Message-ID: <87v9jg4aiz.fsf@cune.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 42023 Cc: 42023@debbugs.gnu.org, Christopher Baines X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --==-=-= Content-Type: multipart/mixed; boundary="=-=-=" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Ludovic Court=C3=A8s writes: >> + (call-with-retrying-transaction db >> + (lambda () > ^^ > Too much indentation (maybe we miss a rule in .dir-locals.el?). My bad, my understanding of our .dir-locals.el was more-or-less cargo culting until I read the lisp-indent-function documentation just now. The 'scheme-indent-function should be 1 for both call-with-transaction and call-with-retrying-transaction. Patch attached. > Two questions: > > 1. Can another process come and fiddle with TO-REGISTER while we=E2=80= =99re > still in =E2=80=98reset-timestamps=E2=80=99? Or can GC happen while= we=E2=80=99re in > =E2=80=98reset-timestamps=E2=80=99 and delete TO-REGISTER and remove= it from the > database? > > I think none of these scenarios can happen, as long as we=E2=80=99ve take= n the > .lock file for TO-REGISTER before, like =E2=80=98finalize-store-file=E2= =80=99 does. The lock file has no bearing on liveness of the path it locks, though the liveness of the path it locks *does* count as liveness for the lock file and for the .chroot file; see tryToDelete() in nix/libstore/gc.cc. (Well, semi-liveness; .lock and .chroot files won't show up in a list of live paths, but they will still be protected from deletion if their associated store item is a temp root) So general "fiddling" can't happen, but GC can. The responsibility for both locking and registering as temporary roots falls on the caller, currently, as I believe it should. We may want to document this responsibility in the docstring for register-items, though. So currently finalize-store-file is safe from clobbering by another process attempting to finalize to the same path, but not safe from garbage collection between when the temporary store file is renamed and when it is registered. It needs an add-temp-root prior to renaming. > > 2. After the transaction, TO-REGISTER is considered valid. But are > the effects of the on-going deduplication observable, due to > non-atomicity of some operation? Subdirectories of store items need to be made writable prior to renaming the temp link, so there will necessarily be a window of time during which various subdirectories will appear writable. I don't think this should cause problems; we already assume that the .lock file is held, so we should be the only process attempting to deduplicate it. On a related note, we should probably use dynamic-wind for setting and restoring the permissions in replace-with-link. Also, replace-with-link doesn't check whether the "containing directory" is the store like optimisePath_() does, so in theory if another process tried to make a permanent change to the store's permissions it could be clobbered when replace-with-link restores the permissions. I don't know of any instance where this could happen currently, but it's something to keep in mind. And, of course, there's the inherent visible change of deduplication - possible modification of inode number, but I don't see how that could cause problems with any reasonable programs. > I think the =E2=80=98replace-with-link=E2=80=99 dance is atomic, so we sh= ould be fine. > > Thoughts? replace-with-link is atomic, but it can fail if the "canonical" link in .links is deleted by the garbage collector between when its existence is checked in 'deduplicate' and when temp link creation in replace-with-link happens. Then ENOENT would be returned, and 'deduplicate' wouldn't catch that exception, nor would optimisePath_() in nix/libstore/optimise-store.cc. The proper behavior there, in my opinion, would be to retry the deduplication. Attached is a patch that makes 'deduplicate' do that. Also, while I'm perusing optimisePath_(), there's a minor bug in which EMLINK generated by renaming the temp link may still be treated as a fatal error. This is because errno may be clobbered by unlink() prior to being checked (according to the errno man page it can still be modified even if the call succeeds). In summary, there are potential issues, but nothing that should be affected by reducing the transaction scope AFAIK. =2D reepca --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-deduplication-retry-on-ENOENT.patch Content-Transfer-Encoding: quoted-printable From=20461064da9e169df3dd939b734bb2860598d113f4 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Wed, 24 Jun 2020 00:56:50 -0500 Subject: [PATCH 1/2] deduplication: retry on ENOENT. It's possible for the garbage collector to remove the "canonical" link after it's been detected as existing by 'deduplicate'. This would cause an ENOENT error when replace-with-link attempts to create the temporary link. This changes it so that it will properly handle that by retrying. * guix/store/deduplication.scm (deduplicate): retry on ENOENT. =2D-- guix/store/deduplication.scm | 64 +++++++++++++++++++++++------------- 1 file changed, 41 insertions(+), 23 deletions(-) diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm index 6784ee0b92..b6d94e49c2 100644 =2D-- a/guix/store/deduplication.scm +++ b/guix/store/deduplication.scm @@ -161,26 +161,44 @@ under STORE." (scandir* path)) (let ((link-file (string-append links-directory "/" (bytevector->nix-base32-string has= h)))) =2D (if (file-exists? link-file) =2D (replace-with-link link-file path =2D #:swap-directory links-directory) =2D (catch 'system-error =2D (lambda () =2D (link path link-file)) =2D (lambda args =2D (let ((errno (system-error-errno args))) =2D (cond ((=3D errno EEXIST) =2D ;; Someone else put an entry for PATH in =2D ;; LINKS-DIRECTORY before we could. Let's us= e it. =2D (replace-with-link path link-file =2D #:swap-directory links-dir= ectory)) =2D ((=3D errno ENOSPC) =2D ;; There's not enough room in the directory i= ndex for =2D ;; more entries in .links, but that's fine: w= e can =2D ;; just stop. =2D #f) =2D ((=3D errno EMLINK) =2D ;; PATH has reached the maximum number of lin= ks, but =2D ;; that's OK: we just can't deduplicate it mo= re. =2D #f) =2D (else (apply throw args))))))))))) + (let retry () + (if (file-exists? link-file) + (catch 'system-error + (lambda () + (replace-with-link link-file path + #:swap-directory links-directory)) + (lambda args + (if (and (=3D (system-error-errno args) ENOENT) + ;; ENOENT can be produced because: + ;; - LINK-FILE has missing directory componen= ts + ;; - LINKS-DIRECTORY has missing directory + ;; components + ;; - PATH has missing directory components + ;; + ;; the last two are errors, the first just + ;; requires a retry. + (file-exists? (dirname path)) + (file-exists? links-directory)) + (retry) + (apply throw args)))) + (catch 'system-error + (lambda () + (link path link-file)) + (lambda args + (let ((errno (system-error-errno args))) + (cond ((=3D errno EEXIST) + ;; Someone else put an entry for PATH in + ;; LINKS-DIRECTORY before we could. Let's use + ;; it. + (retry)) + ((=3D errno ENOSPC) + ;; There's not enough room in the directory i= ndex + ;; for more entries in .links, but that's fin= e: + ;; we can just stop. + #f) + ((=3D errno EMLINK) + ;; PATH has reached the maximum number of lin= ks, + ;; but that's OK: we just can't deduplicate it + ;; more. + #f) + (else (apply throw args)))))))))))) =2D-=20 2.26.2 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0002-.dir-locals.el-fix-call-with-retrying-transaction-in.patch Content-Transfer-Encoding: quoted-printable From=20a14ff0a9dab4d73680befaf9d244d6cce0a73ab3 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Wed, 24 Jun 2020 01:00:40 -0500 Subject: [PATCH 2/2] .dir-locals.el: fix call-with-{retrying-}transaction indenting. * .dir-locals.el (call-with-transaction, call-with-retrying-transaction): change scheme-indent-function property from 2 to 1. =2D-- .dir-locals.el | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index 92979fc5ed..84e2738bca 100644 =2D-- a/.dir-locals.el +++ b/.dir-locals.el @@ -88,9 +88,9 @@ (eval . (put 'let-system 'scheme-indent-function 1)) =20 (eval . (put 'with-database 'scheme-indent-function 2)) =2D (eval . (put 'call-with-transaction 'scheme-indent-function 2)) + (eval . (put 'call-with-transaction 'scheme-indent-function 1)) (eval . (put 'with-statement 'scheme-indent-function 3)) =2D (eval . (put 'call-with-retrying-transaction 'scheme-indent-function = 2)) + (eval . (put 'call-with-retrying-transaction 'scheme-indent-function 1)) (eval . (put 'call-with-savepoint 'scheme-indent-function 1)) (eval . (put 'call-with-retrying-savepoint 'scheme-indent-function 1)) =20 =2D-=20 2.26.2 --=-=-=-- --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEdNapMPRLm4SepVYGwWaqSV9/GJwFAl7zkrQACgkQwWaqSV9/ GJxIRgf/QT0sGFVpqv3WM9fzF7pNxR48ahSkyE2Y5bTnQBCzqVpwSu2Tf8l6j5GR Fyw25tTWvkid7cuXwsfXBP5/Qe1eh+J2N1XO73idmEoXXubXlz6/eqYLi0Q0KJ3+ 2tDQ0TKiugPvyv9r7UhMNxnGXJgaoJqcmSXLi816Vv2NgV7E7broeulY6e7Txvm4 uXAYbMdV5GiXxJ7XYSgIPiEuP/H3mGjAFScZDlt1kkdEdik523pXHKTaYDB2QBG8 eXd+Uxv3SxS7LDl8u/UxqPdOZGPSBDFKYtR+0SS004F/EC7QmVKc2GiEUobHgu+5 kbKZ0wDcUSNnzzS3bNM2ovQO/9CF+g== =Ueya -----END PGP SIGNATURE----- --==-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 25 03:45:28 2020 Received: (at 42023) by debbugs.gnu.org; 25 Jun 2020 07:45:28 +0000 Received: from localhost ([127.0.0.1]:39663 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1joMZc-0006mx-8l for submit@debbugs.gnu.org; Thu, 25 Jun 2020 03:45:28 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43754) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1joMZX-0006mf-2M for 42023@debbugs.gnu.org; Thu, 25 Jun 2020 03:45:26 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:35253) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1joMZR-0000sj-AL; Thu, 25 Jun 2020 03:45:17 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=33090 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1joMZQ-0003Ss-Aj; Thu, 25 Jun 2020 03:45:16 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Caleb Ristvedt Subject: Re: [bug#42023] [PATCH] database: register-items: reduce transaction scope. References: <20200623163649.32444-1-mail@cbaines.net> <87366lxwcd.fsf@gnu.org> <87v9jg4aiz.fsf@cune.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 8 Messidor an 228 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Thu, 25 Jun 2020 09:45:14 +0200 In-Reply-To: <87v9jg4aiz.fsf@cune.org> (Caleb Ristvedt's message of "Wed, 24 Jun 2020 12:51:48 -0500") Message-ID: <87eeq3vbat.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42023 Cc: 42023@debbugs.gnu.org, Christopher Baines X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi, Caleb Ristvedt skribis: > Ludovic Court=C3=A8s writes: [...] >> Two questions: >> >> 1. Can another process come and fiddle with TO-REGISTER while we=E2=80= =99re >> still in =E2=80=98reset-timestamps=E2=80=99? Or can GC happen whil= e we=E2=80=99re in >> =E2=80=98reset-timestamps=E2=80=99 and delete TO-REGISTER and remov= e it from the >> database? > >> >> I think none of these scenarios can happen, as long as we=E2=80=99ve tak= en the >> .lock file for TO-REGISTER before, like =E2=80=98finalize-store-file=E2= =80=99 does. > > The lock file has no bearing on liveness of the path it locks, though > the liveness of the path it locks *does* count as liveness for the lock > file and for the .chroot file; see tryToDelete() in nix/libstore/gc.cc. > > (Well, semi-liveness; .lock and .chroot files won't show up in a list of > live paths, but they will still be protected from deletion if their > associated store item is a temp root) > > So general "fiddling" can't happen, but GC can. The responsibility for > both locking and registering as temporary roots falls on the caller, > currently, as I believe it should. We may want to document this > responsibility in the docstring for register-items, though. Yes, it would be good to add a sentence to document it. > So currently finalize-store-file is safe from clobbering by another > process attempting to finalize to the same path, but not safe from > garbage collection between when the temporary store file is renamed and > when it is registered. It needs an add-temp-root prior to renaming. Ah, so we=E2=80=99d need to do that before applying the patch that reduces = the scope of the transaction, right? >> 2. After the transaction, TO-REGISTER is considered valid. But are >> the effects of the on-going deduplication observable, due to >> non-atomicity of some operation? > > Subdirectories of store items need to be made writable prior to renaming > the temp link, so there will necessarily be a window of time during > which various subdirectories will appear writable. I don't think this > should cause problems; we already assume that the .lock file is held, so > we should be the only process attempting to deduplicate it. On a related > note, we should probably use dynamic-wind for setting and restoring the > permissions in replace-with-link. Yes. Here=E2=80=99s a patch for =E2=80=98dynamic-wind=E2=80=99: --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/.dir-locals.el b/.dir-locals.el index 92979fc5ed..155255a770 100644 --- a/.dir-locals.el +++ b/.dir-locals.el @@ -37,6 +37,7 @@ (eval . (put 'with-file-lock 'scheme-indent-function 1)) (eval . (put 'with-file-lock/no-wait 'scheme-indent-function 1)) (eval . (put 'with-profile-lock 'scheme-indent-function 1)) + (eval . (put 'with-writable-file 'scheme-indent-function 1)) (eval . (put 'package 'scheme-indent-function 0)) (eval . (put 'origin 'scheme-indent-function 0)) diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm index 6784ee0b92..af52c03370 100644 --- a/guix/store/deduplication.scm +++ b/guix/store/deduplication.scm @@ -94,6 +94,20 @@ LINK-PREFIX." (try (tempname-in link-prefix)) (apply throw args)))))) +(define (call-with-writable-file file thunk) + (let ((stat (lstat file))) + (dynamic-wind + (lambda () + (make-file-writable file)) + thunk + (lambda () + (set-file-time file stat) + (chmod file (stat:mode stat)))))) + +(define-syntax-rule (with-writable-file file exp ...) + "Make FILE writable for the dynamic extent of EXP..." + (call-with-writable-file file (lambda () exp ...))) + ;; There are 3 main kinds of errors we can get from hardlinking: "Too many ;; things link to this" (EMLINK), "this link already exists" (EEXIST), and ;; "can't fit more stuff in this directory" (ENOSPC). @@ -120,20 +134,14 @@ Note: TARGET, TO-REPLACE, and SWAP-DIRECTORY must be on the same file system." ;; If we couldn't create TEMP-LINK, that's OK: just don't do the ;; replacement, which means TO-REPLACE won't be deduplicated. (when temp-link - (let* ((parent (dirname to-replace)) - (stat (stat parent))) - (make-file-writable parent) + (with-writable-file (dirname to-replace) (catch 'system-error (lambda () (rename-file temp-link to-replace)) (lambda args (delete-file temp-link) (unless (= EMLINK (system-error-errno args)) - (apply throw args)))) - - ;; Restore PARENT's mtime and permissions. - (set-file-time parent stat) - (chmod parent (stat:mode stat))))) + (apply throw args))))))) (define* (deduplicate path hash #:key (store %store-directory)) "Check if a store item with sha256 hash HASH already exists. If so, --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable > Also, replace-with-link doesn't check whether the "containing directory" > is the store like optimisePath_() does, so in theory if another process > tried to make a permanent change to the store's permissions it could be > clobbered when replace-with-link restores the permissions. I don't know > of any instance where this could happen currently, but it's something to > keep in mind. I guess we should also avoid changing permissions on /gnu/store, it would be wiser. > And, of course, there's the inherent visible change of deduplication - > possible modification of inode number, but I don't see how that could > cause problems with any reasonable programs. Yes, that=E2=80=99s fine. >> I think the =E2=80=98replace-with-link=E2=80=99 dance is atomic, so we s= hould be fine. >> >> Thoughts? > > replace-with-link is atomic, but it can fail if the "canonical" link in > .links is deleted by the garbage collector between when its existence is > checked in 'deduplicate' and when temp link creation in > replace-with-link happens. Then ENOENT would be returned, and > 'deduplicate' wouldn't catch that exception, nor would optimisePath_() > in nix/libstore/optimise-store.cc. > > The proper behavior there, in my opinion, would be to retry the > deduplication. Attached is a patch that makes 'deduplicate' do that. > > Also, while I'm perusing optimisePath_(), there's a minor bug in which > EMLINK generated by renaming the temp link may still be treated as a > fatal error. This is because errno may be clobbered by unlink() prior to > being checked (according to the errno man page it can still be modified > even if the call succeeds). Indeed, good catch! > From 461064da9e169df3dd939b734bb2860598d113f4 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Wed, 24 Jun 2020 00:56:50 -0500 > Subject: [PATCH 1/2] deduplication: retry on ENOENT. > > It's possible for the garbage collector to remove the "canonical" link af= ter > it's been detected as existing by 'deduplicate'. This would cause an ENO= ENT > error when replace-with-link attempts to create the temporary link. This > changes it so that it will properly handle that by retrying. > > * guix/store/deduplication.scm (deduplicate): retry on ENOENT. > --- > guix/store/deduplication.scm | 64 +++++++++++++++++++++++------------- > 1 file changed, 41 insertions(+), 23 deletions(-) > > diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm > index 6784ee0b92..b6d94e49c2 100644 > --- a/guix/store/deduplication.scm > +++ b/guix/store/deduplication.scm > @@ -161,26 +161,44 @@ under STORE." > (scandir* path)) > (let ((link-file (string-append links-directory "/" > (bytevector->nix-base32-string h= ash)))) > - (if (file-exists? link-file) > - (replace-with-link link-file path > - #:swap-directory links-directory) > - (catch 'system-error > - (lambda () > - (link path link-file)) > - (lambda args > - (let ((errno (system-error-errno args))) > - (cond ((=3D errno EEXIST) > - ;; Someone else put an entry for PATH in > - ;; LINKS-DIRECTORY before we could. Let's us= e it. > - (replace-with-link path link-file > - #:swap-directory links-dir= ectory)) > - ((=3D errno ENOSPC) > - ;; There's not enough room in the directory i= ndex for > - ;; more entries in .links, but that's fine: w= e can > - ;; just stop. > - #f) > - ((=3D errno EMLINK) > - ;; PATH has reached the maximum number of lin= ks, but > - ;; that's OK: we just can't deduplicate it mo= re. > - #f) > - (else (apply throw args))))))))))) > + (let retry () > + (if (file-exists? link-file) > + (catch 'system-error > + (lambda () > + (replace-with-link link-file path > + #:swap-directory links-directory)) > + (lambda args > + (if (and (=3D (system-error-errno args) ENOENT) > + ;; ENOENT can be produced because: > + ;; - LINK-FILE has missing directory compon= ents > + ;; - LINKS-DIRECTORY has missing directory > + ;; components > + ;; - PATH has missing directory components > + ;; > + ;; the last two are errors, the first just > + ;; requires a retry. > + (file-exists? (dirname path)) > + (file-exists? links-directory)) > + (retry) > + (apply throw args)))) I feel like there are TOCTTOU issues here and redundant =E2=80=98stat=E2=80= =99 calls, plus the risk of catching a =E2=80=98system-error=E2=80=99 coming from =E2= =80=9Csomewhere else.=E2=80=9D What about baking this logic directly in =E2=80=98replace-with-link=E2=80= =99, and replacing =E2=80=98file-exists?=E2=80=99 calls by 'system-error handling? > From a14ff0a9dab4d73680befaf9d244d6cce0a73ab3 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Wed, 24 Jun 2020 01:00:40 -0500 > Subject: [PATCH 2/2] .dir-locals.el: fix call-with-{retrying-}transaction > indenting. > > * .dir-locals.el (call-with-transaction, call-with-retrying-transaction): > change scheme-indent-function property from 2 to 1. LGTM! Thanks for your quick feedback and thorough analyses! Ludo=E2=80=99. --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Sun Jul 26 11:31:38 2020 Received: (at 42023) by debbugs.gnu.org; 26 Jul 2020 15:31:38 +0000 Received: from localhost ([127.0.0.1]:53259 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jzicj-0001Ix-Dz for submit@debbugs.gnu.org; Sun, 26 Jul 2020 11:31:37 -0400 Received: from eggs.gnu.org ([209.51.188.92]:45794) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jzicg-0001Ii-Dw for 42023@debbugs.gnu.org; Sun, 26 Jul 2020 11:31:35 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:36326) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jzicb-0002uR-07; Sun, 26 Jul 2020 11:31:29 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=35368 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jzicZ-0000yS-OG; Sun, 26 Jul 2020 11:31:28 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Caleb Ristvedt Subject: Re: [bug#42023] [PATCH] database: register-items: reduce transaction scope. References: <20200623163649.32444-1-mail@cbaines.net> <87366lxwcd.fsf@gnu.org> <87v9jg4aiz.fsf@cune.org> <87eeq3vbat.fsf@gnu.org> Date: Sun, 26 Jul 2020 17:31:25 +0200 In-Reply-To: <87eeq3vbat.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Thu, 25 Jun 2020 09:45:14 +0200") Message-ID: <87tuxu2sz6.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42023 Cc: 42023@debbugs.gnu.org, Christopher Baines X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi reepca, Did you have time to look into this (see below)? I still see a lot of contention on /var/guix/db/db.sqlite-shm on berlin and I feel like these changes would be welcome. :-) Thanks, Ludo=E2=80=99. Ludovic Court=C3=A8s skribis: > Hi, > > Caleb Ristvedt skribis: > >> Ludovic Court=C3=A8s writes: > > [...] > >>> Two questions: >>> >>> 1. Can another process come and fiddle with TO-REGISTER while we=E2= =80=99re >>> still in =E2=80=98reset-timestamps=E2=80=99? Or can GC happen whi= le we=E2=80=99re in >>> =E2=80=98reset-timestamps=E2=80=99 and delete TO-REGISTER and remo= ve it from the >>> database? >> >>> >>> I think none of these scenarios can happen, as long as we=E2=80=99ve ta= ken the >>> .lock file for TO-REGISTER before, like =E2=80=98finalize-store-file=E2= =80=99 does. >> >> The lock file has no bearing on liveness of the path it locks, though >> the liveness of the path it locks *does* count as liveness for the lock >> file and for the .chroot file; see tryToDelete() in nix/libstore/gc.cc. >> >> (Well, semi-liveness; .lock and .chroot files won't show up in a list of >> live paths, but they will still be protected from deletion if their >> associated store item is a temp root) >> >> So general "fiddling" can't happen, but GC can. The responsibility for >> both locking and registering as temporary roots falls on the caller, >> currently, as I believe it should. We may want to document this >> responsibility in the docstring for register-items, though. > > Yes, it would be good to add a sentence to document it. > >> So currently finalize-store-file is safe from clobbering by another >> process attempting to finalize to the same path, but not safe from >> garbage collection between when the temporary store file is renamed and >> when it is registered. It needs an add-temp-root prior to renaming. > > Ah, so we=E2=80=99d need to do that before applying the patch that reduce= s the > scope of the transaction, right? > >>> 2. After the transaction, TO-REGISTER is considered valid. But are >>> the effects of the on-going deduplication observable, due to >>> non-atomicity of some operation? >> >> Subdirectories of store items need to be made writable prior to renaming >> the temp link, so there will necessarily be a window of time during >> which various subdirectories will appear writable. I don't think this >> should cause problems; we already assume that the .lock file is held, so >> we should be the only process attempting to deduplicate it. On a related >> note, we should probably use dynamic-wind for setting and restoring the >> permissions in replace-with-link. > > Yes. Here=E2=80=99s a patch for =E2=80=98dynamic-wind=E2=80=99: > > diff --git a/.dir-locals.el b/.dir-locals.el > index 92979fc5ed..155255a770 100644 > --- a/.dir-locals.el > +++ b/.dir-locals.el > @@ -37,6 +37,7 @@ > (eval . (put 'with-file-lock 'scheme-indent-function 1)) > (eval . (put 'with-file-lock/no-wait 'scheme-indent-function 1)) > (eval . (put 'with-profile-lock 'scheme-indent-function 1)) > + (eval . (put 'with-writable-file 'scheme-indent-function 1)) >=20=20 > (eval . (put 'package 'scheme-indent-function 0)) > (eval . (put 'origin 'scheme-indent-function 0)) > diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm > index 6784ee0b92..af52c03370 100644 > --- a/guix/store/deduplication.scm > +++ b/guix/store/deduplication.scm > @@ -94,6 +94,20 @@ LINK-PREFIX." > (try (tempname-in link-prefix)) > (apply throw args)))))) >=20=20 > +(define (call-with-writable-file file thunk) > + (let ((stat (lstat file))) > + (dynamic-wind > + (lambda () > + (make-file-writable file)) > + thunk > + (lambda () > + (set-file-time file stat) > + (chmod file (stat:mode stat)))))) > + > +(define-syntax-rule (with-writable-file file exp ...) > + "Make FILE writable for the dynamic extent of EXP..." > + (call-with-writable-file file (lambda () exp ...))) > + > ;; There are 3 main kinds of errors we can get from hardlinking: "Too ma= ny > ;; things link to this" (EMLINK), "this link already exists" (EEXIST), a= nd > ;; "can't fit more stuff in this directory" (ENOSPC). > @@ -120,20 +134,14 @@ Note: TARGET, TO-REPLACE, and SWAP-DIRECTORY must b= e on the same file system." > ;; If we couldn't create TEMP-LINK, that's OK: just don't do the > ;; replacement, which means TO-REPLACE won't be deduplicated. > (when temp-link > - (let* ((parent (dirname to-replace)) > - (stat (stat parent))) > - (make-file-writable parent) > + (with-writable-file (dirname to-replace) > (catch 'system-error > (lambda () > (rename-file temp-link to-replace)) > (lambda args > (delete-file temp-link) > (unless (=3D EMLINK (system-error-errno args)) > - (apply throw args)))) > - > - ;; Restore PARENT's mtime and permissions. > - (set-file-time parent stat) > - (chmod parent (stat:mode stat))))) > + (apply throw args))))))) >=20=20 > (define* (deduplicate path hash #:key (store %store-directory)) > "Check if a store item with sha256 hash HASH already exists. If so, > > >> Also, replace-with-link doesn't check whether the "containing directory" >> is the store like optimisePath_() does, so in theory if another process >> tried to make a permanent change to the store's permissions it could be >> clobbered when replace-with-link restores the permissions. I don't know >> of any instance where this could happen currently, but it's something to >> keep in mind. > > I guess we should also avoid changing permissions on /gnu/store, it > would be wiser. > >> And, of course, there's the inherent visible change of deduplication - >> possible modification of inode number, but I don't see how that could >> cause problems with any reasonable programs. > > Yes, that=E2=80=99s fine. > >>> I think the =E2=80=98replace-with-link=E2=80=99 dance is atomic, so we = should be fine. >>> >>> Thoughts? >> >> replace-with-link is atomic, but it can fail if the "canonical" link in >> .links is deleted by the garbage collector between when its existence is >> checked in 'deduplicate' and when temp link creation in >> replace-with-link happens. Then ENOENT would be returned, and >> 'deduplicate' wouldn't catch that exception, nor would optimisePath_() >> in nix/libstore/optimise-store.cc. >> >> The proper behavior there, in my opinion, would be to retry the >> deduplication. Attached is a patch that makes 'deduplicate' do that. >> >> Also, while I'm perusing optimisePath_(), there's a minor bug in which >> EMLINK generated by renaming the temp link may still be treated as a >> fatal error. This is because errno may be clobbered by unlink() prior to >> being checked (according to the errno man page it can still be modified >> even if the call succeeds). > > Indeed, good catch! > >> From 461064da9e169df3dd939b734bb2860598d113f4 Mon Sep 17 00:00:00 2001 >> From: Caleb Ristvedt >> Date: Wed, 24 Jun 2020 00:56:50 -0500 >> Subject: [PATCH 1/2] deduplication: retry on ENOENT. >> >> It's possible for the garbage collector to remove the "canonical" link a= fter >> it's been detected as existing by 'deduplicate'. This would cause an EN= OENT >> error when replace-with-link attempts to create the temporary link. This >> changes it so that it will properly handle that by retrying. >> >> * guix/store/deduplication.scm (deduplicate): retry on ENOENT. >> --- >> guix/store/deduplication.scm | 64 +++++++++++++++++++++++------------- >> 1 file changed, 41 insertions(+), 23 deletions(-) >> >> diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm >> index 6784ee0b92..b6d94e49c2 100644 >> --- a/guix/store/deduplication.scm >> +++ b/guix/store/deduplication.scm >> @@ -161,26 +161,44 @@ under STORE." >> (scandir* path)) >> (let ((link-file (string-append links-directory "/" >> (bytevector->nix-base32-string = hash)))) >> - (if (file-exists? link-file) >> - (replace-with-link link-file path >> - #:swap-directory links-directory) >> - (catch 'system-error >> - (lambda () >> - (link path link-file)) >> - (lambda args >> - (let ((errno (system-error-errno args))) >> - (cond ((=3D errno EEXIST) >> - ;; Someone else put an entry for PATH in >> - ;; LINKS-DIRECTORY before we could. Let's u= se it. >> - (replace-with-link path link-file >> - #:swap-directory links-di= rectory)) >> - ((=3D errno ENOSPC) >> - ;; There's not enough room in the directory = index for >> - ;; more entries in .links, but that's fine: = we can >> - ;; just stop. >> - #f) >> - ((=3D errno EMLINK) >> - ;; PATH has reached the maximum number of li= nks, but >> - ;; that's OK: we just can't deduplicate it m= ore. >> - #f) >> - (else (apply throw args))))))))))) >> + (let retry () >> + (if (file-exists? link-file) >> + (catch 'system-error >> + (lambda () >> + (replace-with-link link-file path >> + #:swap-directory links-directory= )) >> + (lambda args >> + (if (and (=3D (system-error-errno args) ENOENT) >> + ;; ENOENT can be produced because: >> + ;; - LINK-FILE has missing directory compo= nents >> + ;; - LINKS-DIRECTORY has missing directory >> + ;; components >> + ;; - PATH has missing directory components >> + ;; >> + ;; the last two are errors, the first just >> + ;; requires a retry. >> + (file-exists? (dirname path)) >> + (file-exists? links-directory)) >> + (retry) >> + (apply throw args)))) > > I feel like there are TOCTTOU issues here and redundant =E2=80=98stat=E2= =80=99 calls, > plus the risk of catching a =E2=80=98system-error=E2=80=99 coming from = =E2=80=9Csomewhere else.=E2=80=9D > > What about baking this logic directly in =E2=80=98replace-with-link=E2=80= =99, and > replacing =E2=80=98file-exists?=E2=80=99 calls by 'system-error handling? > >> From a14ff0a9dab4d73680befaf9d244d6cce0a73ab3 Mon Sep 17 00:00:00 2001 >> From: Caleb Ristvedt >> Date: Wed, 24 Jun 2020 01:00:40 -0500 >> Subject: [PATCH 2/2] .dir-locals.el: fix call-with-{retrying-}transaction >> indenting. >> >> * .dir-locals.el (call-with-transaction, call-with-retrying-transaction): >> change scheme-indent-function property from 2 to 1. > > LGTM! > > Thanks for your quick feedback and thorough analyses! > > Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Sun Aug 09 00:14:23 2020 Received: (at 42023) by debbugs.gnu.org; 9 Aug 2020 04:14:23 +0000 Received: from localhost ([127.0.0.1]:59828 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k4cj0-00006x-KT for submit@debbugs.gnu.org; Sun, 09 Aug 2020 00:14:23 -0400 Received: from mail-il1-f178.google.com ([209.85.166.178]:40404) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k4civ-00006g-Uw for 42023@debbugs.gnu.org; Sun, 09 Aug 2020 00:14:20 -0400 Received: by mail-il1-f178.google.com with SMTP id x1so4980577ilp.7 for <42023@debbugs.gnu.org>; Sat, 08 Aug 2020 21:14:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cune-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=u4Ih1rN7QVYP7F1CB2xXuvtuPRWLWtBm8PnxjEMBCrs=; b=ke+aKyIV0amP+RYey8a/OxDllry3fNTSwukeENeOKdnzTE+CfAES5stmjI9GoZMpDa oDSHcBmKCCJlYRcDC3tBoOosJeodb7l5LaEqlyw6noIx3PvQJdde9TzC8/B+unSiCrCl vHF5EPok/scAN+Pvnr/UIek6o8xN0khoaP6teIBSQAWLcvXhCEOKKt62BTKujWPCcoJJ D0fDSZZyywwXJg96GYpxDD8cXz/SrQLGAhO3lHugPx2Cy6cjc9zRbiRi5V8DEvwtUQZo Q6dE8GxVvv5X47LWDOAqKl1IHhw9CyLHboZZJWv8rANX8Qpfv34FDM1MJuFWocyLhgDZ 8hYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=u4Ih1rN7QVYP7F1CB2xXuvtuPRWLWtBm8PnxjEMBCrs=; b=scIT7gYr8jsXx000oqRHSUc6D02yvLcnO/JjC7WL9qNR7Rkct3RFa/chIw95EiLhQA 17VNvSq1HrEFa6epYP+l2JP+pXHYemKZmDuQbaBLD7R+M08aLiZC/JQUMHxS8NYbjtRe FwH/7CnnoQWAQPuy3Yr0fK5J0sTA/urszaE4cfUfFwyV7MHTGa9hld+aCdWKnR4eFnXD afgUEVKSGKQAMqu+nKgoDgjr36VugGPrOgFk1uDqla7ioY6F4fIf9Gt7EeSzYBOg8Nk7 on0lqe0aAchfFEwpznzwD08VKQp/ky5wVir98E0zI5CnM4bZMDy52XIHDdEPsaKbhdCU biyQ== X-Gm-Message-State: AOAM531isYPWsYQ3+XPCLnSzHs4jgNUWWKj5OvbdT7+ZRh184KHuaBv6 i/9iCwH+yqtMWHZYK5fiJ3j0eQ== X-Google-Smtp-Source: ABdhPJzL5ypgU/kjrV5J0rCnettKK+eJFKCXMzbfQAFedH3ySrRQD5G4nHY4dt3gUIiLJDLBKB6avA== X-Received: by 2002:a92:d289:: with SMTP id p9mr12442930ilp.100.1596946452061; Sat, 08 Aug 2020 21:14:12 -0700 (PDT) Received: from GuixPotato ([208.89.170.4]) by smtp.gmail.com with ESMTPSA id 25sm9762150ilv.85.2020.08.08.21.14.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 21:14:11 -0700 (PDT) From: Caleb Ristvedt To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: [bug#42023] [PATCH] database: register-items: reduce transaction scope. References: <20200623163649.32444-1-mail@cbaines.net> <87366lxwcd.fsf@gnu.org> <87v9jg4aiz.fsf@cune.org> <87eeq3vbat.fsf@gnu.org> <87tuxu2sz6.fsf@gnu.org> Date: Sat, 08 Aug 2020 23:13:35 -0500 In-Reply-To: <87tuxu2sz6.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Sun, 26 Jul 2020 17:31:25 +0200") Message-ID: <87bljka234.fsf@cune.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 42023 Cc: 42023@debbugs.gnu.org, Christopher Baines X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --==-=-= Content-Type: multipart/mixed; boundary="=-=-=" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Ludovic Court=C3=A8s writes: > Hi reepca, > > Did you have time to look into this (see below)? I still see a lot of > contention on /var/guix/db/db.sqlite-shm on berlin and I feel like these > changes would be welcome. :-) Apologies for the long delay, I think I'm caught up on this issue now. Patch series detailed below. > > Thanks, > Ludo=E2=80=99. >>> So general "fiddling" can't happen, but GC can. The responsibility for >>> both locking and registering as temporary roots falls on the caller, >>> currently, as I believe it should. We may want to document this >>> responsibility in the docstring for register-items, though. >> >> Yes, it would be good to add a sentence to document it. Patch attached. >> >>> So currently finalize-store-file is safe from clobbering by another >>> process attempting to finalize to the same path, but not safe from >>> garbage collection between when the temporary store file is renamed and >>> when it is registered. It needs an add-temp-root prior to renaming. >> >> Ah, so we=E2=80=99d need to do that before applying the patch that reduc= es the >> scope of the transaction, right? AFAIK the only code path that actually uses finalize-store-file currently is from the build hook / offload thingy, and it turns out that the outputs of derivations should already be registered as temp roots by the daemon process that launched the offload process (specifically registered in haveDerivation() in nix/libstore/build.cc). So this is technically not currently necessary before or after the scope-reducing patch. But it makes sense to guarantee that the register-items invocation will only happen on items that are protected from garbage collection, so I've put a patch, right before the documenting of that requirement, that modifies finalize-store-file to always add the file being finalized as a temp root for the extent of the register-items invocation. >>> Also, replace-with-link doesn't check whether the "containing directory" >>> is the store like optimisePath_() does, so in theory if another process >>> tried to make a permanent change to the store's permissions it could be >>> clobbered when replace-with-link restores the permissions. I don't know >>> of any instance where this could happen currently, but it's something to >>> keep in mind. >> >> I guess we should also avoid changing permissions on /gnu/store, it >> would be wiser. While rebasing my changes I noticed that the current implementation of this uses (%store-directory) from (guix build utils), which may not correspond to the #:store keyword argument of 'deduplicate'. So I added a patch that propagates the store through to replace-with-link and from there to with-writable-file. >>> diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm >>> index 6784ee0b92..b6d94e49c2 100644 >>> --- a/guix/store/deduplication.scm >>> +++ b/guix/store/deduplication.scm >>> @@ -161,26 +161,44 @@ under STORE." >>> (scandir* path)) >>> (let ((link-file (string-append links-directory "/" >>> (bytevector->nix-base32-string= hash)))) >>> - (if (file-exists? link-file) >>> - (replace-with-link link-file path >>> - #:swap-directory links-directory) >>> - (catch 'system-error >>> - (lambda () >>> - (link path link-file)) >>> - (lambda args >>> - (let ((errno (system-error-errno args))) >>> - (cond ((=3D errno EEXIST) >>> - ;; Someone else put an entry for PATH in >>> - ;; LINKS-DIRECTORY before we could. Let's = use it. >>> - (replace-with-link path link-file >>> - #:swap-directory links-d= irectory)) >>> - ((=3D errno ENOSPC) >>> - ;; There's not enough room in the directory= index for >>> - ;; more entries in .links, but that's fine:= we can >>> - ;; just stop. >>> - #f) >>> - ((=3D errno EMLINK) >>> - ;; PATH has reached the maximum number of l= inks, but >>> - ;; that's OK: we just can't deduplicate it = more. >>> - #f) >>> - (else (apply throw args))))))))))) >>> + (let retry () >>> + (if (file-exists? link-file) >>> + (catch 'system-error >>> + (lambda () >>> + (replace-with-link link-file path >>> + #:swap-directory links-director= y)) >>> + (lambda args >>> + (if (and (=3D (system-error-errno args) ENOENT) >>> + ;; ENOENT can be produced because: >>> + ;; - LINK-FILE has missing directory comp= onents >>> + ;; - LINKS-DIRECTORY has missing directory >>> + ;; components >>> + ;; - PATH has missing directory components >>> + ;; >>> + ;; the last two are errors, the first just >>> + ;; requires a retry. >>> + (file-exists? (dirname path)) >>> + (file-exists? links-directory)) >>> + (retry) >>> + (apply throw args)))) >> >> I feel like there are TOCTTOU issues here and redundant =E2=80=98stat=E2= =80=99 calls, >> plus the risk of catching a =E2=80=98system-error=E2=80=99 coming from = =E2=80=9Csomewhere else.=E2=80=9D >> >> What about baking this logic directly in =E2=80=98replace-with-link=E2= =80=99, and >> replacing =E2=80=98file-exists?=E2=80=99 calls by 'system-error handling? Patch attached. I've renamed replace-with-link to canonicalize-with-link since it may now have to create the target link. Unfortunately there are places where 'file-exists?' error handling is necessary, simply because of ambiguity in errnos. For example, link(oldpath, newpath) can return ENOENT, but there's no way of knowing from that alone whether this is because oldpath has missing directories or newpath has missing directories or the directory components of oldpath are all present but the file itself is missing (the case we need to be able to recognize and retry in case of). Also, I tried removing the first check for whether the canonical link exists in favor of error handling like you suggested, but this actually messes up the hard-coded number of link-invocations expected in tests/store-deduplication.scm - it expects a single link invocation when the canonical link already exists, but the error-handling approach uses two - one to discover it exists, and another to create the temporary link. I didn't know how to reconcile the testing strategy with this behavior, so I kept the behavior of first using a 'file-exists?' call to test for the existence of the canonical link. All tests pass except for tests/packages.scm, which failed even without the patches. =2D reepca --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-.dir-locals.el-fix-call-with-retrying-transaction-in.patch Content-Transfer-Encoding: quoted-printable From=204c8f0cc50e2a1a33d9ce2f8e58cc426872676a7f Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Wed, 24 Jun 2020 01:00:40 -0500 Subject: [PATCH 1/6] .dir-locals.el: fix call-with-{retrying-}transaction indenting. * .dir-locals.el (call-with-transaction, call-with-retrying-transaction): change scheme-indent-function property from 2 to 1. =2D-- .dir-locals.el | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index 155255a770..d3ec2dd00d 100644 =2D-- a/.dir-locals.el +++ b/.dir-locals.el @@ -89,9 +89,9 @@ (eval . (put 'let-system 'scheme-indent-function 1)) =20 (eval . (put 'with-database 'scheme-indent-function 2)) =2D (eval . (put 'call-with-transaction 'scheme-indent-function 2)) + (eval . (put 'call-with-transaction 'scheme-indent-function 1)) (eval . (put 'with-statement 'scheme-indent-function 3)) =2D (eval . (put 'call-with-retrying-transaction 'scheme-indent-function = 2)) + (eval . (put 'call-with-retrying-transaction 'scheme-indent-function 1)) (eval . (put 'call-with-savepoint 'scheme-indent-function 1)) (eval . (put 'call-with-retrying-savepoint 'scheme-indent-function 1)) =20 =2D-=20 2.28.0 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0002-deduplication-pass-store-directory-to-replace-with-l.patch Content-Transfer-Encoding: quoted-printable From=209717568f922e0921e5fdc320cbe6689768d29a29 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Sat, 8 Aug 2020 10:05:22 -0500 Subject: [PATCH 2/6] deduplication: pass store directory to replace-with-li= nk. This causes with-writable-file to take into consideration the actual store being used, as passed to 'deduplicate', rather than whatever (%store-directory) may return. * guix/store/deduplication.scm (replace-with-link): new keyword argument 'store'. Pass to with-writable-file. (with-writable-file, call-with-writable-file): new store argument. (deduplicate): pass store to replace-with-link. =2D-- .dir-locals.el | 2 +- guix/store/deduplication.scm | 102 ++++++++++++++++++----------------- 2 files changed, 54 insertions(+), 50 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index d3ec2dd00d..419911972d 100644 =2D-- a/.dir-locals.el +++ b/.dir-locals.el @@ -37,7 +37,7 @@ (eval . (put 'with-file-lock 'scheme-indent-function 1)) (eval . (put 'with-file-lock/no-wait 'scheme-indent-function 1)) (eval . (put 'with-profile-lock 'scheme-indent-function 1)) =2D (eval . (put 'with-writable-file 'scheme-indent-function 1)) + (eval . (put 'with-writable-file 'scheme-indent-function 2)) =20 (eval . (put 'package 'scheme-indent-function 0)) (eval . (put 'origin 'scheme-indent-function 0)) diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm index df959bdd06..0655ceb890 100644 =2D-- a/guix/store/deduplication.scm +++ b/guix/store/deduplication.scm @@ -94,8 +94,8 @@ LINK-PREFIX." (try (tempname-in link-prefix)) (apply throw args)))))) =20 =2D(define (call-with-writable-file file thunk) =2D (if (string=3D? file (%store-directory)) +(define (call-with-writable-file file store thunk) + (if (string=3D? file store) (thunk) ;don't meddle with the store's permiss= ions (let ((stat (lstat file))) (dynamic-wind @@ -106,17 +106,18 @@ LINK-PREFIX." (set-file-time file stat) (chmod file (stat:mode stat))))))) =20 =2D(define-syntax-rule (with-writable-file file exp ...) +(define-syntax-rule (with-writable-file file store exp ...) "Make FILE writable for the dynamic extent of EXP..., except if FILE is = the store." =2D (call-with-writable-file file (lambda () exp ...))) + (call-with-writable-file file store (lambda () exp ...))) =20 ;; There are 3 main kinds of errors we can get from hardlinking: "Too many ;; things link to this" (EMLINK), "this link already exists" (EEXIST), and ;; "can't fit more stuff in this directory" (ENOSPC). =20 (define* (replace-with-link target to-replace =2D #:key (swap-directory (dirname target))) + #:key (swap-directory (dirname target)) + (store (%store-directory))) "Atomically replace the file TO-REPLACE with a link to TARGET. Use SWAP-DIRECTORY as the directory to store temporary hard links. Upon ENOSPC and EMLINK, TO-REPLACE is left unchanged. @@ -137,7 +138,7 @@ Note: TARGET, TO-REPLACE, and SWAP-DIRECTORY must be on= the same file system." ;; If we couldn't create TEMP-LINK, that's OK: just don't do the ;; replacement, which means TO-REPLACE won't be deduplicated. (when temp-link =2D (with-writable-file (dirname to-replace) + (with-writable-file (dirname to-replace) store (catch 'system-error (lambda () (rename-file temp-link to-replace)) @@ -154,46 +155,49 @@ under STORE." (define links-directory (string-append store "/.links")) =20 =2D (mkdir-p links-directory) =2D (let loop ((path path) =2D (type (stat:type (lstat path))) =2D (hash hash)) =2D (if (eq? 'directory type) =2D ;; Can't hardlink directories, so hardlink their atoms. =2D (for-each (match-lambda =2D ((file . properties) =2D (unless (member file '("." "..")) =2D (let* ((file (string-append path "/" file)) =2D (type (match (assoc-ref properties 'type) =2D ((or 'unknown #f) =2D (stat:type (lstat file))) =2D (type type)))) =2D (loop file type =2D (and (not (eq? 'directory type)) =2D (nar-sha256 file))))))) =2D (scandir* path)) =2D (let ((link-file (string-append links-directory "/" =2D (bytevector->nix-base32-string h= ash)))) =2D (if (file-exists? link-file) =2D (replace-with-link link-file path =2D #:swap-directory links-directory) =2D (catch 'system-error =2D (lambda () =2D (link path link-file)) =2D (lambda args =2D (let ((errno (system-error-errno args))) =2D (cond ((=3D errno EEXIST) =2D ;; Someone else put an entry for PATH in =2D ;; LINKS-DIRECTORY before we could. Let's us= e it. =2D (replace-with-link path link-file =2D #:swap-directory links-dir= ectory)) =2D ((=3D errno ENOSPC) =2D ;; There's not enough room in the directory i= ndex for =2D ;; more entries in .links, but that's fine: w= e can =2D ;; just stop. =2D #f) =2D ((=3D errno EMLINK) =2D ;; PATH has reached the maximum number of lin= ks, but =2D ;; that's OK: we just can't deduplicate it mo= re. =2D #f) =2D (else (apply throw args))))))))))) + (mkdir-p links-directory) + (let loop ((path path) + (type (stat:type (lstat path))) + (hash hash)) + (if (eq? 'directory type) + ;; Can't hardlink directories, so hardlink their atoms. + (for-each (match-lambda + ((file . properties) + (unless (member file '("." "..")) + (let* ((file (string-append path "/" file)) + (type (match (assoc-ref properties 'type) + ((or 'unknown #f) + (stat:type (lstat file))) + (type type)))) + (loop file type + (and (not (eq? 'directory type)) + (nar-sha256 file))))))) + (scandir* path)) + (let ((link-file (string-append links-directory "/" + (bytevector->nix-base32-string h= ash)))) + (if (file-exists? link-file) + (replace-with-link link-file path + #:swap-directory links-directory + #:store store) + (catch 'system-error + (lambda () + (link path link-file)) + (lambda args + (let ((errno (system-error-errno args))) + (cond ((=3D errno EEXIST) + ;; Someone else put an entry for PATH in + ;; LINKS-DIRECTORY before we could. Let's us= e it. + (replace-with-link path link-file + #:swap-directory + links-directory + #:store store)) + ((=3D errno ENOSPC) + ;; There's not enough room in the directory i= ndex for + ;; more entries in .links, but that's fine: w= e can + ;; just stop. + #f) + ((=3D errno EMLINK) + ;; PATH has reached the maximum number of lin= ks, but + ;; that's OK: we just can't deduplicate it mo= re. + #f) + (else (apply throw args))))))))))) =2D-=20 2.28.0 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0003-deduplication-retry-on-ENOENT.patch Content-Transfer-Encoding: quoted-printable From=20b992a3aaac7e3b30222e0bf1df09093f18e25e6a Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Sat, 8 Aug 2020 11:25:57 -0500 Subject: [PATCH 3/6] deduplication: retry on ENOENT. It's possible for the garbage collector to remove the "canonical" link after it's been detected as existing by 'deduplicate'. This would cause an ENOENT error when replace-with-link attempts to create the temporary link. This changes it so that it will properly handle that by retrying. * guix/store/deduplication.scm (replace-with-link): renamed to canonicalize-with-link, now also handles the case where the target link doesn't exist yet, and retries on ENOENT. (deduplicate): modified to use canonicalize-with-link. =2D-- guix/store/deduplication.scm | 103 ++++++++++++++++++----------------- 1 file changed, 53 insertions(+), 50 deletions(-) diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm index 0655ceb890..035a4218fb 100644 =2D-- a/guix/store/deduplication.scm +++ b/guix/store/deduplication.scm @@ -115,9 +115,9 @@ store." ;; things link to this" (EMLINK), "this link already exists" (EEXIST), and ;; "can't fit more stuff in this directory" (ENOSPC). =20 =2D(define* (replace-with-link target to-replace =2D #:key (swap-directory (dirname target)) =2D (store (%store-directory))) +(define* (canonicalize-with-link target to-replace + #:key (swap-directory (dirname target)) + (store (%store-directory))) "Atomically replace the file TO-REPLACE with a link to TARGET. Use SWAP-DIRECTORY as the directory to store temporary hard links. Upon ENOSPC and EMLINK, TO-REPLACE is left unchanged. @@ -126,7 +126,32 @@ Note: TARGET, TO-REPLACE, and SWAP-DIRECTORY must be o= n the same file system." (define temp-link (catch 'system-error (lambda () =2D (get-temp-link target swap-directory)) + (let retry () + (if (file-exists? target) + (catch 'system-error + (lambda () + (get-temp-link target swap-directory)) + (lambda args + (let ((errno (system-error-errno args))) + (if (=3D errno ENOENT) + ;; either SWAP-DIRECTORY has missing directory + ;; components or TARGET was deleted - this is a + ;; fundamental ambiguity to the errno produced by + ;; link() + (if (file-exists? swap-directory) + ;; we must assume link failed because target d= oesn't + ;; exist, so create it. + (retry) + (apply throw args)) + (apply throw args))))) + (catch 'system-error + (lambda () + (link to-replace target) + #f) + (lambda args + (if (=3D (system-error-errno args) EEXIST) + (retry) + (apply throw args))))))) (lambda args ;; We get ENOSPC when we can't fit an additional entry in ;; SWAP-DIRECTORY. If it's EMLINK, then TARGET has reached its @@ -155,49 +180,27 @@ under STORE." (define links-directory (string-append store "/.links")) =20 =2D (mkdir-p links-directory) =2D (let loop ((path path) =2D (type (stat:type (lstat path))) =2D (hash hash)) =2D (if (eq? 'directory type) =2D ;; Can't hardlink directories, so hardlink their atoms. =2D (for-each (match-lambda =2D ((file . properties) =2D (unless (member file '("." "..")) =2D (let* ((file (string-append path "/" file)) =2D (type (match (assoc-ref properties 'type) =2D ((or 'unknown #f) =2D (stat:type (lstat file))) =2D (type type)))) =2D (loop file type =2D (and (not (eq? 'directory type)) =2D (nar-sha256 file))))))) =2D (scandir* path)) =2D (let ((link-file (string-append links-directory "/" =2D (bytevector->nix-base32-string= hash)))) =2D (if (file-exists? link-file) =2D (replace-with-link link-file path =2D #:swap-directory links-directory =2D #:store store) =2D (catch 'system-error =2D (lambda () =2D (link path link-file)) =2D (lambda args =2D (let ((errno (system-error-errno args))) =2D (cond ((=3D errno EEXIST) =2D ;; Someone else put an entry for PATH in =2D ;; LINKS-DIRECTORY before we could. Let's = use it. =2D (replace-with-link path link-file =2D #:swap-directory =2D links-directory =2D #:store store)) =2D ((=3D errno ENOSPC) =2D ;; There's not enough room in the directory= index for =2D ;; more entries in .links, but that's fine:= we can =2D ;; just stop. =2D #f) =2D ((=3D errno EMLINK) =2D ;; PATH has reached the maximum number of l= inks, but =2D ;; that's OK: we just can't deduplicate it = more. =2D #f) =2D (else (apply throw args))))))))))) + (mkdir-p links-directory) + (let loop ((path path) + (type (stat:type (lstat path))) + (hash hash)) + (if (eq? 'directory type) + ;; Can't hardlink directories, so hardlink their atoms. + (for-each (match-lambda + ((file . properties) + (unless (member file '("." "..")) + (let* ((file (string-append path "/" file)) + (type (match (assoc-ref properties 'type) + ((or 'unknown #f) + (stat:type (lstat file))) + (type type)))) + (loop file type + (and (not (eq? 'directory type)) + (nar-sha256 file))))))) + (scandir* path)) + (let ((link-file (string-append links-directory "/" + (bytevector->nix-base32-string + hash)))) + (canonicalize-with-link link-file path + #:swap-directory links-directory + #:store store))))) =2D-=20 2.28.0 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0004-nar-ensure-finalization-target-is-a-temp-root-during.patch Content-Transfer-Encoding: quoted-printable From=206b7d011680c77642f24396be0eb0015c20413d1a Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Sat, 8 Aug 2020 08:31:38 -0500 Subject: [PATCH 4/6] nar: ensure finalization target is a temp root during registration. Note that this is currently unnecessary, as finalize-store-file is only used from the offload hook, and the daemon process that spawned the offload hook will have already registered the derivation outputs as temp roots prior to attempting to offload (see haveDerivation() in nix/libstore/build.cc). But it's necessary to ensure that the register-items invocation works properly when finalize-store-file is used in other contexts. * guix/nar.scm (finalize-store-file): make target a temp root during extent= of register-items invocation. =2D-- guix/nar.scm | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/guix/nar.scm b/guix/nar.scm index a2aacf585c..ffb487e185 100644 =2D-- a/guix/nar.scm +++ b/guix/nar.scm @@ -111,13 +111,24 @@ held." (when (file-exists? target) (delete-file-recursively target)) =20 =2D ;; Install the new TARGET. =2D (rename-file source target) + ;; Register TARGET as a temp root. Note that this may not alway= s be + ;; necessary, but adding a temp root multiple times doesn't chan= ge + ;; anything (except taking a little more space in the temproots + ;; file). Note that currently this will only ensure that TARGET + ;; doesn't get deleted between now and when registration finishe= s; + ;; it may well be deleted by the time this procedure returns. =20 =2D ;; Register TARGET. As a side effect, it resets the timestamp= s of all =2D ;; its files, recursively, and runs a deduplication pass. =2D (register-items db =2D (list (store-info target deriver references)))) + ;; TODO: don't use an RPC for this, add it to *this process's* t= emp + ;; roots file. + (with-store store + (add-temp-root store target) + ;; Install the new TARGET. + (rename-file source target) + + ;; Register TARGET. As a side effect, it resets the timestamp= s of + ;; all its files, recursively, and runs a deduplication pass. + (register-items db + (list (store-info target deriver references)))= )) =20 (when lock? (delete-file (string-append target ".lock")) =2D-=20 2.28.0 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0005-database-document-extra-registration-requirements.patch Content-Transfer-Encoding: quoted-printable From=2055dd48e88d641bbc17b4d9484d6ee84acfb29766 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Wed, 8 Jul 2020 11:33:23 -0500 Subject: [PATCH 5/6] database: document extra registration requirements. It's necessary that store items be locked and protected from garbage collection while they are being registered. This documents that. * guix/store/database.scm (register-path, register-items): document GC protection and locking requirements. =2D-- guix/store/database.scm | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/guix/store/database.scm b/guix/store/database.scm index 50b66ce282..e39a1603af 100644 =2D-- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -397,7 +397,10 @@ absolute file name to the state directory of the store= being initialized. Return #t on success. =20 Use with care as it directly modifies the store! This is primarily meant = to =2Dbe used internally by the daemon's build hook." +be used internally by the daemon's build hook. + +PATH must be protected from GC and locked during execution of this, typica= lly +by adding it as a temp-root." (define db-file (store-database-file #:prefix prefix #:state-directory state-directory)) @@ -423,7 +426,9 @@ be used internally by the daemon's build hook." "Register all of ITEMS, a list of records as returned by 'read-reference-graph', in DB. ITEMS must be in topological order (with leaves first.) REGISTRATION-TIME must be the registration time to be reco= rded =2Din the database; #f means \"now\". Write a progress report to LOG-PORT." +in the database; #f means \"now\". Write a progress report to LOG-PORT. = All +of ITEMS must be protected from GC and locked during execution of this, +typically by adding them as temp-roots." (define store-dir (if prefix (string-append prefix %storedir) =2D-=20 2.28.0 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0006-database-register-items-reduce-transaction-scope.patch Content-Transfer-Encoding: quoted-printable From=2030afb453ce4eb161bb87645a0e6314e6af82a61f Mon Sep 17 00:00:00 2001 From: Christopher Baines Date: Tue, 23 Jun 2020 17:36:49 +0100 Subject: [PATCH 6/6] database: register-items: reduce transaction scope. It was made transactional in a4678c6ba18d8dbd79d931f80426eebf61be7ebe, with the reasoning to prevent broken intermediate states from being visible. I think this means something like an entry being in ValidPaths, but the Refs = not being inserted. Using a transaction for this makes sense, but I think using one single transaction for the whole register-items call is unnecessary to avoid broken states from being visible, and could block other writes to the store databa= se while register-items is running. Because the deduplication and resetting timestamps happens within the transaction as well, even though these things don't involve the database, writes to the database will still be blocked wh= ile this is happening. To reduce the potential for register-items to block other writers to the database for extended periods, this commit moves the transaction to just wr= ap the call to sqlite-register. This is the one place where writes occur, so t= hat should prevent the broken intermediate states issue above. The one differen= ce this will make is some of the registered items will be visible to other connections while others may be still being added. I think this is OK, as i= t's equivalent to just registering different items. * guix/store/database.scm (register-items): Reduce transaction scope. =2D-- guix/store/database.scm | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/guix/store/database.scm b/guix/store/database.scm index e39a1603af..2ea63b17aa 100644 =2D-- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -457,24 +457,25 @@ typically by adding them as temp-roots." (when reset-timestamps? (reset-timestamps real-file-name)) (let-values (((hash nar-size) (nar-sha256 real-file-name))) =2D (sqlite-register db #:path to-register =2D #:references (store-info-references item) =2D #:deriver (store-info-deriver item) =2D #:hash (string-append "sha256:" =2D (bytevector->base16-strin= g hash)) =2D #:nar-size nar-size =2D #:time registration-time) + (call-with-retrying-transaction db + (lambda () + (sqlite-register db #:path to-register + #:references (store-info-references item) + #:deriver (store-info-deriver item) + #:hash (string-append + "sha256:" + (bytevector->base16-string hash)) + #:nar-size nar-size + #:time registration-time))) (when deduplicate? (deduplicate real-file-name hash #:store store-dir))))) =20 =2D (call-with-retrying-transaction db =2D (lambda () =2D (let* ((prefix (format #f "registering ~a items" (length items= ))) =2D (progress (progress-reporter/bar (length items) =2D prefix log-port))) =2D (call-with-progress-reporter progress =2D (lambda (report) =2D (for-each (lambda (item) =2D (register db item) =2D (report)) =2D items))))))) + (let* ((prefix (format #f "registering ~a items" (length items))) + (progress (progress-reporter/bar (length items) + prefix log-port))) + (call-with-progress-reporter progress + (lambda (report) + (for-each (lambda (item) + (register db item) + (report)) + items))))) =2D-=20 2.28.0 --=-=-=-- --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEdNapMPRLm4SepVYGwWaqSV9/GJwFAl8vd/AACgkQwWaqSV9/ GJzdygf/fnQKTxIQzOI8DAvh8Avz3GXx7pESnxYhXCSa1vHN2LD0gQwytS4bHN1B 1vEWqaG+i4aIBF97oIEWMQXm/YrlhEburbQ+/aoKMJzYui4p3vQZedCEt0hrlEsD 5qf7AdIQDwPAScJeeeADvkRLy30+H9ui72iRKqpqPgm6kCHd+1NZS8KGqFYhTe2Y VPo+12xW8dkJguT8FMVYJYj7V/JRtSrGs+0sItBcwpd78HCSV914l0FUdEbPzcqk QbtzyY0a6r+Ul8FpPUYWLvqUNUCNK+kMWE57MpHEzKvosBB/9IerU00WBEC48LIP sZusS4tkpSmbUB3N19lSAV9XizrPfw== =CeVX -----END PGP SIGNATURE----- --==-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Mon Sep 14 04:58:50 2020 Received: (at 42023) by debbugs.gnu.org; 14 Sep 2020 08:58:50 +0000 Received: from localhost ([127.0.0.1]:52486 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kHkK2-0006JO-Af for submit@debbugs.gnu.org; Mon, 14 Sep 2020 04:58:50 -0400 Received: from eggs.gnu.org ([209.51.188.92]:36650) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kHkJy-0006J6-Hy for 42023@debbugs.gnu.org; Mon, 14 Sep 2020 04:58:48 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:48170) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHkJt-0006Nr-2A; Mon, 14 Sep 2020 04:58:41 -0400 Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=44160 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kHkJs-0001bu-25; Mon, 14 Sep 2020 04:58:40 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Caleb Ristvedt Subject: Re: [bug#42023] [PATCH] database: register-items: reduce transaction scope. References: <20200623163649.32444-1-mail@cbaines.net> <87366lxwcd.fsf@gnu.org> <87v9jg4aiz.fsf@cune.org> <87eeq3vbat.fsf@gnu.org> <87tuxu2sz6.fsf@gnu.org> <87bljka234.fsf@cune.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 29 Fructidor an 228 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Mon, 14 Sep 2020 10:58:30 +0200 In-Reply-To: <87bljka234.fsf@cune.org> (Caleb Ristvedt's message of "Sat, 08 Aug 2020 23:13:35 -0500") Message-ID: <877dswpweh.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42023 Cc: 42023@debbugs.gnu.org, Christopher Baines X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi Caleb, And apologies for the delay. I think we=E2=80=99ve drifted from the original patch and that=E2=80=99s be= come a tricky 7-patch series, which partly explains the delay=E2=80=94not that I=E2=80=99= m looking for an excuse. ;-) I decided to go ahead and apply some of these on your behalf. Comments below. Caleb Ristvedt skribis: > From 4c8f0cc50e2a1a33d9ce2f8e58cc426872676a7f Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Wed, 24 Jun 2020 01:00:40 -0500 > Subject: [PATCH 1/6] .dir-locals.el: fix call-with-{retrying-}transaction > indenting. > > * .dir-locals.el (call-with-transaction, call-with-retrying-transaction): > change scheme-indent-function property from 2 to 1. Applied. > From 9717568f922e0921e5fdc320cbe6689768d29a29 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Sat, 8 Aug 2020 10:05:22 -0500 > Subject: [PATCH 2/6] deduplication: pass store directory to replace-with-= link. > > This causes with-writable-file to take into consideration the actual store > being used, as passed to 'deduplicate', rather than > whatever (%store-directory) may return. > > * guix/store/deduplication.scm (replace-with-link): new keyword argument > 'store'. Pass to with-writable-file. > (with-writable-file, call-with-writable-file): new store argument. > (deduplicate): pass store to replace-with-link. Applied. > From b992a3aaac7e3b30222e0bf1df09093f18e25e6a Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Sat, 8 Aug 2020 11:25:57 -0500 > Subject: [PATCH 3/6] deduplication: retry on ENOENT. > > It's possible for the garbage collector to remove the "canonical" link af= ter > it's been detected as existing by 'deduplicate'. This would cause an ENO= ENT > error when replace-with-link attempts to create the temporary link. This > changes it so that it will properly handle that by retrying. Would that ENOENT cause an error, or just a missed deduplication opportunit= y? > * guix/store/deduplication.scm (replace-with-link): renamed to > canonicalize-with-link, now also handles the case where the target link > doesn't exist yet, and retries on ENOENT. > (deduplicate): modified to use canonicalize-with-link. There=E2=80=99s an issue with this patch. I gave it a spin (offloading a f= ew builds) and it got stuck in a infinite loop: --8<---------------cut here---------------start------------->8--- stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31brhr= k", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwps-= 0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31= brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31brhr= k", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwps-= 0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31= brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31brhr= k", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwps-= 0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31= brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) --8<---------------cut here---------------end--------------->8--- I think we should work on reducing the complexity of that code (e.g., there are several layers of system-error handling). I=E2=80=99ve omitted it and I propose discussing it in a separate issue if = we need to. > From 6b7d011680c77642f24396be0eb0015c20413d1a Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Sat, 8 Aug 2020 08:31:38 -0500 > Subject: [PATCH 4/6] nar: ensure finalization target is a temp root during > registration. > > Note that this is currently unnecessary, as finalize-store-file is only u= sed > from the offload hook, and the daemon process that spawned the offload ho= ok > will have already registered the derivation outputs as temp roots prior to > attempting to offload (see haveDerivation() in nix/libstore/build.cc). B= ut > it's necessary to ensure that the register-items invocation works properly > when finalize-store-file is used in other contexts. > > * guix/nar.scm (finalize-store-file): make target a temp root during exte= nt of > register-items invocation. [...] > + ;; TODO: don't use an RPC for this, add it to *this process's*= temp > + ;; roots file. > + (with-store store > + (add-temp-root store target) I agree that this is the right thing but as you note, it=E2=80=99s currently unnecessary in the context of =E2=80=98guix offload=E2=80=99, and I=E2=80= =99d rather avoid opening more connections to the daemon from =E2=80=98guix offload=E2=80=99 = and this increases load, pressure on the store database, etc. > From 55dd48e88d641bbc17b4d9484d6ee84acfb29766 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Wed, 8 Jul 2020 11:33:23 -0500 > Subject: [PATCH 5/6] database: document extra registration requirements. > > It's necessary that store items be locked and protected from garbage > collection while they are being registered. This documents that. > > * guix/store/database.scm (register-path, register-items): document GC > protection and locking requirements. Applied. > From 30afb453ce4eb161bb87645a0e6314e6af82a61f Mon Sep 17 00:00:00 2001 > From: Christopher Baines > Date: Tue, 23 Jun 2020 17:36:49 +0100 > Subject: [PATCH 6/6] database: register-items: reduce transaction scope. > > It was made transactional in a4678c6ba18d8dbd79d931f80426eebf61be7ebe, wi= th > the reasoning to prevent broken intermediate states from being visible. I > think this means something like an entry being in ValidPaths, but the Ref= s not > being inserted. > > Using a transaction for this makes sense, but I think using one single > transaction for the whole register-items call is unnecessary to avoid bro= ken > states from being visible, and could block other writes to the store data= base > while register-items is running. Because the deduplication and resetting > timestamps happens within the transaction as well, even though these thin= gs > don't involve the database, writes to the database will still be blocked = while > this is happening. > > To reduce the potential for register-items to block other writers to the > database for extended periods, this commit moves the transaction to just = wrap > the call to sqlite-register. This is the one place where writes occur, so= that > should prevent the broken intermediate states issue above. The one differ= ence > this will make is some of the registered items will be visible to other > connections while others may be still being added. I think this is OK, as= it's > equivalent to just registering different items. > > * guix/store/database.scm (register-items): Reduce transaction scope. Applied. Thanks Caleb & Chris! Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Mon Sep 14 04:58:58 2020 Received: (at control) by debbugs.gnu.org; 14 Sep 2020 08:58:58 +0000 Received: from localhost ([127.0.0.1]:52489 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kHkK9-0006Jl-Tq for submit@debbugs.gnu.org; Mon, 14 Sep 2020 04:58:58 -0400 Received: from eggs.gnu.org ([209.51.188.92]:36710) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kHkK9-0006JW-0O for control@debbugs.gnu.org; Mon, 14 Sep 2020 04:58:57 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:48174) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHkK3-0006Oo-Lh for control@debbugs.gnu.org; Mon, 14 Sep 2020 04:58:51 -0400 Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=44162 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kHkK2-0001mD-Ee for control@debbugs.gnu.org; Mon, 14 Sep 2020 04:58:51 -0400 Date: Mon, 14 Sep 2020 10:58:41 +0200 Message-Id: <875z8gpwe6.fsf@gnu.org> To: control@debbugs.gnu.org From: =?utf-8?Q?Ludovic_Court=C3=A8s?= Subject: control message for bug #42023 MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) tags 42023 fixed close 42023 quit From debbugs-submit-bounces@debbugs.gnu.org Tue Sep 15 16:29:45 2020 Received: (at submit) by debbugs.gnu.org; 15 Sep 2020 20:29:46 +0000 Received: from localhost ([127.0.0.1]:60656 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIHaD-0002W3-Ag for submit@debbugs.gnu.org; Tue, 15 Sep 2020 16:29:45 -0400 Received: from lists.gnu.org ([209.51.188.17]:58562) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIHa9-0002Vt-VT for submit@debbugs.gnu.org; Tue, 15 Sep 2020 16:29:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33938) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kIHa8-0008Fu-Dl for bug-guix@gnu.org; Tue, 15 Sep 2020 16:29:40 -0400 Received: from mail-io1-xd42.google.com ([2607:f8b0:4864:20::d42]:33706) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kIHa4-0003Ik-Id for bug-guix@gnu.org; Tue, 15 Sep 2020 16:29:39 -0400 Received: by mail-io1-xd42.google.com with SMTP id r25so5700406ioj.0 for ; Tue, 15 Sep 2020 13:29:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cune-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:in-reply-to:references:user-agent:date :message-id:mime-version; bh=wVSL8s2Pxv/+/fZSnLK9LkpBL+wkUSNx9wtZHsCtxGU=; b=tDCEKOKL2M9z23e24ak687m5rGv6oh33uOm2TuFDD6IF8gKL5kiHPpOOum9rjy6c6p YTeCIFKNMN7ukcLlubBxaIh+4FOl/85aRRpAlhRr5FVujB5Peti1I/054VMfILEKr109 3ZhuZhWtECIyRzDVoFUGHWkofV1gMWs3vqJKYi0ma/Ll53x4IKPlRJXwnyvpZuxDb5Tp eW815GCtoDa1fW0f5swuUoB/7wBgFEheaWw3PChkyXVvjo6McZ283ZirZWpXTdUtjq43 Rbue3nXGyofYZzqy0y0WUbyRRziZd7quNmaO7onJKl5Dqo3KVK247IH/GUJhJjZf8S1L 0G3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references :user-agent:date:message-id:mime-version; bh=wVSL8s2Pxv/+/fZSnLK9LkpBL+wkUSNx9wtZHsCtxGU=; b=savYWPJQ3I9m+kOg8FbotZyE3q4HnbfHuFwlK7j1lUZQeSRenvIpbC4Ng27iwwYt/y XM08piyoJiqVAwk3W3qFPVDIM9ACWT+XzQtGM5TCfwZhUMD6/CnhdX00MSj/lHJFWg7e UZKeqlfiCPFWWQ1gqfhfO2yy5rtCWNCmAzq8mPFtbLF/3kUPhO4RjU9FU2kBeaaQJwDE EGGfsJytcrxEBSVZiQyjYNEJ3uGnGYi72r9djUvOfY+tbv0aMM0JmEIwwGwnE3cK+zsF 6MHK1zYXOM5yAK8rXeIkJ+/8GYF3zmL4jj81xHqcX/a2CmBLvhLscg0G7eHPFgveKDLY 7IoA== X-Gm-Message-State: AOAM530QQ5K3kfcebUcHz7wYadLcvYPMqAvI4Jr80NPgudbZS8MiqueV Z319kHfM7kipHlU6BdX4O4Q3yA== X-Google-Smtp-Source: ABdhPJygC8j6PDipAbHImvF1GEoq57bOQXveMHWcO/PN3dMs4cB8Mu30bZZSnyptYBdgTaKBXvxHnA== X-Received: by 2002:a05:6638:168c:: with SMTP id f12mr19475580jat.16.1600201774841; Tue, 15 Sep 2020 13:29:34 -0700 (PDT) Received: from GuixPotato ([208.89.170.4]) by smtp.gmail.com with ESMTPSA id d19sm8288278iod.38.2020.09.15.13.29.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Sep 2020 13:29:34 -0700 (PDT) From: Caleb Ristvedt To: bug-guix@gnu.org Subject: [PATCH] Retry deduplication on ENOENT In-Reply-To: <877dswpweh.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Mon, 14 Sep 2020 10:58:30 +0200") References: <20200623163649.32444-1-mail@cbaines.net> <87366lxwcd.fsf@gnu.org> <87v9jg4aiz.fsf@cune.org> <87eeq3vbat.fsf@gnu.org> <87tuxu2sz6.fsf@gnu.org> <87bljka234.fsf@cune.org> <877dswpweh.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) Date: Tue, 15 Sep 2020 15:29:32 -0500 Message-ID: <87sgbikclv.fsf_-_@cune.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=2607:f8b0:4864:20::d42; envelope-from=caleb.ristvedt@cune.org; helo=mail-io1-xd42.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: submit Cc: Ludovic =?utf-8?Q?Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Continued where left off from 42023: Ludovic Court=C3=A8s writes: >> From b992a3aaac7e3b30222e0bf1df09093f18e25e6a Mon Sep 17 00:00:00 2001 >> From: Caleb Ristvedt >> Date: Sat, 8 Aug 2020 11:25:57 -0500 >> Subject: [PATCH 3/6] deduplication: retry on ENOENT. >> >> It's possible for the garbage collector to remove the "canonical" link a= fter >> it's been detected as existing by 'deduplicate'. This would cause an EN= OENT >> error when replace-with-link attempts to create the temporary link. This >> changes it so that it will properly handle that by retrying. > > Would that ENOENT cause an error, or just a missed deduplication > opportunity? Depends on how we handle it. Previously the error would be uncaught entirely; simply skipping deduplication of that file would work, though it would be suboptimal. > >> * guix/store/deduplication.scm (replace-with-link): renamed to >> canonicalize-with-link, now also handles the case where the target link >> doesn't exist yet, and retries on ENOENT. >> (deduplicate): modified to use canonicalize-with-link. > > There=E2=80=99s an issue with this patch. I gave it a spin (offloading a= few > builds) and it got stuck in a infinite loop: > > stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31br= hrk", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) > link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwp= s-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq= 31brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) > stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31br= hrk", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) > link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwp= s-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq= 31brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) > stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31br= hrk", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) > link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libwp= s-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq= 31brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) > I believe I can explain this. In 'deduplicate' we currently treat anything that isn't a directory as a hardlinkable thing. This includes symlinks (although it's implementation-defined whether symlinks can be hardlinked to - we use CAN_LINK_SYMLINK to test this in nix/libstore/optimise-store.cc). This means that at present we unconditionally attempt to deduplicate symlinks (which happens to work on linux). However, 'file-exists?' uses stat, not lstat, to check for file existence. Thus, if there is a dangling symlink, 'file-exists?' will return #f when passed it, but of course attempting to call link() to create it will fail with EEXIST. Attached is a modified patch that tests for file existence with lstat instead. I expect that will fix the problem. We should probably still add a test in 'deduplicate' for whether symlinks can be hardlinked to. Tangent: I was curious why libwps-0.4.so would be a dangling symlink, and it turns out that it's actually a relative symlink, so when accessing it via /gnu/store/...-libwps-0.4.12/lib/libwps-0.4.so it isn't dangling, but when accessing it via /gnu/store/.links/0k63r... it is. > I think we should work on reducing the complexity of that code (e.g., > there are several layers of system-error handling). I've since flattened it down into just one layer of catch'es. It adds a bit of redundancy, but might make it clearer. - reepca --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-deduplication-retry-on-ENOENT.patch >From 12f5848e79b0ede95babebea240264b32e39812c Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt Date: Sat, 8 Aug 2020 11:25:57 -0500 Subject: [PATCH] deduplication: retry on ENOENT. It's possible for the garbage collector to remove the "canonical" link after it's been detected as existing by 'deduplicate'. This would cause an ENOENT error when replace-with-link attempts to create the temporary link. This changes it so that it will properly handle that by retrying. * guix/store/deduplication.scm (replace-with-link): renamed to canonicalize-with-link, now also handles the case where the target link doesn't exist yet, and retries on ENOENT. Also modified to support canonicalizing symbolic links, though it is the caller's responsibility to ensure that the system supports hardlinking to a symbolic link (on Linux it does). (deduplicate): modified to use canonicalize-with-link. --- guix/store/deduplication.scm | 125 ++++++++++++++++++++--------------- 1 file changed, 70 insertions(+), 55 deletions(-) diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm index 0655ceb890..42116ed47d 100644 --- a/guix/store/deduplication.scm +++ b/guix/store/deduplication.scm @@ -115,26 +115,63 @@ store." ;; things link to this" (EMLINK), "this link already exists" (EEXIST), and ;; "can't fit more stuff in this directory" (ENOSPC). -(define* (replace-with-link target to-replace - #:key (swap-directory (dirname target)) - (store (%store-directory))) +(define* (canonicalize-with-link target to-replace + #:key (swap-directory (dirname target)) + (store (%store-directory))) "Atomically replace the file TO-REPLACE with a link to TARGET. Use SWAP-DIRECTORY as the directory to store temporary hard links. Upon ENOSPC and EMLINK, TO-REPLACE is left unchanged. Note: TARGET, TO-REPLACE, and SWAP-DIRECTORY must be on the same file system." - (define temp-link + (define (file-or-symlink-exists? filename) + ;; Like 'file-exists?', but doesn't follow symlinks. (catch 'system-error (lambda () - (get-temp-link target swap-directory)) + (lstat filename) + #t) (lambda args - ;; We get ENOSPC when we can't fit an additional entry in - ;; SWAP-DIRECTORY. If it's EMLINK, then TARGET has reached its - ;; maximum number of links. - (if (memv (system-error-errno args) `(,ENOSPC ,EMLINK)) + (if (= (system-error-errno args) ENOENT) #f (apply throw args))))) + (define temp-link + (let retry () + (if (file-or-symlink-exists? target) + (catch 'system-error + (lambda () + (get-temp-link target swap-directory)) + (lambda args + (let ((errno (system-error-errno args))) + (cond + ((= errno ENOENT) + ;; either SWAP-DIRECTORY has missing directory + ;; components or TARGET was deleted - this is a + ;; fundamental ambiguity to the errno produced by + ;; link() + (if (file-exists? swap-directory) + ;; we must assume link failed because target doesn't + ;; exist, so create it. + (retry) + (apply throw args))) + ((memv errno `(,ENOSPC ,EMLINK)) + ;; We get ENOSPC when we can't fit an additional entry + ;; in SWAP-DIRECTORY. If it's EMLINK, then TARGET has + ;; reached its maximum number of links. In both cases, + ;; skip deduplication. + #f) + (else (apply throw args)))))) + (catch 'system-error + (lambda () + (link to-replace target) + #f) + (lambda args + (cond + ((= (system-error-errno args) EEXIST) + (retry)) + ((memv (system-error-errno args) `(,ENOSPC ,EMLINK)) + #f) + (else (apply throw args)))))))) + ;; If we couldn't create TEMP-LINK, that's OK: just don't do the ;; replacement, which means TO-REPLACE won't be deduplicated. (when temp-link @@ -155,49 +192,27 @@ under STORE." (define links-directory (string-append store "/.links")) - (mkdir-p links-directory) - (let loop ((path path) - (type (stat:type (lstat path))) - (hash hash)) - (if (eq? 'directory type) - ;; Can't hardlink directories, so hardlink their atoms. - (for-each (match-lambda - ((file . properties) - (unless (member file '("." "..")) - (let* ((file (string-append path "/" file)) - (type (match (assoc-ref properties 'type) - ((or 'unknown #f) - (stat:type (lstat file))) - (type type)))) - (loop file type - (and (not (eq? 'directory type)) - (nar-sha256 file))))))) - (scandir* path)) - (let ((link-file (string-append links-directory "/" - (bytevector->nix-base32-string hash)))) - (if (file-exists? link-file) - (replace-with-link link-file path - #:swap-directory links-directory - #:store store) - (catch 'system-error - (lambda () - (link path link-file)) - (lambda args - (let ((errno (system-error-errno args))) - (cond ((= errno EEXIST) - ;; Someone else put an entry for PATH in - ;; LINKS-DIRECTORY before we could. Let's use it. - (replace-with-link path link-file - #:swap-directory - links-directory - #:store store)) - ((= errno ENOSPC) - ;; There's not enough room in the directory index for - ;; more entries in .links, but that's fine: we can - ;; just stop. - #f) - ((= errno EMLINK) - ;; PATH has reached the maximum number of links, but - ;; that's OK: we just can't deduplicate it more. - #f) - (else (apply throw args))))))))))) + (mkdir-p links-directory) + (let loop ((path path) + (type (stat:type (lstat path))) + (hash hash)) + (if (eq? 'directory type) + ;; Can't hardlink directories, so hardlink their atoms. + (for-each (match-lambda + ((file . properties) + (unless (member file '("." "..")) + (let* ((file (string-append path "/" file)) + (type (match (assoc-ref properties 'type) + ((or 'unknown #f) + (stat:type (lstat file))) + (type type)))) + (loop file type + (and (not (eq? 'directory type)) + (nar-sha256 file))))))) + (scandir* path)) + (let ((link-file (string-append links-directory "/" + (bytevector->nix-base32-string + hash)))) + (canonicalize-with-link link-file path + #:swap-directory links-directory + #:store store))))) -- 2.28.0 --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Wed Sep 16 16:37:21 2020 Received: (at submit) by debbugs.gnu.org; 16 Sep 2020 20:37:21 +0000 Received: from localhost ([127.0.0.1]:36641 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIeB7-0003Bx-Hj for submit@debbugs.gnu.org; Wed, 16 Sep 2020 16:37:21 -0400 Received: from lists.gnu.org ([209.51.188.17]:34428) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIeB6-0003Bq-AS for submit@debbugs.gnu.org; Wed, 16 Sep 2020 16:37:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:32960) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kIeB6-0005ru-29 for bug-guix@gnu.org; Wed, 16 Sep 2020 16:37:20 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:48808) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kIeB5-0002cb-Jp; Wed, 16 Sep 2020 16:37:19 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=40112 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kIeB5-0001jc-7K; Wed, 16 Sep 2020 16:37:19 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Caleb Ristvedt Subject: Re: [PATCH] Retry deduplication on ENOENT References: <20200623163649.32444-1-mail@cbaines.net> <87366lxwcd.fsf@gnu.org> <87v9jg4aiz.fsf@cune.org> <87eeq3vbat.fsf@gnu.org> <87tuxu2sz6.fsf@gnu.org> <87bljka234.fsf@cune.org> <877dswpweh.fsf@gnu.org> <87sgbikclv.fsf_-_@cune.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Jour de la Vertu de =?utf-8?Q?l'Ann=C3=A9e?= 228 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Wed, 16 Sep 2020 22:37:05 +0200 In-Reply-To: <87sgbikclv.fsf_-_@cune.org> (Caleb Ristvedt's message of "Tue, 15 Sep 2020 15:29:32 -0500") Message-ID: <87k0wtlaq6.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: submit Cc: bug-guix@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi! Caleb Ristvedt skribis: [...] >> There=E2=80=99s an issue with this patch. I gave it a spin (offloading = a few >> builds) and it got stuck in a infinite loop: >> >> stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31b= rhrk", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) >> link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libw= ps-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0p= q31brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) >> stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31b= rhrk", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) >> link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libw= ps-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0p= q31brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) >> stat("/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0pq31b= rhrk", 0x7ffe43898cd0) =3D -1 ENOENT (Dosiero a=C5=AD dosierujo ne ekzistas) >> link("/gnu/store/83jy739bn644w3pnwgb5kwjig0kzs92f-libwps-0.4.12/lib/libw= ps-0.4.so", "/gnu/store/.links/0k63r0n3681r2gqd00blq4z5xd7cw1knv0x049p99f0p= q31brhrk") =3D -1 EEXIST (Dosiero jam ekzistas) >> > > I believe I can explain this. In 'deduplicate' we currently treat > anything that isn't a directory as a hardlinkable thing. This includes > symlinks (although it's implementation-defined whether symlinks can be > hardlinked to - we use CAN_LINK_SYMLINK to test this in > nix/libstore/optimise-store.cc). This means that at present we > unconditionally attempt to deduplicate symlinks (which happens to work > on linux). However, 'file-exists?' uses stat, not lstat, to check for > file existence. Thus, if there is a dangling symlink, 'file-exists?' > will return #f when passed it, but of course attempting to call link() > to create it will fail with EEXIST. Attached is a modified patch that > tests for file existence with lstat instead. I expect that will fix the > problem. Ah ha! > We should probably still add a test in 'deduplicate' for whether > symlinks can be hardlinked to. If GNU/Linux and GNU/Hurd support it, it=E2=80=99s unnecessary. > Tangent: I was curious why libwps-0.4.so would be a dangling symlink, > and it turns out that it's actually a relative symlink, so when > accessing it via /gnu/store/...-libwps-0.4.12/lib/libwps-0.4.so it isn't > dangling, but when accessing it via /gnu/store/.links/0k63r... it is. I see, good catch! > From 12f5848e79b0ede95babebea240264b32e39812c Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Sat, 8 Aug 2020 11:25:57 -0500 > Subject: [PATCH] deduplication: retry on ENOENT. > > It's possible for the garbage collector to remove the "canonical" link af= ter > it's been detected as existing by 'deduplicate'. This would cause an ENO= ENT > error when replace-with-link attempts to create the temporary link. This > changes it so that it will properly handle that by retrying. > > * guix/store/deduplication.scm (replace-with-link): renamed to > canonicalize-with-link, now also handles the case where the target link > doesn't exist yet, and retries on ENOENT. Also modified to support > canonicalizing symbolic links, though it is the caller's responsibility= to > ensure that the system supports hardlinking to a symbolic link (on Linu= x it > does). > (deduplicate): modified to use canonicalize-with-link. [...] > + (lambda args > + (let ((errno (system-error-errno args))) > + (cond > + ((=3D errno ENOENT) > + ;; either SWAP-DIRECTORY has missing directory > + ;; components or TARGET was deleted - this is a > + ;; fundamental ambiguity to the errno produced by > + ;; link() > + (if (file-exists? swap-directory) > + ;; we must assume link failed because target doesn= 't > + ;; exist, so create it. Nitpick: Please capitalize sentences, add a period at the end, and write =E2=80=9C'link'=E2=80=9D instead of =E2=80=9Clink()=E2=80=9D or =E2=80=9Cli= nk=E2=80=9D for clarity. Otherwise LGTM. I think we=E2=80=99ll have to stress-test it through offloading to catch any remaining issues. Thank you! Ludo=E2=80=99. From unknown Fri Aug 15 12:50:33 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Thu, 15 Oct 2020 11:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator