From unknown Mon Sep 08 15:07:26 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#44760 <44760@debbugs.gnu.org> To: bug#44760 <44760@debbugs.gnu.org> Subject: Status: Closure copy in =?UTF-8?Q?=E2=80=98guix?= system =?UTF-8?Q?init=E2=80=99?= is inefficient Reply-To: bug#44760 <44760@debbugs.gnu.org> Date: Mon, 08 Sep 2025 22:07:26 +0000 retitle 44760 Closure copy in =E2=80=98guix system init=E2=80=99 is ineffic= ient reassign 44760 guix submitter 44760 Ludovic Court=C3=A8s severity 44760 important thanks From debbugs-submit-bounces@debbugs.gnu.org Fri Nov 20 06:02:31 2020 Received: (at submit) by debbugs.gnu.org; 20 Nov 2020 11:02:31 +0000 Received: from localhost ([127.0.0.1]:42181 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kg4BS-0005zO-V2 for submit@debbugs.gnu.org; Fri, 20 Nov 2020 06:02:31 -0500 Received: from lists.gnu.org ([209.51.188.17]:57686) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kg4BR-0005zG-3E for submit@debbugs.gnu.org; Fri, 20 Nov 2020 06:02:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:60234) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kg4BQ-0005Oy-HP for bug-guix@gnu.org; Fri, 20 Nov 2020 06:02:28 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:49361) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kg4BQ-0006Zs-9r for bug-guix@gnu.org; Fri, 20 Nov 2020 06:02:28 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=39250 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kg4BP-0005BX-C4 for bug-guix@gnu.org; Fri, 20 Nov 2020 06:02:28 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Subject: Closure copy in =?utf-8?Q?=E2=80=98guix?= system =?utf-8?Q?init?= =?utf-8?Q?=E2=80=99?= is inefficient X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 30 Brumaire an 229 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Fri, 20 Nov 2020 12:02:25 +0100 Message-ID: <87h7pkffzy.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) =E2=80=98guix system init=E2=80=99 ends by copying the system=E2=80=99s clo= sure from the =E2=80=9Chost=E2=80=9D store to the target store; it also initializes the database of that target store. That copy is inefficient for several reasons. Let=E2=80=99s pick one file, shred.1.gz, that ends up being copied, and let=E2=80=99s look at its occurr= ences in the strace log of =E2=80=98guix system init config.scm /tmp/os=E2=80=99: --8<---------------cut here---------------start------------->8--- $ grep -A2 '/shred.1.gz' ,,s lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man= /man1/shred.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 openat(AT_FDCWD, "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.3= 2/share/man/man1/shred.1.gz", O_RDONLY) =3D 15 fstat(15, {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreut= ils-8.32/share/man/man1/shred.1.gz", O_WRONLY|O_CREAT|O_TRUNC, 0444) =3D 16 read(15, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\3= 22:M"..., 8192) =3D 1490 write(16, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\= 322:M"..., 1490) =3D 1490 -- utimensat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-cor= eutils-8.32/share/man/man1/shred.1.gz", [{tv_sec=3D1605721025, tv_nsec=3D61= 6985411} /* 2020-11-18T18:37:05.616985411+0100 */, {tv_sec=3D1, tv_nsec=3D0= } /* 1970-01-01T01:00:01+0100 */], 0) =3D 0 lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man= /man1/sleep.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D813, ...}) =3D 0 openat(AT_FDCWD, "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.3= 2/share/man/man1/sleep.1.gz", O_RDONLY) =3D 15 -- lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/shred.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) = =3D 0 lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/shuf.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D972, ...}) =3D 0 lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/sleep.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D813, ...}) =3D= 0 -- lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/shred.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) = =3D 0 openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreut= ils-8.32/share/man/man1/shred.1.gz", O_RDONLY) =3D 17 lseek(17, 0, SEEK_CUR) =3D 0 read(17, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\3= 22:M"..., 1490) =3D 1490 -- lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/shred.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) = =3D 0 openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreut= ils-8.32/share/man/man1/shred.1.gz", O_RDONLY) =3D 17 lseek(17, 0, SEEK_CUR) =3D 0 read(17, "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\3= 22:M"..., 1490) =3D 1490 -- link("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sha= re/man/man1/shred.1.gz", "/tmp/os/gnu/store/.links/0w0qcs5lp36i89yry91r2ixl= ghihzf0vc56bpd9yylj342gv82xl") =3D 0 lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/sh= are/man/man1/shuf.1.gz", {st_mode=3DS_IFREG|0444, st_size=3D972, ...}) =3D 0 openat(AT_FDCWD, "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreut= ils-8.32/share/man/man1/shuf.1.gz", O_RDONLY) =3D 17 --8<---------------cut here---------------end--------------->8--- First, /tmp/os/=E2=80=A6/shred.1.gz is read entirely twice: once in =E2=80=98register-items=E2=80=99 (in the =E2=80=98nar-sha256=E2=80=99 call)= to compute its hash, and a second time for deduplication (the =E2=80=98deduplicate=E2=80=99 call in th= ere.) The =E2=80=98nar-sha256=E2=80=99 call could be avoided because the database= of /gnu/store contains that value. As for deduplication, we could perhaps create those =E2=80=98.links=E2=80=99 entries as we copy files instead of r= e-traversing the whole thing afterwards. Second, all of /tmp/os is traversed to reset timestamps, although we could have cleared those timestamps when we created those files in the first place ( prevents that though, unless we keep a bug-fixed copy of =E2=80=98copy-recursively=E2=80=99 in th= ere.) Third, in the case of the installer, we=E2=80=99re really copying from /mnt/guix-inst/store to /mnt/gnu/store, which is likely the same device. In this case we could create hard links instead of actually copying files. Fourth, we=E2=80=99re adding items one by one in the target store database,= but it may be more efficient to more or less dump the subset of the source database in bulk. Surely we can do better. Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Sat Nov 21 06:02:05 2020 Received: (at control) by debbugs.gnu.org; 21 Nov 2020 11:02:06 +0000 Received: from localhost ([127.0.0.1]:44971 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kgQeb-0001WP-K2 for submit@debbugs.gnu.org; Sat, 21 Nov 2020 06:02:05 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43478) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kgQeZ-0001VS-QQ for control@debbugs.gnu.org; Sat, 21 Nov 2020 06:02:04 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:56143) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kgQeU-0007M3-Jq for control@debbugs.gnu.org; Sat, 21 Nov 2020 06:01:58 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=45532 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kgQeU-00025j-45 for control@debbugs.gnu.org; Sat, 21 Nov 2020 06:01:58 -0500 Date: Sat, 21 Nov 2020 12:01:57 +0100 Message-Id: <87o8jrc6sa.fsf@gnu.org> To: control@debbugs.gnu.org From: =?utf-8?Q?Ludovic_Court=C3=A8s?= Subject: control message for bug #44760 MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) severity 44760 important quit From debbugs-submit-bounces@debbugs.gnu.org Sun Nov 22 14:54:19 2020 Received: (at 44760) by debbugs.gnu.org; 22 Nov 2020 19:54:19 +0000 Received: from localhost ([127.0.0.1]:48336 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kgvRC-0005l8-MZ for submit@debbugs.gnu.org; Sun, 22 Nov 2020 14:54:19 -0500 Received: from mx1.riseup.net ([198.252.153.129]:51800) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kgvR7-0005kx-Oe for 44760@debbugs.gnu.org; Sun, 22 Nov 2020 14:54:17 -0500 Received: from bell.riseup.net (bell-pn.riseup.net [10.0.1.178]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "*.riseup.net", Issuer "Sectigo RSA Domain Validation Secure Server CA" (not verified)) by mx1.riseup.net (Postfix) with ESMTPS id 4CfLZq0SYwzFcQJ; Sun, 22 Nov 2020 11:54:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=riseup.net; s=squak; t=1606074851; bh=uMMIgSwyK5XGW13lnRTmQDn8ZlxyciCFEiaa4aeoaRA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=X7mWqlK78rUE2gAUM92jR1toHmUn/XaRRc8nLKhcaAmgrBChTcXxPA1xg5rhZZuHF hZISYJrBxbRfx/p3+YACRXDiHekDYngKb4dlUQGJM64cmcGbG9IuHOJS8qeYwTZaoE 7IZ00lkHmMqR01h5PQOhpEy3Y/JCmoOUHy/V791g= X-Riseup-User-ID: EBAE9E51C33841F11A62CC299348D72DB52B9D82E70607CCEE9719F2BE95B275 Received: from [127.0.0.1] (localhost [127.0.0.1]) by bell.riseup.net (Postfix) with ESMTPSA id 4CfLZk3wp1zJmmn; Sun, 22 Nov 2020 11:54:06 -0800 (PST) Date: Sun, 22 Nov 2020 20:46:34 +0100 From: raingloom To: Ludovic =?UTF-8?B?Q291cnTDqHM=?= Subject: Re: bug#44760: Closure copy in =?UTF-8?B?4oCYZ3VpeA==?= system =?UTF-8?B?aW5pdOKAmQ==?= is inefficient Message-ID: <20201122204634.2730df12@riseup.net> In-Reply-To: <87h7pkffzy.fsf@inria.fr> References: <87h7pkffzy.fsf@inria.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 44760 Cc: 44760@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) On Fri, 20 Nov 2020 12:02:25 +0100 Ludovic Court=C3=A8s wrote: > =E2=80=98guix system init=E2=80=99 ends by copying the system=E2=80=99s c= losure from the > =E2=80=9Chost=E2=80=9D store to the target store; it also initializes the= database of > that target store. >=20 > That copy is inefficient for several reasons. Let=E2=80=99s pick one fil= e, > shred.1.gz, that ends up being copied, and let=E2=80=99s look at its > occurrences in the strace log of =E2=80=98guix system init config.scm > /tmp/os=E2=80=99: >=20 > --8<---------------cut here---------------start------------->8--- > $ grep -A2 '/shred.1.gz' ,,s > lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/m= an/man1/shred.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 openat(AT_FDCWD, > "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man= 1/shred.1.gz", > O_RDONLY) =3D 15 fstat(15, {st_mode=3DS_IFREG|0444, st_size=3D1490, ...})= =3D > 0 openat(AT_FDCWD, > "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/= man/man1/shred.1.gz", > O_WRONLY|O_CREAT|O_TRUNC, 0444) =3D 16 read(15, > "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..= ., > 8192) =3D 1490 write(16, > "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..= ., > 1490) =3D 1490 -- utimensat(AT_FDCWD, > "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/= man/man1/shred.1.gz", > [{tv_sec=3D1605721025, tv_nsec=3D616985411} /* > 2020-11-18T18:37:05.616985411+0100 */, {tv_sec=3D1, tv_nsec=3D0} /* > 1970-01-01T01:00:01+0100 */], 0) =3D 0 > lstat("/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/m= an/man1/sleep.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D813, ...}) =3D 0 openat(AT_FDCWD, > "/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/man/man= 1/sleep.1.gz", > O_RDONLY) =3D 15 -- > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/shred.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/shuf.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D972, ...}) =3D 0 > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/sleep.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D813, ...}) =3D 0 -- > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/shred.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 openat(AT_FDCWD, > "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/= man/man1/shred.1.gz", > O_RDONLY) =3D 17 lseek(17, 0, SEEK_CUR) =3D 0 read(17, > "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..= ., > 1490) =3D 1490 -- > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/shred.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D1490, ...}) =3D 0 openat(AT_FDCWD, > "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/= man/man1/shred.1.gz", > O_RDONLY) =3D 17 lseek(17, 0, SEEK_CUR) =3D 0 read(17, > "\37\213\10\0\0\0\0\0\2\3\215VMs\3336\20\275\363Wluh\354\251L%vg\322:M"..= ., > 1490) =3D 1490 -- > link("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/s= hare/man/man1/shred.1.gz", > "/tmp/os/gnu/store/.links/0w0qcs5lp36i89yry91r2ixlghihzf0vc56bpd9yylj342g= v82xl") > =3D 0 > lstat("/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/= share/man/man1/shuf.1.gz", > {st_mode=3DS_IFREG|0444, st_size=3D972, ...}) =3D 0 openat(AT_FDCWD, > "/tmp/os/gnu/store/57xj5gcy1jbl9ai2lnrqnpr0dald9i65-coreutils-8.32/share/= man/man1/shuf.1.gz", > O_RDONLY) =3D 17 --8<---------------cut > here---------------end--------------->8--- >=20 > First, /tmp/os/=E2=80=A6/shred.1.gz is read entirely twice: once in > =E2=80=98register-items=E2=80=99 (in the =E2=80=98nar-sha256=E2=80=99 cal= l) to compute its hash, and a > second time for deduplication (the =E2=80=98deduplicate=E2=80=99 call in = there.) >=20 > The =E2=80=98nar-sha256=E2=80=99 call could be avoided because the databa= se of > /gnu/store contains that value. As for deduplication, we could > perhaps create those =E2=80=98.links=E2=80=99 entries as we copy files in= stead of > re-traversing the whole thing afterwards. >=20 > Second, all of /tmp/os is traversed to reset timestamps, although we > could have cleared those timestamps when we created those files in the > first place ( prevents that though, > unless we keep a bug-fixed copy of =E2=80=98copy-recursively=E2=80=99 in = there.) >=20 > Third, in the case of the installer, we=E2=80=99re really copying from > /mnt/guix-inst/store to /mnt/gnu/store, which is likely the same > device. In this case we could create hard links instead of actually > copying files. >=20 > Fourth, we=E2=80=99re adding items one by one in the target store databas= e, > but it may be more efficient to more or less dump the subset of the > source database in bulk. >=20 > Surely we can do better. >=20 > Ludo=E2=80=99. >=20 >=20 >=20 Also, if a store is already present (eg.: because of a partial install), it could make sense to (optionally) keep its contents. AFAIK this is still not possible. It was one the bigger time sinks while I was working on the F2FS support. From debbugs-submit-bounces@debbugs.gnu.org Sun Nov 22 16:10:28 2020 Received: (at 44760) by debbugs.gnu.org; 22 Nov 2020 21:10:28 +0000 Received: from localhost ([127.0.0.1]:48425 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kgwct-0007c2-Py for submit@debbugs.gnu.org; Sun, 22 Nov 2020 16:10:28 -0500 Received: from eggs.gnu.org ([209.51.188.92]:33150) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kgwcp-0007bn-Gq for 44760@debbugs.gnu.org; Sun, 22 Nov 2020 16:10:27 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:52179) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kgwck-0002q2-Ab; Sun, 22 Nov 2020 16:10:18 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=46738 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kgwcj-00013y-V1; Sun, 22 Nov 2020 16:10:18 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: raingloom Subject: Re: bug#44760: Closure copy in =?utf-8?Q?=E2=80=98guix?= system =?utf-8?Q?init=E2=80=99?= is inefficient References: <87h7pkffzy.fsf@inria.fr> <20201122204634.2730df12@riseup.net> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 2 Frimaire an 229 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Sun, 22 Nov 2020 22:10:16 +0100 In-Reply-To: <20201122204634.2730df12@riseup.net> (raingloom@riseup.net's message of "Sun, 22 Nov 2020 20:46:34 +0100") Message-ID: <87v9dxkshz.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: 44760@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi, raingloom skribis: > Also, if a store is already present (eg.: because of a partial > install), it could make sense to (optionally) keep its contents. AFAIK > this is still not possible. It was one the bigger time sinks while I > was working on the F2FS support. It=E2=80=99s on purpose that =E2=80=98guix system init=E2=80=99 forcefully = wipes store items already present before copying them; some time ago it was not the case and that led to problems: https://issues.guix.gnu.org/20722 I think it=E2=80=99s safer to keep it unchanged. Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:09:40 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:09:40 +0000 Received: from localhost ([127.0.0.1]:42559 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3A-0007gk-HW for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:40 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56534) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk38-0007gM-MP for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:39 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47525) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk32-0000S7-Oz; Fri, 11 Dec 2020 10:09:32 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59324 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk30-0004Tz-Tn; Fri, 11 Dec 2020 10:09:32 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 00/15] Speed up 'guix system init' & co. Date: Fri, 11 Dec 2020 16:09:10 +0100 Message-Id: <20201211150919.18435-1-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <87h7pkffzy.fsf@inria.fr> References: <87h7pkffzy.fsf@inria.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi there! Here’s a long and rather boring patch series to address and a bit more. To avoid traversing store items repeatedly as described in the issue above, the strategy here is to gradually move the reset-timestamps and deduplicate phases as part of the file copying process, such that each file is accessed only once. Consequently, the kitchen sink that ‘register-items’ once was is now very focused. Furthermore, it changes ‘guix system init’ so that it reuses the already-known store item hashes when populating the target database instead of re-traversing store items. On my laptop (SSD, warm cache, derivations already built), the command: guix system init gnu/system/examples/bare-bones.tmpl /tmp/sys goes from 32s to 22s, a 33% improvement. Feedback welcome! Ludo’. Ludovic Courtès (15): serialization: 'fold-archive' notifies about directory processing completion. serialization: 'restore-file' sets canonical timestamp and permissions. nar: Deduplicate files right as they are restored. store-copy: 'populate-store' resets timestamps. image: 'register-closure' assumes already-reset timestamps. database: Remove #:reset-timestamps? from 'register-items'. store-copy: 'populate-store' can optionally deduplicate files. image: 'register-closure' leaves it up to the caller to deduplicate. database: Remove #:deduplicate? from 'register-items'. guix system: 'init' copies, resets timestamps, and deduplicates at once. database: Remove #:deduplicate? and #:reset-timestamps? from 'register-path'. system: 'init' does not recompute the hash of each store item. database: Remove 'register-path'. database: Honor 'SOURCE_DATE_EPOCH'. deduplicate: Create the '.links' directory lazily. .dir-locals.el | 1 + gnu/build/image.scm | 16 +- gnu/build/install.scm | 3 +- gnu/build/linux-initrd.scm | 3 +- gnu/build/vm.scm | 14 +- gnu/system/install.scm | 12 +- gnu/system/linux-initrd.scm | 10 +- guix/build/store-copy.scm | 133 ++++++++++++---- guix/nar.scm | 8 +- guix/scripts/archive.scm | 2 + guix/scripts/challenge.scm | 1 + guix/scripts/pack.scm | 276 +++++++++++++++++----------------- guix/scripts/system.scm | 64 ++++---- guix/serialization.scm | 36 +++-- guix/store/database.scm | 58 ++----- guix/store/deduplication.scm | 167 ++++++++++++++------ tests/gexp.scm | 20 ++- tests/guix-archive.sh | 4 +- tests/nar.scm | 21 ++- tests/store-database.scm | 18 ++- tests/store-deduplication.scm | 20 ++- 21 files changed, 544 insertions(+), 343 deletions(-) -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:09:41 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:09:41 +0000 Received: from localhost ([127.0.0.1]:42561 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3A-0007gn-SA for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:41 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56542) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk39-0007gO-HK for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:39 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47528) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk34-0000Sc-Db; Fri, 11 Dec 2020 10:09:34 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59324 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk33-0004Tz-2B; Fri, 11 Dec 2020 10:09:33 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 01/15] serialization: 'fold-archive' notifies about directory processing completion. Date: Fri, 11 Dec 2020 16:09:11 +0100 Message-Id: <20201211150919.18435-2-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150919.18435-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150919.18435-1-ludo@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) * guix/serialization.scm (fold-archive): Call PROC with a 'directory-complete tag when done with a directory. (restore-file): Handle it. * guix/scripts/archive.scm (list-contents): Likewise. * guix/scripts/challenge.scm (archive-contents): Likewise. * tests/nar.scm ("write-file-tree + fold-archive"): Adjust accordingly. --- guix/scripts/archive.scm | 2 ++ guix/scripts/challenge.scm | 1 + guix/serialization.scm | 5 ++++- tests/nar.scm | 6 ++++-- 4 files changed, 11 insertions(+), 3 deletions(-) diff --git a/guix/scripts/archive.scm b/guix/scripts/archive.scm index c04baf9784..1f73fff711 100644 --- a/guix/scripts/archive.scm +++ b/guix/scripts/archive.scm @@ -347,6 +347,8 @@ output port." (match type ('directory (format #t "D ~a~%" file)) + ('directory-complete + #t) ('symlink (format #t "S ~a -> ~a~%" file content)) ((or 'regular 'executable) diff --git a/guix/scripts/challenge.scm b/guix/scripts/challenge.scm index 39bd2c1c0f..d0a456ac1d 100644 --- a/guix/scripts/challenge.scm +++ b/guix/scripts/challenge.scm @@ -210,6 +210,7 @@ taken since we do not import the archives." (cons `(,file ,type ,(port-sha256* port size)) result)))) ('directory result) + ('directory-complete result) ('symlink (cons `(,file ,type ,contents) result)))) '() diff --git a/guix/serialization.scm b/guix/serialization.scm index 836ad06caf..cc56134ef4 100644 --- a/guix/serialization.scm +++ b/guix/serialization.scm @@ -444,7 +444,8 @@ depends on TYPE." (file file) (token x)))))) (loop (read-string port) result))))) - (")" result) ;done with DIR + (")" ;done with DIR + (proc file 'directory-complete #f result)) (x (raise (condition @@ -463,6 +464,8 @@ Restore it as FILE." (match type ('directory (mkdir file)) + ('directory-complete + #t) ('symlink (symlink content file)) ((or 'regular 'executable) diff --git a/tests/nar.scm b/tests/nar.scm index aeff3d3330..b542ebd47c 100644 --- a/tests/nar.scm +++ b/tests/nar.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès +;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès ;;; ;;; This file is part of GNU Guix. ;;; @@ -218,8 +218,10 @@ '(("R" directory #f) ("R/dir" directory #f) ("R/dir/exe" executable "1234") + ("R/dir" directory-complete #f) ("R/foo" regular "abcdefg") - ("R/lnk" symlink "foo")) + ("R/lnk" symlink "foo") + ("R" directory-complete #f)) (let () (define-values (port get-bytevector) -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:09:46 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:09:46 +0000 Received: from localhost ([127.0.0.1]:42569 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3G-0007hW-72 for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:46 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56552) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3B-0007gQ-3G for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:41 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47529) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk35-0000Sk-LP; Fri, 11 Dec 2020 10:09:35 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59324 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk34-0004Tz-Nf; Fri, 11 Dec 2020 10:09:35 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 02/15] serialization: 'restore-file' sets canonical timestamp and permissions. Date: Fri, 11 Dec 2020 16:09:12 +0100 Message-Id: <20201211150919.18435-3-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150919.18435-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150919.18435-1-ludo@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) * guix/serialization.scm (restore-file): Set the permissions and mtime of FILE. * guix/nar.scm (finalize-store-file): Pass #:reset-timestamps? #f to 'register-items'. * tests/nar.scm (rm-rf): Add 'chmod' calls to ensure files are writable. ("write-file + restore-file with symlinks"): Ensure every file in OUTPUT passes 'canonical-file?'. * tests/guix-archive.sh: Run "chmod -R +w" before "rm -rf". --- guix/nar.scm | 8 +++++--- guix/serialization.scm | 14 +++++++++----- tests/guix-archive.sh | 4 ++-- tests/nar.scm | 12 ++++++++++-- 4 files changed, 26 insertions(+), 12 deletions(-) diff --git a/guix/nar.scm b/guix/nar.scm index a23af2e5de..edfcc9aab5 100644 --- a/guix/nar.scm +++ b/guix/nar.scm @@ -114,10 +114,12 @@ held." ;; Install the new TARGET. (rename-file source target) - ;; Register TARGET. As a side effect, it resets the timestamps of all - ;; its files, recursively, and runs a deduplication pass. + ;; Register TARGET. As a side effect, run a deduplication pass. + ;; Timestamps and permissions are already correct thanks to + ;; 'restore-file'. (register-items db - (list (store-info target deriver references)))) + (list (store-info target deriver references)) + #:reset-timestamps? #f)) (when lock? (delete-file (string-append target ".lock")) diff --git a/guix/serialization.scm b/guix/serialization.scm index cc56134ef4..677ca60b66 100644 --- a/guix/serialization.scm +++ b/guix/serialization.scm @@ -459,23 +459,27 @@ depends on TYPE." (define (restore-file port file) "Read a file (possibly a directory structure) in Nar format from PORT. -Restore it as FILE." +Restore it as FILE with canonical permissions and timestamps." (fold-archive (lambda (file type content result) (match type ('directory (mkdir file)) ('directory-complete - #t) + (chmod file #o555) + (utime file 1 1 0 0)) ('symlink - (symlink content file)) + (symlink content file) + (utime file 1 1 0 0 AT_SYMLINK_NOFOLLOW)) ((or 'regular 'executable) (match content ((input . size) (call-with-output-file file (lambda (output) (dump input output size) - (when (eq? type 'executable) - (chmod output #o755))))))))) + (chmod output (if (eq? type 'executable) + #o555 + #o444)))) + (utime file 1 1 0 0)))))) #t port file)) diff --git a/tests/guix-archive.sh b/tests/guix-archive.sh index e796c62f9a..00b87ff0ac 100644 --- a/tests/guix-archive.sh +++ b/tests/guix-archive.sh @@ -1,5 +1,5 @@ # GNU Guix --- Functional package management for GNU -# Copyright © 2013, 2014, 2015, 2019 Ludovic Courtès +# Copyright © 2013, 2014, 2015, 2019, 2020 Ludovic Courtès # # This file is part of GNU Guix. # @@ -28,7 +28,7 @@ tmpdir="t-archive-dir-$$" rm -f "$archive" "$archive_alt" rm -rf "$tmpdir" -trap 'rm -f "$archive" "$archive_alt"; rm -rf "$tmpdir"' EXIT +trap 'rm -f "$archive" "$archive_alt"; chmod -R +w "$tmpdir"; rm -rf "$tmpdir"' EXIT guix archive --export guile-bootstrap > "$archive" guix archive --export guile-bootstrap:out > "$archive_alt" diff --git a/tests/nar.scm b/tests/nar.scm index b542ebd47c..59616659c8 100644 --- a/tests/nar.scm +++ b/tests/nar.scm @@ -136,8 +136,11 @@ (define (rm-rf dir) (file-system-fold (const #t) ; enter? (lambda (file stat result) ; leaf + (unless (eq? 'symlink (stat:type stat)) + (chmod file #o644)) (delete-file file)) - (const #t) ; down + (lambda (dir stat result) ; down + (chmod dir #o755)) (lambda (dir stat result) ; up (rmdir dir)) (const #t) ; skip @@ -363,7 +366,12 @@ (cut write-file input <>)) (call-with-input-file nar (cut restore-file <> output)) - (file-tree-equal? input output)) + + (and (file-tree-equal? input output) + (every (lambda (file) + (canonical-file? + (string-append output "/" file))) + '("root" "root/reg" "root/exe")))) (lambda () (false-if-exception (delete-file nar)) (false-if-exception (rm-rf output))))))) -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:09:47 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:09:47 +0000 Received: from localhost ([127.0.0.1]:42572 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3G-0007hY-Ij for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:47 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56558) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3C-0007gT-C9 for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:42 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47530) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk37-0000Td-8V; Fri, 11 Dec 2020 10:09:37 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59324 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk35-0004Tz-TC; Fri, 11 Dec 2020 10:09:36 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 03/15] nar: Deduplicate files right as they are restored. Date: Fri, 11 Dec 2020 16:09:13 +0100 Message-Id: <20201211150919.18435-4-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150919.18435-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150919.18435-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) This avoids having to traverse and re-read the files that we have just restored, thereby reducing I/O. * guix/serialization.scm (dump-file): New procedure. (restore-file): Add #:dump-file parameter and honor it. * guix/store/deduplication.scm (tee, dump-file/deduplicate): New procedures. * guix/nar.scm (restore-one-item): Pass #:dump-file to 'restore-file'. (finalize-store-file): Pass #:deduplicate? #f to 'register-items'. * tests/nar.scm : Call 'setenv' to set "NIX_STORE". --- guix/nar.scm | 12 ++++---- guix/serialization.scm | 27 ++++++++++++----- guix/store/deduplication.scm | 57 +++++++++++++++++++++++++++++++++++- tests/nar.scm | 3 ++ 4 files changed, 85 insertions(+), 14 deletions(-) diff --git a/guix/nar.scm b/guix/nar.scm index edfcc9aab5..ba035ca6dc 100644 --- a/guix/nar.scm +++ b/guix/nar.scm @@ -27,6 +27,7 @@ ;; (guix store) since this is "daemon-side" code. #:use-module (guix store) #:use-module (guix store database) + #:use-module ((guix store deduplication) #:select (dump-file/deduplicate)) #:use-module ((guix build store-copy) #:select (store-info)) #:use-module (guix i18n) @@ -114,12 +115,12 @@ held." ;; Install the new TARGET. (rename-file source target) - ;; Register TARGET. As a side effect, run a deduplication pass. - ;; Timestamps and permissions are already correct thanks to - ;; 'restore-file'. + ;; Register TARGET. The 'restore-file' call took care of + ;; deduplication, timestamps, and permissions. (register-items db (list (store-info target deriver references)) - #:reset-timestamps? #f)) + #:reset-timestamps? #f + #:deduplicate? #f)) (when lock? (delete-file (string-append target ".lock")) @@ -212,7 +213,8 @@ s-expression")) (let-values (((port get-hash) (open-sha256-input-port port))) (with-temporary-store-file temp - (restore-file port temp) + (restore-file port temp + #:dump-file dump-file/deduplicate) (let ((magic (read-int port))) (unless (= magic %export-magic) diff --git a/guix/serialization.scm b/guix/serialization.scm index 677ca60b66..9e2dce8bb0 100644 --- a/guix/serialization.scm +++ b/guix/serialization.scm @@ -457,9 +457,22 @@ depends on TYPE." (&message (message "unsupported nar entry type")) (&nar-read-error (port port) (file file) (token x))))))))) -(define (restore-file port file) +(define (dump-file file input size type) + "Dump SIZE bytes from INPUT to FILE." + (call-with-output-file file + (lambda (output) + (dump input output size)))) + +(define* (restore-file port file + #:key (dump-file dump-file)) "Read a file (possibly a directory structure) in Nar format from PORT. -Restore it as FILE with canonical permissions and timestamps." +Restore it as FILE with canonical permissions and timestamps. To write a +regular or executable file, call: + + (DUMP-FILE FILE INPUT SIZE TYPE) + +The default is to dump SIZE bytes from INPUT to FILE, but callers can provide +a custom procedure, for instance to deduplicate FILE on the fly." (fold-archive (lambda (file type content result) (match type ('directory @@ -473,12 +486,10 @@ Restore it as FILE with canonical permissions and timestamps." ((or 'regular 'executable) (match content ((input . size) - (call-with-output-file file - (lambda (output) - (dump input output size) - (chmod output (if (eq? type 'executable) - #o555 - #o444)))) + (dump-file file input size type) + (chmod file (if (eq? type 'executable) + #o555 + #o444)) (utime file 1 1 0 0)))))) #t port diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm index 0655ceb890..b4d37d4525 100644 --- a/guix/store/deduplication.scm +++ b/guix/store/deduplication.scm @@ -26,12 +26,15 @@ #:use-module (guix build syscalls) #:use-module (guix base32) #:use-module (srfi srfi-11) + #:use-module (srfi srfi-34) + #:use-module (srfi srfi-35) #:use-module (rnrs io ports) #:use-module (ice-9 ftw) #:use-module (ice-9 match) #:use-module (guix serialization) #:export (nar-sha256 - deduplicate)) + deduplicate + dump-file/deduplicate)) ;; XXX: This port is used as a workaround on Guile <= 2.2.4 where ;; 'port-position' throws to 'out-of-range' when the offset is great than or @@ -201,3 +204,55 @@ under STORE." ;; that's OK: we just can't deduplicate it more. #f) (else (apply throw args))))))))))) + +(define (tee input len output) + "Return a port that reads up to LEN bytes from INPUT and writes them to +OUTPUT as it goes." + (define bytes-read 0) + + (define (fail) + ;; Reached EOF before we had read LEN bytes from INPUT. + (raise (condition + (&nar-error (port input) + (file (port-filename output)))))) + + (define (read! bv start count) + ;; Read at most LEN bytes in total. + (let ((count (min count (- len bytes-read)))) + (let loop ((ret (get-bytevector-n! input bv start count))) + (cond ((eof-object? ret) + (if (= bytes-read len) + 0 ; EOF + (fail))) + ((and (zero? ret) (> count 0)) + ;; Do not return zero since zero means EOF, so try again. + (loop (get-bytevector-n! input bv start count))) + (else + (put-bytevector output bv start ret) + (set! bytes-read (+ bytes-read ret)) + ret))))) + + (make-custom-binary-input-port "tee input port" read! #f #f #f)) + +(define* (dump-file/deduplicate file input size type + #:key (store (%store-directory))) + "Write SIZE bytes read from INPUT to FILE. TYPE is a symbol, either +'regular or 'executable. + +This procedure is suitable as a #:dump-file argument to 'restore-file'. When +used that way, it deduplicates files on the fly as they are restored, thereby +removing the need to a deduplication pass that would re-read all the files +down the road." + (define hash + (call-with-output-file file + (lambda (output) + (let-values (((hash-port get-hash) + (open-hash-port (hash-algorithm sha256)))) + (write-file-tree file hash-port + #:file-type+size (lambda (_) (values type size)) + #:file-port + (const (tee input size output))) + (close-port hash-port) + (get-hash))))) + + (deduplicate file hash #:store store)) diff --git a/tests/nar.scm b/tests/nar.scm index 59616659c8..ba4881caaa 100644 --- a/tests/nar.scm +++ b/tests/nar.scm @@ -452,6 +452,9 @@ (false-if-exception (rm-rf %test-dir)) (setlocale LC_ALL locale))))) +;; XXX: Tell the 'deduplicate' procedure what store we're actually using. +(setenv "NIX_STORE" (%store-prefix)) + (test-assert "restore-file-set (signed, valid)" (with-store store (let* ((texts (unfold (cut >= <> 10) -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:09:48 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:09:48 +0000 Received: from localhost ([127.0.0.1]:42574 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3H-0007hs-EL for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:48 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56564) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3D-0007gV-5T for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:43 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47531) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk38-0000Tj-1x; Fri, 11 Dec 2020 10:09:38 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59324 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk37-0004Tz-Io; Fri, 11 Dec 2020 10:09:37 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 04/15] store-copy: 'populate-store' resets timestamps. Date: Fri, 11 Dec 2020 16:09:14 +0100 Message-Id: <20201211150919.18435-5-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150919.18435-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150919.18435-1-ludo@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Until now, 'populate-store' would reset permissions but not timestamps, so callers would resort to going through an extra directory traversal to reset timestamps. * guix/build/store-copy.scm (reset-permissions): Remove. (copy-recursively): New procedure. (populate-store): Pass #:keep-permissions? to 'copy-recursively'. Remove call to 'reset-permissions'. * tests/gexp.scm ("gexp->derivation, store copy"): In BUILD-DRV, check whether 'populate-store' canonicalizes permissions and timestamps. * gnu/build/image.scm (initialize-root-partition): Pass #:reset-timestamps? #f to 'register-closure'. * gnu/build/vm.scm (root-partition-initializer): Likewise. --- gnu/build/image.scm | 5 +- gnu/build/vm.scm | 2 +- guix/build/store-copy.scm | 103 +++++++++++++++++++++++++++----------- tests/gexp.scm | 19 ++++++- 4 files changed, 95 insertions(+), 34 deletions(-) diff --git a/gnu/build/image.scm b/gnu/build/image.scm index 640a784204..2857362914 100644 --- a/gnu/build/image.scm +++ b/gnu/build/image.scm @@ -196,9 +196,8 @@ register-closure." (when register-closures? (for-each (lambda (closure) - (register-closure root - closure - #:reset-timestamps? #t + (register-closure root closure + #:reset-timestamps? #f #:deduplicate? deduplicate? #:wal-mode? wal-mode?)) references-graphs)) diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm index 287d099f79..30feaf800f 100644 --- a/gnu/build/vm.scm +++ b/gnu/build/vm.scm @@ -414,7 +414,7 @@ system that is passed to 'populate-root-file-system'." (for-each (lambda (closure) (register-closure target (string-append "/xchg/" closure) - #:reset-timestamps? copy-closures? + #:reset-timestamps? #f #:deduplicate? deduplicate?)) closures) (unless copy-closures? diff --git a/guix/build/store-copy.scm b/guix/build/store-copy.scm index ad551bca98..95dcb8e114 100644 --- a/guix/build/store-copy.scm +++ b/guix/build/store-copy.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2013, 2014, 2017, 2018 Ludovic Courtès +;;; Copyright © 2013, 2014, 2017, 2018, 2020 Ludovic Courtès ;;; ;;; This file is part of GNU Guix. ;;; @@ -17,7 +17,7 @@ ;;; along with GNU Guix. If not, see . (define-module (guix build store-copy) - #:use-module (guix build utils) + #:use-module ((guix build utils) #:hide (copy-recursively)) #:use-module (guix sets) #:use-module (guix progress) #:use-module (srfi srfi-1) @@ -169,32 +169,83 @@ REFERENCE-GRAPHS, a list of reference-graph files." (reduce + 0 (map file-size items))) -(define (reset-permissions file) - "Reset the permissions on FILE and its sub-directories so that they are all -read-only." - ;; XXX: This procedure exists just to work around the inability of - ;; 'copy-recursively' to preserve permissions. - (file-system-fold (const #t) ;enter? - (lambda (file stat _) ;leaf - (unless (eq? 'symlink (stat:type stat)) - (chmod file - (if (zero? (logand (stat:mode stat) - #o100)) - #o444 - #o555)))) - (const #t) ;down - (lambda (directory stat _) ;up - (chmod directory #o555)) - (const #f) ;skip - (const #f) ;error +;; TODO: Remove when the one in (guix build utils) has #:keep-permissions?, +;; the fix for , and when #:keep-mtime? works for +;; symlinks. +(define* (copy-recursively source destination + #:key + (log (current-output-port)) + (follow-symlinks? #f) + (copy-file copy-file) + keep-mtime? keep-permissions?) + "Copy SOURCE directory to DESTINATION. Follow symlinks if FOLLOW-SYMLINKS? +is true; otherwise, just preserve them. Call COPY-FILE to copy regular files. +When KEEP-MTIME? is true, keep the modification time of the files in SOURCE on +those of DESTINATION. When KEEP-PERMISSIONS? is true, preserve file +permissions. Write verbose output to the LOG port." + (define AT_SYMLINK_NOFOLLOW + ;; Guile 2.0 did not define this constant, hence this hack. + (let ((variable (module-variable the-root-module 'AT_SYMLINK_NOFOLLOW))) + (if variable + (variable-ref variable) + 256))) ;for GNU/Linux + + (define (set-file-time file stat) + (utime file + (stat:atime stat) + (stat:mtime stat) + (stat:atimensec stat) + (stat:mtimensec stat) + AT_SYMLINK_NOFOLLOW)) + + (define strip-source + (let ((len (string-length source))) + (lambda (file) + (substring file len)))) + + (file-system-fold (const #t) ; enter? + (lambda (file stat result) ; leaf + (let ((dest (string-append destination + (strip-source file)))) + (format log "`~a' -> `~a'~%" file dest) + (case (stat:type stat) + ((symlink) + (let ((target (readlink file))) + (symlink target dest))) + (else + (copy-file file dest) + (when keep-permissions? + (chmod dest (stat:perms stat))))) + (when keep-mtime? + (set-file-time dest stat)))) + (lambda (dir stat result) ; down + (let ((target (string-append destination + (strip-source dir)))) + (mkdir-p target))) + (lambda (dir stat result) ; up + (let ((target (string-append destination + (strip-source dir)))) + (when keep-mtime? + (set-file-time target stat)) + (when keep-permissions? + (chmod target (stat:perms stat))))) + (const #t) ; skip + (lambda (file stat errno result) + (format (current-error-port) "i/o error: ~a: ~a~%" + file (strerror errno)) + #f) #t - file - lstat)) + source + + (if follow-symlinks? + stat + lstat))) (define* (populate-store reference-graphs target #:key (log-port (current-error-port))) "Populate the store under directory TARGET with the items specified in -REFERENCE-GRAPHS, a list of reference-graph files." +REFERENCE-GRAPHS, a list of reference-graph files. Items copied to TARGET +maintain timestamps and permissions." (define store (string-append target (%store-directory))) @@ -221,12 +272,8 @@ REFERENCE-GRAPHS, a list of reference-graph files." (copy-recursively thing (string-append target thing) #:keep-mtime? #t + #:keep-permissions? #t #:log (%make-void-port "w")) - - ;; XXX: Since 'copy-recursively' doesn't allow us to - ;; preserve permissions, we have to traverse TARGET to - ;; make sure everything is read-only. - (reset-permissions (string-append target thing)) (report)) things))))) diff --git a/tests/gexp.scm b/tests/gexp.scm index 686334af61..a0e55178fa 100644 --- a/tests/gexp.scm +++ b/tests/gexp.scm @@ -723,10 +723,25 @@ (lambda (port) (display "This is the second one." port)))))) (build-drv #~(begin - (use-modules (guix build store-copy)) + (use-modules (guix build store-copy) + (guix build utils) + (srfi srfi-1)) + + (define (canonical-file? file) + ;; Copied from (guix tests). + (let ((st (lstat file))) + (or (not (string-prefix? (%store-directory) file)) + (eq? 'symlink (stat:type st)) + (and (= 1 (stat:mtime st)) + (zero? (logand #o222 (stat:mode st))))))) (mkdir #$output) - (populate-store '("graph") #$output)))) + (populate-store '("graph") #$output) + + ;; Check whether 'populate-store' canonicalizes + ;; permissions and timestamps. + (unless (every canonical-file? (find-files #$output)) + (error "not canonical!" #$output))))) (mlet* %store-monad ((one (gexp->derivation "one" build-one)) (two (gexp->derivation "two" (build-two one))) (drv (gexp->derivation "store-copy" build-drv -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:09:48 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:09:48 +0000 Received: from localhost ([127.0.0.1]:42577 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3I-0007i5-8Q for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:48 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56568) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3D-0007gX-UR for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:44 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47532) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk38-0000U3-Qr; Fri, 11 Dec 2020 10:09:38 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59324 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk38-0004Tz-Bx; Fri, 11 Dec 2020 10:09:38 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 05/15] image: 'register-closure' assumes already-reset timestamps. Date: Fri, 11 Dec 2020 16:09:15 +0100 Message-Id: <20201211150919.18435-6-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150919.18435-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150919.18435-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) * gnu/build/image.scm (register-closure): Remove #:reset-timestamps? parameter. Pass #:reset-timestamps? #f to 'register-items'. (initialize-root-partition): Adjust accordingly. * gnu/build/vm.scm (register-closure, root-partition-initializer): Likewise. --- gnu/build/image.scm | 8 +++----- gnu/build/vm.scm | 8 +++----- 2 files changed, 6 insertions(+), 10 deletions(-) diff --git a/gnu/build/image.scm b/gnu/build/image.scm index 2857362914..4f80a1964f 100644 --- a/gnu/build/image.scm +++ b/gnu/build/image.scm @@ -140,13 +140,12 @@ given CONFIG file." (define* (register-closure prefix closure #:key - (deduplicate? #t) (reset-timestamps? #t) + (deduplicate? #t) (schema (sql-schema)) (wal-mode? #t)) "Register CLOSURE in PREFIX, where PREFIX is the directory name of the target store and CLOSURE is the name of a file containing a reference graph as -produced by #:references-graphs.. As a side effect, if RESET-TIMESTAMPS? is -true, reset timestamps on store files and, if DEDUPLICATE? is true, +produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true, deduplicates files common to CLOSURE and the rest of PREFIX. Pass WAL-MODE? to call-with-database." (let ((items (call-with-input-file closure read-reference-graph))) @@ -156,7 +155,7 @@ to call-with-database." (register-items db items #:prefix prefix #:deduplicate? deduplicate? - #:reset-timestamps? reset-timestamps? + #:reset-timestamps? #f #:registration-time %epoch))))) (define* (initialize-efi-partition root @@ -197,7 +196,6 @@ register-closure." (when register-closures? (for-each (lambda (closure) (register-closure root closure - #:reset-timestamps? #f #:deduplicate? deduplicate? #:wal-mode? wal-mode?)) references-graphs)) diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm index 30feaf800f..f700e08b25 100644 --- a/gnu/build/vm.scm +++ b/gnu/build/vm.scm @@ -215,12 +215,11 @@ the #:references-graphs parameter of 'derivation'." (define* (register-closure prefix closure #:key - (deduplicate? #t) (reset-timestamps? #t) + (deduplicate? #t) (schema (sql-schema))) "Register CLOSURE in PREFIX, where PREFIX is the directory name of the target store and CLOSURE is the name of a file containing a reference graph as -produced by #:references-graphs.. As a side effect, if RESET-TIMESTAMPS? is -true, reset timestamps on store files and, if DEDUPLICATE? is true, +produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true, deduplicates files common to CLOSURE and the rest of PREFIX." (let ((items (call-with-input-file closure read-reference-graph))) (parameterize ((sql-schema schema)) @@ -228,7 +227,7 @@ deduplicates files common to CLOSURE and the rest of PREFIX." (register-items db items #:prefix prefix #:deduplicate? deduplicate? - #:reset-timestamps? reset-timestamps? + #:reset-timestamps? #f #:registration-time %epoch))))) @@ -414,7 +413,6 @@ system that is passed to 'populate-root-file-system'." (for-each (lambda (closure) (register-closure target (string-append "/xchg/" closure) - #:reset-timestamps? #f #:deduplicate? deduplicate?)) closures) (unless copy-closures? -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:09:49 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:09:49 +0000 Received: from localhost ([127.0.0.1]:42579 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3I-0007iC-Mi for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:49 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56572) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3E-0007gd-OH for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:45 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47534) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk39-0000Uk-Jh; Fri, 11 Dec 2020 10:09:39 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59324 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk39-0004Tz-4f; Fri, 11 Dec 2020 10:09:39 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 06/15] database: Remove #:reset-timestamps? from 'register-items'. Date: Fri, 11 Dec 2020 16:09:16 +0100 Message-Id: <20201211150919.18435-7-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150919.18435-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150919.18435-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) The assumption now is that the caller took care of resetting timestamps and permissions. * guix/store/database.scm (register-items): Remove #:reset-timestamps? parameter and the call to 'reset-timestamps'. (register-path): Adjust accordingly and add call to 'reset-timestamps'. * gnu/build/image.scm (register-closure): Remove #:reset-timestamps? parameter to 'register-items'. * gnu/build/vm.scm (register-closure): Likewise. * guix/nar.scm (finalize-store-file): Adjust accordingly. * guix/scripts/pack.scm (store-database)[build]: Likewise. --- gnu/build/image.scm | 1 - gnu/build/vm.scm | 1 - guix/nar.scm | 1 - guix/scripts/pack.scm | 1 - guix/store/database.scm | 13 ++++++++----- 5 files changed, 8 insertions(+), 9 deletions(-) diff --git a/gnu/build/image.scm b/gnu/build/image.scm index 4f80a1964f..0deea10a9d 100644 --- a/gnu/build/image.scm +++ b/gnu/build/image.scm @@ -155,7 +155,6 @@ to call-with-database." (register-items db items #:prefix prefix #:deduplicate? deduplicate? - #:reset-timestamps? #f #:registration-time %epoch))))) (define* (initialize-efi-partition root diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm index f700e08b25..abb0317faf 100644 --- a/gnu/build/vm.scm +++ b/gnu/build/vm.scm @@ -227,7 +227,6 @@ deduplicates files common to CLOSURE and the rest of PREFIX." (register-items db items #:prefix prefix #:deduplicate? deduplicate? - #:reset-timestamps? #f #:registration-time %epoch))))) diff --git a/guix/nar.scm b/guix/nar.scm index ba035ca6dc..947b393d84 100644 --- a/guix/nar.scm +++ b/guix/nar.scm @@ -119,7 +119,6 @@ held." ;; deduplication, timestamps, and permissions. (register-items db (list (store-info target deriver references)) - #:reset-timestamps? #f #:deduplicate? #f)) (when lock? diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm index ba9a6dc1b2..1612ec8f04 100644 --- a/guix/scripts/pack.scm +++ b/guix/scripts/pack.scm @@ -168,7 +168,6 @@ dependencies are registered." (with-database db-file db (register-items db items #:deduplicate? #f - #:reset-timestamps? #f #:registration-time %epoch))))))) (computed-file "store-database" build diff --git a/guix/store/database.scm b/guix/store/database.scm index b36b127630..0ed66a6e2c 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -392,7 +392,8 @@ references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is given, it must be the name of the directory containing the new store to initialize; if STATE-DIRECTORY is given, it must be a string containing the absolute file name to the state directory of the store being initialized. -Return #t on success. +Return #t on success. As a side effect, reset timestamps on PATH, unless +RESET-TIMESTAMPS? is false. Use with care as it directly modifies the store! This is primarily meant to be used internally by the daemon's build hook. @@ -403,12 +404,17 @@ by adding it as a temp-root." (store-database-file #:prefix prefix #:state-directory state-directory)) + (define real-file-name + (string-append (or prefix "") path)) + + (when reset-timestamps? + (reset-timestamps real-file-name)) + (parameterize ((sql-schema schema)) (with-database db-file db (register-items db (list (store-info path deriver references)) #:prefix prefix #:deduplicate? deduplicate? - #:reset-timestamps? reset-timestamps? #:log-port (%make-void-port "w"))))) (define %epoch @@ -418,7 +424,6 @@ by adding it as a temp-root." (define* (register-items db items #:key prefix (deduplicate? #t) - (reset-timestamps? #t) registration-time (log-port (current-error-port))) "Register all of ITEMS, a list of records as returned by @@ -452,8 +457,6 @@ typically by adding them as temp-roots." ;; significant differences when 'register-closures' is called ;; consecutively for overlapping closures such as 'system' and 'bootcfg'. (unless (path-id db to-register) - (when reset-timestamps? - (reset-timestamps real-file-name)) (let-values (((hash nar-size) (nar-sha256 real-file-name))) (call-with-retrying-transaction db (lambda () -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:09:55 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:09:55 +0000 Received: from localhost ([127.0.0.1]:42581 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3O-0007iT-5u for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:55 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56578) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3F-0007gl-JI for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:46 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47535) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk3A-0000Uv-FR; Fri, 11 Dec 2020 10:09:40 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59324 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk39-0004Tz-Ti; Fri, 11 Dec 2020 10:09:40 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 07/15] store-copy: 'populate-store' can optionally deduplicate files. Date: Fri, 11 Dec 2020 16:09:17 +0100 Message-Id: <20201211150919.18435-8-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150919.18435-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150919.18435-1-ludo@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Until now deduplication was performed as an additional pass after copying files, which involve re-traversing all the files that had just been copied. * guix/store/deduplication.scm (copy-file/deduplicate): New procedure. * tests/store-deduplication.scm ("copy-file/deduplicate"): New test. * guix/build/store-copy.scm (populate-store): Add #:deduplicate? parameter and honor it. * tests/gexp.scm ("gexp->derivation, store copy"): Pass #:deduplicate? #f to 'populate-store'. * gnu/build/image.scm (initialize-root-partition): Pass #:deduplicate? to 'populate-store'. Pass #:deduplicate? #f to 'register-closure'. * gnu/build/vm.scm (root-partition-initializer): Likewise. * gnu/build/install.scm (populate-single-profile-directory): Pass #:deduplicate? #f to 'populate-store'. * gnu/build/linux-initrd.scm (build-initrd): Likewise. * guix/scripts/pack.scm (self-contained-tarball)[import-module?]: New procedure. [build]: Pass it as an argument to 'source-module-closure'. * guix/scripts/pack.scm (squashfs-image)[build]: Wrap in 'with-extensions'. * gnu/system/linux-initrd.scm (expression->initrd)[import-module?]: New procedure. [builder]: Pass it to 'source-module-closure'. * gnu/system/install.scm (cow-store-service-type)[import-module?]: New procedure. Pass it to 'source-module-closure'. --- gnu/build/image.scm | 5 +- gnu/build/install.scm | 3 +- gnu/build/linux-initrd.scm | 3 +- gnu/build/vm.scm | 5 +- gnu/system/install.scm | 12 +- gnu/system/linux-initrd.scm | 10 +- guix/build/store-copy.scm | 13 +- guix/scripts/pack.scm | 274 +++++++++++++++++----------------- guix/store/deduplication.scm | 16 +- tests/gexp.scm | 3 +- tests/store-deduplication.scm | 18 ++- 11 files changed, 215 insertions(+), 147 deletions(-) diff --git a/gnu/build/image.scm b/gnu/build/image.scm index 0deea10a9d..8f50f27f78 100644 --- a/gnu/build/image.scm +++ b/gnu/build/image.scm @@ -186,7 +186,8 @@ rest of the store when registering the closures. SYSTEM-DIRECTORY is the name of the directory of the 'system' derivation. Pass WAL-MODE? to register-closure." (populate-root-file-system system-directory root) - (populate-store references-graphs root) + (populate-store references-graphs root + #:deduplicate? deduplicate?) ;; Populate /dev. (when make-device-nodes @@ -195,7 +196,7 @@ register-closure." (when register-closures? (for-each (lambda (closure) (register-closure root closure - #:deduplicate? deduplicate? + #:deduplicate? #f #:wal-mode? wal-mode?)) references-graphs)) diff --git a/gnu/build/install.scm b/gnu/build/install.scm index 63995e1d09..f5c8407b89 100644 --- a/gnu/build/install.scm +++ b/gnu/build/install.scm @@ -214,7 +214,8 @@ This is used to create the self-contained tarballs with 'guix pack'." (symlink old (scope new))) ;; Populate the store. - (populate-store (list closure) directory) + (populate-store (list closure) directory + #:deduplicate? #f) (when database (install-database-and-gc-roots directory database profile diff --git a/gnu/build/linux-initrd.scm b/gnu/build/linux-initrd.scm index 99796adba6..bb2ed0db0c 100644 --- a/gnu/build/linux-initrd.scm +++ b/gnu/build/linux-initrd.scm @@ -127,7 +127,8 @@ REFERENCES-GRAPHS." (mkdir "contents") ;; Copy the closures of all the items referenced in REFERENCES-GRAPHS. - (populate-store references-graphs "contents") + (populate-store references-graphs "contents" + #:deduplicate? #f) (with-directory-excursion "contents" ;; Make '/init'. diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm index abb0317faf..03be5697b7 100644 --- a/gnu/build/vm.scm +++ b/gnu/build/vm.scm @@ -395,7 +395,8 @@ system that is passed to 'populate-root-file-system'." (when copy-closures? ;; Populate the store. (populate-store (map (cut string-append "/xchg/" <>) closures) - target)) + target + #:deduplicate? deduplicate?)) ;; Populate /dev. (make-device-nodes target) @@ -412,7 +413,7 @@ system that is passed to 'populate-root-file-system'." (for-each (lambda (closure) (register-closure target (string-append "/xchg/" closure) - #:deduplicate? deduplicate?)) + #:deduplicate? #f)) closures) (unless copy-closures? (umount target-store))) diff --git a/gnu/system/install.scm b/gnu/system/install.scm index 7701297411..06f8043bb7 100644 --- a/gnu/system/install.scm +++ b/gnu/system/install.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès +;;; Copyright © 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès ;;; Copyright © 2015 Mark H Weaver ;;; Copyright © 2016 Andreas Enge ;;; Copyright © 2017 Marius Bakke @@ -175,6 +175,13 @@ manual." (shepherd-service-type 'cow-store (lambda _ + (define (import-module? module) + ;; Since we don't use deduplication support in 'populate-store', don't + ;; import (guix store deduplication) and its dependencies, which + ;; includes Guile-Gcrypt. + (and (guix-module-name? module) + (not (equal? module '(guix store deduplication))))) + (shepherd-service (requirement '(root-file-system user-processes)) (provision '(cow-store)) @@ -189,7 +196,8 @@ the given target.") ,@%default-modules)) (start (with-imported-modules (source-module-closure - '((gnu build install))) + '((gnu build install)) + #:select? import-module?) #~(case-lambda ((target) (mount-cow-store target #$%backing-directory) diff --git a/gnu/system/linux-initrd.scm b/gnu/system/linux-initrd.scm index 4fb1d863c9..c6ba9bb560 100644 --- a/gnu/system/linux-initrd.scm +++ b/gnu/system/linux-initrd.scm @@ -76,12 +76,20 @@ the derivations referenced by EXP are automatically copied to the initrd." (define init (program-file "init" exp #:guile guile)) + (define (import-module? module) + ;; Since we don't use deduplication support in 'populate-store', don't + ;; import (guix store deduplication) and its dependencies, which includes + ;; Guile-Gcrypt. That way we can run tests with '--bootstrap'. + (and (guix-module-name? module) + (not (equal? module '(guix store deduplication))))) + (define builder ;; Do not use "guile-zlib" extension here, otherwise it would drag the ;; non-static "zlib" package to the initrd closure. It is not needed ;; anyway because the modules are stored uncompressed within the initrd. (with-imported-modules (source-module-closure - '((gnu build linux-initrd))) + '((gnu build linux-initrd)) + #:select? import-module?) #~(begin (use-modules (gnu build linux-initrd)) diff --git a/guix/build/store-copy.scm b/guix/build/store-copy.scm index 95dcb8e114..7f0672cd9d 100644 --- a/guix/build/store-copy.scm +++ b/guix/build/store-copy.scm @@ -20,6 +20,7 @@ #:use-module ((guix build utils) #:hide (copy-recursively)) #:use-module (guix sets) #:use-module (guix progress) + #:autoload (guix store deduplication) (copy-file/deduplicate) #:use-module (srfi srfi-1) #:use-module (srfi srfi-9) #:use-module (srfi srfi-26) @@ -242,10 +243,13 @@ permissions. Write verbose output to the LOG port." lstat))) (define* (populate-store reference-graphs target - #:key (log-port (current-error-port))) + #:key + (deduplicate? #t) + (log-port (current-error-port))) "Populate the store under directory TARGET with the items specified in REFERENCE-GRAPHS, a list of reference-graph files. Items copied to TARGET -maintain timestamps and permissions." +maintain timestamps and permissions. When DEDUPLICATE? is true, deduplicate +regular files as they are copied to TARGET." (define store (string-append target (%store-directory))) @@ -273,6 +277,11 @@ maintain timestamps and permissions." (string-append target thing) #:keep-mtime? #t #:keep-permissions? #t + #:copy-file + (if deduplicate? + (cut copy-file/deduplicate <> <> + #:store store) + copy-file) #:log (%make-void-port "w")) (report)) things))))) diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm index 1612ec8f04..440c4b0903 100644 --- a/guix/scripts/pack.scm +++ b/guix/scripts/pack.scm @@ -203,12 +203,19 @@ added to the pack." #+(file-append glibc-utf8-locales "/lib/locale")) (setlocale LC_ALL "en_US.utf8")))) + (define (import-module? module) + ;; Since we don't use deduplication support in 'populate-store', don't + ;; import (guix store deduplication) and its dependencies, which includes + ;; Guile-Gcrypt. That way we can run tests with '--bootstrap'. + (and (not-config? module) + (not (equal? '(guix store deduplication) module)))) + (define build (with-imported-modules (source-module-closure `((guix build utils) (guix build union) (gnu build install)) - #:select? not-config?) + #:select? import-module?) #~(begin (use-modules (guix build utils) ((guix build union) #:select (relative-file-name)) @@ -382,138 +389,139 @@ added to the pack." `(("/bin" -> "bin") ,@symlinks))) (define build - (with-imported-modules (source-module-closure - '((guix build utils) - (guix build store-copy) - (guix build union) - (gnu build install)) - #:select? not-config?) - #~(begin - (use-modules (guix build utils) - (guix build store-copy) - ((guix build union) #:select (relative-file-name)) - (gnu build install) - (srfi srfi-1) - (srfi srfi-26) - (ice-9 match)) - - (define database #+database) - (define entry-point #$entry-point) - - (define (mksquashfs args) - (apply invoke "mksquashfs" - `(,@args - - ;; Do not create a "recovery file" when appending to the - ;; file system since it's useless in this case. - "-no-recovery" - - ;; Do not attempt to store extended attributes. - ;; See . - "-no-xattrs" - - ;; Set file times and the file system creation time to - ;; one second after the Epoch. - "-all-time" "1" "-mkfs-time" "1" - - ;; Reset all UIDs and GIDs. - "-force-uid" "0" "-force-gid" "0"))) - - (setenv "PATH" #+(file-append archiver "/bin")) - - ;; We need an empty file in order to have a valid file argument when - ;; we reparent the root file system. Read on for why that's - ;; necessary. - (with-output-to-file ".empty" (lambda () (display ""))) - - ;; Create the squashfs image in several steps. - ;; Add all store items. Unfortunately mksquashfs throws away all - ;; ancestor directories and only keeps the basename. We fix this - ;; in the following invocations of mksquashfs. - (mksquashfs `(,@(map store-info-item - (call-with-input-file "profile" - read-reference-graph)) - #$environment - ,#$output - - ;; Do not perform duplicate checking because we - ;; don't have any dupes. - "-no-duplicates" - "-comp" - ,#+(compressor-name compressor))) - - ;; Here we reparent the store items. For each sub-directory of - ;; the store prefix we need one invocation of "mksquashfs". - (for-each (lambda (dir) - (mksquashfs `(".empty" - ,#$output - "-root-becomes" ,dir))) - (reverse (string-tokenize (%store-directory) - (char-set-complement (char-set #\/))))) - - ;; Add symlinks and mount points. - (mksquashfs - `(".empty" - ,#$output - ;; Create SYMLINKS via pseudo file definitions. - ,@(append-map - (match-lambda - ((source '-> target) - ;; Create relative symlinks to work around a bug in - ;; Singularity 2.x: - ;; https://bugs.gnu.org/34913 - ;; https://github.com/sylabs/singularity/issues/1487 - (let ((target (string-append #$profile "/" target))) - (list "-p" - (string-join - ;; name s mode uid gid symlink - (list source - "s" "777" "0" "0" - (relative-file-name (dirname source) - target))))))) - '#$symlinks*) - - "-p" "/.singularity.d d 555 0 0" - - ;; Create the environment file. - "-p" "/.singularity.d/env d 555 0 0" - "-p" ,(string-append - "/.singularity.d/env/90-environment.sh s 777 0 0 " - (relative-file-name "/.singularity.d/env" - #$environment)) - - ;; Create /.singularity.d/actions, and optionally the 'run' - ;; script, used by 'singularity run'. - "-p" "/.singularity.d/actions d 555 0 0" - - ,@(if entry-point - `(;; This one if for Singularity 2.x. - "-p" - ,(string-append - "/.singularity.d/actions/run s 777 0 0 " - (relative-file-name "/.singularity.d/actions" - (string-append #$profile "/" - entry-point))) - - ;; This one is for Singularity 3.x. - "-p" - ,(string-append - "/.singularity.d/runscript s 777 0 0 " - (relative-file-name "/.singularity.d" - (string-append #$profile "/" - entry-point)))) - '()) - - ;; Create empty mount points. - "-p" "/proc d 555 0 0" - "-p" "/sys d 555 0 0" - "-p" "/dev d 555 0 0" - "-p" "/home d 555 0 0")) - - (when database - ;; Initialize /var/guix. - (install-database-and-gc-roots "var-etc" database #$profile) - (mksquashfs `("var-etc" ,#$output)))))) + (with-extensions (list guile-gcrypt) + (with-imported-modules (source-module-closure + '((guix build utils) + (guix build store-copy) + (guix build union) + (gnu build install)) + #:select? not-config?) + #~(begin + (use-modules (guix build utils) + (guix build store-copy) + ((guix build union) #:select (relative-file-name)) + (gnu build install) + (srfi srfi-1) + (srfi srfi-26) + (ice-9 match)) + + (define database #+database) + (define entry-point #$entry-point) + + (define (mksquashfs args) + (apply invoke "mksquashfs" + `(,@args + + ;; Do not create a "recovery file" when appending to the + ;; file system since it's useless in this case. + "-no-recovery" + + ;; Do not attempt to store extended attributes. + ;; See . + "-no-xattrs" + + ;; Set file times and the file system creation time to + ;; one second after the Epoch. + "-all-time" "1" "-mkfs-time" "1" + + ;; Reset all UIDs and GIDs. + "-force-uid" "0" "-force-gid" "0"))) + + (setenv "PATH" #+(file-append archiver "/bin")) + + ;; We need an empty file in order to have a valid file argument when + ;; we reparent the root file system. Read on for why that's + ;; necessary. + (with-output-to-file ".empty" (lambda () (display ""))) + + ;; Create the squashfs image in several steps. + ;; Add all store items. Unfortunately mksquashfs throws away all + ;; ancestor directories and only keeps the basename. We fix this + ;; in the following invocations of mksquashfs. + (mksquashfs `(,@(map store-info-item + (call-with-input-file "profile" + read-reference-graph)) + #$environment + ,#$output + + ;; Do not perform duplicate checking because we + ;; don't have any dupes. + "-no-duplicates" + "-comp" + ,#+(compressor-name compressor))) + + ;; Here we reparent the store items. For each sub-directory of + ;; the store prefix we need one invocation of "mksquashfs". + (for-each (lambda (dir) + (mksquashfs `(".empty" + ,#$output + "-root-becomes" ,dir))) + (reverse (string-tokenize (%store-directory) + (char-set-complement (char-set #\/))))) + + ;; Add symlinks and mount points. + (mksquashfs + `(".empty" + ,#$output + ;; Create SYMLINKS via pseudo file definitions. + ,@(append-map + (match-lambda + ((source '-> target) + ;; Create relative symlinks to work around a bug in + ;; Singularity 2.x: + ;; https://bugs.gnu.org/34913 + ;; https://github.com/sylabs/singularity/issues/1487 + (let ((target (string-append #$profile "/" target))) + (list "-p" + (string-join + ;; name s mode uid gid symlink + (list source + "s" "777" "0" "0" + (relative-file-name (dirname source) + target))))))) + '#$symlinks*) + + "-p" "/.singularity.d d 555 0 0" + + ;; Create the environment file. + "-p" "/.singularity.d/env d 555 0 0" + "-p" ,(string-append + "/.singularity.d/env/90-environment.sh s 777 0 0 " + (relative-file-name "/.singularity.d/env" + #$environment)) + + ;; Create /.singularity.d/actions, and optionally the 'run' + ;; script, used by 'singularity run'. + "-p" "/.singularity.d/actions d 555 0 0" + + ,@(if entry-point + `( ;; This one if for Singularity 2.x. + "-p" + ,(string-append + "/.singularity.d/actions/run s 777 0 0 " + (relative-file-name "/.singularity.d/actions" + (string-append #$profile "/" + entry-point))) + + ;; This one is for Singularity 3.x. + "-p" + ,(string-append + "/.singularity.d/runscript s 777 0 0 " + (relative-file-name "/.singularity.d" + (string-append #$profile "/" + entry-point)))) + '()) + + ;; Create empty mount points. + "-p" "/proc d 555 0 0" + "-p" "/sys d 555 0 0" + "-p" "/dev d 555 0 0" + "-p" "/home d 555 0 0")) + + (when database + ;; Initialize /var/guix. + (install-database-and-gc-roots "var-etc" database #$profile) + (mksquashfs `("var-etc" ,#$output))))))) (gexp->derivation (string-append name (compressor-extension compressor) diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm index b4d37d4525..8564f12107 100644 --- a/guix/store/deduplication.scm +++ b/guix/store/deduplication.scm @@ -34,7 +34,8 @@ #:use-module (guix serialization) #:export (nar-sha256 deduplicate - dump-file/deduplicate)) + dump-file/deduplicate + copy-file/deduplicate)) ;; XXX: This port is used as a workaround on Guile <= 2.2.4 where ;; 'port-position' throws to 'out-of-range' when the offset is great than or @@ -256,3 +257,16 @@ down the road." (get-hash))))) (deduplicate file hash #:store store)) + +(define* (copy-file/deduplicate source target + #:key (store (%store-directory))) + "Like 'copy-file', but additionally deduplicate TARGET in STORE." + (call-with-input-file source + (lambda (input) + (let ((stat (stat input))) + (dump-file/deduplicate target input (stat:size stat) + (if (zero? (logand (stat:mode stat) + #o100)) + 'regular + 'executable) + #:store store))))) diff --git a/tests/gexp.scm b/tests/gexp.scm index a0e55178fa..6e92f0e4b3 100644 --- a/tests/gexp.scm +++ b/tests/gexp.scm @@ -736,7 +736,8 @@ (zero? (logand #o222 (stat:mode st))))))) (mkdir #$output) - (populate-store '("graph") #$output) + (populate-store '("graph") #$output + #:deduplicate? #f) ;; Check whether 'populate-store' canonicalizes ;; permissions and timestamps. diff --git a/tests/store-deduplication.scm b/tests/store-deduplication.scm index e2870a363d..7b01acae24 100644 --- a/tests/store-deduplication.scm +++ b/tests/store-deduplication.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2018 Ludovic Courtès +;;; Copyright © 2018, 2020 Ludovic Courtès ;;; ;;; This file is part of GNU Guix. ;;; @@ -25,6 +25,7 @@ #:use-module (rnrs bytevectors) #:use-module (ice-9 binary-ports) #:use-module (srfi srfi-1) + #:use-module (srfi srfi-26) #:use-module (srfi srfi-64)) (test-begin "store-deduplication") @@ -106,4 +107,19 @@ (cons (apply = (map (compose stat:ino stat) identical)) (map (compose stat:nlink stat) identical)))))) +(test-assert "copy-file/deduplicate" + (call-with-temporary-directory + (lambda (store) + (let ((source (search-path %load-path "gnu/packages/emacs-xyz.scm"))) + (for-each (lambda (target) + (copy-file/deduplicate source + (string-append store target) + #:store store)) + '("/a" "/b" "/c")) + (and (directory-exists? (string-append store "/.links")) + (file=? source (string-append store "/a")) + (apply = (map (compose stat:ino stat + (cut string-append store <>)) + '("/a" "/b" "/c")))))))) + (test-end "store-deduplication") -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:09:55 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:09:55 +0000 Received: from localhost ([127.0.0.1]:42583 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3P-0007iW-5F for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:55 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56590) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3G-0007h0-CC for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:47 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47536) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk3B-0000V2-8p; Fri, 11 Dec 2020 10:09:41 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59324 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk3A-0004Tz-PL; Fri, 11 Dec 2020 10:09:41 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 08/15] image: 'register-closure' leaves it up to the caller to deduplicate. Date: Fri, 11 Dec 2020 16:09:18 +0100 Message-Id: <20201211150919.18435-9-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150919.18435-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150919.18435-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) * gnu/build/image.scm (register-closure): Remove #:deduplicate? parameter and pass #:deduplicate? #f to 'register-items'. (initialize-root-partition): Adjust accordingly. * gnu/build/vm.scm (register-closure, root-partition-initializer): Likewise. --- gnu/build/image.scm | 8 ++------ gnu/build/vm.scm | 9 +++------ 2 files changed, 5 insertions(+), 12 deletions(-) diff --git a/gnu/build/image.scm b/gnu/build/image.scm index 8f50f27f78..8d5fc603d9 100644 --- a/gnu/build/image.scm +++ b/gnu/build/image.scm @@ -140,21 +140,18 @@ given CONFIG file." (define* (register-closure prefix closure #:key - (deduplicate? #t) (schema (sql-schema)) (wal-mode? #t)) "Register CLOSURE in PREFIX, where PREFIX is the directory name of the target store and CLOSURE is the name of a file containing a reference graph as -produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true, -deduplicates files common to CLOSURE and the rest of PREFIX. Pass WAL-MODE? -to call-with-database." +produced by #:references-graphs. Pass WAL-MODE? to call-with-database." (let ((items (call-with-input-file closure read-reference-graph))) (parameterize ((sql-schema schema)) (with-database (store-database-file #:prefix prefix) db #:wal-mode? wal-mode? (register-items db items #:prefix prefix - #:deduplicate? deduplicate? + #:deduplicate? #f #:registration-time %epoch))))) (define* (initialize-efi-partition root @@ -196,7 +193,6 @@ register-closure." (when register-closures? (for-each (lambda (closure) (register-closure root closure - #:deduplicate? #f #:wal-mode? wal-mode?)) references-graphs)) diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm index 03be5697b7..8c6ab648ac 100644 --- a/gnu/build/vm.scm +++ b/gnu/build/vm.scm @@ -215,18 +215,16 @@ the #:references-graphs parameter of 'derivation'." (define* (register-closure prefix closure #:key - (deduplicate? #t) (schema (sql-schema))) "Register CLOSURE in PREFIX, where PREFIX is the directory name of the target store and CLOSURE is the name of a file containing a reference graph as -produced by #:references-graphs. As a side effect, if DEDUPLICATE? is true, -deduplicates files common to CLOSURE and the rest of PREFIX." +produced by #:references-graphs." (let ((items (call-with-input-file closure read-reference-graph))) (parameterize ((sql-schema schema)) (with-database (store-database-file #:prefix prefix) db (register-items db items #:prefix prefix - #:deduplicate? deduplicate? + #:deduplicate? #f #:registration-time %epoch))))) @@ -412,8 +410,7 @@ system that is passed to 'populate-root-file-system'." (display "registering closures...\n") (for-each (lambda (closure) (register-closure target - (string-append "/xchg/" closure) - #:deduplicate? #f)) + (string-append "/xchg/" closure))) closures) (unless copy-closures? (umount target-store))) -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:09:55 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:09:55 +0000 Received: from localhost ([127.0.0.1]:42585 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3P-0007id-Fq for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:55 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56600) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3H-0007h7-Jv for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:09:48 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47537) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk3C-0000X5-G9; Fri, 11 Dec 2020 10:09:42 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59324 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk3B-0004Tz-Ie; Fri, 11 Dec 2020 10:09:42 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 09/15] database: Remove #:deduplicate? from 'register-items'. Date: Fri, 11 Dec 2020 16:09:19 +0100 Message-Id: <20201211150919.18435-10-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150919.18435-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150919.18435-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) It is now up to the caller to deduplicate store contents. * guix/store/database.scm (register-items): Remove #:deduplicate? parameter and call to 'deduplicate'. (register-path): Call 'deduplicate' when #:deduplicate? is true. * gnu/build/image.scm (register-closure): Adjust call accordingly. * gnu/build/vm.scm (register-closure): Likewise. * guix/nar.scm (finalize-store-file): Likewise. * guix/scripts/pack.scm (store-database): Likewise. --- gnu/build/image.scm | 1 - gnu/build/vm.scm | 1 - guix/nar.scm | 3 +-- guix/scripts/pack.scm | 1 - guix/store/database.scm | 11 ++++++----- 5 files changed, 7 insertions(+), 10 deletions(-) diff --git a/gnu/build/image.scm b/gnu/build/image.scm index 8d5fc603d9..f6e5cb42f6 100644 --- a/gnu/build/image.scm +++ b/gnu/build/image.scm @@ -151,7 +151,6 @@ produced by #:references-graphs. Pass WAL-MODE? to call-with-database." #:wal-mode? wal-mode? (register-items db items #:prefix prefix - #:deduplicate? #f #:registration-time %epoch))))) (define* (initialize-efi-partition root diff --git a/gnu/build/vm.scm b/gnu/build/vm.scm index 8c6ab648ac..bd59916bf3 100644 --- a/gnu/build/vm.scm +++ b/gnu/build/vm.scm @@ -224,7 +224,6 @@ produced by #:references-graphs." (with-database (store-database-file #:prefix prefix) db (register-items db items #:prefix prefix - #:deduplicate? #f #:registration-time %epoch))))) diff --git a/guix/nar.scm b/guix/nar.scm index 947b393d84..a817b56007 100644 --- a/guix/nar.scm +++ b/guix/nar.scm @@ -118,8 +118,7 @@ held." ;; Register TARGET. The 'restore-file' call took care of ;; deduplication, timestamps, and permissions. (register-items db - (list (store-info target deriver references)) - #:deduplicate? #f)) + (list (store-info target deriver references)))) (when lock? (delete-file (string-append target ".lock")) diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm index 440c4b0903..8ecdcb823f 100644 --- a/guix/scripts/pack.scm +++ b/guix/scripts/pack.scm @@ -167,7 +167,6 @@ dependencies are registered." (let ((items (append-map read-closure '#$labels))) (with-database db-file db (register-items db items - #:deduplicate? #f #:registration-time %epoch))))))) (computed-file "store-database" build diff --git a/guix/store/database.scm b/guix/store/database.scm index 0ed66a6e2c..31ea9add78 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -407,6 +407,11 @@ by adding it as a temp-root." (define real-file-name (string-append (or prefix "") path)) + (when deduplicate? + (deduplicate real-file-name (nar-sha256 real-file-name) + #:store (string-append (or prefix "") + %store-directory))) + (when reset-timestamps? (reset-timestamps real-file-name)) @@ -414,7 +419,6 @@ by adding it as a temp-root." (with-database db-file db (register-items db (list (store-info path deriver references)) #:prefix prefix - #:deduplicate? deduplicate? #:log-port (%make-void-port "w"))))) (define %epoch @@ -423,7 +427,6 @@ by adding it as a temp-root." (define* (register-items db items #:key prefix - (deduplicate? #t) registration-time (log-port (current-error-port))) "Register all of ITEMS, a list of records as returned by @@ -467,9 +470,7 @@ typically by adding them as temp-roots." "sha256:" (bytevector->base16-string hash)) #:nar-size nar-size - #:time registration-time))) - (when deduplicate? - (deduplicate real-file-name hash #:store store-dir))))) + #:time registration-time)))))) (let* ((prefix (format #f "registering ~a items" (length items))) (progress (progress-reporter/bar (length items) -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:10:13 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:10:13 +0000 Received: from localhost ([127.0.0.1]:42600 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3g-0007ky-TS for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:13 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56736) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3e-0007kD-8t for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:11 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47542) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk3Z-0000eM-3s; Fri, 11 Dec 2020 10:10:05 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59326 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk3Y-0004Vl-LA; Fri, 11 Dec 2020 10:10:04 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 11/15] database: Remove #:deduplicate? and #:reset-timestamps? from 'register-path'. Date: Fri, 11 Dec 2020 16:09:47 +0100 Message-Id: <20201211150951.18508-2-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150951.18508-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150951.18508-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) * guix/store/database.scm (register-path): Remove #:deduplicate? and #:reset-timestamps?. * guix/scripts/system.scm (copy-item): Adjust accordingly. * tests/store-database.scm ("register-path") ("register-path, directory"): Call 'reset-timestamps'. --- guix/scripts/system.scm | 6 +----- guix/store/database.scm | 17 ++--------------- tests/store-database.scm | 5 +++-- 3 files changed, 6 insertions(+), 22 deletions(-) diff --git a/guix/scripts/system.scm b/guix/scripts/system.scm index c08929066b..0e543d9460 100644 --- a/guix/scripts/system.scm +++ b/guix/scripts/system.scm @@ -158,11 +158,7 @@ REFERENCES as its set of references." (unless (register-path item #:prefix target #:state-directory state - #:references references - - ;; Those are taken care of by 'copy-store-item'. - #:reset-timestamps? #f - #:deduplicate? #f) + #:references references) (leave (G_ "failed to register '~a' under '~a'~%") item target)))) diff --git a/guix/store/database.scm b/guix/store/database.scm index 31ea9add78..c0010b72b9 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -384,16 +384,14 @@ is true." (define* (register-path path #:key (references '()) deriver prefix - state-directory (deduplicate? #t) - (reset-timestamps? #t) + state-directory (schema (sql-schema))) "Register PATH as a valid store file, with REFERENCES as its list of references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is given, it must be the name of the directory containing the new store to initialize; if STATE-DIRECTORY is given, it must be a string containing the absolute file name to the state directory of the store being initialized. -Return #t on success. As a side effect, reset timestamps on PATH, unless -RESET-TIMESTAMPS? is false. +Return #t on success. Use with care as it directly modifies the store! This is primarily meant to be used internally by the daemon's build hook. @@ -404,17 +402,6 @@ by adding it as a temp-root." (store-database-file #:prefix prefix #:state-directory state-directory)) - (define real-file-name - (string-append (or prefix "") path)) - - (when deduplicate? - (deduplicate real-file-name (nar-sha256 real-file-name) - #:store (string-append (or prefix "") - %store-directory))) - - (when reset-timestamps? - (reset-timestamps real-file-name)) - (parameterize ((sql-schema schema)) (with-database db-file db (register-items db (list (store-info path deriver references)) diff --git a/tests/store-database.scm b/tests/store-database.scm index 3b4ef43f6d..33fd6cfbad 100644 --- a/tests/store-database.scm +++ b/tests/store-database.scm @@ -34,8 +34,7 @@ (test-begin "store-database") -(test-equal "register-path" - '(1 1) +(test-assert "register-path" (let ((file (string-append (%store-prefix) "/" (make-string 32 #\f) "-fake"))) (when (valid-path? %store file) @@ -46,6 +45,7 @@ (drv (string-append file ".drv"))) (call-with-output-file file (cut display "This is a fake store item.\n" <>)) + (reset-timestamps file) (register-path file #:references (list ref) #:deriver drv) @@ -69,6 +69,7 @@ (mkdir-p (string-append file "/a")) (call-with-output-file (string-append file "/a/b") (const #t)) + (reset-timestamps file) (register-path file #:deriver drv) (and (valid-path? %store file) -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:10:13 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:10:13 +0000 Received: from localhost ([127.0.0.1]:42602 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3h-0007l0-7J for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:13 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56756) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3f-0007kZ-Jy for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:11 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47541) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk3Y-0000dY-Eh; Fri, 11 Dec 2020 10:10:04 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59326 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk3X-0004Vl-0F; Fri, 11 Dec 2020 10:10:03 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 10/15] guix system: 'init' copies, resets timestamps, and deduplicates at once. Date: Fri, 11 Dec 2020 16:09:46 +0100 Message-Id: <20201211150951.18508-1-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <87h7pkffzy.fsf@inria.fr> References: <87h7pkffzy.fsf@inria.fr> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Partly fixes . * guix/build/store-copy.scm (copy-store-item): New procedure. (populate-store): Use it instead of the inline 'copy-recursively' call. * guix/scripts/system.scm (copy-item): Likewise. Pass #:reset-timestamps? and #:deduplicate? to 'register-path'. --- guix/build/store-copy.scm | 31 +++++++++++++++++++++---------- guix/scripts/system.scm | 11 ++++++++--- 2 files changed, 29 insertions(+), 13 deletions(-) diff --git a/guix/build/store-copy.scm b/guix/build/store-copy.scm index 7f0672cd9d..01e1f41870 100644 --- a/guix/build/store-copy.scm +++ b/guix/build/store-copy.scm @@ -38,6 +38,7 @@ file-size closure-size + copy-store-item populate-store)) ;;; Commentary: @@ -242,6 +243,24 @@ permissions. Write verbose output to the LOG port." stat lstat))) +(define* (copy-store-item item target + #:key + (deduplicate? #t) + (log-port (%make-void-port "w"))) + "Copy ITEM, a store item, to the store under TARGET, the target root +directory. When DEDUPLICATE? is true, deduplicate it within TARGET." + (define store + (string-append target (%store-directory))) + + (copy-recursively item (string-append target item) + #:keep-mtime? #t + #:keep-permissions? #t + #:copy-file + (if deduplicate? + (cut copy-file/deduplicate <> <> #:store store) + copy-file) + #:log log-port)) + (define* (populate-store reference-graphs target #:key (deduplicate? #t) @@ -273,16 +292,8 @@ regular files as they are copied to TARGET." (call-with-progress-reporter progress (lambda (report) (for-each (lambda (thing) - (copy-recursively thing - (string-append target thing) - #:keep-mtime? #t - #:keep-permissions? #t - #:copy-file - (if deduplicate? - (cut copy-file/deduplicate <> <> - #:store store) - copy-file) - #:log (%make-void-port "w")) + (copy-store-item thing target + #:deduplicate? deduplicate?) (report)) things))))) diff --git a/guix/scripts/system.scm b/guix/scripts/system.scm index db80e0be8f..c08929066b 100644 --- a/guix/scripts/system.scm +++ b/guix/scripts/system.scm @@ -30,6 +30,7 @@ #:use-module ((guix status) #:select (with-status-verbosity)) #:use-module (guix store) #:autoload (guix store database) (register-path) + #:autoload (guix build store-copy) (copy-store-item) #:use-module (guix describe) #:use-module (guix grafts) #:use-module (guix gexp) @@ -147,8 +148,8 @@ REFERENCES as its set of references." #:directories? #t)) (delete-file-recursively dest)) - (copy-recursively item dest - #:log (%make-void-port "w")) + (copy-store-item item target + #:deduplicate? #t) ;; Register ITEM; as a side-effect, it resets timestamps, etc. ;; Explicitly use "TARGET/var/guix" as the state directory, to avoid @@ -157,7 +158,11 @@ REFERENCES as its set of references." (unless (register-path item #:prefix target #:state-directory state - #:references references) + #:references references + + ;; Those are taken care of by 'copy-store-item'. + #:reset-timestamps? #f + #:deduplicate? #f) (leave (G_ "failed to register '~a' under '~a'~%") item target)))) -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:10:17 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:10:17 +0000 Received: from localhost ([127.0.0.1]:42607 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3k-0007lb-KV for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:16 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56768) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3i-0007kg-4k for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:14 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47545) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk3d-0000g5-0B; Fri, 11 Dec 2020 10:10:09 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59326 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk3a-0004Vl-9Y; Fri, 11 Dec 2020 10:10:08 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 13/15] database: Remove 'register-path'. Date: Fri, 11 Dec 2020 16:09:49 +0100 Message-Id: <20201211150951.18508-4-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150951.18508-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150951.18508-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) * guix/store/database.scm (register-path): Remove. * tests/store-database.scm ("register-path"): Rename to... ("register-items"): ... this, and use 'register-items' instead of 'register-path'. ("register-path, directory"): Rename to... ("register-items, directory"): ... this, and use 'register-items' instead of 'register-path'. ("register-path with unregistered references"): Rename to... ("sqlite-register with unregistered references"): ... this. --- guix/store/database.scm | 27 --------------------------- tests/store-database.scm | 15 ++++++++------- 2 files changed, 8 insertions(+), 34 deletions(-) diff --git a/guix/store/database.scm b/guix/store/database.scm index 9d5bc531bb..4579b05261 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -43,7 +43,6 @@ with-database path-id sqlite-register - register-path register-items %epoch reset-timestamps)) @@ -383,32 +382,6 @@ is true." (chmod file (if (executable-file? file) #o555 #o444))) (utime file 1 1 0 0))))) -(define* (register-path path - #:key (references '()) deriver prefix - state-directory - (schema (sql-schema))) - "Register PATH as a valid store file, with REFERENCES as its list of -references, and DERIVER as its deriver (.drv that led to it.) If PREFIX is -given, it must be the name of the directory containing the new store to -initialize; if STATE-DIRECTORY is given, it must be a string containing the -absolute file name to the state directory of the store being initialized. -Return #t on success. - -Use with care as it directly modifies the store! This is primarily meant to -be used internally by the daemon's build hook. - -PATH must be protected from GC and locked during execution of this, typically -by adding it as a temp-root." - (define db-file - (store-database-file #:prefix prefix - #:state-directory state-directory)) - - (parameterize ((sql-schema schema)) - (with-database db-file db - (register-items db (list (store-info path deriver references)) - #:prefix prefix - #:log-port (%make-void-port "w"))))) - (define %epoch ;; When it all began. (make-time time-utc 0 1)) diff --git a/tests/store-database.scm b/tests/store-database.scm index 33fd6cfbad..17eea38c63 100644 --- a/tests/store-database.scm +++ b/tests/store-database.scm @@ -20,6 +20,7 @@ #:use-module (guix tests) #:use-module (guix store) #:use-module (guix store database) + #:use-module (guix build store-copy) #:use-module ((guix utils) #:select (call-with-temporary-output-file)) #:use-module ((guix build utils) #:select (mkdir-p delete-file-recursively)) @@ -34,7 +35,7 @@ (test-begin "store-database") -(test-assert "register-path" +(test-assert "register-items" (let ((file (string-append (%store-prefix) "/" (make-string 32 #\f) "-fake"))) (when (valid-path? %store file) @@ -46,9 +47,8 @@ (call-with-output-file file (cut display "This is a fake store item.\n" <>)) (reset-timestamps file) - (register-path file - #:references (list ref) - #:deriver drv) + (with-database (store-database-file) db + (register-items db (list (store-info file drv (list ref))))) (and (valid-path? %store file) (equal? (references %store file) (list ref)) @@ -57,7 +57,7 @@ (list (stat:mtime (lstat file)) (stat:mtime (lstat ref))))))) -(test-equal "register-path, directory" +(test-equal "register-items, directory" '(1 1 1) (let ((file (string-append (%store-prefix) "/" (make-string 32 #\f) "-fake-directory"))) @@ -70,7 +70,8 @@ (call-with-output-file (string-append file "/a/b") (const #t)) (reset-timestamps file) - (register-path file #:deriver drv) + (with-database (store-database-file) db + (register-items db (list (store-info file drv '())))) (and (valid-path? %store file) (null? (references %store file)) @@ -102,7 +103,7 @@ (list (path-id db "/gnu/foo") (path-id db "/gnu/bar"))))))) -(test-assert "register-path with unregistered references" +(test-assert "sqlite-register with unregistered references" ;; Make sure we get a "NOT NULL constraint failed: Refs.reference" error ;; when we try to add references that are not registered yet. Better safe ;; than sorry. -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:10:17 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:10:17 +0000 Received: from localhost ([127.0.0.1]:42609 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3l-0007le-0D for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:17 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56750) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3f-0007kY-4B for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:16 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47543) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk3Z-0000ep-Vu; Fri, 11 Dec 2020 10:10:05 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59326 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk3Z-0004Vl-Dx; Fri, 11 Dec 2020 10:10:05 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 12/15] system: 'init' does not recompute the hash of each store item. Date: Fri, 11 Dec 2020 16:09:48 +0100 Message-Id: <20201211150951.18508-3-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150951.18508-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150951.18508-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Fixes . Previously, the 'register-path' call would re-traverse ITEM to compute its nar hash, even though that hash is already known in the initial store. This patch also avoids repeated opening/closing of the database. * guix/store/database.scm (call-with-database): Export. * guix/scripts/system.scm (copy-item): Add 'db' parameter. Call 'sqlite-register' instead of 'register-path'. (copy-closure): Remove redundant call to 'references*'. Call 'call-with-database' and pass the database to 'copy-item'. --- .dir-locals.el | 1 + guix/scripts/system.scm | 59 ++++++++++++++++++++++------------------- guix/store/database.scm | 1 + 3 files changed, 34 insertions(+), 27 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index 4eb27d8b1b..8f07a08eb5 100644 --- a/.dir-locals.el +++ b/.dir-locals.el @@ -121,6 +121,7 @@ (eval . (put 'let-system 'scheme-indent-function 1)) (eval . (put 'with-database 'scheme-indent-function 2)) + (eval . (put 'call-with-database 'scheme-indent-function 1)) (eval . (put 'call-with-transaction 'scheme-indent-function 1)) (eval . (put 'with-statement 'scheme-indent-function 3)) (eval . (put 'call-with-retrying-transaction 'scheme-indent-function 1)) diff --git a/guix/scripts/system.scm b/guix/scripts/system.scm index 0e543d9460..5427f875ec 100644 --- a/guix/scripts/system.scm +++ b/guix/scripts/system.scm @@ -29,7 +29,9 @@ #:use-module (guix ui) #:use-module ((guix status) #:select (with-status-verbosity)) #:use-module (guix store) - #:autoload (guix store database) (register-path) + #:autoload (guix base16) (bytevector->base16-string) + #:autoload (guix store database) + (sqlite-register store-database-file call-with-database) #:autoload (guix build store-copy) (copy-store-item) #:use-module (guix describe) #:use-module (guix grafts) @@ -130,12 +132,11 @@ BODY..., and restore them." (store-lift topologically-sorted)) -(define* (copy-item item references target +(define* (copy-item item info target db #:key (log-port (current-error-port))) - "Copy ITEM to the store under root directory TARGET and register it with -REFERENCES as its set of references." - (let ((dest (string-append target item)) - (state (string-append target "/var/guix"))) + "Copy ITEM to the store under root directory TARGET and populate DB with the +given INFO, a record." + (let ((dest (string-append target item))) (format log-port "copying '~a'...~%" item) ;; Remove DEST if it exists to make sure that (1) we do not fail badly @@ -151,41 +152,45 @@ REFERENCES as its set of references." (copy-store-item item target #:deduplicate? #t) - ;; Register ITEM; as a side-effect, it resets timestamps, etc. - ;; Explicitly use "TARGET/var/guix" as the state directory, to avoid - ;; reproducing the user's current settings; see - ;; . - (unless (register-path item - #:prefix target - #:state-directory state - #:references references) - (leave (G_ "failed to register '~a' under '~a'~%") - item target)))) + (sqlite-register db + #:path item + #:references (path-info-references info) + #:deriver (path-info-deriver info) + #:hash (string-append + "sha256:" + (bytevector->base16-string (path-info-hash info))) + #:nar-size (path-info-nar-size info)))) (define* (copy-closure item target #:key (log-port (current-error-port))) "Copy ITEM and all its dependencies to the store under root directory TARGET, and register them." (mlet* %store-monad ((to-copy (topologically-sorted* (list item))) - (refs (mapm %store-monad references* to-copy)) - (info (mapm %store-monad query-path-info* - (delete-duplicates - (append to-copy (concatenate refs))))) + (info (mapm %store-monad query-path-info* to-copy)) (size -> (reduce + 0 (map path-info-nar-size info)))) (define progress-bar (progress-reporter/bar (length to-copy) (format #f (G_ "copying to '~a'...") target))) + (define state + (string-append target "/var/guix")) + (check-available-space size target) - (call-with-progress-reporter progress-bar - (lambda (report) - (let ((void (%make-void-port "w"))) - (for-each (lambda (item refs) - (copy-item item refs target #:log-port void) - (report)) - to-copy refs)))) + ;; Explicitly use "TARGET/var/guix" as the state directory to avoid + ;; reproducing the user's current settings; see + ;; . + (call-with-database (store-database-file #:prefix target + #:state-directory state) + (lambda (db) + (call-with-progress-reporter progress-bar + (lambda (report) + (let ((void (%make-void-port "w"))) + (for-each (lambda (item info) + (copy-item item info target db #:log-port void) + (report)) + to-copy info)))))) (return *unspecified*))) diff --git a/guix/store/database.scm b/guix/store/database.scm index c0010b72b9..9d5bc531bb 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -39,6 +39,7 @@ #:export (sql-schema %default-database-file store-database-file + call-with-database with-database path-id sqlite-register -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:10:17 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:10:18 +0000 Received: from localhost ([127.0.0.1]:42612 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3l-0007ll-FB for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:17 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56776) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3k-0007kk-3G for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:16 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47547) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk3e-0000gS-Uq; Fri, 11 Dec 2020 10:10:10 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59326 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk3d-0004Vl-8u; Fri, 11 Dec 2020 10:10:09 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 14/15] database: Honor 'SOURCE_DATE_EPOCH'. Date: Fri, 11 Dec 2020 16:09:50 +0100 Message-Id: <20201211150951.18508-5-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150951.18508-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150951.18508-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) * guix/store/database.scm (timestamp): New procedure. (sqlite-register): Use it as the default for #:time. (register-items): Likewise for #:registeration-time. --- guix/store/database.scm | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/guix/store/database.scm b/guix/store/database.scm index 4579b05261..0a84bbddb9 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -323,8 +323,19 @@ ids of items referred to." (sqlite-fold cons '() stmt)) references))) +(define (timestamp) + "Return a timestamp, either the current time of SOURCE_DATE_EPOCH." + (match (getenv "SOURCE_DATE_EPOCH") + (#f + (current-time time-utc)) + ((= string->number seconds) + (if seconds + (make-time time-utc 0 seconds) + (current-time time-utc))))) + (define* (sqlite-register db #:key path (references '()) - deriver hash nar-size time) + deriver hash nar-size + (time (timestamp))) "Registers this stuff in DB. PATH is the store item to register and REFERENCES is the list of store items PATH refers to; DERIVER is the '.drv' that produced PATH, HASH is the base16-encoded Nix sha256 hash of @@ -337,9 +348,7 @@ Every store item in REFERENCES must already be registered." #:deriver deriver #:hash hash #:nar-size nar-size - #:time (time-second - (or time - (current-time time-utc)))))) + #:time (time-second time)))) ;; Call 'path-id' on each of REFERENCES. This ensures we get a ;; "non-NULL constraint" failure if one of REFERENCES is unregistered. (add-references db id @@ -388,7 +397,7 @@ is true." (define* (register-items db items #:key prefix - registration-time + (registration-time (timestamp)) (log-port (current-error-port))) "Register all of ITEMS, a list of records as returned by 'read-reference-graph', in DB. ITEMS must be in topological order (with -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:10:22 2020 Received: (at 44760) by debbugs.gnu.org; 11 Dec 2020 15:10:22 +0000 Received: from localhost ([127.0.0.1]:42615 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3q-0007mG-0p for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:22 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56786) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knk3l-0007kw-9Z for 44760@debbugs.gnu.org; Fri, 11 Dec 2020 10:10:17 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47548) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knk3g-0000hM-1a; Fri, 11 Dec 2020 10:10:12 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=59326 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knk3f-0004Vl-7h; Fri, 11 Dec 2020 10:10:11 -0500 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: [PATCH 15/15] deduplicate: Create the '.links' directory lazily. Date: Fri, 11 Dec 2020 16:09:51 +0100 Message-Id: <20201211150951.18508-6-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201211150951.18508-1-ludo@gnu.org> References: <87h7pkffzy.fsf@inria.fr> <20201211150951.18508-1-ludo@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: =?UTF-8?q?Ludovic=20Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) This avoids repeated (mkdir-p "/gnu/store/.links") calls when deduplicating lots of files. * guix/store/deduplication.scm (deduplicate): Remove initial call to 'mkdir-p'. Add ENOENT case in 'link' exception handler. Reindent. * tests/store-deduplication.scm ("deduplicate, ENOSPC"): Check for (<= links 4) to account for the initial 'link' call. --- guix/store/deduplication.scm | 96 ++++++++++++++++++----------------- tests/store-deduplication.scm | 2 +- 2 files changed, 51 insertions(+), 47 deletions(-) diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm index 8564f12107..a72a43bf79 100644 --- a/guix/store/deduplication.scm +++ b/guix/store/deduplication.scm @@ -159,52 +159,56 @@ under STORE." (define links-directory (string-append store "/.links")) - (mkdir-p links-directory) - (let loop ((path path) - (type (stat:type (lstat path))) - (hash hash)) - (if (eq? 'directory type) - ;; Can't hardlink directories, so hardlink their atoms. - (for-each (match-lambda - ((file . properties) - (unless (member file '("." "..")) - (let* ((file (string-append path "/" file)) - (type (match (assoc-ref properties 'type) - ((or 'unknown #f) - (stat:type (lstat file))) - (type type)))) - (loop file type - (and (not (eq? 'directory type)) - (nar-sha256 file))))))) - (scandir* path)) - (let ((link-file (string-append links-directory "/" - (bytevector->nix-base32-string hash)))) - (if (file-exists? link-file) - (replace-with-link link-file path - #:swap-directory links-directory - #:store store) - (catch 'system-error - (lambda () - (link path link-file)) - (lambda args - (let ((errno (system-error-errno args))) - (cond ((= errno EEXIST) - ;; Someone else put an entry for PATH in - ;; LINKS-DIRECTORY before we could. Let's use it. - (replace-with-link path link-file - #:swap-directory - links-directory - #:store store)) - ((= errno ENOSPC) - ;; There's not enough room in the directory index for - ;; more entries in .links, but that's fine: we can - ;; just stop. - #f) - ((= errno EMLINK) - ;; PATH has reached the maximum number of links, but - ;; that's OK: we just can't deduplicate it more. - #f) - (else (apply throw args))))))))))) + (let loop ((path path) + (type (stat:type (lstat path))) + (hash hash)) + (if (eq? 'directory type) + ;; Can't hardlink directories, so hardlink their atoms. + (for-each (match-lambda + ((file . properties) + (unless (member file '("." "..")) + (let* ((file (string-append path "/" file)) + (type (match (assoc-ref properties 'type) + ((or 'unknown #f) + (stat:type (lstat file))) + (type type)))) + (loop file type + (and (not (eq? 'directory type)) + (nar-sha256 file))))))) + (scandir* path)) + (let ((link-file (string-append links-directory "/" + (bytevector->nix-base32-string hash)))) + (if (file-exists? link-file) + (replace-with-link link-file path + #:swap-directory links-directory + #:store store) + (catch 'system-error + (lambda () + (link path link-file)) + (lambda args + (let ((errno (system-error-errno args))) + (cond ((= errno EEXIST) + ;; Someone else put an entry for PATH in + ;; LINKS-DIRECTORY before we could. Let's use it. + (replace-with-link path link-file + #:swap-directory + links-directory + #:store store)) + ((= errno ENOENT) + ;; This most likely means that LINKS-DIRECTORY does + ;; not exist. Attempt to create it and try again. + (mkdir-p links-directory) + (loop path type hash)) + ((= errno ENOSPC) + ;; There's not enough room in the directory index for + ;; more entries in .links, but that's fine: we can + ;; just stop. + #f) + ((= errno EMLINK) + ;; PATH has reached the maximum number of links, but + ;; that's OK: we just can't deduplicate it more. + #f) + (else (apply throw args))))))))))) (define (tee input len output) "Return a port that reads up to LEN bytes from INPUT and writes them to diff --git a/tests/store-deduplication.scm b/tests/store-deduplication.scm index 7b01acae24..b1c2d93bbd 100644 --- a/tests/store-deduplication.scm +++ b/tests/store-deduplication.scm @@ -95,7 +95,7 @@ (lambda () (set! link (lambda (old new) (set! links (+ links 1)) - (if (<= links 3) + (if (<= links 4) (true-link old new) (throw 'system-error "link" "~A" '("Whaaat?!") (list ENOSPC)))))) -- 2.29.2 From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 11:34:10 2020 Received: (at 44760-done) by debbugs.gnu.org; 15 Dec 2020 16:34:10 +0000 Received: from localhost ([127.0.0.1]:58941 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kpDH7-0003lT-PR for submit@debbugs.gnu.org; Tue, 15 Dec 2020 11:34:10 -0500 Received: from eggs.gnu.org ([209.51.188.92]:49210) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kpDH6-0003lG-AR for 44760-done@debbugs.gnu.org; Tue, 15 Dec 2020 11:34:08 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:53805) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kpDH1-0004vz-3n for 44760-done@debbugs.gnu.org; Tue, 15 Dec 2020 11:34:03 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=35584 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kpDGx-0001Gk-9h for 44760-done@debbugs.gnu.org; Tue, 15 Dec 2020 11:34:00 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: 44760-done@debbugs.gnu.org Subject: Re: bug#44760: [PATCH 00/15] Speed up 'guix system init' & co. References: <87h7pkffzy.fsf@inria.fr> <20201211150919.18435-1-ludo@gnu.org> Date: Tue, 15 Dec 2020 17:33:57 +0100 In-Reply-To: <20201211150919.18435-1-ludo@gnu.org> ("Ludovic =?utf-8?Q?Cou?= =?utf-8?Q?rt=C3=A8s=22's?= message of "Fri, 11 Dec 2020 16:09:10 +0100") Message-ID: <873607m3ka.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Ludovic Court=C3=A8s skribis: > serialization: 'fold-archive' notifies about directory processing > completion. > serialization: 'restore-file' sets canonical timestamp and > permissions. > nar: Deduplicate files right as they are restored. > store-copy: 'populate-store' resets timestamps. > image: 'register-closure' assumes already-reset timestamps. > database: Remove #:reset-timestamps? from 'register-items'. > store-copy: 'populate-store' can optionally deduplicate files. > image: 'register-closure' leaves it up to the caller to deduplicate. > database: Remove #:deduplicate? from 'register-items'. > guix system: 'init' copies, resets timestamps, and deduplicates at > once. > database: Remove #:deduplicate? and #:reset-timestamps? from > 'register-path'. > system: 'init' does not recompute the hash of each store item. > database: Remove 'register-path'. > database: Honor 'SOURCE_DATE_EPOCH'. > deduplicate: Create the '.links' directory lazily. Pushed as 7530e491b517497b7b8166b5ccecdc3d4cdb468d! Ludo'. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 11:38:40 2020 Received: (at 44760) by debbugs.gnu.org; 15 Dec 2020 16:38:40 +0000 Received: from localhost ([127.0.0.1]:58956 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kpDLU-0003tK-6M for submit@debbugs.gnu.org; Tue, 15 Dec 2020 11:38:40 -0500 Received: from eggs.gnu.org ([209.51.188.92]:50192) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kpDLS-0003t2-VG for 44760@debbugs.gnu.org; Tue, 15 Dec 2020 11:38:39 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:53924) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kpDLN-0005j5-Pf for 44760@debbugs.gnu.org; Tue, 15 Dec 2020 11:38:33 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=35588 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kpDLN-0000hU-D6 for 44760@debbugs.gnu.org; Tue, 15 Dec 2020 11:38:33 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: 44760@debbugs.gnu.org Subject: Re: bug#44760: Closure copy in =?utf-8?Q?=E2=80=98guix?= system =?utf-8?Q?init=E2=80=99?= is inefficient References: <87h7pkffzy.fsf@inria.fr> Date: Tue, 15 Dec 2020 17:38:32 +0100 In-Reply-To: <87h7pkffzy.fsf@inria.fr> ("Ludovic =?utf-8?Q?Court=C3=A8s=22?= =?utf-8?Q?'s?= message of "Fri, 20 Nov 2020 12:02:25 +0100") Message-ID: <87y2hzkos7.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Ludovic Court=C3=A8s skribis: > =E2=80=98guix system init=E2=80=99 ends by copying the system=E2=80=99s c= losure from the =E2=80=9Chost=E2=80=9D > store to the target store; it also initializes the database of that > target store. There=E2=80=99s one =E2=80=98guix system init=E2=80=99 optimization that wo= uld be nice to have in the installation image: instead of copying store items at all, simply move the store from its temporary location in /mnt/tmp/guix-inst to /mnt/gnu/store. That=E2=80=99d make it zero-copy. It would probably require a separate code path, but it=E2=80=99s worth considering. Thoughts? Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 16 16:53:31 2020 Received: (at 44760) by debbugs.gnu.org; 16 Dec 2020 21:53:31 +0000 Received: from localhost ([127.0.0.1]:34100 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kpejj-0007Q6-6n for submit@debbugs.gnu.org; Wed, 16 Dec 2020 16:53:31 -0500 Received: from mout.web.de ([212.227.15.4]:46509) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kpejh-0007Pp-52 for 44760@debbugs.gnu.org; Wed, 16 Dec 2020 16:53:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=dbaedf251592; t=1608155603; bh=+1ACPWVEinG3Xk3fiyHqQsFAkSPQQG7xuMzKJ+16/m4=; h=X-UI-Sender-Class:To:From:Subject:Date; b=L28BRVVtoWpP5RcPu2+qrh+3jlII/sizXFxX6hRnKwCwLYtXFkJQZXt9bE4tEdh5e cVkk5vblCkgoBSlGZS9BFv6ZPdCSLrLNyV02qmnookJSaSgSueY6RyBapCYEX4C4/m 5yPXxjEstn7x+87wtL66WeaOxZS6tXR3FYUcR/YI= X-UI-Sender-Class: c548c8c5-30a9-4db5-a2e7-cb6cb037b8f9 Received: from [192.168.178.74] ([5.146.192.224]) by smtp.web.de (mrweb005 [213.165.67.108]) with ESMTPSA (Nemesis) id 1MIL0C-1kvDBG4AD6-00En1g for <44760@debbugs.gnu.org>; Wed, 16 Dec 2020 22:53:23 +0100 To: 44760@debbugs.gnu.org From: Jonathan Brielmaier Subject: =?UTF-8?Q?Closure_copy_in_=e2=80=98guix_system_init=e2=80=99_is_ine?= =?UTF-8?Q?fficient?= Message-ID: Date: Wed, 16 Dec 2020 22:53:22 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Icedove/78.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-GB X-Provags-ID: V03:K1:9A8LxnezX/koySqFRHaejzY3X5NWzoYAeb9no2c89JZC3pBssZH B/wTS+4vaVCRUYBScIOtJsV8JCNrQ56vDmrWpnl/6EVYlyOBpuuzFEZ1nX/XG4LdlFidgC9 xjy0scvunM4KccUGthjKWutgD2aZaprD6sLC9LGSSZka9bXD2x2ua8y5EUDu2rx3u27Qlqr VcIKX8Bzs3YV8sZr/V+Yw== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:TtgurN567ak=:YeTdkzkb2TXFJ0Y+yPkcJE vAS2JKmMIRDDOw8lYvSsJLHCQVU4RbsUbJJn13O2MODbEEjRzqlb+2MaZ08lWAoI4NHWwO9vK siRlm/IrmcUG0Zc+J+nzsGorWRpeU+CpLdzxjTeguVW/KoRbuT31RA0BeJuo+hsvpOfRB/56T ZR0gcQhXaacbDYTgRTkh19S+GShnVgrVjT5tyn8Z7n1/o1asv/GQzZugN6fIL+iW9Lcolzfg7 QkFJZlH3fv7sMM3ZINi/Qozz0Xmry9q2i7NJf85CSj4CaD/jmnaEFA/+q62tiLGetMoUxBxvs oTBAk2/KgyDKccFb1IVSHVLE2H2vVqlEK1kthuDxRu8VrqWyk25z41RHz78zUsfvvNoNJVlwY HheUdAesapMxxJV5uHpz1cdpEOlx1+AoubM3t2Hg/cGwUALRBZ688l6xy4oX6bhyUDaGw4aS/ XL/MTzxKtUkiv3oZrIl3LyvCQ5bRW8F769aR5LD1ANNUp23aeu0Wa8tzBxZARlTRcf9ChhAYD v2PlsDZX6P6k6oKk5iy/IJ0YmAuoswkNKJh35EiSP2hIRuCLt+55431MWO2ogWzjPTYyaz8nr JmakRjwogjUp6wdrUUud/VzTFeFQ13wPGqbh5MfwmISURMsxlH1MZ/+7If9MYqoTkoKiaXXBb Ou85Nson61IiD4P4EsLkNI7pxU9QjwyfCnpM/mO54vrmA4B6c+epl+T2CG5IwkPusZ955p/DU 5YW2c42gEsQEiYmYGeL7e8b2E70qAhD7E/J+XxTdbXmjElyQIXbuQP6NGEqFnE1VZHTdsZLYa vfru4JTtIOOrGHczF5wvQxP8u2ABDGpOSz2vN0oGFYYuhalvUSvPwTGlIlOCgPUmQN4aoLM8g a7PMo/4z9rp6LNKPzv2Og3YJgd4L2rXXYAEgphGkw= X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 44760 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) Hi Ludo, there seems to be another regression in this series. Pulling from 6a060ff27ff68384d7c90076baa36c349fff689d gives: ``` [ 1/20] Loading './guix/base32.scm'... [ 2/20] Loading './guix/build/store-copy.scm'... [ 3/20] Loading './guix/build/syscalls.scm'... [ 4/20] Loading './guix/build/utils.scm'... [ 5/20] Loading './guix/combinators.scm'... [ 6/20] Loading './guix/progress.scm'... [ 7/20] Loading './guix/records.scm'... [ 8/20] Loading './guix/serialization.scm'... [ 9/20] Loading './guix/sets.scm'... [10/20] Loading './guix/store/deduplication.scm'... ;;; Failed to autoload copy-file/deduplicate in (guix store deduplication): ;;; no code for module (gcrypt hash) Backtrace: 17 (primitive-load "/gnu/store/1fwvy0fz6zfgnyq1mj5pn8mfin7?") In ice-9/eval.scm: 619:8 16 (_ #f) In srfi/srfi-1.scm: 460:18 15 (fold # ?) 460:18 14 (fold # ?) 460:18 13 (fold # ?) In ice-9/eval.scm: 619:8 12 (_ #(#(#) # ?)) In ice-9/boot-9.scm: 2806:4 11 (save-module-excursion #) In unknown file: 10 (primitive-load "./guix/store/deduplication.scm") In ice-9/eval.scm: 721:20 9 (primitive-eval (define-module (guix store #) # (# #) ?)) In ice-9/psyntax.scm: 1241:36 8 (expand-top-sequence ((define-module (guix store #) ?)) ?) 1233:19 7 (parse _ (("placeholder" placeholder)) ((top) #(# # ?)) ?) 285:10 6 (parse _ (("placeholder" placeholder)) (()) _ c&e (eval) ?) In ice-9/eval.scm: 293:34 5 (_ #) In ice-9/boot-9.scm: 3380:4 4 (define-module* _ #:filename _ #:pure _ #:version _ # _ ?) 2565:24 3 (call-with-deferred-observers #) 3393:24 2 (_) 222:17 1 (map1 (((gcrypt hash)) ((guix build utils)) ((guix ?)) ?)) 3300:6 0 (resolve-interface (gcrypt hash) #:select _ #:hide _ # _ ?) ice-9/boot-9.scm:3300:6: In procedure resolve-interface: no code for module (gcrypt hash) ``` All commits before in this series are fine. The fix in d88ff09ea3138fc85c1463b0b345bd6ba71ca568 does not really help. From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 17 08:24:51 2020 Received: (at 44760) by debbugs.gnu.org; 17 Dec 2020 13:24:51 +0000 Received: from localhost ([127.0.0.1]:35068 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kptH1-00024L-63 for submit@debbugs.gnu.org; Thu, 17 Dec 2020 08:24:51 -0500 Received: from eggs.gnu.org ([209.51.188.92]:49384) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kptGz-000246-Ds for 44760@debbugs.gnu.org; Thu, 17 Dec 2020 08:24:50 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:51912) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kptGt-0000T9-TM; Thu, 17 Dec 2020 08:24:43 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=39718 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kptGs-0004pb-7B; Thu, 17 Dec 2020 08:24:43 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Jonathan Brielmaier Subject: Re: bug#44760: Closure copy in =?utf-8?Q?=E2=80=98guix?= system =?utf-8?Q?init=E2=80=99?= is inefficient References: <87h7pkffzy.fsf@inria.fr> Date: Thu, 17 Dec 2020 14:24:41 +0100 In-Reply-To: (Jonathan Brielmaier's message of "Wed, 16 Dec 2020 22:53:22 +0100") Message-ID: <87v9d0imzq.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44760 Cc: 44760@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi, Jonathan Brielmaier skribis: > there seems to be another regression in this series. > > Pulling from 6a060ff27ff68384d7c90076baa36c349fff689d gives: > ``` > [ 1/20] Loading './guix/base32.scm'... [...] > [10/20] Loading './guix/store/deduplication.scm'... > > ;;; Failed to autoload copy-file/deduplicate in (guix store deduplication= ): > > ;;; no code for module (gcrypt hash) > > Backtrace: > > 17 (primitive-load "/gnu/store/1fwvy0fz6zfgnyq1mj5pn8mfin7?") > > In ice-9/eval.scm: > > 619:8 16 (_ #f) > > In srfi/srfi-1.scm: > > 460:18 15 (fold # ?) > > 460:18 14 (fold # ?) > > 460:18 13 (fold # ?) > > In ice-9/eval.scm: > > 619:8 12 (_ #(#(#) # ?)) > > In ice-9/boot-9.scm: > > 2806:4 11 (save-module-excursion #) > > In unknown file: > > 10 (primitive-load "./guix/store/deduplication.scm") > > In ice-9/eval.scm: > > 721:20 9 (primitive-eval (define-module (guix store #) # (# #) ?)) > > In ice-9/psyntax.scm: > > 1241:36 8 (expand-top-sequence ((define-module (guix store #) ?)) ?) > > 1233:19 7 (parse _ (("placeholder" placeholder)) ((top) #(# # ?)) ?) > > 285:10 6 (parse _ (("placeholder" placeholder)) (()) _ c&e (eval) ?) > > In ice-9/eval.scm: > > 293:34 5 (_ #) > > In ice-9/boot-9.scm: > > 3380:4 4 (define-module* _ #:filename _ #:pure _ #:version _ # _ ?) > > 2565:24 3 (call-with-deferred-observers #) > > 3393:24 2 (_) > > 222:17 1 (map1 (((gcrypt hash)) ((guix build utils)) ((guix ?)) ?)) > > 3300:6 0 (resolve-interface (gcrypt hash) #:select _ #:hide _ # _ ?) > > > > ice-9/boot-9.scm:3300:6: In procedure resolve-interface: > > no code for module (gcrypt hash) As discussed on IRC, the following command works for me: guix time-machine \ --commit=3D6a060ff27ff68384d7c90076baa36c349fff689d -- \ pull -p /tmp/test However, I don=E2=80=99t have enough information about the issue. Could you provide information on how to reproduce the issue? Thanks in advance, Ludo=E2=80=99. From unknown Mon Sep 08 15:07:26 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 15 Jan 2021 12:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator