GNU bug report logs - #51427
[PATCH] nix: libstore: Do not remove unused links when deleting specific items.

Previous Next

Package: guix-patches;

Reported by: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Date: Wed, 27 Oct 2021 03:50:02 UTC

Severity: normal

Tags: patch

Full log


View this message in rfc822 format

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 51427 <at> debbugs.gnu.org, Liliana Marie Prikler <liliana.prikler <at> gmail.com>
Subject: [bug#51427] [PATCH] nix: libstore: Do not remove unused links when deleting specific items.
Date: Mon, 08 Nov 2021 23:11:22 -0500
Hi,

Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi Maxim and all,
>
> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
>
>> Ludovic Courtès <ludo <at> gnu.org> writes:
>
> [...]
>
>>> You seem to be proposing to remove ‘-D’ altogether.  I agree it has the
>>> shortcomings you write, but I think it’s occasionally useful
>>> nonetheless.
>>>
>>> My proposal would be either the status quo, or removing just the one
>>> link that matters from /gnu/store/.links upon ‘-D’.
>>
>> The second proposal makes sense.
>
> Maybe that proposal is bogus though because you’d need to know the hash
> of the files being removed, which means reading them…

Oops :-).

>> I didn't care about freeing space, as my use case was getting around
>> corrupting an item in my store due to
>> https://issues.guix.gnu.org/51400, which the patch proposed here
>> allowed me to do without wasting hours of cleaning up links (nearly 1
>> GiB of store on spinning drives).
>
> The ideal solution as zimoun writes would be to address
> <https://issues.guix.gnu.org/24937>.

Seems there's some improvement ready, but which needs more
testing/measurements?  I'd suggest simply invoking GNU sort; if it has
many pages of program for doing what it does, it's probably doing
something fancier/faster than we can (are ready to) emulate -- for free!

> Perhaps that phase needs to be implemented using a different strategy,
> say an sqlite database that records the current link count (hoping that
> ‘SELECT * FROM links WHERE NLINKS = 1’ would be faster than traversing
> all of ‘.links’) as well as a mapping from store item to file hashes.

Hmm.  I'll need to dive in the problem a bit more before I can comment
on this.

> BTW, those using Btrfs can probably use ‘--disable-deduplication’ and be
> done with it.

I erroneously used to think that Btrfs could do live deduplication, but
it doesn't.  There are external tools to do out of band / batch
deduplication though [0]; so if they perform better than the guix daemon's
own dedup, perhaps we could document this way out for our Btrfs users.

[0]  https://btrfs.wiki.kernel.org/index.php/Deduplication

Thank you,

Maxim




This bug report was last modified 2 years and 1 day ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.