GNU bug report logs - #45676
Store references inside compressed data

Previous Next

Package: guix;

Reported by: Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>

Date: Tue, 5 Jan 2021 14:44:02 UTC

Severity: wishlist

To reply to this bug, email your comments to 45676 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Tue, 05 Jan 2021 14:44:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Tue, 05 Jan 2021 14:44:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>
To: bug-guix <at> gnu.org
Subject: Store references inside compressed data
Date: Tue, 05 Jan 2021 15:36:07 +0100
There are several binary formats that allow compression of the
executable image, or some of its data, which is decompress at runtime:

  - Kernel images.
  - Compressed libraries: e.g. Smalltalk modules.
  - Compressed executable or data files: e.g. library.el.gz.

These aren't taken into account by the grafting process, which may lead
to issues when store paths are located inside that kind of files.




Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Tue, 05 Jan 2021 20:23:01 GMT) Full text and rfc822 format available.

Message #8 received at 45676 <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>
Cc: 45676 <at> debbugs.gnu.org
Subject: Re: bug#45676: Store references inside compressed data
Date: Tue, 5 Jan 2021 15:22:10 -0500
On Tue, Jan 05, 2021 at 03:36:07PM +0100, Miguel Ángel Arruga Vivas wrote:
> There are several binary formats that allow compression of the
> executable image, or some of its data, which is decompress at runtime:
> 
>   - Kernel images.
>   - Compressed libraries: e.g. Smalltalk modules.
>   - Compressed executable or data files: e.g. library.el.gz.
> 
> These aren't taken into account by the grafting process, which may lead
> to issues when store paths are located inside that kind of files.

It's a serious problem, and not just because of grafting. These obscured
references can cause things to be garbage collected inappropriately.

Here is an older case of the same problem:

https://bugs.gnu.org/24703

It was resolved by patching GCC.




Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Tue, 05 Jan 2021 20:23:02 GMT) Full text and rfc822 format available.

Message #11 received at 45676 <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>
Cc: 45676 <at> debbugs.gnu.org
Subject: Re: bug#45676: Store references inside compressed data
Date: Tue, 5 Jan 2021 15:22:33 -0500
On Tue, Jan 05, 2021 at 03:36:07PM +0100, Miguel Ángel Arruga Vivas wrote:
> There are several binary formats that allow compression of the
> executable image, or some of its data, which is decompress at runtime:
> 
>   - Kernel images.
>   - Compressed libraries: e.g. Smalltalk modules.
>   - Compressed executable or data files: e.g. library.el.gz.
> 
> These aren't taken into account by the grafting process, which may lead
> to issues when store paths are located inside that kind of files.

If you have specific instances of this type of bug, please report them.




Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Tue, 05 Jan 2021 22:35:02 GMT) Full text and rfc822 format available.

Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Tobias Geerinckx-Rice <me <at> tobias.gr>
To: Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>
Cc: 45676 <at> debbugs.gnu.org, bug-guix <at> gnu.org
Subject: Re: bug#45676: Store references inside compressed data
Date: Tue, 05 Jan 2021 23:33:59 +0100
[Message part 1 (text/plain, inline)]
Hi!

Miguel Ángel Arruga Vivas wrote:
> These aren't taken into account by the grafting process, which 
> may lead
> to issues when store paths are located inside that kind of 
> files.

It's true.  It's a known trade-off of an otherwise 
almost-zero-effort yet fast reference scanner.  I don't think it's 
a bug per se, but it is something of which to be aware.  I also 
think this trade-off is worth it.

Luckily, this case is easier to fix than the infamous 
<http://issues.guix.gnu.org/24703>, because the right solution is 
simple:

>   - Compressed libraries: e.g. Smalltalk modules.
>   - Compressed executable or data files: e.g. library.el.gz.

Let's stop installing compressed executables & data files.  We 
already avoid compressed .jars and other renamed zip files.  It 
ain't right.

It's not 1998, my hard drive isn't 1.1GB, and I didn't just 
reinstall Slackware because I ‘accidentally’ gzexe'd gzip.

Gzipping a tiny handful of Lisp or Smalltalk files is pointless 
when zstd {,de}compresses my entire 500GB SSD better and faster, 
at the file system level where it now squarely belongs.  Without 
breaking Guix.

Kind regards,

T G-R
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Tue, 05 Jan 2021 22:35:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Wed, 06 Jan 2021 08:55:01 GMT) Full text and rfc822 format available.

Message #20 received at 45676 <at> debbugs.gnu.org (full text, mbox):

From: Leo Prikler <leo.prikler <at> student.tugraz.at>
To: Tobias Geerinckx-Rice <me <at> tobias.gr>, Miguel Ángel
 Arruga Vivas <rosen644835 <at> gmail.com>
Cc: 45676 <at> debbugs.gnu.org
Subject: Re: bug#45676: Store references inside compressed data
Date: Wed, 06 Jan 2021 09:54:25 +0100
[Message part 1 (text/plain, inline)]
Hi!
Am Dienstag, den 05.01.2021, 23:33 +0100 schrieb Tobias Geerinckx-Rice:
> Let's stop installing compressed executables & data files.  We 
> already avoid compressed .jars and other renamed zip files.  It 
> ain't right.
Would this be strictly necessary even if the same references are kept
through other files, e.g. uncompressed binaries?
I'll attach a patch, that fixes Emacs just in case.

Regards, Leo
[0001-gnu-emacs-Don-t-install-compressed-archives.patch (text/x-patch, attachment)]

Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Wed, 06 Jan 2021 11:36:02 GMT) Full text and rfc822 format available.

Message #23 received at 45676 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Leo Famulari <leo <at> famulari.name>
Cc: 45676 <at> debbugs.gnu.org,
 Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>
Subject: Re: bug#45676: Store references inside compressed data
Date: Wed, 06 Jan 2021 12:35:34 +0100
Hi,

Leo Famulari <leo <at> famulari.name> skribis:

> On Tue, Jan 05, 2021 at 03:36:07PM +0100, Miguel Ángel Arruga Vivas wrote:
>> There are several binary formats that allow compression of the
>> executable image, or some of its data, which is decompress at runtime:
>> 
>>   - Kernel images.
>>   - Compressed libraries: e.g. Smalltalk modules.
>>   - Compressed executable or data files: e.g. library.el.gz.
>> 
>> These aren't taken into account by the grafting process, which may lead
>> to issues when store paths are located inside that kind of files.
>
> If you have specific instances of this type of bug, please report them.

Agreed.  The general issue is “well known” as we say, but what I think
we need to do is look for specific instances and address them.

Ludo’.




Severity set to 'wishlist' from 'normal' Request was from Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 06 Jan 2021 15:04:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Wed, 06 Jan 2021 16:58:02 GMT) Full text and rfc822 format available.

Message #28 received at 45676 <at> debbugs.gnu.org (full text, mbox):

From: Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 45676 <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#45676: Store references inside compressed data
Date: Wed, 06 Jan 2021 17:57:43 +0100
Hi Ludo and Leo,

Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi,
>
> Leo Famulari <leo <at> famulari.name> skribis:
>
>> On Tue, Jan 05, 2021 at 03:36:07PM +0100, Miguel Ángel Arruga Vivas wrote:
>>> There are several binary formats that allow compression of the
>>> executable image, or some of its data, which is decompress at runtime:
>>> 
>>>   - Kernel images.
>>>   - Compressed libraries: e.g. Smalltalk modules.
>>>   - Compressed executable or data files: e.g. library.el.gz.
>>> 
>>> These aren't taken into account by the grafting process, which may lead
>>> to issues when store paths are located inside that kind of files.
>>
>> If you have specific instances of this type of bug, please report them.
>
> Agreed.  The general issue is “well known” as we say, but what I think
> we need to do is look for specific instances and address them.

It can be tagged it notabug if you consider so.  I've tagged it as
wishlist (I should have been done it before) for that reason (it's "well
known"), but I haven't found any specific instance yet.  OTOH, I think
it might be closely related to #33848, as the solution for both issues
could be solved by the extension on the dumpPath code path---or the
Scheme implementation equivalent, as pointed there.

Happy hacking!
Miguel




Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Wed, 06 Jan 2021 18:42:01 GMT) Full text and rfc822 format available.

Message #31 received at 45676 <at> debbugs.gnu.org (full text, mbox):

From: Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>
To: Tobias Geerinckx-Rice <me <at> tobias.gr>
Cc: 45676 <at> debbugs.gnu.org
Subject: Re: bug#45676: Store references inside compressed data
Date: Wed, 06 Jan 2021 19:40:55 +0100
Hi!

Tobias Geerinckx-Rice <me <at> tobias.gr> writes:

> It's true.  It's a known trade-off of an otherwise almost-zero-effort
> yet fast reference scanner.  I don't think it's a bug per se, but it
> is something of which to be aware.
>
> Let's stop installing compressed executables & data files.  We already
> avoid compressed .jars and other renamed zip files.

This is the current trade-off between build time and closure size for
executable code, but it isn't the current status regarding data files.

> Gzipping a tiny handful of Lisp or Smalltalk files is pointless when
> zstd {,de}compresses my entire 500GB SSD better and faster, at the
> file system level where it now squarely belongs.

Not every system has a file system with compression, nor most of us
mortals have a SSD to test that. ;-)

> Without breaking Guix.

Software bugs are related to the number of lines, and this probably
would end up adding more, so I get that idea, hehe. :-P

With your proposal closures wouldn't benefit from the "standard tricks"
used by package maintainers to reduce their footprint for uncompressed
file systems.  Having an option to remove that compression seems best
for treating it at the file system level---perhaps only some wrappers
for the compression tools to use always -0 could do most of the
trick---but I'd still like to have the option of paying at build/graft
time the storage savings.  Of course, this is still only a wish.

Happy hacking!
Miguel




Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Thu, 07 Jan 2021 11:06:02 GMT) Full text and rfc822 format available.

Message #34 received at 45676 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>
Cc: 45676 <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#45676: Store references inside compressed data
Date: Thu, 07 Jan 2021 12:05:30 +0100
Howdy,

Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Hi,
>>
>> Leo Famulari <leo <at> famulari.name> skribis:
>>
>>> On Tue, Jan 05, 2021 at 03:36:07PM +0100, Miguel Ángel Arruga Vivas wrote:
>>>> There are several binary formats that allow compression of the
>>>> executable image, or some of its data, which is decompress at runtime:
>>>> 
>>>>   - Kernel images.
>>>>   - Compressed libraries: e.g. Smalltalk modules.
>>>>   - Compressed executable or data files: e.g. library.el.gz.
>>>> 
>>>> These aren't taken into account by the grafting process, which may lead
>>>> to issues when store paths are located inside that kind of files.
>>>
>>> If you have specific instances of this type of bug, please report them.
>>
>> Agreed.  The general issue is “well known” as we say, but what I think
>> we need to do is look for specific instances and address them.
>
> It can be tagged it notabug if you consider so.  I've tagged it as
> wishlist (I should have been done it before) for that reason (it's "well
> known"), but I haven't found any specific instance yet.  OTOH, I think
> it might be closely related to #33848, as the solution for both issues
> could be solved by the extension on the dumpPath code path---or the
> Scheme implementation equivalent, as pointed there.

Yes, though I’d prefer simple workarounds if possible—after all, we’ve
lived with it since the beginning and there’s only ever been a handful
of instances of that problem (one of them was really tricky, see
‘gcc-strmov-store-file-names.patch’…).

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Thu, 14 Jan 2021 21:32:02 GMT) Full text and rfc822 format available.

Message #37 received at 45676 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Leo Prikler <leo.prikler <at> student.tugraz.at>
Cc: 45676 <at> debbugs.gnu.org, Tobias Geerinckx-Rice <me <at> tobias.gr>,
 Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>
Subject: Re: bug#45676: Store references inside compressed data
Date: Thu, 14 Jan 2021 22:31:30 +0100
Hi Leo,

Leo Prikler <leo.prikler <at> student.tugraz.at> skribis:

> From 57c23bf6ecac79c397cb49ff251176ec3a7b1cf5 Mon Sep 17 00:00:00 2001
> From: Leo Prikler <leo.prikler <at> student.tugraz.at>
> Date: Wed, 6 Jan 2021 09:24:07 +0100
> Subject: [PATCH] gnu: emacs: Don't install compressed archives.
>
> See <http://issues.guix.gnu.org/45676#3>.

Perhaps make it a comment next to the option.

> * gnu/packages/emacs.scm (emacs)[#:configure-flags]:
> Add --without-compress-install.
> (emacs-minimal)[#:configure-flags]: Likewise.

[...]

> +                               "--without-compress-install"

Does that disable .el file compression altogether for Emacs’ own files?

If so, isn’t it too much?  Do these file currently contain store file
names?

(I know EMMS .el files for instance are full of store file names, so
that one should definitely not be gzipped, but Emacs itself may be
fine?)

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#45676; Package guix. (Thu, 14 Jan 2021 22:25:02 GMT) Full text and rfc822 format available.

Message #40 received at 45676 <at> debbugs.gnu.org (full text, mbox):

From: Leo Prikler <leo.prikler <at> student.tugraz.at>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 45676 <at> debbugs.gnu.org, Tobias Geerinckx-Rice <me <at> tobias.gr>,
 Miguel Ángel Arruga Vivas <rosen644835 <at> gmail.com>
Subject: Re: bug#45676: Store references inside compressed data
Date: Thu, 14 Jan 2021 23:24:18 +0100
Hi Ludo,

Am Donnerstag, den 14.01.2021, 22:31 +0100 schrieb Ludovic Courtès:
> Hi Leo,
> 
> Leo Prikler <leo.prikler <at> student.tugraz.at> skribis:
> 
> > From 57c23bf6ecac79c397cb49ff251176ec3a7b1cf5 Mon Sep 17 00:00:00
> > 2001
> > From: Leo Prikler <leo.prikler <at> student.tugraz.at>
> > Date: Wed, 6 Jan 2021 09:24:07 +0100
> > Subject: [PATCH] gnu: emacs: Don't install compressed archives.
> > 
> > See <http://issues.guix.gnu.org/45676#3>;.
> 
> Perhaps make it a comment next to the option.
I'll keep that in mind, but I wasn't going to commit this unless it is
absolutely needed.

> > * gnu/packages/emacs.scm (emacs)[#:configure-flags]:
> > Add --without-compress-install.
> > (emacs-minimal)[#:configure-flags]: Likewise.
> 
> [...]
> 
> > +                               "--without-compress-install"
> 
> Does that disable .el file compression altogether for Emacs’ own
> files?
> 
> If so, isn’t it too much?  Do these file currently contain store file
> names?
> 
> (I know EMMS .el files for instance are full of store file names, so
> that one should definitely not be gzipped, but Emacs itself may be
> fine?)
As far as I know, this is an all or nothing deal.  If I'm not mistaken,
however, all those references should still exist in the compiled (and
not compressed) .go files however, hence it making little difference. 
Perhaps time stamps could be added during compression, but I think our
Emacs reproducibility issues lie elsewhere as well.

All in all, I don't think there's a technical reason to do this (yet),
merely the somewhat purist stance of "no compressed source files".

Regards,
Leo





This bug report was last modified 4 years and 152 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.