Package: guix-patches;
Reported by: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Date: Thu, 4 Jan 2024 15:18:01 UTC
Severity: normal
Tags: patch
Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Bug is archived. No further changes may be made.
Message #58 received at 68242 <at> debbugs.gnu.org (full text, mbox):
From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com> To: Ludovic Courtès <ludo <at> gnu.org> Cc: 68242 <at> debbugs.gnu.org Subject: Re: bug#68242: [core-updates] Compress man pages using zstd Date: Mon, 08 Jan 2024 20:17:51 -0500
Hi Ludovic! Ludovic Courtès <ludo <at> gnu.org> writes: > Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis: > >> The aim is to improve the efficiency of computing the man pages database, >> which must decompress the man pages. Zstd is faster than gzip, especially for >> decompression, and has a similar compression ratio. >> >> * gnu/packages/commencement.scm (%final-inputs): Add zstd. >> * guix/build/gnu-build-system.scm >> (compress-documentation) Update doc. >> <info-compressor, info-compressor-flags, man-compressor, man-compressor-flags> >> <man-compressor-file-extension>: New arguments. >> <compressed-documentation-extension>: Rename argument to... >> <info-compressor-file-extension>: ... this. Add an 'extension' argument to >> the retarget-symlink nested procedure. Use new arguments in nested >> 'maybe-compress' procedure. >> >> Change-Id: Ibaad4658f8e5151633714d263d9198f56d255020 > > That’s a great idea, LGTM! Thank you for the review! > Do you have figures on the space savings of a package with many man > pages such as gnutls:doc or openssl:doc? Surprisingly, all of these I've checked used the weighed the same. Here's gnutls:doc from my local (master) Guix: --8<---------------cut here---------------start------------->8--- $ du -sh /gnu/store/8i3bas6lhziqi2n5wg6qzzhlddkb502c-gnutls-3.7.7-doc 4,9M /gnu/store/8i3bas6lhziqi2n5wg6qzzhlddkb502c-gnutls-3.7.7-doc --8<---------------cut here---------------end--------------->8--- Compared to core-updates with these changes: --8<---------------cut here---------------start------------->8--- $ du -sh /gnu/store/h3lbj1g64lkn9rd9xp86dphqnblxqkl6-gnutls-3.8.1-doc 4.9M /gnu/store/h3lbj1g64lkn9rd9xp86dphqnblxqkl6-gnutls-3.8.1-doc --8<---------------cut here---------------end--------------->8--- That's because all the compressed man pages appear to fit in the minimal 4 KiB size of a single file, whether they are compressed with gzip or zstd compressed. Both man-pages packages weigh 11 MiB, but we can get an idea of the compression ratio using: With my local Guix: --8<---------------cut here---------------start------------->8--- $ find $(guix build man-pages) -name '*.gz' | xargs -n1 du | sort -rn | head -n20 64 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man5/proc.5.gz 44 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/bpf-helpers.7.gz 32 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/perf_event_open.2.gz 28 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/ptrace.2.gz 20 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/tcp.7.gz 20 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/cgroups.7.gz 20 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/seccomp_unotify.2.gz 20 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/prctl.2.gz 20 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/open.2.gz 20 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/futex.2.gz 20 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/fcntl.2.gz 16 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/user_namespaces.7.gz 16 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/socket.7.gz 16 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/man-pages.7.gz 16 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/ip.7.gz 16 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/cpuset.7.gz 16 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/capabilities.7.gz 16 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man5/elf.5.gz 16 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/seccomp.2.gz 16 /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/keyctl.2.gz --8<---------------cut here---------------end--------------->8--- On core-updates with these changes: --8<---------------cut here---------------start------------->8--- $ find /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02 -name '*.zst' | xargs -n1 du | sort -rn | head -n20 56 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man5/proc.5.zst 36 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/bpf-helpers.7.zst 28 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/perf_event_open.2.zst 24 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/ptrace.2.zst 20 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/tcp.7.zst 20 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/seccomp_unotify.2.zst 20 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/prctl.2.zst 20 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/futex.2.zst 20 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/fcntl.2.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/user_namespaces.7.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/man-pages.7.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/ip.7.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/cpuset.7.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/cgroups.7.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/capabilities.7.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man5/elf.5.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/seccomp.2.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/open.2.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/keyctl.2.zst 16 /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/clone.2.zst --8<---------------cut here---------------end--------------->8--- So for larger man pages, it seems we're talking about a 10% improvement. That's not much, but the decompression is more efficient: Compare gzipped man-pages decompression: --8<---------------cut here---------------start------------->8--- $ find /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02 -name '*.gz' | sh -c 'time xargs gunzip -ck > /dev/null' real 0m0.137s user 0m0.106s sys 0m0.032s $ find /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02 -name '*.gz' | sh -c 'time xargs gunzip -ck > /dev/null' real 0m0.137s user 0m0.104s sys 0m0.035s $ find /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02 -name '*.gz' | sh -c 'time xargs gunzip -ck > /dev/null' real 0m0.138s user 0m0.103s sys 0m0.036s --8<---------------cut here---------------end--------------->8--- With zstd' man-pages decompression: --8<---------------cut here---------------start------------->8--- $ find /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02 -name '*.zst' | sh -c 'time xargs zstd -dkc > /dev/null' real 0m0.091s user 0m0.033s sys 0m0.059s $ find /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02 -name '*.zst' | sh -c 'time xargs zstd -dkc > /dev/null' real 0m0.091s user 0m0.035s sys 0m0.058s $ find /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02 -name '*.zst' | sh -c 'time xargs zstd -dkc > /dev/null' real 0m0.090s user 0m0.027s sys 0m0.063s --8<---------------cut here---------------end--------------->8--- Assuming guile-zstd fares as well as zstd itself, we're looking at 1.5x faster decompression. Past measurements though had suggested the decompression was not the limiting thing in making man-pages faster; rather it had to do with building the database with Guile (sorry, I can't find a reference to it anymore). -- Thanks, Maxim
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.