GNU bug report logs -
#67022
Gzip decompression can be 60% faster using zlib's CRC32
Previous Next
Reported by: Young Mo Kang <kym327 <at> gmail.com>
Date: Thu, 9 Nov 2023 17:42:01 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Mon, 10 Feb 2025 23:46:40 -0800
with message-id <04c457fc-c804-4da9-b09f-1a45e10e8808 <at> cs.ucla.edu>
and subject line Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
has caused the debbugs.gnu.org bug report #67022,
regarding Gzip decompression can be 60% faster using zlib's CRC32
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
67022: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=67022
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
Hello,
I have noticed that GNU Gzip's CRC32 calculation is the main bottleneck
in decompression, and it can run significantly faster >60% if we replace
it with crc32 function from zlib.
I tested decompression speed of linux source code tar.gz file before and
after replacing CRC32 computation. On an AMD 7735HS system, I get
GNU Gzip unmodified
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:05.11
GNU Gzip with CRC32 from zlib
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.16
And I saw even better performance improvement when tested on an Apple
Silicon M1 system.
GNU Gzip unmodified
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.83
GNU Gzip with CRC32 from zlib
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.72
Since both GNU Gzip and zlib are written by the same authors, I was
wondering if GNU Gzip can share zlib's CRC32 calculation and obtain this
performance gain--I am not sure if there would be a license issue though.
The following bash script should reproduce the result
```
# download GNU Gzip and zlib
wget -O- https://ftp.gnu.org/gnu/gzip/gzip-1.13.tar.gz | tar xzf -
wget -O- https://zlib.net/zlib-1.3.tar.gz | tar xzf -
# download linux source code as a test file for decompression speed
wget -O- https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.6.1.tar.xz
| xz -d | gzip > linux.tar.gz
# compile zlib
cd zlib-1.3
CFLAGS="-O2 -g" ./configure --static && make -j
cd ..
# compile GNU Gzip
cd gzip-1.13
CFLAGS="-O2 -g" ./configure && make -j
# measure decompression speed
/usr/bin/time -v ./gzip -d < ../linux.tar.gz > linux.tar 2> ../gzip1.time
# use crc32 from zlib
cat > util.diff << EOF
@@ -27,6 +27,7 @@
#include <stdlib.h>
#include <errno.h>
+#include "crc32.h"
#include "tailor.h"
#include "gzip.h"
#include <dirname.h>
@@ -136,25 +137,14 @@ copy (int in, int out)
ulg
updcrc (uch const *s, unsigned n)
{
- register ulg c; /* temporary variable */
-
- if (s == NULL) {
- c = 0xffffffffL;
- } else {
- c = crc;
- if (n) do {
- c = crc_32_tab[((int)c ^ (*s++)) & 0xff] ^ (c >> 8);
- } while (--n);
- }
- crc = c;
- return c ^ 0xffffffffL; /* (instead of ~c for 64-bit machines) */
+ crc = crc32(crc, s, n);
}
/* Return a current CRC value. */
ulg
getcrc ()
{
- return crc ^ 0xffffffffL;
+ return crc;
}
#ifdef IBM_Z_DFLTCC
EOF
patch < util.diff util.c
# create header file
cat > crc32.h << EOF
#pragma once
unsigned long crc32(unsigned long crc, const unsigned char *buf,
unsigned int len);
EOF
# copy crc32 object file from zlib
cp ../zlib-1.3/crc32.o .
# re-compile GNU Gzip
gcc -O2 -g -c util.c -Ilib
gcc -O2 -g *.o lib/libgzip.a -o gzip
# measure decompression speed
/usr/bin/time -v ./gzip -d < ../linux.tar.gz > linux.tar 2> ../gzip2.time
# print out time difference
cd ..
echo
echo "GNU Gzip unmodified"
grep Elapsed gzip1.time
echo "GNU Gzip with CRC32 from zlib"
grep Elapsed gzip2.time
```
[Message part 3 (message/rfc822, inline)]
On 2023-11-09 10:32, Paul Eggert wrote:
> On 2023-11-09 09:40, Young Mo Kang wrote:
>> Since both GNU Gzip and zlib are written by the same authors, I was
>> wondering if GNU Gzip can share zlib's CRC32 calculation and obtain
>> this performance gain--I am not sure if there would be a license issue
>> though.
>
> Shouldn't be a license issue. It's just a lack of time issue.
Due to work by Sam Russell and others it looks like gzip now has faster
CRC32 code on Savannah master, so closing this old bug report. See:
https://bugs.gnu.org/74927
https://bugs.gnu.org/74192
This bug report was last modified 100 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.