GNU bug report logs - #67022
Gzip decompression can be 60% faster using zlib's CRC32

Previous Next

Package: gzip;

Reported by: Young Mo Kang <kym327 <at> gmail.com>

Date: Thu, 9 Nov 2023 17:42:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Young Mo Kang <kym327 <at> gmail.com>
Subject: bug#67022: closed (Re: bug#67022: Gzip decompression can be 60%
 faster using zlib's CRC32)
Date: Tue, 11 Feb 2025 07:47:02 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#67022: Gzip decompression can be 60% faster using zlib's CRC32

which was filed against the gzip package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 67022 <at> debbugs.gnu.org.

-- 
67022: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=67022
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Young Mo Kang <kym327 <at> gmail.com>
Cc: 67022-done <at> debbugs.gnu.org
Subject: Re: bug#67022: Gzip decompression can be 60% faster using zlib's CRC32
Date: Mon, 10 Feb 2025 23:46:40 -0800
On 2023-11-09 10:32, Paul Eggert wrote:
> On 2023-11-09 09:40, Young Mo Kang wrote:
>> Since both GNU Gzip and zlib are written by the same authors, I was 
>> wondering if GNU Gzip can share zlib's CRC32 calculation and obtain 
>> this performance gain--I am not sure if there would be a license issue 
>> though.
> 
> Shouldn't be a license issue. It's just a lack of time issue.

Due to work by Sam Russell and others it looks like gzip now has faster 
CRC32 code on Savannah master, so closing this old bug report. See:

https://bugs.gnu.org/74927
https://bugs.gnu.org/74192

[Message part 3 (message/rfc822, inline)]
From: Young Mo Kang <kym327 <at> gmail.com>
To: bug-gzip <at> gnu.org
Subject: Gzip decompression can be 60% faster using zlib's CRC32
Date: Thu, 9 Nov 2023 12:40:24 -0500
Hello,


I have noticed that GNU Gzip's CRC32 calculation is the main bottleneck 
in decompression, and it can run significantly faster >60% if we replace 
it with crc32 function from zlib.


I tested decompression speed of linux source code tar.gz file before and 
after replacing CRC32 computation. On an AMD 7735HS system, I get

GNU Gzip unmodified
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:05.11
GNU Gzip with CRC32 from zlib
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.16


And I saw even better performance improvement when tested on an Apple 
Silicon M1 system.

GNU Gzip unmodified
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.83
GNU Gzip with CRC32 from zlib
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.72


Since both GNU Gzip and zlib are written by the same authors, I was 
wondering if GNU Gzip can share zlib's CRC32 calculation and obtain this 
performance gain--I am not sure if there would be a license issue though.


The following bash script should reproduce the result

```

# download GNU Gzip and zlib
wget -O- https://ftp.gnu.org/gnu/gzip/gzip-1.13.tar.gz | tar xzf -
wget -O- https://zlib.net/zlib-1.3.tar.gz | tar xzf -

# download linux source code as a test file for decompression speed
wget -O- https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.6.1.tar.xz 
| xz -d | gzip > linux.tar.gz

# compile zlib
cd zlib-1.3
CFLAGS="-O2 -g" ./configure --static && make -j
cd ..

# compile GNU Gzip
cd gzip-1.13
CFLAGS="-O2 -g" ./configure && make -j

# measure decompression speed
/usr/bin/time -v ./gzip -d < ../linux.tar.gz > linux.tar 2> ../gzip1.time

# use crc32 from zlib
cat > util.diff << EOF
@@ -27,6 +27,7 @@
 #include <stdlib.h>
 #include <errno.h>

+#include "crc32.h"
 #include "tailor.h"
 #include "gzip.h"
 #include <dirname.h>
@@ -136,25 +137,14 @@ copy (int in, int out)
 ulg
 updcrc (uch const *s, unsigned n)
 {
-    register ulg c;         /* temporary variable */
-
-    if (s == NULL) {
-        c = 0xffffffffL;
-    } else {
-        c = crc;
-        if (n) do {
-            c = crc_32_tab[((int)c ^ (*s++)) & 0xff] ^ (c >> 8);
-        } while (--n);
-    }
-    crc = c;
-    return c ^ 0xffffffffL;       /* (instead of ~c for 64-bit machines) */
+    crc = crc32(crc, s, n);
 }

 /* Return a current CRC value.  */
 ulg
 getcrc ()
 {
-  return crc ^ 0xffffffffL;
+  return crc;
 }

 #ifdef IBM_Z_DFLTCC
EOF
patch < util.diff util.c

# create header file
cat > crc32.h << EOF
#pragma once

unsigned long  crc32(unsigned long crc, const unsigned char  *buf,
                            unsigned int len);
EOF

# copy crc32 object file from zlib
cp ../zlib-1.3/crc32.o .

# re-compile GNU Gzip
gcc -O2 -g -c util.c -Ilib
gcc -O2 -g *.o lib/libgzip.a -o gzip

# measure decompression speed
/usr/bin/time -v ./gzip -d < ../linux.tar.gz > linux.tar 2> ../gzip2.time

# print out time difference
cd ..
echo
echo "GNU Gzip unmodified"
grep Elapsed gzip1.time
echo "GNU Gzip with CRC32 from zlib"
grep Elapsed gzip2.time
```




This bug report was last modified 100 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.