GNU bug report logs - #21270
gzip huge filesize problem

Previous Next

Package: gzip;

Reported by: Alexander Kleinsorge <aleks <at> physik.tu-berlin.de>

Date: Sat, 15 Aug 2015 21:57:01 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Jim Meyering <jim <at> meyering.net>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#21270: closed (gzip huge filesize problem)
Date: Mon, 17 Aug 2015 03:46:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Sun, 16 Aug 2015 21:44:54 -0600
with message-id <CA+8g5KEDYxgsuJpb-iaiRADNb7JsgieaXRvF3qVQwipLf55Xmg <at> mail.gmail.com>
and subject line Re: bug#21270: gzip huge filesize problem
has caused the debbugs.gnu.org bug report #21270,
regarding gzip huge filesize problem
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
21270: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=21270
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Alexander Kleinsorge <aleks <at> physik.tu-berlin.de>
To: bug-gzip <at> gnu.org
Subject: gzip huge filesize problem
Date: Sat, 15 Aug 2015 23:42:20 +0200
Hi Gzip team,

I compressed a 500 GB file (raw hdd image) using gzip 1.6 under Ubuntu 
14.10 (64 bit). uncompressing the file gives a file with 500 gb 
(checked).
But "gzip -l" shows bad (small) uncompressed_size and bad ratio 
(-5167%).

Below you can see some details, but I think it is a general bug.
Thanks for help, Alexander


gzip -l asus.gz
compressed        uncompressed  ratio uncompressed_name     99630975185  
        1891655680 -5166.9% asus

gzip --version
gzip 1.6

Linux myname 3.16.0-43-generic #58-Ubuntu SMP Fri Jun 19 11:04:02 UTC 
2015 x86_64 x86_64 x86_64 GNU/Linux

the 2 files (compressed 93gb + uncompressed 500gb)

-rwxrwx--- 1 root plugdev 99630975185 Aug 15 21:39 asus.gz
-rwxrwx--- 1 root plugdev 500107862016 Aug 14 09:00 sdc.raw
-rwxrwx--- 1 root plugdev 93G Aug 15 21:39 asus.gz
-rwxrwx--- 1 root plugdev 466G Aug 14 09:00 sdc.raw



[Message part 3 (message/rfc822, inline)]
From: Jim Meyering <jim <at> meyering.net>
To: Mark Adler <madler <at> alumni.caltech.edu>
Cc: Alexander Kleinsorge <aleks <at> physik.tu-berlin.de>,
 21270-done <at> debbugs.gnu.org
Subject: Re: bug#21270: gzip huge filesize problem
Date: Sun, 16 Aug 2015 21:44:54 -0600
tags 21270 notabug
thanks

On Sun, Aug 16, 2015 at 1:58 AM, Mark Adler <madler <at> alumni.caltech.edu> wrote:
> Alexander,
>
> Thank you for your report.  This is a well-known limitation of the gzip format.  The -l function makes use of the uncompressed length stored in the last four bytes of a gzip stream.  Therein lies the rub, since four bytes can represent no more than 4 GB - 1.
>
> There is another problem with that approach, in that a valid gzip file may consist of a series of concatenated gzip streams, in which case -l will report only on the last one.  In that case, even if the entire stream decompresses to less than 4 GB, the result will still be incorrect.
>
> The only reliable way to determine the uncompressed size of a gzip file is to decompress the entire file (which can be done without storing the result).  This in fact is what "pigz -lt file.gz" does.  It will correctly report the uncompressed length, but takes much longer than "gzip -l".
>
> -l remains useful however in most cases, so it remains a gzip and pigz option.

Thank you for replying Mark.
I've marked this as "notabug" with the in-line comment above, and am
closing the auto-created issue with the "-done" part of the debbugs
email recipient address.


This bug report was last modified 9 years and 340 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.