GNU bug report logs -
#21270
gzip huge filesize problem
Previous Next
Full log
Message #13 received at 21270-done <at> debbugs.gnu.org (full text, mbox):
tags 21270 notabug
thanks
On Sun, Aug 16, 2015 at 1:58 AM, Mark Adler <madler <at> alumni.caltech.edu> wrote:
> Alexander,
>
> Thank you for your report. This is a well-known limitation of the gzip format. The -l function makes use of the uncompressed length stored in the last four bytes of a gzip stream. Therein lies the rub, since four bytes can represent no more than 4 GB - 1.
>
> There is another problem with that approach, in that a valid gzip file may consist of a series of concatenated gzip streams, in which case -l will report only on the last one. In that case, even if the entire stream decompresses to less than 4 GB, the result will still be incorrect.
>
> The only reliable way to determine the uncompressed size of a gzip file is to decompress the entire file (which can be done without storing the result). This in fact is what "pigz -lt file.gz" does. It will correctly report the uncompressed length, but takes much longer than "gzip -l".
>
> -l remains useful however in most cases, so it remains a gzip and pigz option.
Thank you for replying Mark.
I've marked this as "notabug" with the in-line comment above, and am
closing the auto-created issue with the "-done" part of the debbugs
email recipient address.
This bug report was last modified 9 years and 340 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.