From debbugs-submit-bounces@debbugs.gnu.org Tue Oct 31 14:04:23 2017 Received: (at submit) by debbugs.gnu.org; 31 Oct 2017 18:04:23 +0000 Received: from localhost ([127.0.0.1]:44515 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e9ati-0002T2-Od for submit@debbugs.gnu.org; Tue, 31 Oct 2017 14:04:22 -0400 Received: from eggs.gnu.org ([208.118.235.92]:59910) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e9apN-0002Kn-OB for submit@debbugs.gnu.org; Tue, 31 Oct 2017 13:59:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e9apH-0007Tr-Ip for submit@debbugs.gnu.org; Tue, 31 Oct 2017 13:59:48 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_40,FREEMAIL_FROM, T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:39460) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1e9apH-0007TZ-CE for submit@debbugs.gnu.org; Tue, 31 Oct 2017 13:59:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50839) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e9apF-0007cQ-Sg for bug-gzip@gnu.org; Tue, 31 Oct 2017 13:59:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e9apA-0007QV-P6 for bug-gzip@gnu.org; Tue, 31 Oct 2017 13:59:45 -0400 Received: from smtp49.i.mail.ru ([94.100.177.109]:52154) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1e9apA-0007Kh-Ap for bug-gzip@gnu.org; Tue, 31 Oct 2017 13:59:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Date:Message-ID:Subject:From:To; bh=xWy7dmDpBAcZxs4gTvD2JMQYTJ/BiCmdlV8TBXyfMLc=; b=Nmr5ETXwKOMGNQmuIOTc1MwU/EfhFRwc/R14Bku0h8TmQ1xjYwTI0HrZXRcErKhLAOh0Er3tpnuOvhcIAaVse755Sry9WrYuO+jheWSH0IS/2lyjzK9zlthxJApFeLQi6/pt47HK/rwM+1pmnJYGXI1boy2Tcu/HqbPtqkq8nfA=; Received: by smtp49.i.mail.ru with esmtpa (envelope-from ) id 1e9ap3-0002ng-Vq for bug-gzip@gnu.org; Tue, 31 Oct 2017 20:59:34 +0300 To: bug-gzip@gnu.org From: Alex Peshkoff Subject: Truncated size of big file Message-ID: Date: Tue, 31 Oct 2017 20:59:33 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB Authentication-Results: smtp49.i.mail.ru; auth=pass smtp.auth=peshkoff@mail.ru smtp.mailfrom=peshkoff@mail.ru X-7FA49CB5: 0D63561A33F958A543C1B584C85F441222EF9B7DEBCED96BF593A7939E0B998A725E5C173C3A84C315AF0D0D4FC4FA3DBEE15BE102E9A750C2546860BDEA057BC4224003CC836476C0CAF46E325F83A50BF2EBBBDD9D6B0F5D41B9178041F3E72623479134186CDE6BA297DBC24807EABDAD6C7F3747799A X-Mailru-Sender: 4328B98C6DFE3B90D8B585182AAF62D22E11DA63C92638E275584350D7B15BA57B0B9A7C4E99E59BFEA8D0CDE4AE263008335C02508E532CDF6005FC3A0B9B165FEEDEB644C299C0ED14614B50AE0675 X-Mras: OK X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Tue, 31 Oct 2017 14:04:21 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) Before decompressing a copy of database I've decided to take a look at it's size: localhost stg # gunzip -l SWHTOROLT_20171019.GBK.gz          compressed        uncompressed  ratio uncompressed_name          3645968323          1782666240 -104.5% SWHTOROLT_20171019.GBK uncompressed is reported as 1.7Gb which is definitely something unreal like -104.5 compress ratio Actual size after unzip is: localhost stg # gunzip SWHTOROLT_20171019.GBK.gz localhost stg # ls -l SWHTOROLT_20171019.GBK -rw-r--r-- 1 root root 18962535424 Oct 19 15:59 SWHTOROLT_20171019.GBK Lickily I've had enough disk space - but let me not attach problematic archive to email, I suppose it's easier to reproduce this locally ;) Alex. From debbugs-submit-bounces@debbugs.gnu.org Tue Oct 31 14:20:39 2017 Received: (at 29089) by debbugs.gnu.org; 31 Oct 2017 18:20:39 +0000 Received: from localhost ([127.0.0.1]:44537 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e9b9T-0002sZ-6p for submit@debbugs.gnu.org; Tue, 31 Oct 2017 14:20:39 -0400 Received: from mail.alumni.caltech.edu ([131.215.242.114]:39146) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e9b9R-0002sJ-D3 for 29089@debbugs.gnu.org; Tue, 31 Oct 2017 14:20:38 -0400 Received: from [17.115.232.195] (unknown [17.115.232.195]) (Authenticated sender: madler) by mail.alumni.caltech.edu (Postfix) with ESMTPSA id 46D95120057; Tue, 31 Oct 2017 11:20:30 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.alumni.caltech.edu 46D95120057 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alumni.caltech.edu; s=enforce; t=1509474030; bh=muJDECz5YSLfz6QpIBKZqKatEVYUZS6i2RNu+Nsh8/s=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=YHWRbzMiNGCQLQ3F7oroOu70vIOzhLMHCK+zYiLni3Mu9A/3iv3IhsGTxyPJ4UeTM GVAYk60bet+O8xBGeD50V4w+yASBbjxCexn6GG6ZtlaZuGl+n8j4g4E8iggRhkOObM b64THckHzB1MJWRa5/ryKYA02lVSLQLdPTH87fgs= Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: bug#29089: Truncated size of big file From: Mark Adler In-Reply-To: Date: Tue, 31 Oct 2017 11:20:29 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <16601711-B48B-4FBC-A7E7-14EC227388CE@alumni.caltech.edu> References: To: Alex Peshkoff X-Mailer: Apple Mail (2.3273) X-MailScanner-Information-Alumni: X-Alumni-MailScanner-ID: 46D95120057.AFA2F X-MailScanner-Alumni: No Virii found X-Spam-Status-Alumni: not spam, SpamAssassin (not cached, score=-1.1, required 5, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10, DKIM_VALID -0.10, DKIM_VALID_AU -0.10) X-MailScanner-From: madler@alumni.caltech.edu X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 29089 Cc: 29089@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Alex, This is inherent in the gzip format, and is not really a bug in gzip. = (Though gzip could notice the problem and not display a large negative = compression ratio.) The gzip format stores the uncompressed length at the end using four = bytes, which can only represent up to 2^32-1. So what you are seeing is = the low 32 bits of 18962535424, which is in fact 1782666240. When gzip = uses that truncated value to compute a compression ratio, it gets a = nonsensical result. Unfortunately the only way to get the real uncompressed length and = compute a real ratio is to decompress the entire file. (In fact, pigz = will do this with "pigz -lt", which tests the entire file without = storing the result, and reports the correct uncompressed size and = compression ratio. "pigz -l" will do the same bad thing that "gzip -l" = does on > 4 GB uncompressed sizes, though it will report =E2=80=9Cunk=E2=80= =9D for questionable ratios, i.e. expansions of the data beyond what = would be expected for incompressible data.) Mark > On Oct 31, 2017, at 10:59 AM, Alex Peshkoff wrote: >=20 > Before decompressing a copy of database I've decided to take a look at = it's size: >=20 > localhost stg # gunzip -l SWHTOROLT_20171019.GBK.gz > compressed uncompressed ratio uncompressed_name > 3645968323 1782666240 -104.5% SWHTOROLT_20171019.GBK >=20 > uncompressed is reported as 1.7Gb which is definitely something unreal = like -104.5 compress ratio >=20 > Actual size after unzip is: >=20 > localhost stg # gunzip SWHTOROLT_20171019.GBK.gz > localhost stg # ls -l SWHTOROLT_20171019.GBK > -rw-r--r-- 1 root root 18962535424 Oct 19 15:59 SWHTOROLT_20171019.GBK >=20 > Lickily I've had enough disk space - but let me not attach problematic = archive to email, I suppose it's easier to reproduce this locally ;) >=20 > Alex. >=20 >=20 >=20 >=20 >=20 From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 01 18:33:43 2021 Received: (at control) by debbugs.gnu.org; 1 Dec 2021 23:33:44 +0000 Received: from localhost ([127.0.0.1]:46061 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1msZ6d-0003pa-PB for submit@debbugs.gnu.org; Wed, 01 Dec 2021 18:33:43 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:46650) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1msZ6b-0003pJ-1h for control@debbugs.gnu.org; Wed, 01 Dec 2021 18:33:41 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id D1069160119 for ; Wed, 1 Dec 2021 15:33:34 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id AnOpPU7ZyYzd for ; Wed, 1 Dec 2021 15:33:34 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 3CE6D16011C for ; Wed, 1 Dec 2021 15:33:34 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id oznAoTJzm__G for ; Wed, 1 Dec 2021 15:33:34 -0800 (PST) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 19C60160119 for ; Wed, 1 Dec 2021 15:33:34 -0800 (PST) Message-ID: Date: Wed, 1 Dec 2021 15:33:33 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.3.0 Content-Language: en-US To: GNU bug control From: Paul Eggert Subject: merge gzip -l bugs Organization: UCLA Computer Science Department Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) severity 17804 normal merge 17804 29089 30935 30936 38766 42965 48424 52227 From unknown Tue Jun 17 01:49:44 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Thu, 13 Jan 2022 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator