From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 16 18:51:25 2014 Received: (at submit) by debbugs.gnu.org; 16 Jan 2014 23:51:25 +0000 Received: from localhost ([127.0.0.1]:54196 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3wia-0003Qf-Ej for submit@debbugs.gnu.org; Thu, 16 Jan 2014 18:51:24 -0500 Received: from eggs.gnu.org ([208.118.235.92]:48647) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3wfB-0003GQ-Iy for submit@debbugs.gnu.org; Thu, 16 Jan 2014 18:47:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W3wf6-0006hF-OK for submit@debbugs.gnu.org; Thu, 16 Jan 2014 18:47:53 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:58383) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W3wf6-0006hB-Kv for submit@debbugs.gnu.org; Thu, 16 Jan 2014 18:47:48 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50339) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W3wf2-0004dc-6w for bug-gzip@gnu.org; Thu, 16 Jan 2014 18:47:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W3wex-0006gJ-UF for bug-gzip@gnu.org; Thu, 16 Jan 2014 18:47:44 -0500 Received: from mailhub129.itcs.purdue.edu ([128.210.5.129]:53790) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W3wex-0006g5-OS for bug-gzip@gnu.org; Thu, 16 Jan 2014 18:47:39 -0500 Received: from [10.184.172.48] (pal-nat184-172-048.itap.purdue.edu [10.184.172.48]) (authenticated bits=0) by mailhub129.itcs.purdue.edu (8.14.4/8.14.4/mta-auth.smtp.purdue.edu) with ESMTP id s0GNlcWW032218 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Thu, 16 Jan 2014 18:47:38 -0500 Message-ID: <52D86F98.1030301@purdue.edu> Date: Thu, 16 Jan 2014 18:47:36 -0500 From: =?UTF-8?B?SmFyb3PFgmF3IER1ZGE=?= User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: bug-gzip@gnu.org Subject: New entropy coding: faster than Huffman, compression rate like arithmetic Content-Type: text/plain; charset=UTF-8; format=flowed X-PMX-Version: 6.0.2.2308539 X-PerlMx-Virus-Scanned: Yes Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mailhub129.itcs.purdue.edu id s0GNlcWW032218 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 16 Jan 2014 18:51:22 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Hallo, Since I haven't found gzip development mailing list, I am sending it here. There is a new approach to entropy coding: asymmetric numeral systems.=20 It has some connection with arithmetic coding, but is simpler: uses a=20 single natural number as the state, instead of two to represent the=20 range. Thanks of it, we can put the entire behavior for given large=20 alphabet probability distribution into a relatively small table: a few=20 kilobytes for 256 size alphabet. This way it turns out about 50% faster than Huffman decoding, still=20 providing accuracy (compression rate) like arithmetic coding. Here is some implementation: https://github.com/Cyan4973/FiniteStateEntro= py Just replacing Huffman with it, we get improvement of both speed and=20 compression rate, like recently in Zhuff of Yann Collet:=20 http://fastcompression.blogspot.fr/2013/12/zhuff-v09-or-first-fse-applica= tion.html I thought it could also improve gzip DEFLATE in the same way? Best, Jarek -- dr Jaros=C5=82aw Duda Center for Science of Information, Purdue University, USA http://www.soihub.org/people.php?id=3D484 From debbugs-submit-bounces@debbugs.gnu.org Mon Nov 10 13:39:41 2014 Received: (at control) by debbugs.gnu.org; 10 Nov 2014 18:39:42 +0000 Received: from localhost ([127.0.0.1]:56265 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XntsL-0000ZK-Ki for submit@debbugs.gnu.org; Mon, 10 Nov 2014 13:39:41 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]:60213) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XntsJ-0000ZB-VI for control@debbugs.gnu.org; Mon, 10 Nov 2014 13:39:40 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 41002A60040 for ; Mon, 10 Nov 2014 10:39:39 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id G1K0liKed5Yl for ; Mon, 10 Nov 2014 10:39:30 -0800 (PST) Received: from penguin.cs.ucla.edu (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 70436A60043 for ; Mon, 10 Nov 2014 10:39:30 -0800 (PST) Message-ID: <54610662.6060706@cs.ucla.edu> Date: Mon, 10 Nov 2014 10:39:30 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: control@debbugs.gnu.org Subject: gzip bug report maintenance Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.9 (--) severity 18000 wishlist severity 17804 wishlist severity 16471 wishlist