GNU bug report logs - #16471
New entropy coding: faster than Huffman, compression rate like arithmetic

Previous Next

Package: gzip;

Reported by: Jarosław Duda <dudaj <at> purdue.edu>

Date: Thu, 16 Jan 2014 23:52:01 UTC

Severity: wishlist

Full log


View this message in rfc822 format

From: Jarosław Duda <dudaj <at> purdue.edu>
To: 16471 <at> debbugs.gnu.org
Subject: bug#16471: New entropy coding: faster than Huffman, compression rate like arithmetic
Date: Thu, 16 Jan 2014 18:47:36 -0500
Hallo,

Since I haven't found gzip development mailing list, I am sending it here.
There is a new approach to entropy coding: asymmetric numeral systems. 
It has some connection with arithmetic coding, but is simpler: uses a 
single natural number as the state, instead of two to represent the 
range. Thanks of it, we can put the entire behavior for given large 
alphabet probability distribution into a relatively small table: a few 
kilobytes for 256 size alphabet.
This way it turns out about 50% faster than Huffman decoding, still 
providing accuracy (compression rate) like arithmetic coding.

Here is some implementation: https://github.com/Cyan4973/FiniteStateEntropy
Just replacing Huffman with it, we get improvement of both speed and 
compression rate, like recently in Zhuff of Yann Collet: 
http://fastcompression.blogspot.fr/2013/12/zhuff-v09-or-first-fse-application.html
I thought it could also improve gzip DEFLATE in the same way?

Best,
Jarek

-- dr Jarosław Duda
Center for Science of Information, Purdue University, USA
http://www.soihub.org/people.php?id=484





This bug report was last modified 10 years and 219 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.