GNU bug report logs - #33133
26.1.50; zlib-decompress-region too rigid

Previous Next

Package: emacs;

Reported by: Katsumi Yamaoka <yamaoka <at> jpl.org>

Date: Tue, 23 Oct 2018 23:10:02 UTC

Severity: normal

Tags: fixed, patch

Found in version 26.1.50

Fixed in version 27.1

Done: Noam Postavsky <npostavs <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 33133 in the body.
You can then email your comments to 33133 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#33133; Package emacs. (Tue, 23 Oct 2018 23:10:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Katsumi Yamaoka <yamaoka <at> jpl.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 23 Oct 2018 23:10:04 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Katsumi Yamaoka <yamaoka <at> jpl.org>
To: bug-gnu-emacs <at> gnu.org
Cc: Kevin Ryde <user42_kevin <at> yahoo.com.au>
Subject: 26.1.50; zlib-decompress-region too rigid
Date: Wed, 24 Oct 2018 08:07:50 +0900
[Message part 1 (text/plain, inline)]
Hi,

Whereas `gzip -d' does, zlib-decompress-region doesn't decompress
corrupted data of a certain kind.  For instance, visiting

https://www.gutenberg.org/no-such-page-exists

using eww shows raw gzipped data.  The data extracted is attached.
As for `gzip -d', it says "unexpected end of file" in stderr.
Here is a recipe to reproduce zlib-decompress-region not working:

(let ((buffer (get-buffer-create "*testing*"))
      (coding-system-for-read 'binary)
      (cw (selected-window))
      jka-compr-compression-info-list format-alist)
  (switch-to-buffer-other-window buffer)
  (erase-buffer)
  (set-buffer-multibyte nil)
  (insert-file-contents "/TEMP/corrupted-data.gz")
  (sit-for 1)
  (prog1
      (zlib-decompress-region (point-min) (point-max))
    (select-window cw)))

If there is no prospect to improve zlib-decompress-region, how
about this workaround?

--- url-http.el~	2018-09-12 07:48:16.110765500 +0000
+++ url-http.el	2018-10-23 23:04:48.060829900 +0000
@@ -951,7 +951,12 @@
 	(widen)
 	(goto-char (point-min))
 	(when (search-forward "\n\n")
-	  (zlib-decompress-region (point) (point-max)))))))
+	  (or (zlib-decompress-region (point) (point-max))
+	      (let ((coding-system-for-write 'binary)
+		    (coding-system-for-read 'binary)
+		    (default-process-coding-system (cons 'binary 'binary)))
+		(zerop (call-process-region (point) (point-max) "gzip"
+					    t '(t nil) nil "-d")))))))))
 
 ;; Miscellaneous
 (defun url-http-activate-callback ()


In GNU Emacs 26.1.50 (build 1, x86_64-unknown-cygwin, GTK+ Version 3.22.28)
 of 2018-10-22 built on localhost
Windowing system distributor 'The Cygwin/X Project', version 11.0.12001000
[corrupted-data.gz (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33133; Package emacs. (Wed, 24 Oct 2018 00:28:02 GMT) Full text and rfc822 format available.

Message #8 received at 33133 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: Katsumi Yamaoka <yamaoka <at> jpl.org>
Cc: Kevin Ryde <user42_kevin <at> yahoo.com.au>, 33133 <at> debbugs.gnu.org
Subject: Re: bug#33133: 26.1.50; zlib-decompress-region too rigid
Date: Tue, 23 Oct 2018 20:26:59 -0400
Katsumi Yamaoka <yamaoka <at> jpl.org> writes:

> Whereas `gzip -d' does, zlib-decompress-region doesn't decompress
> corrupted data of a certain kind.  For instance, visiting
>
> https://www.gutenberg.org/no-such-page-exists
>
> using eww shows raw gzipped data.  The data extracted is attached.
> As for `gzip -d', it says "unexpected end of file" in stderr.

> If there is no prospect to improve zlib-decompress-region, how
> about this workaround?

We could have zlib-decompress-region ignore unexpected eof as well,
e.g., the below (though it should obviously be enhanced to depend on a
new parameter):

--- i/src/decompress.c
+++ w/src/decompress.c
@@ -206,7 +206,7 @@ DEFUN ("zlib-decompress-region", Fzlib_decompress_region,
     }
   while (inflate_status == Z_OK);
 
-  if (inflate_status != Z_STREAM_END)
+  if (inflate_status != Z_STREAM_END && inflate_status != Z_BUF_ERROR)
     return unbind_to (count, Qnil);
 
   unwind_data.start = 0;






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33133; Package emacs. (Wed, 24 Oct 2018 01:17:02 GMT) Full text and rfc822 format available.

Message #11 received at 33133 <at> debbugs.gnu.org (full text, mbox):

From: Katsumi Yamaoka <yamaoka <at> jpl.org>
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: Kevin Ryde <user42_kevin <at> yahoo.com.au>, 33133 <at> debbugs.gnu.org
Subject: Re: bug#33133: 26.1.50; zlib-decompress-region too rigid
Date: Wed, 24 Oct 2018 10:16:16 +0900
On Tue, 23 Oct 2018 20:26:59 -0400, Noam Postavsky wrote:
> We could have zlib-decompress-region ignore unexpected eof as well,
> e.g., the below (though it should obviously be enhanced to depend on a
> new parameter):

> --- i/src/decompress.c
> +++ w/src/decompress.c
> @@ -206,7 +206,7 @@ DEFUN ("zlib-decompress-region", Fzlib_decompress_region,

>    while (inflate_status == Z_OK);

> -  if (inflate_status != Z_STREAM_END)
> +  if (inflate_status != Z_STREAM_END && inflate_status != Z_BUF_ERROR)
>      return unbind_to (count, Qnil);

>    unwind_data.start = 0;

I confirmed that it makes it work for the corrupted web site in
question.  Thank you!




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33133; Package emacs. (Sat, 27 Oct 2018 21:49:02 GMT) Full text and rfc822 format available.

Message #14 received at 33133 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: Katsumi Yamaoka <yamaoka <at> jpl.org>
Cc: Kevin Ryde <user42_kevin <at> yahoo.com.au>, 33133 <at> debbugs.gnu.org
Subject: Re: bug#33133: 26.1.50; zlib-decompress-region too rigid
Date: Sat, 27 Oct 2018 17:48:26 -0400
[Message part 1 (text/plain, inline)]
tags 33133 + patch
quit

Katsumi Yamaoka <yamaoka <at> jpl.org> writes:

> On Tue, 23 Oct 2018 20:26:59 -0400, Noam Postavsky wrote:
>> --- i/src/decompress.c
>> +++ w/src/decompress.c
>> @@ -206,7 +206,7 @@ DEFUN ("zlib-decompress-region", Fzlib_decompress_region,
>
>>    while (inflate_status == Z_OK);
>
>> -  if (inflate_status != Z_STREAM_END)
>> +  if (inflate_status != Z_STREAM_END && inflate_status != Z_BUF_ERROR)
>>      return unbind_to (count, Qnil);
>
>>    unwind_data.start = 0;
>
> I confirmed that it makes it work for the corrupted web site in
> question.  Thank you!

Here's a proper patch.

[v1-0001-Allow-partial-decompression-Bug-33133.patch (text/plain, attachment)]

Added tag(s) patch. Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Sat, 27 Oct 2018 21:49:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33133; Package emacs. (Sun, 28 Oct 2018 15:42:02 GMT) Full text and rfc822 format available.

Message #19 received at 33133 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: user42_kevin <at> yahoo.com.au, yamaoka <at> jpl.org, 33133 <at> debbugs.gnu.org
Subject: Re: bug#33133: 26.1.50; zlib-decompress-region too rigid
Date: Sun, 28 Oct 2018 17:41:17 +0200
> From: Noam Postavsky <npostavs <at> gmail.com>
> Date: Sat, 27 Oct 2018 17:48:26 -0400
> Cc: Kevin Ryde <user42_kevin <at> yahoo.com.au>, 33133 <at> debbugs.gnu.org
> 
> Here's a proper patch.

Thanks.  I have a few comments:

> +data.  If @var{allow-partial} is @code{nil}, on failure, the function

We usually say "nil or omitted" for optional arguments.  Also, I'd say
"then on failure, ...", otherwise this could be misinterpreted as if
"on failure" qualifies the "is nil" part.

Same comment regarding the doc string of the function.

> +leaves the region unchanged and returns @code{nil}.  Otherwise, it
> +returns the number of bytes that were not decompressed and replaces
> +the region text by whatever data was successfully decompressed.  This
> +function can be called only in unibyte buffers.

Maybe it would make sense here to say that this emulates what 'gzip'
does?

> +  Lisp_Object ret = Qt;
>    if (inflate_status != Z_STREAM_END)
> -    return unbind_to (count, Qnil);
> +    {
> +      if (!NILP (allow_partial))
> +        ret = make_int (iend - pos_byte);
> +      else
> +        return unbind_to (count, Qnil);
> +    }

Hmm... should we display a warning message, like gzip does?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33133; Package emacs. (Wed, 31 Oct 2018 00:26:02 GMT) Full text and rfc822 format available.

Message #22 received at 33133 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: user42_kevin <at> yahoo.com.au, yamaoka <at> jpl.org, 33133 <at> debbugs.gnu.org
Subject: Re: bug#33133: 26.1.50; zlib-decompress-region too rigid
Date: Tue, 30 Oct 2018 20:25:10 -0400
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

>> +data.  If @var{allow-partial} is @code{nil}, on failure, the function
>
> We usually say "nil or omitted" for optional arguments.  Also, I'd say
> "then on failure, ...", otherwise this could be misinterpreted as if
> "on failure" qualifies the "is nil" part.
>
> Same comment regarding the doc string of the function.

Makes sense, done.

>> +leaves the region unchanged and returns @code{nil}.  Otherwise, it
>> +returns the number of bytes that were not decompressed and replaces
>> +the region text by whatever data was successfully decompressed.  This
>> +function can be called only in unibyte buffers.
>
> Maybe it would make sense here to say that this emulates what 'gzip'
> does?

Hmm, maybe.  I've added a mention of this, not sure if it actually
helps.

>> +  Lisp_Object ret = Qt;
>>    if (inflate_status != Z_STREAM_END)
>> +    {
>> +      if (!NILP (allow_partial))
>> +        ret = make_int (iend - pos_byte);

> Hmm... should we display a warning message, like gzip does?

Not unconditionally, I'd say.  In the example which prompted this bug
thread, a warning would just be a nuisance.  We could just leave it up
to the caller to print a warning message if they want, e.g.:

   (unless (eq t (zlib-decompress-region START END t))
     (message "Incomplete decompression"))

Or perhaps instead of returning the byte count, return an error
indicator which the caller could use to contruct a warning message (this
could allow for a slightly more specific message)?  Or maybe it's easier
if the ALLOW-PARTIAL parameter could have another possible value to
control display of the warning message?

[v2-0001-Allow-partial-decompression-Bug-33133.patch (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33133; Package emacs. (Wed, 31 Oct 2018 16:08:01 GMT) Full text and rfc822 format available.

Message #25 received at 33133 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: user42_kevin <at> yahoo.com.au, yamaoka <at> jpl.org, 33133 <at> debbugs.gnu.org
Subject: Re: bug#33133: 26.1.50; zlib-decompress-region too rigid
Date: Wed, 31 Oct 2018 18:07:33 +0200
> From: Noam Postavsky <npostavs <at> gmail.com>
> Cc: user42_kevin <at> yahoo.com.au,  yamaoka <at> jpl.org,  33133 <at> debbugs.gnu.org
> Date: Tue, 30 Oct 2018 20:25:10 -0400
> 
> >From 43a912181a4a30d826b2c016ca05b5c9d8daf3f4 Mon Sep 17 00:00:00 2001
> From: Noam Postavsky <npostavs <at> gmail.com>
> Date: Sat, 27 Oct 2018 17:45:52 -0400
> Subject: [PATCH v2] Allow partial decompression (Bug#33133)
> 
> * src/decompress.c (Fzlib_decompress_region): Add optional
> ALLOW-PARTIAL parameter.
> * lisp/url/url-http.el (url-handle-content-transfer-encoding): Use it.
> * doc/lispref/text.texi (Decompression): Document it.
> * etc/NEWS: Announce it.

Thanks, this LGTM.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33133; Package emacs. (Wed, 03 Apr 2019 02:10:02 GMT) Full text and rfc822 format available.

Message #28 received at 33133 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: user42_kevin <at> yahoo.com.au, yamaoka <at> jpl.org, 33133 <at> debbugs.gnu.org
Subject: Re: bug#33133: 26.1.50; zlib-decompress-region too rigid
Date: Tue, 02 Apr 2019 22:09:24 -0400
tags 33133 fixed
close 33133 27.1
quit

Eli Zaretskii <eliz <at> gnu.org> writes:

>> Subject: [PATCH v2] Allow partial decompression (Bug#33133)
>> 
>> * src/decompress.c (Fzlib_decompress_region): Add optional
>> ALLOW-PARTIAL parameter.
>> * lisp/url/url-http.el (url-handle-content-transfer-encoding): Use it.
>> * doc/lispref/text.texi (Decompression): Document it.
>> * etc/NEWS: Announce it.
>
> Thanks, this LGTM.

Pushed.

[1: b36913d803]: 2019-04-02 22:02:32 -0400
  Allow partial decompression (Bug#33133)
  https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=b36913d803ee22a314f2e0a27523fbadeb60dd2c




Added tag(s) fixed. Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 03 Apr 2019 02:10:04 GMT) Full text and rfc822 format available.

bug marked as fixed in version 27.1, send any further explanations to 33133 <at> debbugs.gnu.org and Katsumi Yamaoka <yamaoka <at> jpl.org> Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 03 Apr 2019 02:10:04 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 01 May 2019 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 51 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.