GNU bug report logs -
#15380
Bug in gzip.c buffering, and two feature requests
Previous Next
To reply to this bug, email your comments to 15380 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gzip <at> gnu.org
:
bug#15380
; Package
gzip
.
(Sun, 15 Sep 2013 01:00:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Martin Langhoff <martin.langhoff <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gzip <at> gnu.org
.
(Sun, 15 Sep 2013 01:00:06 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi gzip maintainers!
In the course of trying to use gzip as a stream-compressor with
Apache's "piped logs" feature I hit what I think is a bug in gzip.
Additionally, using gzip as a stream log compressor triggered two
related "feature requests".
For the time being, I have rolled my own mini-reimplementation of
gzip.c (in python, I was in a rush...) and the tool I will describe
below. However, I think these improvements belong in gzip proper.
Bug: gzip.c discards its buffers when it gets SIGTERM, with no effort
made to flush them to disk. Logging via gzip adds buffering, when
apache is stopped (or restarted, usually daily to rotate logs), apache
sends SIGTERM before it closes the fh; gzip discards everything it has
buffered.
[ Apache logs are highly compressible, so the loss is quite significant! ]
Feature requests:
- for debugging / diagnostics of live services, gzip (when in
"stream" mode) should flush its buffers to disk when it receives a
SIGUSR1
- zcat needs to support a "followtail" option, where it reads to the
end of an open gzipped file and then "follows" it (as in tail -f).
This tool would be one user of the "flush on SIGUSR1" feature.
Working implementations of "compresslog" (a gzip-stream-compressor)
and "ztail", implemented in Python, are available here:
http://repo.or.cz/w/compresslog.git
Is there any interest in this? I am not familiar with gzip.c sources;
and hesitant to throw a lot of effort without hearing from the
maintainers.
cheers,
m
--
martin.langhoff <at> gmail.com
- ask interesting questions
- don't get distracted with shiny stuff - working code first
~ http://docs.moodle.org/en/User:Martin_Langhoff
Information forwarded
to
bug-gzip <at> gnu.org
:
bug#15380
; Package
gzip
.
(Sun, 15 Sep 2013 04:15:02 GMT)
Full text and
rfc822 format available.
Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):
Your gzip feature requests sound reasonable, except for this one:
On 09/14/2013 07:57 PM, Martin Langhoff wrote:
> Bug: gzip.c discards its buffers when it gets SIGTERM, with no
> effort made to flush them to disk.
That's the normal behavior of utilites when they get SIGTERM; they
exit right away with minimal fuss, and they typically do not attempt
to flush output buffers.
> Logging via gzip adds buffering, when apache is stopped (or
> restarted, usually daily to rotate logs), apache sends SIGTERM
> before it closes the fh;
That seems wrong. Apache should close the file handle, and let gzip
finish up cleanly. Maybe Apache should kill gzip if gzip doesn't exit
reasonably soon after the file handle is closed, but it'd be misguided
for Apache to kill off gzip first, before closing the file handle. If
this is really Apache's behavior, perhaps you should file an Apache
bug report.
Information forwarded
to
bug-gzip <at> gnu.org
:
bug#15380
; Package
gzip
.
(Sun, 15 Sep 2013 13:37:02 GMT)
Full text and
rfc822 format available.
Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):
On Sun, Sep 15, 2013 at 12:13 AM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> Your gzip feature requests sound reasonable, except for this one:
Thanks for taking the time to reply!
>> Bug: gzip.c discards its buffers when it gets SIGTERM, with no
>> effort made to flush them to disk.
>
> That's the normal behavior of utilites when they get SIGTERM; they
> exit right away with minimal fuss, and they typically do not attempt
> to flush output buffers.
I am no POSIX lawyer. Googling, I find two diverging interpretations
of SIGTERM -- one is close to SIGABRT, the other one is for a more
elegant exit.
For example, Wikipedia has:
"SIGTERM - The SIGTERM signal is sent to a process to request its
termination. Unlike the SIGKILL signal, it can be caught and
interpreted or ignored by the process. This allows the process to
perform nice termination releasing resources and saving state if
appropriate. It should be noted that SIGINT is nearly identical to
SIGTERM."
Did a bit of reading around POSIX 2008 at
http://pubs.opengroup.org/onlinepubs/9699919799/ -- if I read
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html
I do find that SIGTERM should behave as per the "T" key, "Abnormal
termination of process". But it also says SIGALRM, SIGUSR1 and SIGUSR2
are "T", so this is not the information we are looking for :-)
At the end of the day, we didn't get SIGABRT. We got SIGTERM. IMHO,
flushing of buffers is reasonable in this context.
>> Logging via gzip adds buffering, when apache is stopped (or
>> restarted, usually daily to rotate logs), apache sends SIGTERM
>> before it closes the fh;
>
> That seems wrong. Apache should close the file handle, and let gzip
> finish up cleanly. Maybe Apache should kill gzip if gzip doesn't exit
I am double-checking the apache sources to confirm my initial reading,
but TBH I am not yet convinced that SIGTERM is wrong in this context.
(closing the fh would be nice, though).
cheers,
m
--
martin.langhoff <at> gmail.com
- ask interesting questions
- don't get distracted with shiny stuff - working code first
~ http://docs.moodle.org/en/User:Martin_Langhoff
This bug report was last modified 11 years and 322 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.