Package: emacs;
Reported by: Eli Zaretskii <eliz <at> gnu.org>
Date: Thu, 2 Feb 2012 18:19:02 UTC
Severity: important
Found in version 24.0.93
Fixed in version 24.0.94
Done: Glenn Morris <rgm <at> gnu.org>
Bug is archived. No further changes may be made.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: bug-gnu-emacs <at> gnu.org Cc: Kenichi Handa <handa <at> m17n.org> Subject: 24.0.93; Crash while decoding input with DOS EOLs Date: Thu, 02 Feb 2012 20:15:39 +0200
This bug report will be sent to the Bug-GNU-Emacs mailing list and the GNU bug tracker at debbugs.gnu.org. Please check that the From: line contains a valid email address. After a delay of up to one day, you should receive an acknowledgement at that address. Please write in English if possible, as the Emacs maintainers usually do not have translators for other languages. Please describe exactly what actions triggered the bug, and the precise symptoms of the bug. If you can, give a recipe starting from `emacs -Q': I see this both with today's trunk and in the 24.0.93 pretest, both on GNU/Linux and on MS-Windows. To reproduce: emacs -Q C-x b foo RET M-: (set-buffer-multibyte nil) RET C-x RET c undecided-dos RET C-u M-! gunzip -c emacs-24.0.93.tar.gz RET (It must be the tarball of Emacs 24.0.93, because the bug is data-dependent. It doesn't have to be .tar.gz, as long as you use the correct decompressor: bunzip2 for .tar.bz2. xz for .tar.xz, etc. You can even do this with an uncompressed tarball and cat. The important part is that Emacs gets the byte stream of that tarball, and it gets it from a subprocess.) This crashes somewhere in the middle of reading the output from the subprocess. The immediate reason for the crash can be seen from this fragment of the backtrace: #0 w32_abort () at w32fns.c:7196 #1 0x012eea83 in temp_set_point_both (buffer=0x10137600, charpos=45817604, bytepos=45817605) at intervals.c:1870 #2 0x01135816 in Fcall_process (nargs=6, args=0x82f644) at callproc.c:846 As you see temp_set_point_both gets character position and byte position that are different, which cannot happen in a unibyte buffer (as can be seen above, the recipe makes the buffer `foo' a unibyte one). There's an assertion inside temp_set_point_both that aborts due to this. The call to temp_set_point_both is in call-process: TEMP_SET_PT_BOTH (PT + process_coding.produced_char, PT_BYTE + process_coding.produced); carryover = process_coding.carryover_bytes; if (carryover > 0) memcpy (buf, process_coding.carryover, process_coding.carryover_bytes); The crash happens at the point in the input byte stream where the last byte in the chunk we read from the pipe is \r. Since the stream is decoded with raw-text-dos coding-system, this last \r is left as a "carryover", for the possibility that there will be a \n at the beginning of the next chunk. However, process_coding.produced does not account for this single byte that was not processed, and gets the value one more than it should. As far as I could see, the problematic code that sets process_coding.produced to incorrect value is in decode_coding, around line 7176: else { /* Record unprocessed bytes in coding->carryover. We are sure that the number of data is less than the size of coding->carryover. */ unsigned char *p = coding->carryover; if (nbytes > sizeof coding->carryover) nbytes = sizeof coding->carryover; coding->carryover_bytes = nbytes; while (nbytes-- > 0) *p++ = *src++; } coding->consumed = coding->src_bytes; <<<<<<<<<<<<<<<<<<< This last assignment then causes produce_chars to set coding->produced to an incorrect value: /* Source characters are at coding->source. */ const unsigned char *src = coding->source; const unsigned char *src_end = src + coding->consumed; <<<<<<<<<<<< ... produced_chars = coding->consumed_char; while (src < src_end) *dst++ = *src++; } } produced = dst - (coding->destination + coding->produced); <<<<<<<<<<< if (BUFFERP (coding->dst_object) && produced_chars > 0) insert_from_gap (produced_chars, produced); coding->produced += produced; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< coding->produced_char += produced_chars; I don't understand the logic of "carryover" in decode_coding well enough to decide how to fix it. If Emacs crashed, and you have the Emacs process in the gdb debugger, please include the output from the following gdb commands: `bt full' and `xbacktrace'. For information about debugging Emacs, please read the file d:/gnu/bzr/emacs/trunk/etc/DEBUG. In GNU Emacs 24.0.93.1 (i386-mingw-nt5.1.2600) of 2012-02-02 on HOME-C4E4A596F7 Windowing system distributor `Microsoft Corp.', version 5.1.2600 Configured using: `configure --with-gcc (3.4) --no-opt' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: ENU value of $XMODIFIERS: nil locale-coding-system: cp1255 default enable-multibyte-characters: t Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Recent input: M-x r e p o r t - e m <tab> <return> Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Load-path shadows: None found. Features: (shadow sort gnus-util mail-extr message format-spec rfc822 mml easymenu mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mailabbrev mail-utils gmm-utils mailheader emacsbug time-date tooltip ediff-hook vc-hooks lisp-float-type mwheel dos-w32 disp-table ls-lisp w32-win w32-vars tool-bar dnd fontset image fringe lisp-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer loaddefs button faces cus-face files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote make-network-process multi-tty emacs)
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.