GNU bug report logs - #5251
23.1; problem in decode_eol of coding.c

Previous Next

Package: emacs;

Reported by: Toru TSUNEYOSHI <t_tuneyosi <at> hotmail.com>

Date: Tue, 6 Oct 2009 18:55:07 UTC

Severity: normal

Tags: patch

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 5251 in the body.
You can then email your comments to 5251 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4360; Package emacs. (Sun, 06 Sep 2009 17:00:03 GMT) Full text and rfc822 format available.

Message #3 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Toru TSUNEYOSHI <t_tuneyosi <at> hotmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: bug#5251: 23.1; problem in decode_eol of coding.c
Date: Sun, 20 Dec 2009 17:57:09 +0900
[Message part 1 (text/plain, inline)]
In GNU Emacs 23.1, there is a problem in decode_eol of coding.c.

In buffer of which enable-multibyte-characters is nil,
the function `decode-coding-region' should delete the character '^M'
(code: 0x0d) at the end of line,
in case that the function parameter `coding-system' is *-dos (and
the variable `inhibit-eol-conversion' is nil).
But, in practice, the function doesn't delete all of the character '^M'.

You can watch the problem with the following.

========================================================================
(progn
  (pop-to-buffer (generate-new-buffer-name "*scratch*"))
  (set-buffer-multibyte nil)
  (insert (encode-coding-string "あ" 'euc-jp) "\xd" "\n")
  (read-char "(press any key)")
  (decode-coding-region (point-min) (point-max) 'euc-jp-dos)
  (read-char "doesn't delete all of the character '^M'. (press any key)")
  (set-buffer-multibyte t))
========================================================================

I made a patch to fix the problem. Please check it.
[coding.c__decode_eol.diff (text/x-patch, inline)]
--- coding.c.orig	2009-07-08 12:09:16.000000000 +0900
+++ coding.c	2009-12-20 16:36:45.887121600 +0900
@@ -6598,7 +6598,8 @@
 	{
 	  int pos_byte = coding->dst_pos_byte;
 	  int pos = coding->dst_pos;
-	  int pos_end = pos + coding->produced_char - 1;
+	  int pos_end = pos + (coding->dst_multibyte
+			       ? coding->produced_char : coding->produced) - 1;
 
 	  while (pos < pos_end)
 	    {

Added tag(s) patch. Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Thu, 28 Jan 2010 00:14:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#5251; Package emacs. (Wed, 17 Feb 2016 03:42:02 GMT) Full text and rfc822 format available.

Message #8 received at 5251 <at> debbugs.gnu.org (full text, mbox):

From: Andrew Hyatt <ahyatt <at> gmail.com>
To: Toru TSUNEYOSHI <t_tuneyosi <at> hotmail.com>
Cc: 5251 <at> debbugs.gnu.org
Subject: Re: bug#5251: 23.1; problem in decode_eol of coding.c
Date: Tue, 16 Feb 2016 22:41:32 -0500
I can verify that this issue still happens in Emacs 25.  This simple fix
seems to have slipped by unnoticed many years ago, hopefully someone
that is qualified to comment on the patch will see it now.

BTW, this seems like a nice thing to unit test with ert, if a fix does
take place.

Toru TSUNEYOSHI <t_tuneyosi <at> hotmail.com> writes:

> In GNU Emacs 23.1, there is a problem in decode_eol of coding.c.
>
> In buffer of which enable-multibyte-characters is nil,
> the function `decode-coding-region' should delete the character '^M'
> (code: 0x0d) at the end of line,
> in case that the function parameter `coding-system' is *-dos (and
> the variable `inhibit-eol-conversion' is nil).
> But, in practice, the function doesn't delete all of the character '^M'.
>
> You can watch the problem with the following.
>
> ========================================================================
> (progn
>   (pop-to-buffer (generate-new-buffer-name "*scratch*"))
>   (set-buffer-multibyte nil)
>   (insert (encode-coding-string "あ" 'euc-jp) "\xd" "\n")
>   (read-char "(press any key)")
>   (decode-coding-region (point-min) (point-max) 'euc-jp-dos)
>   (read-char "doesn't delete all of the character '^M'. (press any key)")
>   (set-buffer-multibyte t))
> ========================================================================
>
> I made a patch to fix the problem. Please check it.
>
> --- coding.c.orig	2009-07-08 12:09:16.000000000 +0900
> +++ coding.c	2009-12-20 16:36:45.887121600 +0900
> @@ -6598,7 +6598,8 @@
>  	{
>  	  int pos_byte = coding->dst_pos_byte;
>  	  int pos = coding->dst_pos;
> -	  int pos_end = pos + coding->produced_char - 1;
> +	  int pos_end = pos + (coding->dst_multibyte
> +			       ? coding->produced_char : coding->produced) - 1;
>  
>  	  while (pos < pos_end)
>  	    {




Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Wed, 17 Feb 2016 15:54:02 GMT) Full text and rfc822 format available.

Notification sent to Toru TSUNEYOSHI <t_tuneyosi <at> hotmail.com>:
bug acknowledged by developer. (Wed, 17 Feb 2016 15:54:02 GMT) Full text and rfc822 format available.

Message #13 received at 5251-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Andrew Hyatt <ahyatt <at> gmail.com>
Cc: t_tuneyosi <at> hotmail.com, 5251-done <at> debbugs.gnu.org
Subject: Re: bug#5251: 23.1; problem in decode_eol of coding.c
Date: Wed, 17 Feb 2016 17:53:41 +0200
> From: Andrew Hyatt <ahyatt <at> gmail.com>
> Date: Tue, 16 Feb 2016 22:41:32 -0500
> Cc: 5251 <at> debbugs.gnu.org
> 
> 
> I can verify that this issue still happens in Emacs 25.  This simple fix
> seems to have slipped by unnoticed many years ago, hopefully someone
> that is qualified to comment on the patch will see it now.

Thanks for the reminder, I fixed this now on the emacs-25 branch.
(The proposed patch was not entirely correct, although it was in the
right direction, and happened to fix this particular case.)

It is indeed a shame that such a simple bug was left unresolved for
such a long time, but better late than never.

> BTW, this seems like a nice thing to unit test with ert, if a fix does
> take place.

Yes, that would be very welcome, thanks.  Looks like coding-tests.el
is a good place to add such a test.  Please be sure to include lone ^M
characters, as well as those followed by a newline, so that we are
sure lone CRs are not removed.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#5251; Package emacs. (Wed, 17 Feb 2016 17:04:01 GMT) Full text and rfc822 format available.

Message #16 received at 5251-done <at> debbugs.gnu.org (full text, mbox):

From: Toru TSUNEYOSHI <t_tuneyosi <at> hotmail.com>
To: eliz <at> gnu.org
Cc: ahyatt <at> gmail.com, 5251-done <at> debbugs.gnu.org
Subject: Re: bug#5251: 23.1; problem in decode_eol of coding.c
Date: Thu, 18 Feb 2016 02:03:08 +0900
Oh, I have forgotten this issue completely. :-)

From: Andrew Hyatt <ahyatt <at> gmail.com>
Subject: Re: bug#5251: 23.1; problem in decode_eol of coding.c
Date: Tue, 16 Feb 2016 22:41:32 -0500
Message-ID: <m2bn7gypeb.fsf <at> gmail.com>

> BTW, this seems like a nice thing to unit test with ert, if a fix does
> take place.

What is the word "ert" ? Would you like to tell me the meaning ?

From: Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#5251: 23.1; problem in decode_eol of coding.c
Date: Wed, 17 Feb 2016 17:53:41 +0200
Message-ID: <8337srwcxm.fsf <at> gnu.org>

> Thanks for the reminder, I fixed this now on the emacs-25 branch.
> (The proposed patch was not entirely correct, although it was in the
> right direction, and happened to fix this particular case.)

After I checked my patch again, I thoght my patch was not enough maybe.

> better late than never.

Yes, that's right.

Thanks, Mr. Andrew Hyatt and Mr. Eli Zaretskii.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#5251; Package emacs. (Wed, 17 Feb 2016 19:25:02 GMT) Full text and rfc822 format available.

Message #19 received at 5251-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Toru TSUNEYOSHI <t_tuneyosi <at> hotmail.com>
Cc: ahyatt <at> gmail.com, 5251-done <at> debbugs.gnu.org
Subject: Re: bug#5251: 23.1; problem in decode_eol of coding.c
Date: Wed, 17 Feb 2016 21:24:23 +0200
> Date: Thu, 18 Feb 2016 02:03:08 +0900
> CC: ahyatt <at> gmail.com, 5251-done <at> debbugs.gnu.org
> From: Toru TSUNEYOSHI <t_tuneyosi <at> hotmail.com>
> 
> What is the word "ert" ? Would you like to tell me the meaning ?

Emacs Regression Test suite.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#5251; Package emacs. (Thu, 18 Feb 2016 02:58:02 GMT) Full text and rfc822 format available.

Message #22 received at 5251-done <at> debbugs.gnu.org (full text, mbox):

From: Toru TSUNEYOSHI <t_tuneyosi <at> hotmail.com>
To: eliz <at> gnu.org
Cc: ahyatt <at> gmail.com, 5251-done <at> debbugs.gnu.org
Subject: Re: bug#5251: 23.1; problem in decode_eol of coding.c
Date: Thu, 18 Feb 2016 11:56:54 +0900
From: Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#5251: 23.1; problem in decode_eol of coding.c
Date: Wed, 17 Feb 2016 21:24:23 +0200
Message-ID: <83mvqzuom0.fsf <at> gnu.org>

> Emacs Regression Test suite.

I don't know it at all.
Thanks.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 17 Mar 2016 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 98 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.