GNU bug report logs - #25658
26.0.50; ELisp part in a mail isn't encoded properly

Previous Next

Package: emacs;

Reported by: Katsumi Yamaoka <yamaoka <at> jpl.org>

Date: Thu, 9 Feb 2017 02:37:02 UTC

Severity: normal

Found in version 26.0.50

Done: Katsumi Yamaoka <yamaoka <at> jpl.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 25658 in the body.
You can then email your comments to 25658 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#25658; Package emacs. (Thu, 09 Feb 2017 02:37:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Katsumi Yamaoka <yamaoka <at> jpl.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 09 Feb 2017 02:37:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Katsumi Yamaoka <yamaoka <at> jpl.org>
To: bug-gnu-emacs <at> gnu.org
Subject: 26.0.50; ELisp part in a mail isn't encoded properly
Date: Thu, 09 Feb 2017 11:35:50 +0900
Hi,

In a message draft, an ELisp part containing non-ASCII letters,
like the following, is not encoded properly.

<#part type="application/emacs-lisp" disposition=inline>
(defun mm-shr (handle)
  ...
	 ;; Remove "soft hyphens".
	 (goto-char (point-min))
	 (while (search-forward "­" nil t)
	   (replace-match "" t t))
<#/part>

This doesn't happen with Emacs 25.1.  Specifying the charset spec,
as follows, doesn't help.

<#part type="application/emacs-lisp" disposition=inline charset="utf-8">

Please note that you may want to quote mml tags when replying.

Thanks.

In GNU Emacs 26.0.50.1 (i686-pc-cygwin, GTK+ Version 3.18.9)
 of 2017-02-08 built on localhost
Windowing system distributor 'The Cygwin/X Project', version 11.0.11900000




Added indication that bug 25658 blocks24655 Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Thu, 09 Feb 2017 17:58:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#25658; Package emacs. (Fri, 10 Feb 2017 00:57:01 GMT) Full text and rfc822 format available.

Message #10 received at 25658 <at> debbugs.gnu.org (full text, mbox):

From: Katsumi Yamaoka <yamaoka <at> jpl.org>
To: 25658 <at> debbugs.gnu.org
Subject: Re: bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
Date: Fri, 10 Feb 2017 09:56:47 +0900
On Thu, 09 Feb 2017 11:35:50 +0900, Katsumi Yamaoka wrote:
> In a message draft, an ELisp part containing non-ASCII letters,
> like the following, is not encoded properly.

> <#part type="application/emacs-lisp" disposition=inline>
> (defun mm-shr (handle)
>   ...
> 	 ;; Remove "soft hyphens".
> 	 (goto-char (point-min))
> 	 (while (search-forward "­" nil t)
> 	   (replace-match "" t t))
> <#/part>

;; Note that "­" is a soft hyphen.

What Gnus wants to do is:

(quoted-printable-encode-string
 (encode-coding-string "­" 'iso-8859-1))
 => "=AD"

However what is actually done is:

(with-temp-buffer
  ;; `mml-generate-mime-1' does:
  (set-buffer-multibyte t)
  (insert "­")
  ;; `mm-encode-body' does:
  (encode-coding-region (point-min) (point-max) 'iso-8859-1)
  ;; `mm-encode-buffer' does:
  (quoted-printable-encode-region (point-min) (point-max))
  (buffer-string))
 => "=3FFFAD"

Hmm.

(with-temp-buffer
  (set-buffer-multibyte t)
  (insert "­")
  (encode-coding-region (point-min) (point-max) 'iso-8859-1)
  (append (buffer-string) nil))
 => (4194221)

This would probably be the multibyte version of:

(append (encode-coding-string "­" 'iso-8859-1) nil)
 => (173)

Doesn't it mean we ought not to use `encode-coding-region'?
Anyway, I think what we should do here would be one of the
following two ways:

(with-temp-buffer
  (set-buffer-multibyte t)
  (insert "­")
  (encode-coding-region (point-min) (point-max) 'iso-8859-1)
  (set-buffer-multibyte nil)
  (quoted-printable-encode-region (point-min) (point-max))
  (buffer-string))
 => "=AD"

I'm not sure whether (set-buffer-multibyte nil) above does not do
anything other than converting characters to the unibyte version
one by one.  OTOH, this is what I often do:

(with-temp-buffer
  (set-buffer-multibyte t)
  (insert "­")
  (insert (prog1
	      (encode-coding-string (buffer-string) 'iso-8859-1)
	    (erase-buffer)
	    (set-buffer-multibyte nil)))
  (quoted-printable-encode-region (point-min) (point-max))
  (buffer-string))
 => "=AD"

Regards,




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#25658; Package emacs. (Fri, 10 Feb 2017 07:52:01 GMT) Full text and rfc822 format available.

Message #13 received at 25658 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Katsumi Yamaoka <yamaoka <at> jpl.org>
Cc: 25658 <at> debbugs.gnu.org
Subject: Re: bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
Date: Fri, 10 Feb 2017 09:51:32 +0200
> Date: Fri, 10 Feb 2017 09:56:47 +0900
> From: Katsumi Yamaoka <yamaoka <at> jpl.org>
> 
> On Thu, 09 Feb 2017 11:35:50 +0900, Katsumi Yamaoka wrote:
> > In a message draft, an ELisp part containing non-ASCII letters,
> > like the following, is not encoded properly.
> 
> > <#part type="application/emacs-lisp" disposition=inline>
> > (defun mm-shr (handle)
> >   ...
> > 	 ;; Remove "soft hyphens".
> > 	 (goto-char (point-min))
> > 	 (while (search-forward "­" nil t)
> > 	   (replace-match "" t t))
> > <#/part>
> 
> ;; Note that "­" is a soft hyphen.
> 
> What Gnus wants to do is:
> 
> (quoted-printable-encode-string
>  (encode-coding-string "­" 'iso-8859-1))
>  => "=AD"

Then why doesn't Gnus do exactly that?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#25658; Package emacs. (Fri, 10 Feb 2017 17:35:02 GMT) Full text and rfc822 format available.

Message #16 received at 25658 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Morris <rgm <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Katsumi Yamaoka <yamaoka <at> jpl.org>, 25658 <at> debbugs.gnu.org
Subject: Re: bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
Date: Fri, 10 Feb 2017 12:33:26 -0500
Eli Zaretskii wrote:

> Then why doesn't Gnus do exactly that?

Could it be... a bug?!   ;)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#25658; Package emacs. (Sun, 12 Feb 2017 23:06:02 GMT) Full text and rfc822 format available.

Message #19 received at 25658 <at> debbugs.gnu.org (full text, mbox):

From: Katsumi Yamaoka <yamaoka <at> jpl.org>
To: Glenn Morris <rgm <at> gnu.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 25658 <at> debbugs.gnu.org
Subject: Re: bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
Date: Mon, 13 Feb 2017 08:05:41 +0900
On Fri, 10 Feb 2017 12:33:26 -0500, Glenn Morris wrote:
> Eli Zaretskii wrote:
>> Then why doesn't Gnus do exactly that?
> Could it be... a bug?!   ;)

Ok.  So,

cd lisp/gnus
egrep '\((decode|encode)-coding-region' *.el|wc -l
 => 10

are they all potentially bugs?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#25658; Package emacs. (Mon, 13 Feb 2017 02:05:02 GMT) Full text and rfc822 format available.

Message #22 received at 25658 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Morris <rgm <at> gnu.org>
To: Katsumi Yamaoka <yamaoka <at> jpl.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 25658 <at> debbugs.gnu.org
Subject: Re: bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
Date: Sun, 12 Feb 2017 21:03:29 -0500
Katsumi Yamaoka wrote:

>>> Then why doesn't Gnus do exactly that?
>> Could it be... a bug?!   ;)
>
> Ok.  So,
>
> cd lisp/gnus
> egrep '\((decode|encode)-coding-region' *.el|wc -l
>  => 10
>
> are they all potentially bugs?

Don't ask me, I was only being flippant. :)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#25658; Package emacs. (Mon, 13 Feb 2017 05:45:01 GMT) Full text and rfc822 format available.

Message #25 received at 25658 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Katsumi Yamaoka <yamaoka <at> jpl.org>
Cc: rgm <at> gnu.org, 25658 <at> debbugs.gnu.org
Subject: Re: bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
Date: Mon, 13 Feb 2017 07:44:07 +0200
> Date: Mon, 13 Feb 2017 08:05:41 +0900
> From: Katsumi Yamaoka <yamaoka <at> jpl.org>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 25658 <at> debbugs.gnu.org
> 
> On Fri, 10 Feb 2017 12:33:26 -0500, Glenn Morris wrote:
> > Eli Zaretskii wrote:
> >> Then why doesn't Gnus do exactly that?
> > Could it be... a bug?!   ;)
> 
> Ok.  So,
> 
> cd lisp/gnus
> egrep '\((decode|encode)-coding-region' *.el|wc -l
>  => 10
> 
> are they all potentially bugs?

Not necessarily, they need to be reviewed one by one.

My question was triggered by the fact that "what Gnus wants" was so
much simpler and obviously correct that it was a clear winner IMO.  If
the other places are all of the same variety, then yes, I'd suggest to
make similar replacements there as well.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#25658; Package emacs. (Mon, 13 Feb 2017 08:32:02 GMT) Full text and rfc822 format available.

Message #28 received at 25658 <at> debbugs.gnu.org (full text, mbox):

From: Katsumi Yamaoka <yamaoka <at> jpl.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: rgm <at> gnu.org, 25658 <at> debbugs.gnu.org
Subject: Re: bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
Date: Mon, 13 Feb 2017 17:31:03 +0900
On Mon, 13 Feb 2017 07:44:07 +0200, Eli Zaretskii wrote:
>> cd lisp/gnus
>> egrep '\((decode|encode)-coding-region' *.el|wc -l
>>  => 10
>> are they all potentially bugs?

> Not necessarily, they need to be reviewed one by one.

Ok.  But I personally got to think *-coding-region should never
be used anymore.

> My question was triggered by the fact that "what Gnus wants" was so
> much simpler and obviously correct that it was a clear winner IMO.  If
> the other places are all of the same variety, then yes, I'd suggest to
> make similar replacements there as well.

I see, however it's not so easy to simplify the codes so as to
achieve just "what Gnus wants" perfectly (I mean using *-coding-
string for all the cases).

Instead, I've modified `mm-encode-body' for the emergency fix.
In the Emacs master, only `mml-generate-mime-1' uses it.
(`rfc2231-encode-string' uses it as well but now we use
 `rfc2047-encode-parameter' instead for encoding a file name.)

Regards,




bug closed, send any further explanations to 25658 <at> debbugs.gnu.org and Katsumi Yamaoka <yamaoka <at> jpl.org> Request was from Katsumi Yamaoka <yamaoka <at> jpl.org> to control <at> debbugs.gnu.org. (Wed, 15 Feb 2017 22:12:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 16 Mar 2017 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 102 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.