GNU bug report logs - #39689
26.3; browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding)

Previous Next

Package: emacs;

Reported by: Vegard Vesterheim <vegard.vesterheim <at> uninett.no>

Date: Thu, 20 Feb 2020 13:49:01 UTC

Severity: normal

Found in version 26.3

Fixed in version 28.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 39689 in the body.
You can then email your comments to 39689 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#39689; Package emacs. (Thu, 20 Feb 2020 13:49:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Vegard Vesterheim <vegard.vesterheim <at> uninett.no>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 20 Feb 2020 13:49:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Vegard Vesterheim <vegard.vesterheim <at> uninett.no>
To: bug-gnu-emacs <at> gnu.org
Subject: 26.3;
 browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding)
Date: Thu, 20 Feb 2020 14:48:54 +0100
Emacs does not seem to correctly handle UTF-8-Based Percent-Encoding as
illustrated in Chapter 6.2 from RFC6068.

The command 
   emacs -Q -l browse-url -eval '(browse-url-mail "mailto:user <at> example.org?subject=caf%C3%A9&body=caf%C3%A9")'

should result in a message buffer with the string "café" insterted into the
body part of the message. Instead the string "café" is inserted.

I am running Ubuntu 18.04.3 LTS.

M-x emacs-version returns:
  "GNU Emacs 26.3 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.22.30) of 2019-09-16" 

$ locale -a | grep -i utf
C.UTF-8
en_AG.utf8
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IL.utf8
en_IN.utf8
en_NG.utf8
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZM.utf8
en_ZW.utf8
nb_NO.utf8

$ env | grep LC
LC_MEASUREMENT=en_US.UTF-8
LC_PAPER=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_NUMERIC=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_TIME=nb_NO.utf8

$ env | grep LANG
LANG=en_US.UTF-8
GDM_LANG=en
NLS_LANG=NORWEGIAN_NORWAY.WE8ISO8859P1
LANGUAGE=en


--

Vennlig hilsen/Best regards
Vegard Vesterheim
Senior Software engineer
+47 48 11 98 98
vegard.vesterheim <at> uninett.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39689; Package emacs. (Thu, 27 Feb 2020 10:52:02 GMT) Full text and rfc822 format available.

Message #8 received at 39689 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: Vegard Vesterheim <vegard.vesterheim <at> uninett.no>
Cc: 39689 <at> debbugs.gnu.org, larsi <at> gnus.org
Subject: Re: bug#39689: 26.3; browse-url-mail not supporting RFC6068
 (UTF-8-Based Percent-Encoding)
Date: Thu, 27 Feb 2020 11:51:35 +0100
>>>>> On Thu, 20 Feb 2020 14:48:54 +0100, Vegard Vesterheim via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> said:

    Vegard> Emacs does not seem to correctly handle UTF-8-Based Percent-Encoding as
    Vegard> illustrated in Chapter 6.2 from RFC6068.

    Vegard> The command 
    Vegard>    emacs -Q -l browse-url -eval '(browse-url-mail "mailto:user <at> example.org?subject=caf%C3%A9&body=caf%C3%A9")'

    Vegard> should result in a message buffer with the string "café" insterted into the
    Vegard> body part of the message. Instead the string "café" is inserted.

Yes, the assumption in rfc2368-unhexify-string is that percent
escaping is being done of ASCII characters.

epg--decode-percent-escape-as-utf-8 in epg.el does the
right thing, it could be renamed and moved. I think rfc2047 decoding
needs doing on the result as well. Lars, should I just stick these in
rfc2368.el but named something like rfc6068-unhexify-string and
rfc6068-decode-2047-string or something?

Robert




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39689; Package emacs. (Thu, 27 Feb 2020 15:09:02 GMT) Full text and rfc822 format available.

Message #11 received at 39689 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: Vegard Vesterheim <vegard.vesterheim <at> uninett.no>
Cc: 39689 <at> debbugs.gnu.org, larsi <at> gnus.org
Subject: Re: bug#39689: 26.3; browse-url-mail not supporting RFC6068
 (UTF-8-Based Percent-Encoding)
Date: Thu, 27 Feb 2020 16:08:25 +0100
>>>>> On Thu, 27 Feb 2020 11:51:35 +0100, Robert Pluim <rpluim <at> gmail.com> said:

>>>>> On Thu, 20 Feb 2020 14:48:54 +0100, Vegard Vesterheim via "Bug
    Robert> reports for GNU Emacs, the Swiss army knife of text editors"
    Robert> <bug-gnu-emacs <at> gnu.org> said:

    Vegard> Emacs does not seem to correctly handle UTF-8-Based Percent-Encoding as
    Vegard> illustrated in Chapter 6.2 from RFC6068.

    Vegard> The command 
    Vegard> emacs -Q -l browse-url -eval '(browse-url-mail "mailto:user <at> example.org?subject=caf%C3%A9&body=caf%C3%A9")'

    Vegard> should result in a message buffer with the string "café" insterted into the
    Vegard> body part of the message. Instead the string "café" is inserted.

    Robert> Yes, the assumption in rfc2368-unhexify-string is that percent
    Robert> escaping is being done of ASCII characters.

    Robert> epg--decode-percent-escape-as-utf-8 in epg.el does the
    Robert> right thing, it could be renamed and moved. I think rfc2047 decoding
    Robert> needs doing on the result as well. Lars, should I just stick these in
    Robert> rfc2368.el but named something like rfc6068-unhexify-string and
    Robert> rfc6068-decode-2047-string or something?

Oh, and thereʼs another version in gnus-util, and one in url, and an
almost compatible one in org [1]. The gnus and url ones suffer from
this same issue, although they both return different wrong results :-)

At least the epg, rfc2368, gnus, and the url versions look like they
can be unified. Not sure where to put them though.

Footnotes:
[1]  It supports % representation of the UTF-8 encoding of chars, but
     also of the unicode code point of chars, so eg %E1 gets turned
     into á. Iʼm sure thereʼs some historical reason for that.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39689; Package emacs. (Sat, 14 Mar 2020 12:26:02 GMT) Full text and rfc822 format available.

Message #14 received at 39689 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: 39689 <at> debbugs.gnu.org, Vegard Vesterheim <vegard.vesterheim <at> uninett.no>
Subject: Re: bug#39689: 26.3; browse-url-mail not supporting RFC6068
 (UTF-8-Based Percent-Encoding)
Date: Sat, 14 Mar 2020 13:25:49 +0100
Robert Pluim <rpluim <at> gmail.com> writes:

> Oh, and thereʼs another version in gnus-util, and one in url, and an
> almost compatible one in org [1]. The gnus and url ones suffer from
> this same issue, although they both return different wrong results :-)

:-)

> At least the epg, rfc2368, gnus, and the url versions look like they
> can be unified. Not sure where to put them though.

Putting them in either rfc2368 or url.el would make sense.  Hm...
perhaps url-util.el?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39689; Package emacs. (Sun, 25 Oct 2020 11:44:01 GMT) Full text and rfc822 format available.

Message #17 received at 39689 <at> debbugs.gnu.org (full text, mbox):

From: Vegard Vesterheim <vegard.vesterheim <at> uninett.no>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 39689 <at> debbugs.gnu.org, Robert Pluim <rpluim <at> gmail.com>
Subject: Re: bug#39689: 26.3;
 browse-url-mail not supporting RFC6068 (UTF-8-Based Percent-Encoding)
Date: Sun, 25 Oct 2020 12:41:53 +0100
On Sat, 14 Mar 2020 13:25:49 +0100 Lars Ingebrigtsen <larsi <at> gnus.org> wrote:

> Robert Pluim <rpluim <at> gmail.com> writes:
>
>> Oh, and thereʼs another version in gnus-util, and one in url, and an
>> almost compatible one in org [1]. The gnus and url ones suffer from
>> this same issue, although they both return different wrong results :-)
>
> :-)
>
>> At least the epg, rfc2368, gnus, and the url versions look like they
>> can be unified. Not sure where to put them though.
>
> Putting them in either rfc2368 or url.el would make sense.  Hm...
> perhaps url-util.el?

I am assuming this bug is not yet fixed. Can anyone advice on a
workaround I can apply for this bug. I am using emacs 26.1 (as packaged
in Debian buster)

-- 
Vennlig hilsen/Best regards
Vegard Vesterheim
Senior Software engineer
+47 48 11 98 98
vegard.vesterheim <at> uninett.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39689; Package emacs. (Mon, 30 Aug 2021 00:06:01 GMT) Full text and rfc822 format available.

Message #20 received at 39689 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: 39689 <at> debbugs.gnu.org, Vegard Vesterheim <vegard.vesterheim <at> uninett.no>
Subject: Re: bug#39689: 26.3; browse-url-mail not supporting RFC6068
 (UTF-8-Based Percent-Encoding)
Date: Mon, 30 Aug 2021 02:05:12 +0200
Lars Ingebrigtsen <larsi <at> gnus.org> writes:

>> At least the epg, rfc2368, gnus, and the url versions look like they
>> can be unified. Not sure where to put them though.
>
> Putting them in either rfc2368 or url.el would make sense.  Hm...
> perhaps url-util.el?

I made a new file, obsoleted rfc2368, and adjusted browse-url and epg.
So the original reported problem should now be fixed in Emacs 28.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




bug marked as fixed in version 28.1, send any further explanations to 39689 <at> debbugs.gnu.org and Vegard Vesterheim <vegard.vesterheim <at> uninett.no> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Mon, 30 Aug 2021 00:06:01 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 27 Sep 2021 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 260 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.